Getting dataset metadata using INFORMATION_SCHEMA

INFORMATION_SCHEMA is a series of views that provide access to metadata about datasets, tables, and views.

You can query the INFORMATION_SCHEMA.SCHEMATA and INFORMATION_SCHEMA.SCHEMATA_OPTIONS views to retrieve metadata about datasets in a project.

Required permissions

To get information or metadata about a dataset, you must be assigned the dataset-level READER role, or you must be assigned a project-level IAM role that includes bigquery.datasets.get permissions. All predefined, project-level IAM roles include bigquery.datasets.get permissions except for bigquery.jobUser.

For more information on IAM roles and permissions in BigQuery, see Access Control. For more information on dataset-level roles, see Primitive roles for datasets.

SCHEMATA view

When you query the INFORMATION_SCHEMA.SCHEMATA view, the query results contain one row for each dataset in a project to which the current user has access.

The INFORMATION_SCHEMA.SCHEMATA view has the following schema:

Column name Data type Value
CATALOG_NAME STRING The name of the project that contains the dataset
SCHEMA_NAME STRING The dataset's name also referred to as the datasetId
SCHEMA_OWNER STRING The value is always NULL
CREATION_TIME TIMESTAMP The dataset's creation time
LAST_MODIFIED_TIME TIMESTAMP The dataset's last modified time
LOCATION STRING The dataset's geographic location

For more information on dataset properties, see the dataset resource page in the REST API documentation.

Examples

The following example retrieves all columns from the INFORMATION_SCHEMA.SCHEMATA view except for schema_owner which is reserved for future use. The metadata returned is for all datasets in the default project — myproject.

To run the query against a project other than your default project, add the project ID to the dataset in the following format: `[PROJECT_ID]`.INFORMATION_SCHEMA.[VIEW] for example, `myproject`.INFORMATION_SCHEMA.SCHEMATA.

To run the query:

Console

  1. Open the BigQuery web UI in the GCP Console.

    Go to the BigQuery web UI

  2. Enter the following standard SQL query in the Query editor box. INFORMATION_SCHEMA requires standard SQL syntax. Standard SQL is the default syntax in the GCP Console.

    SELECT
     * EXCEPT(schema_owner)
    FROM
     INFORMATION_SCHEMA.SCHEMATA
    
  3. Click Run.

Command-line

Use the query command and specify standard SQL syntax by using the --nouse_legacy_sql or --use_legacy_sql=false flag. Standard SQL syntax is required for INFORMATION_SCHEMA queries.

To run the query, enter:

bq query --nouse_legacy_sql \
'SELECT * EXCEPT(schema_owner) FROM INFORMATION_SCHEMA.SCHEMATA'

The results should look like the following:

  +----------------+---------------+---------------------+---------------------+-----------------+
  |  catalog_name  |  schema_name  |    creation_time    | last_modified_time  |    location     |
  +----------------+---------------+---------------------+---------------------+-----------------+
  | myproject      | mydataset1    | 2018-11-07 19:50:24 | 2018-11-07 19:50:24 | US              |
  | myproject      | mydataset2    | 2018-07-16 04:24:22 | 2018-07-16 04:24:22 | US              |
  | myproject      | mydataset3    | 2018-02-07 21:08:45 | 2018-05-01 23:32:53 | asia-northeast1 |
  +----------------+---------------+---------------------+---------------------+-----------------+
  

SCHEMATA_OPTIONS view

When you query the INFORMATION_SCHEMA.SCHEMATA_OPTIONS view, the query results contain one row for each dataset in a project to which the current user has access.

The INFORMATION_SCHEMA.SCHEMATA_OPTIONS view has the following schema:

Column name Data type Value
CATALOG_NAME STRING The name of the project that contains the dataset
SCHEMA_NAME STRING The dataset's name also referred to as the datasetId
OPTION_NAME STRING One of the name values in the options table
OPTION_TYPE STRING One of the data type values in the options table
OPTION_VALUE STRING One of the value options in the options table

Options table
OPTION_NAME OPTION_TYPE OPTION_VALUE
default_table_expiration_days FLOAT64 The default lifetime, in days, of all tables in the dataset
friendly_name STRING The dataset's descriptive name
description STRING A description of the dataset
labels ARRAY<STRUCT<STRING, STRING>> An array of STRUCT's that represent the labels on the dataset

For more information on dataset properties, see the dataset resource page in the REST API documentation.

Examples

Example 1:

The following example retrieves the default table expiration times for all datasets in your default project (myproject) by querying the INFORMATION_SCHEMATA.SCHEMATA_OPTIONS view.

To run the query against a project other than your default project, add the project ID to the dataset in the following format: `[PROJECT_ID]`.INFORMATION_SCHEMA.[VIEW] for example, `myproject`.INFORMATION_SCHEMA.SCHEMATA_OPTIONS.

To run the query:

Console

  1. Open the BigQuery web UI in the GCP Console.

    Go to the BigQuery web UI

  2. Enter the following standard SQL query in the Query editor box. INFORMATION_SCHEMA requires standard SQL syntax. Standard SQL is the default syntax in the GCP Console.

    SELECT
     *
    FROM
     INFORMATION_SCHEMA.SCHEMATA_OPTIONS
    WHERE
     option_name="default_table_expiration_days"
    
  3. Click Run.

Command-line

Use the query command and specify standard SQL syntax by using the --nouse_legacy_sql or --use_legacy_sql=false flag. Standard SQL syntax is required for INFORMATION_SCHEMA queries.

To run the query, enter:

bq query --nouse_legacy_sql \
'SELECT * FROM INFORMATION_SCHEMA.SCHEMATA_OPTIONS
WHERE option_name="default_table_expiration_days"'

The results should look like the following:

  +----------------+---------------+-------------------------------+-------------+---------------------+
  |  catalog_name  |  schema_name  |          option_name          | option_type |    option_value     |
  +----------------+---------------+-------------------------------+-------------+---------------------+
  | myproject      | mydataset3    | default_table_expiration_days | FLOAT64     | 0.08333333333333333 |
  | myproject      | mydataset2    | default_table_expiration_days | FLOAT64     | 90.0                |
  | myproject      | mydataset1    | default_table_expiration_days | FLOAT64     | 30.0                |
  +----------------+---------------+-------------------------------+-------------+---------------------+
  

Example 2:

The following example retrieves the labels for all datasets in your default project (myproject) by querying the INFORMATION_SCHEMATA.SCHEMATA_OPTIONS view.

To run the query against a project other than your default project, add the project ID to the dataset in the following format: `[PROJECT_ID]`.INFORMATION_SCHEMA.[VIEW] for example, `myproject`.INFORMATION_SCHEMA.SCHEMATA_OPTIONS.

To run the query:

Console

  1. Open the BigQuery web UI in the GCP Console.

    Go to the BigQuery web UI

  2. Enter the following standard SQL query in the Query editor box. INFORMATION_SCHEMA requires standard SQL syntax. Standard SQL is the default syntax in the GCP Console.

    SELECT
     *
    FROM
     INFORMATION_SCHEMA.SCHEMATA_OPTIONS
    WHERE
     option_name="labels"
    
  3. Click Run.

Command-line

Use the query command and specify standard SQL syntax by using the --nouse_legacy_sql or --use_legacy_sql=false flag. Standard SQL syntax is required for INFORMATION_SCHEMA queries.

To run the query, enter:

bq query --nouse_legacy_sql \
'SELECT * FROM INFORMATION_SCHEMA.SCHEMATA_OPTIONS
WHERE option_name="labels"'

The results should look like the following:

  +----------------+---------------+-------------+---------------------------------+------------------------+
  |  catalog_name  |  schema_name  | option_name |          option_type            |      option_value      |
  +----------------+---------------+-------------+---------------------------------+------------------------+
  | myproject      | mydataset1    | labels      | ARRAY<STRUCT<STRING, STRING>>   | [STRUCT("org", "dev")] |
  | myproject      | mydataset2    | labels      | ARRAY<STRUCT<STRING, STRING>>   | [STRUCT("org", "dev")] |
  +----------------+---------------+-------------+---------------------------------+------------------------+
  

Next steps

Was this page helpful? Let us know how we did:

Send feedback about...

Need help? Visit our support page.