Getting dataset metadata using INFORMATION_SCHEMA
INFORMATION_SCHEMA
contains these views for dataset metadata:
SCHEMATA
SCHEMATA_OPTIONS
Before you begin
Grant Identity and Access Management (IAM) roles that give users the necessary permissions to perform each task in this document.
Required permissions
To get SCHEMATA
and SCHEMATA_OPTIONS
views for dataset metadata, you need the bigquery.datasets.get
IAM permission at the project level.
Each of the following predefined IAM roles includes the permissions that you need in order to get SCHEMATA
and SCHEMATA_OPTIONS
views:
roles/bigquery.admin
roles/bigquery.dataEditor
roles/bigquery.dataOwner
roles/bigquery.dataViewer
For more information about granular BigQuery permissions, see roles and permissions.
Syntax
Queries against any of these views should include a region qualifier.
If you do not specify a regional qualifier, metadata is retrieved from the us
multi-region.
-- Returns metadata for datasets in a region.
SELECT * FROM region-us.INFORMATION_SCHEMA.SCHEMATA;
SCHEMATA
view
When you query the INFORMATION_SCHEMA.SCHEMATA
view, the query results contain
one row for each dataset in a project to which the current user has access.
The INFORMATION_SCHEMA.SCHEMATA
view has the following schema:
Column name | Data type | Value |
---|---|---|
CATALOG_NAME |
STRING |
The name of the project that contains the dataset |
SCHEMA_NAME |
STRING |
The dataset's name also referred to as the datasetId |
SCHEMA_OWNER |
STRING |
The value is always NULL |
CREATION_TIME |
TIMESTAMP |
The dataset's creation time |
LAST_MODIFIED_TIME |
TIMESTAMP |
The dataset's last modified time |
LOCATION |
STRING |
The dataset's geographic location |
DDL |
STRING |
The CREATE SCHEMA
DDL statement that can be used to create the dataset |
DEFAULT_COLLATION_NAME |
STRING |
The name of the default collation specification
if it exists; otherwise, NULL .
|
Examples
The following example retrieves all columns from the
INFORMATION_SCHEMA.SCHEMATA
view except for schema_owner
which is reserved
for future use. The metadata returned is for all datasets in the default project
— myproject
.
To run the query against a project other than your default project, add the
project ID to the dataset in the following format:
`project_id`.INFORMATION_SCHEMA.view
for example, `myproject`.INFORMATION_SCHEMA.SCHEMATA
.
SELECT * EXCEPT (schema_owner) FROM INFORMATION_SCHEMA.SCHEMATA;
The result is similar to the following. For readability, some columns are excluded from the result.
+----------------+---------------+---------------------+---------------------+------------+------------------------------------------+ | catalog_name | schema_name | creation_time | last_modified_time | location | ddl | +----------------+---------------+---------------------+---------------------+------------+------------------------------------------+ | myproject | mydataset1 | 2018-11-07 19:50:24 | 2018-11-07 19:50:24 | US | CREATE SCHEMA `myproject.mydataset1` | | | | | | | OPTIONS( | | | | | | | location="us" | | | | | | | ); | +----------------+---------------+---------------------+---------------------+------------+------------------------------------------+ | myproject | mydataset2 | 2018-07-16 04:24:22 | 2018-07-16 04:24:22 | US | CREATE SCHEMA `myproject.mydataset2` | | | | | | | OPTIONS( | | | | | | | default_partition_expiration_days=3.0, | | | | | | | location="us" | | | | | | | ); | +----------------+---------------+---------------------+---------------------+------------+------------------------------------------+ | myproject | mydataset3 | 2018-02-07 21:08:45 | 2018-05-01 23:32:53 | US | CREATE SCHEMA `myproject.mydataset3` | | | | | | | OPTIONS( | | | | | | | description="My dataset", | | | | | | | location="us" | | | | | | | ); | +----------------+---------------+---------------------+---------------------+------------+------------------------------------------+
SCHEMATA_OPTIONS
view
When you query the INFORMATION_SCHEMA.SCHEMATA_OPTIONS
view, the query results
contain one row for each option of each dataset in a project.
The INFORMATION_SCHEMA.SCHEMATA_OPTIONS
view has the following schema:
Column name | Data type | Value |
---|---|---|
CATALOG_NAME |
STRING |
The name of the project that contains the dataset |
SCHEMA_NAME |
STRING |
The dataset's name also referred to as the datasetId |
OPTION_NAME |
STRING |
One of the name values in the options table |
OPTION_TYPE |
STRING |
One of the data type values in the options table |
OPTION_VALUE |
STRING |
One of the value options in the options table |
Options table
OPTION_NAME |
OPTION_TYPE |
OPTION_VALUE |
---|---|---|
default_table_expiration_days |
FLOAT64 |
The default lifetime, in days, of all tables in the dataset |
friendly_name |
STRING |
The dataset's descriptive name |
description |
STRING |
A description of the dataset |
labels |
ARRAY<STRUCT<STRING, STRING>> |
An array of STRUCT 's that represent the labels on the
dataset |
Examples
Example 1:
The following example retrieves the default table expiration times for all
datasets in your default project (myproject
) by querying the
INFORMATION_SCHEMA.SCHEMATA_OPTIONS
view.
To run the query against a project other than your default project, add the
project ID to the dataset in the following format:
`project_id`.INFORMATION_SCHEMA.view
for example, `myproject`.INFORMATION_SCHEMA.SCHEMATA_OPTIONS
.
SELECT * FROM INFORMATION_SCHEMA.SCHEMATA_OPTIONS WHERE option_name = 'default_table_expiration_days';
The result is similar to the following:
+----------------+---------------+-------------------------------+-------------+---------------------+ | catalog_name | schema_name | option_name | option_type | option_value | +----------------+---------------+-------------------------------+-------------+---------------------+ | myproject | mydataset3 | default_table_expiration_days | FLOAT64 | 0.08333333333333333 | | myproject | mydataset2 | default_table_expiration_days | FLOAT64 | 90.0 | | myproject | mydataset1 | default_table_expiration_days | FLOAT64 | 30.0 | +----------------+---------------+-------------------------------+-------------+---------------------+
Example 2:
The following example retrieves the labels for all
datasets in your default project (myproject
) by querying the
INFORMATION_SCHEMA.SCHEMATA_OPTIONS
view.
To run the query against a project other than your default project, add the
project ID to the dataset in the following format:
`project_id`.INFORMATION_SCHEMA.view
;
for example, `myproject`.INFORMATION_SCHEMA.SCHEMATA_OPTIONS
.
SELECT * FROM INFORMATION_SCHEMA.SCHEMATA_OPTIONS WHERE option_name = 'labels';
The result is similar to the following:
+----------------+---------------+-------------+---------------------------------+------------------------+ | catalog_name | schema_name | option_name | option_type | option_value | +----------------+---------------+-------------+---------------------------------+------------------------+ | myproject | mydataset1 | labels | ARRAY<STRUCT<STRING, STRING>> | [STRUCT("org", "dev")] | | myproject | mydataset2 | labels | ARRAY<STRUCT<STRING, STRING>> | [STRUCT("org", "dev")] | +----------------+---------------+-------------+---------------------------------+------------------------+