Stay organized with collections Save and categorize content based on your preferences.

Getting dataset metadata using INFORMATION_SCHEMA

INFORMATION_SCHEMA contains these views for dataset metadata:

  • SCHEMATA
  • SCHEMATA_OPTIONS

Before you begin

Grant Identity and Access Management (IAM) roles that give users the necessary permissions to perform each task in this document.

Required permissions

To get SCHEMATA and SCHEMATA_OPTIONS views for dataset metadata, you need the bigquery.datasets.get IAM permission at the project level.

Each of the following predefined IAM roles includes the permissions that you need in order to get SCHEMATA and SCHEMATA_OPTIONS views:

  • roles/bigquery.admin
  • roles/bigquery.dataEditor
  • roles/bigquery.dataOwner
  • roles/bigquery.dataViewer

For more information about granular BigQuery permissions, see roles and permissions.

Syntax

Queries against any of these views should include a region qualifier.

If you do not specify a regional qualifier, metadata is retrieved from the us multi-region.

-- Returns metadata for datasets in a region.
SELECT * FROM region-us.INFORMATION_SCHEMA.SCHEMATA;

SCHEMATA view

When you query the INFORMATION_SCHEMA.SCHEMATA view, the query results contain one row for each dataset in a project to which the current user has access.

The INFORMATION_SCHEMA.SCHEMATA view has the following schema:

Column name Data type Value
CATALOG_NAME STRING The name of the project that contains the dataset
SCHEMA_NAME STRING The dataset's name also referred to as the datasetId
SCHEMA_OWNER STRING The value is always NULL
CREATION_TIME TIMESTAMP The dataset's creation time
LAST_MODIFIED_TIME TIMESTAMP The dataset's last modified time
LOCATION STRING The dataset's geographic location
DDL STRING The CREATE SCHEMA DDL statement that can be used to create the dataset
DEFAULT_COLLATION_NAME STRING The name of the default collation specification if it exists; otherwise, NULL.

Examples

To run the query against a project other than your default project, add the project ID to the dataset in the following format:

`PROJECT_ID`.INFORMATION_SCHEMA.SCHEMATA
for example, `myproject`.INFORMATION_SCHEMA.SCHEMATA.

SELECT
  * EXCEPT (schema_owner)
FROM
  INFORMATION_SCHEMA.SCHEMATA;

The result is similar to the following. For readability, some columns are excluded from the result.

+----------------+---------------+---------------------+---------------------+------------+------------------------------------------+
|  catalog_name  |  schema_name  |    creation_time    | last_modified_time  |  location  |                   ddl                    |
+----------------+---------------+---------------------+---------------------+------------+------------------------------------------+
| myproject      | mydataset1    | 2018-11-07 19:50:24 | 2018-11-07 19:50:24 | US         | CREATE SCHEMA `myproject.mydataset1`     |
|                |               |                     |                     |            | OPTIONS(                                 |
|                |               |                     |                     |            |   location="us"                          |
|                |               |                     |                     |            | );                                       |
+----------------+---------------+---------------------+---------------------+------------+------------------------------------------+
| myproject      | mydataset2    | 2018-07-16 04:24:22 | 2018-07-16 04:24:22 | US         | CREATE SCHEMA `myproject.mydataset2`     |
|                |               |                     |                     |            | OPTIONS(                                 |
|                |               |                     |                     |            |   default_partition_expiration_days=3.0, |
|                |               |                     |                     |            |   location="us"                          |
|                |               |                     |                     |            | );                                       |
+----------------+---------------+---------------------+---------------------+------------+------------------------------------------+
| myproject      | mydataset3    | 2018-02-07 21:08:45 | 2018-05-01 23:32:53 | US         | CREATE SCHEMA `myproject.mydataset3`     |
|                |               |                     |                     |            | OPTIONS(                                 |
|                |               |                     |                     |            |   description="My dataset",              |
|                |               |                     |                     |            |   location="us"                          |
|                |               |                     |                     |            | );                                       |
+----------------+---------------+---------------------+---------------------+------------+------------------------------------------+

SCHEMATA_OPTIONS view

When you query the INFORMATION_SCHEMA.SCHEMATA_OPTIONS view, the query results contain one row for each option of each dataset in a project.

The INFORMATION_SCHEMA.SCHEMATA_OPTIONS view has the following schema:

Column name Data type Value
CATALOG_NAME STRING The name of the project that contains the dataset
SCHEMA_NAME STRING The dataset's name also referred to as the datasetId
OPTION_NAME STRING One of the name values in the options table
OPTION_TYPE STRING One of the data type values in the options table
OPTION_VALUE STRING One of the value options in the options table

Options table
OPTION_NAME OPTION_TYPE OPTION_VALUE
default_table_expiration_days FLOAT64 The default lifetime, in days, of all tables in the dataset
default_partition_expiration_days FLOAT64 The default lifetime, in days, of all partitioned tables in the dataset
friendly_name STRING The dataset's descriptive name
description STRING A description of the dataset
labels ARRAY<STRUCT<STRING, STRING>> An array of STRUCT's that represent the labels on the dataset

Examples

Retrieve the default table expiration time for all datasets in your project

To run the query against a project other than your default project, add the project ID to the dataset in the following format:

`PROJECT_ID`.INFORMATION_SCHEMA.SCHEMATA_OPTIONS
for example, `myproject`.INFORMATION_SCHEMA.SCHEMATA_OPTIONS.

SELECT
  *
FROM
  INFORMATION_SCHEMA.SCHEMATA_OPTIONS
WHERE
  option_name = 'default_table_expiration_days';

The result is similar to the following:

  +----------------+---------------+-------------------------------+-------------+---------------------+
  |  catalog_name  |  schema_name  |          option_name          | option_type |    option_value     |
  +----------------+---------------+-------------------------------+-------------+---------------------+
  | myproject      | mydataset3    | default_table_expiration_days | FLOAT64     | 0.08333333333333333 |
  | myproject      | mydataset2    | default_table_expiration_days | FLOAT64     | 90.0                |
  | myproject      | mydataset1    | default_table_expiration_days | FLOAT64     | 30.0                |
  +----------------+---------------+-------------------------------+-------------+---------------------+
  

Retrieve labels for all datasets in your project

To run the query against a project other than your default project, add the project ID to the dataset in the following format:

`PROJECT_ID`.INFORMATION_SCHEMA.SCHEMATA_OPTIONS
; for example, `myproject`.INFORMATION_SCHEMA.SCHEMATA_OPTIONS.

SELECT
  *
FROM
  INFORMATION_SCHEMA.SCHEMATA_OPTIONS
WHERE
  option_name = 'labels';

The result is similar to the following:

  +----------------+---------------+-------------+---------------------------------+------------------------+
  |  catalog_name  |  schema_name  | option_name |          option_type            |      option_value      |
  +----------------+---------------+-------------+---------------------------------+------------------------+
  | myproject      | mydataset1    | labels      | ARRAY<STRUCT<STRING, STRING>>   | [STRUCT("org", "dev")] |
  | myproject      | mydataset2    | labels      | ARRAY<STRUCT<STRING, STRING>>   | [STRUCT("org", "dev")] |
  +----------------+---------------+-------------+---------------------------------+------------------------+