Getting table metadata using INFORMATION_SCHEMA

The following INFORMATION_SCHEMA views contain table metadata:

  • TABLES and TABLE_OPTIONS for metadata about tables
  • COLUMNS and COLUMN_FIELD_PATHS for metadata about columns and fields
  • PARTITIONS for metadata about table partitions (Preview)
  • TABLE_STORAGE for metadata about current table storage usage (Preview)
  • TABLE_STORAGE_TIMELINE_BY_PROJECT and TABLE_STORAGE_TIMELINE_BY_ORGANIZATION for metadata about historical table storage usage (Preview)

The table storage views give you a convenient way to observe your current and historical storage consumption, including logical, compressed, and maintenance bytes. This information can help you with tasks like planning for future growth and understanding the update patterns for tables, even for tables that don't have a last_update timestamp column as part of the schema.

TABLES and TABLE_OPTIONS also contain high-level information about views. For detailed information, query the INFORMATION_SCHEMA.VIEWS view instead.

Before you begin

Grant Identity and Access Management (IAM) roles that give users the necessary permissions to perform each task in this document.

Required permissions

Use the following table to understand what IAM permissions you need in order to query INFORMATION_SCHEMA views.

Note that for the INFORMATION_SCHEMA.TABLE_STORAGE_TIMELINE_BY_ORGANIZATION view, you must have permissions granted at the organization level. The other views only require permissions granted at the project level. For more information, see Manage access to projects, folders, and organization.

View Permissions Roles that grant these permissions
TABLES bigquery.tables.get
bigquery.tables.list
bigquery.routines.get
bigquery.routines.list
roles/bigquery.admin
roles/bigquery.dataViewer
roles/bigquery.metadataViewer
TABLE_OPTIONS bigquery.tables.get
bigquery.tables.list
bigquery.routines.get
bigquery.routines.list
roles/bigquery.admin
roles/bigquery.dataViewer
roles/bigquery.metadataViewer
COLUMNS bigquery.tables.get
bigquery.tables.list
roles/bigquery.admin
roles/bigquery.dataEditor
roles/bigquery.dataViewer
roles/bigquery.metadataViewer
COLUMN_FIELD_PATHS bigquery.tables.get
bigquery.tables.list
roles/bigquery.admin
roles/bigquery.dataEditor
roles/bigquery.dataViewer
roles/bigquery.metadataViewer
PARTITIONS bigquery.tables.get
bigquery.tables.getData
bigquery.tables.list
roles/bigquery.admin
roles/bigquery.dataEditor
roles/bigquery.dataViewer
TABLE_STORAGE bigquery.tables.get
bigquery.tables.list
roles/bigquery.admin
roles/bigquery.dataEditor
roles/bigquery.dataViewer
roles/bigquery.metadataViewer
TABLE_STORAGE_TIMELINE_BY_ORGANIZATION

TABLE_STORAGE_TIMELINE_BY_PROJECT
bigquery.tables.get
bigquery.tables.list

roles/bigquery.admin
roles/bigquery.dataEditor
roles/bigquery.dataViewer
roles/bigquery.metadataViewer

For more information about granular BigQuery permissions, see roles and permissions.

Syntax

Queries against any of these views must have a dataset or region qualifier.

-- Returns metadata for tables in a single dataset.
SELECT * FROM myDataset.INFORMATION_SCHEMA.TABLES;

-- Returns metadata for tables in a region.
SELECT * FROM region-us.INFORMATION_SCHEMA.TABLES;

For queries with a dataset qualifier, you must have permissions for the dataset. For queries with a region qualifier, you must have permissions for the project.

TABLES view

When you query the INFORMATION_SCHEMA.TABLES view, the query results contain one row for each table or view in a dataset. For detailed information about views, query the INFORMATION_SCHEMA.VIEWS view instead.

The INFORMATION_SCHEMA.TABLES view has the following schema:

Column name Data type Value
table_catalog STRING The project ID of the project that contains the dataset.
table_schema STRING The name of the dataset that contains the table or view. Also referred to as the datasetId.
table_name STRING The name of the table or view. Also referred to as the tableId.
table_type STRING The table type; one of the following:
is_insertable_into STRING YES or NO depending on whether the table supports DML INSERT statements
is_typed STRING The value is always NO
creation_time TIMESTAMP The table's creation time
ddl STRING The DDL statement that can be used to recreate the table, such as CREATE TABLE or CREATE VIEW
clone_time TIMESTAMP For table clones (Preview), the time when the base table was cloned to create this table. If time travel was used, then this field contains the time travel timestamp. Otherwise, the clone_time field is the same as the creation_time field. Applicable only to tables with table_type set to CLONE.
base_table_catalog STRING For table clones (Preview), the base table's project. Applicable only to tables with table_type set to CLONE.
base_table_schema STRING For table clones (Preview), the base table's dataset. Applicable only to tables with table_type set to CLONE.
base_table_name STRING For table clones (Preview), the base table's name. Applicable only to tables with table_type set to CLONE.
default_collation_name STRING The name of the default collation specification if it exists; otherwise, NULL.

Example query

Example 1:

The following example retrieves table metadata for all of the tables in the dataset named mydataset. The query selects all of the columns from the INFORMATION_SCHEMA.TABLES view except for is_typed, which is reserved for future use. The metadata that's returned is for all types of tables in mydataset in your default project.

mydataset contains the following tables:

  • mytable1: a standard BigQuery table
  • myview1: a BigQuery view

To run the query against a project other than your default project, add the project ID to the dataset in the following format: `project_id`.dataset.INFORMATION_SCHEMA.view; for example, `myproject`.mydataset.INFORMATION_SCHEMA.TABLES.

  SELECT
    * EXCEPT(is_typed)
  FROM
    mydataset.INFORMATION_SCHEMA.TABLES;

The result is similar to the following. For readability, some columns are excluded from the result.

+----------------+---------------+----------------+------------+--------------------+---------------------+---------------------------------------------+
| table_catalog  | table_schema  |   table_name   | table_type | is_insertable_into |    creation_time    |                     ddl                     |
+----------------+---------------+----------------+------------+--------------------+---------------------+---------------------------------------------+
| myproject      | mydataset     | mytable1       | BASE TABLE | YES                | 2018-10-29 20:34:44 | CREATE TABLE `myproject.mydataset.mytable1` |
|                |               |                |            |                    |                     | (                                           |
|                |               |                |            |                    |                     |   id INT64                                  |
|                |               |                |            |                    |                     | );                                          |
| myproject      | mydataset     | myview1        | VIEW       | NO                 | 2018-12-29 00:19:20 | CREATE VIEW `myproject.mydataset.myview1`   |
|                |               |                |            |                    |                     | AS SELECT 100 as id;                        |
+----------------+---------------+----------------+------------+--------------------+---------------------+---------------------------------------------+
Example 2:

The following example retrieves all tables of type BASE TABLE from the INFORMATION_SCHEMA.TABLES view. The is_typed column is excluded. The metadata returned is for tables in mydataset in your default project.

To run the query against a project other than your default project, add the project ID to the dataset in the following format: `project_id`.dataset.INFORMATION_SCHEMA.view; for example, `myproject`.mydataset.INFORMATION_SCHEMA.TABLES.

  SELECT
    * EXCEPT(is_typed)
  FROM
    mydataset.INFORMATION_SCHEMA.TABLES
  WHERE
    table_type = 'BASE TABLE';

The result is similar to the following. For readability, some columns are excluded from the result.

  +----------------+---------------+----------------+------------+--------------------+---------------------+---------------------------------------------+
  | table_catalog  | table_schema  |   table_name   | table_type | is_insertable_into |    creation_time    |                     ddl                     |
  +----------------+---------------+----------------+------------+--------------------+---------------------+---------------------------------------------+
  | myproject      | mydataset     | mytable1       | BASE TABLE | YES                | 2018-10-31 22:40:05 | CREATE TABLE myproject.mydataset.mytable1 |
  |                |               |                |            |                    |                     | (                                           |
  |                |               |                |            |                    |                     |   id INT64                                  |
  |                |               |                |            |                    |                     | );                                          |
  +----------------+---------------+----------------+------------+--------------------+---------------------+---------------------------------------------+
  

Example 3:

The following example retrieves table_name and ddl columns from the INFORMATION_SCHEMA.TABLES view for the population_by_zip_2010 table in the census_bureau_usa dataset. This dataset is part of the BigQuery public dataset program.

Because the table you're querying is in another project, you add the project ID to the dataset in the following format: `project_id`.dataset.INFORMATION_SCHEMA.view. In this example, the value is `bigquery-public-data`.census_bureau_usa.INFORMATION_SCHEMA.TABLES.

SELECT
  table_name, ddl
FROM
  `bigquery-public-data`.census_bureau_usa.INFORMATION_SCHEMA.TABLES
WHERE
  table_name = 'population_by_zip_2010';

The result is similar to the following:

+------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
|       table_name       |                                                                                                            ddl                                                                                                             |
+------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| population_by_zip_2010 | CREATE TABLE `bigquery-public-data.census_bureau_usa.population_by_zip_2010`                                                                                                                                               |
|                        | (                                                                                                                                                                                                                          |
|                        |   geo_id STRING OPTIONS(description="Geo code"),                                                                                                                                                                           |
|                        |   zipcode STRING NOT NULL OPTIONS(description="Five digit ZIP Code Tabulation Area Census Code"),                                                                                                                          |
|                        |   population INT64 OPTIONS(description="The total count of the population for this segment."),                                                                                                                             |
|                        |   minimum_age INT64 OPTIONS(description="The minimum age in the age range. If null, this indicates the row as a total for male, female, or overall population."),                                                          |
|                        |   maximum_age INT64 OPTIONS(description="The maximum age in the age range. If null, this indicates the row as having no maximum (such as 85 and over) or the row is a total of the male, female, or overall population."), |
|                        |   gender STRING OPTIONS(description="male or female. If empty, the row is a total population summary.")                                                                                                                    |
|                        | )                                                                                                                                                                                                                          |
|                        | OPTIONS(                                                                                                                                                                                                                   |
|                        |   labels=[("freebqcovid", "")]                                                                                                                                                                                             |
|                        | );                                                                                                                                                                                                                         |
+------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
  

TABLE_OPTIONS view

When you query the INFORMATION_SCHEMA.TABLE_OPTIONS view, the query results contain one row for each option, for each table or view in a dataset. For detailed information about views, query the INFORMATION_SCHEMA.VIEWS view instead.

The INFORMATION_SCHEMA.TABLE_OPTIONS view has the following schema:

Column name Data type Value
TABLE_CATALOG STRING The project ID of the project that contains the dataset
TABLE_SCHEMA STRING The name of the dataset that contains the table or view also referred to as the datasetId
TABLE_NAME STRING The name of the table or view also referred to as the tableId
OPTION_NAME STRING One of the name values in the options table
OPTION_TYPE STRING One of the data type values in the options table
OPTION_VALUE STRING One of the value options in the options table
Options table
OPTION_NAME OPTION_TYPE OPTION_VALUE
partition_expiration_days FLOAT64 The default lifetime, in days, of all partitions in a partitioned table
expiration_timestamp FLOAT64 The time when this table expires
kms_key_name STRING The name of the Cloud KMS key used to encrypt the table
friendly_name STRING The table's descriptive name
description STRING A description of the table
labels ARRAY<STRUCT<STRING, STRING>> An array of STRUCT's that represent the labels on the table
require_partition_filter BOOL Whether queries over the table require a partition filter
enable_refresh BOOL Whether automatic refresh is enabled for a materialized view
refresh_interval_minutes FLOAT64 How frequently a materialized view is refreshed

For external tables, the following options are possible:

Options
allow_jagged_rows

BOOL

If true, allow rows that are missing trailing optional columns.

Applies to CSV data.

allow_quoted_newlines

BOOL

If true, allow quoted data sections that contain newline characters in the file.

Applies to CSV data.

compression

STRING

The compression type of the data source. Supported values include: GZIP. If not specified, the data source is uncompressed.

Applies to CSV and JSON data.

description

STRING

A description of this table.

enable_logical_types

BOOL

If true, convert Avro logical types into their corresponding SQL types. For more information, see Logical types.

Applies to Avro data.

enum_as_string

BOOL

If true, infer Parquet ENUM logical type as STRING instead of BYTES by default.

Applies to Parquet data.

enable_list_inference

BOOL

If true, use schema inference specifically for Parquet LIST logical type.

Applies to Parquet data.

encoding

STRING

The character encoding of the data. Supported values include: UTF8 (or UTF-8), ISO_8859_1 (or ISO-8859-1).

Applies to CSV data.

expiration_timestamp

TIMESTAMP

The time when this table expires. If not specified, the table does not expire.

Example: "2025-01-01 00:00:00 UTC".

field_delimiter

STRING

The separator for fields in a CSV file.

Applies to CSV data.

format

STRING

The format of the external data. Supported values include: AVRO, CSV, DATASTORE_BACKUP, GOOGLE_SHEETS, NEWLINE_DELIMITED_JSON (or JSON), ORC, PARQUET.

The value JSON is equivalent to NEWLINE_DELIMITED_JSON.

decimal_target_types

ARRAY<STRING>

Determines how to convert a Decimal type. Equivalent to ExternalDataConfiguration.decimal_target_types

Example: ["NUMERIC", "BIGNUMERIC"].

json_extension

STRING

For JSON data, indicates a particular JSON interchange format. If not specified, BigQuery reads the data as generic JSON records.

Supported values include:
GEOJSON. Newline-delimited GeoJSON data. For more information, see Creating an external table from a newline-delimited GeoJSON file.

hive_partition_uri_prefix

STRING

A common prefix for all source URIs before the partition key encoding begins. Applies only to hive-partitioned external tables.

Applies to Avro, CSV, JSON, Parquet, and ORC data.

Example: "gs://bucket/path".

ignore_unknown_values

BOOL

If true, ignore extra values that are not represented in the table schema, without returning an error.

Applies to CSV and JSON data.

max_bad_records

INT64

The maximum number of bad records to ignore when reading the data.

Applies to: CSV, JSON, and Sheets data.

null_marker

STRING

The string that represents NULL values in a CSV file.

Applies to CSV data.

projection_fields

STRING

A list of entity properties to load.

Applies to Datastore data.

quote

STRING

The string used to quote data sections in a CSV file. If your data contains quoted newline characters, also set the allow_quoted_newlines property to true.

Applies to CSV data.

require_hive_partition_filter

BOOL

If true, all queries over this table require a partition filter that can be used to eliminate partitions when reading data. Applies only to hive-partitioned external tables.

Applies to Avro, CSV, JSON, Parquet, and ORC data.

sheet_range

STRING

Range of a Sheets spreadsheet to query from.

Applies to Sheets data.

Example: “sheet1!A1:B20”,

skip_leading_rows

INT64

The number of rows at the top of a file to skip when reading the data.

Applies to CSV and Sheets data.

uris

ARRAY<STRING>

An array of fully qualified URIs for the external data locations.

Example: ["gs://bucket/path/*"].

Example query

Example 1:

The following example retrieves the default table expiration times for all tables in mydataset in your default project (myproject) by querying the INFORMATION_SCHEMA.TABLE_OPTIONS view.

To run the query against a project other than your default project, add the project ID to the dataset in the following format: `project_id`.dataset.INFORMATION_SCHEMA.view; for example, `myproject`.mydataset.INFORMATION_SCHEMA.TABLE_OPTIONS.

  SELECT
    *
  FROM
    mydataset.INFORMATION_SCHEMA.TABLE_OPTIONS
  WHERE
    option_name = 'expiration_timestamp';

The result is similar to the following:

  +----------------+---------------+------------+----------------------+-------------+--------------------------------------+
  | table_catalog  | table_schema  | table_name |     option_name      | option_type |             option_value             |
  +----------------+---------------+------------+----------------------+-------------+--------------------------------------+
  | myproject      | mydataset     | mytable1   | expiration_timestamp | TIMESTAMP   | TIMESTAMP "2020-01-16T21:12:28.000Z" |
  | myproject      | mydataset     | mytable2   | expiration_timestamp | TIMESTAMP   | TIMESTAMP "2021-01-01T21:12:28.000Z" |
  +----------------+---------------+------------+----------------------+-------------+--------------------------------------+
  

Example 2:

The following example retrieves metadata about all tables in mydataset that contain test data. The query uses the values in the description option to find tables that contain "test" anywhere in the description. mydataset is in your default project — myproject.

To run the query against a project other than your default project, add the project ID to the dataset in the following format: `project_id`.dataset.INFORMATION_SCHEMA.view; for example, `myproject`.mydataset.INFORMATION_SCHEMA.TABLE_OPTIONS.

  SELECT
    *
  FROM
    mydataset.INFORMATION_SCHEMA.TABLE_OPTIONS
  WHERE
    option_name = 'description'
    AND option_value LIKE '%test%';

The result is similar to the following:

  +----------------+---------------+------------+-------------+-------------+--------------+
  | table_catalog  | table_schema  | table_name | option_name | option_type | option_value |
  +----------------+---------------+------------+-------------+-------------+--------------+
  | myproject      | mydataset     | mytable1   | description | STRING      | "test data"  |
  | myproject      | mydataset     | mytable2   | description | STRING      | "test data"  |
  +----------------+---------------+------------+-------------+-------------+--------------+
  

COLUMNS view

When you query the INFORMATION_SCHEMA.COLUMNS view, the query results contain one row for each column (field) in a table.

The INFORMATION_SCHEMA.COLUMNS view has the following schema:

Column name Data type Value
TABLE_CATALOG STRING The project ID of the project that contains the dataset
TABLE_SCHEMA STRING The name of the dataset that contains the table also referred to as the datasetId
TABLE_NAME STRING The name of the table or view also referred to as the tableId
COLUMN_NAME STRING The name of the column
ORDINAL_POSITION INT64 The 1-indexed offset of the column within the table; if it's a pseudo column such as _PARTITIONTIME or _PARTITIONDATE, the value is NULL
IS_NULLABLE STRING YES or NO depending on whether the column's mode allows NULL values
DATA_TYPE STRING The column's standard SQL data type
IS_GENERATED STRING The value is always NEVER
GENERATION_EXPRESSION STRING The value is always NULL
IS_STORED STRING The value is always NULL
IS_HIDDEN STRING YES or NO depending on whether the column is a pseudo column such as _PARTITIONTIME or _PARTITIONDATE
IS_UPDATABLE STRING The value is always NULL
IS_SYSTEM_DEFINED STRING YES or NO depending on whether the column is a pseudo column such as _PARTITIONTIME or _PARTITIONDATE
IS_PARTITIONING_COLUMN STRING YES or NO depending on whether the column is a partitioning column
CLUSTERING_ORDINAL_POSITION INT64 The 1-indexed offset of the column within the table's clustering columns; the value is NULL if the table is not a clustered table
COLLATION_NAME STRING The name of the collation specification if it exists; otherwise, NULL

If a STRING or ARRAY<STRING> is passed in, the collation specification is returned if it exists; otherwise NULL is returned

Example query

The following example retrieves metadata from the INFORMATION_SCHEMA.COLUMNS view for the population_by_zip_2010 table in the census_bureau_usa dataset. This dataset is part of the BigQuery public dataset program.

Because the table you're querying is in another project, the bigquery-public-data project, you add the project ID to the dataset in the following format: `project_id`.dataset.INFORMATION_SCHEMA.view; for example, `bigquery-public-data`.census_bureau_usa.INFORMATION_SCHEMA.TABLES.

The following columns are excluded from the query results because they are currently reserved for future use:

  • IS_GENERATED
  • GENERATION_EXPRESSION
  • IS_STORED
  • IS_UPDATABLE
  SELECT
    * EXCEPT(is_generated, generation_expression, is_stored, is_updatable)
  FROM
    `bigquery-public-data`.census_bureau_usa.INFORMATION_SCHEMA.COLUMNS
  WHERE
    table_name = 'population_by_zip_2010';

The result is similar to the following. For readability, some columns are excluded from the result.

+------------------------+-------------+------------------+-------------+-----------+-----------+-------------------+------------------------+-----------------------------+
|       table_name       | column_name | ordinal_position | is_nullable | data_type | is_hidden | is_system_defined | is_partitioning_column | clustering_ordinal_position |
+------------------------+-------------+------------------+-------------+-----------+-----------+-------------------+------------------------+-----------------------------+
| population_by_zip_2010 | zipcode     |                1 | NO          | STRING    | NO        | NO                | NO                     |                        NULL |
| population_by_zip_2010 | geo_id      |                2 | YES         | STRING    | NO        | NO                | NO                     |                        NULL |
| population_by_zip_2010 | minimum_age |                3 | YES         | INT64     | NO        | NO                | NO                     |                        NULL |
| population_by_zip_2010 | maximum_age |                4 | YES         | INT64     | NO        | NO                | NO                     |                        NULL |
| population_by_zip_2010 | gender      |                5 | YES         | STRING    | NO        | NO                | NO                     |                        NULL |
| population_by_zip_2010 | population  |                6 | YES         | INT64     | NO        | NO                | NO                     |                        NULL |
+------------------------+-------------+------------------+-------------+-----------+-----------+-------------------+------------------------+-----------------------------+
  

COLUMN_FIELD_PATHS view

Query results contain one row for each column nested within a RECORD (or STRUCT) column.

When you query the INFORMATION_SCHEMA.COLUMN_FIELD_PATHS view, the query results contain one row for each column nested within a RECORD (or STRUCT) column.

The INFORMATION_SCHEMA.COLUMN_FIELD_PATHS view has the following schema:

Column name Data type Value
TABLE_CATALOG STRING The project ID of the project that contains the dataset
TABLE_SCHEMA STRING The name of the dataset that contains the table also referred to as the datasetId
TABLE_NAME STRING The name of the table or view also referred to as the tableId
COLUMN_NAME STRING The name of the column
FIELD_PATH STRING The path to a column nested within a `RECORD` or `STRUCT` column
DATA_TYPE STRING The column's standard SQL data type
DESCRIPTION STRING The column's description
COLLATION_NAME STRING The name of the collation specification if it exists; otherwise, NULL

If a STRING, ARRAY<STRING>, or STRING field in a STRUCT is passed in, the collation specification is returned if it exists; otherwise NULL is returned

Example query

The following example retrieves metadata from the INFORMATION_SCHEMA.COLUMN_FIELD_PATHS view for the commits table in the github_repos dataset. This dataset is part of the BigQuery public dataset program.

Because the table you're querying is in another project, the bigquery-public-data project, you add the project ID to the dataset in the following format: `project_id`.dataset.INFORMATION_SCHEMA.view; for example, `bigquery-public-data`.github_repos.INFORMATION_SCHEMA.COLUMN_FIELD_PATHS.

The commits table contains the following nested and nested and repeated columns:

  • author: nested RECORD column
  • committer: nested RECORD column
  • trailer: nested and repeated RECORD column
  • difference: nested and repeated RECORD column

To view metadata about the author and difference columns, run the following query.

SELECT
  *
FROM
  `bigquery-public-data`.github_repos.INFORMATION_SCHEMA.COLUMN_FIELD_PATHS
WHERE
  table_name = 'commits'
  AND (column_name = 'author' OR column_name = 'difference');

The result is similar to the following. For readability, some columns are excluded from the result.

  +------------+-------------+---------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------+-------------+
  | table_name | column_name |     field_path      |                                                                      data_type                                                                      | description |
  +------------+-------------+---------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------+-------------+
  | commits    | author      | author              | STRUCT<name STRING, email STRING, time_sec INT64, tz_offset INT64, date TIMESTAMP>                                                                  | NULL        |
  | commits    | author      | author.name         | STRING                                                                                                                                              | NULL        |
  | commits    | author      | author.email        | STRING                                                                                                                                              | NULL        |
  | commits    | author      | author.time_sec     | INT64                                                                                                                                               | NULL        |
  | commits    | author      | author.tz_offset    | INT64                                                                                                                                               | NULL        |
  | commits    | author      | author.date         | TIMESTAMP                                                                                                                                           | NULL        |
  | commits    | difference  | difference          | ARRAY<STRUCT<old_mode INT64, new_mode INT64, old_path STRING, new_path STRING, old_sha1 STRING, new_sha1 STRING, old_repo STRING, new_repo STRING>> | NULL        |
  | commits    | difference  | difference.old_mode | INT64                                                                                                                                               | NULL        |
  | commits    | difference  | difference.new_mode | INT64                                                                                                                                               | NULL        |
  | commits    | difference  | difference.old_path | STRING                                                                                                                                              | NULL        |
  | commits    | difference  | difference.new_path | STRING                                                                                                                                              | NULL        |
  | commits    | difference  | difference.old_sha1 | STRING                                                                                                                                              | NULL        |
  | commits    | difference  | difference.new_sha1 | STRING                                                                                                                                              | NULL        |
  | commits    | difference  | difference.old_repo | STRING                                                                                                                                              | NULL        |
  | commits    | difference  | difference.new_repo | STRING                                                                                                                                              | NULL        |
  +------------+-------------+---------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------+-------------+
  

PARTITIONS view

When you query the INFORMATION_SCHEMA.PARTITIONS view, the query results contain one row for each partition.

The INFORMATION_SCHEMA.PARTITIONS view has the following schema:

Column name Data type Value
TABLE_CATALOG STRING The project ID of the project that contains the table
TABLE_SCHEMA STRING The name of the dataset that contains the table, also referred to as the datasetId
TABLE_NAME STRING The name of the table, also referred to as the tableId
PARTITION_ID STRING A single partition's ID. For unpartitioned tables, the value is NULL. For partitioned tables that contain rows with NULL values in the partitioning column, the value is __NULL__.
TOTAL_ROWS INTEGER The total number of rows in the partition
TOTAL_LOGICAL_BYTES INTEGER The total number of logical bytes in the partition
TOTAL_BILLABLE_BYTES INTEGER The total number of billable bytes in the partition
LAST_MODIFIED_TIME TIMESTAMP The time when the data was most recently written to the partition
STORAGE_TIER STRING The partition's storage tier:

Example query

The following example calculates the amount of bytes used by each storage tier in all of the tables in a dataset named mydataset.

To run the query:

Console

  1. Open the BigQuery page in the Google Cloud console.

    Go to the BigQuery page

  2. Enter the following standard SQL query in the Query editor box. INFORMATION_SCHEMA requires standard SQL syntax. Standard SQL is the default syntax in the Google Cloud console.

    SELECT storage_tier, SUM(total_billable_bytes) billable_bytes
    FROM `mydataset.INFORMATION_SCHEMA.PARTITIONS`
    GROUP BY storage_tier
    
  3. Click Run.

bq

Use the query command and specify standard SQL syntax by using the --nouse_legacy_sql or --use_legacy_sql=false flag. Standard SQL syntax is required for INFORMATION_SCHEMA queries.

To run the query, enter:

bq query --nouse_legacy_sql \
'SELECT storage_tier, SUM(total_billable_bytes) billable_bytes
 FROM `mydataset.INFORMATION_SCHEMA.PARTITIONS`
 GROUP BY storage_tier'

The results look similar to the following table.

  +--------------+----------------+
  | storage_tier | billable_bytes |
  +--------------+----------------+
  | LONG_TERM    |  1311495144879 |
  | ACTIVE       |    66757629240 |
  +--------------+----------------+
  

TABLE_STORAGE view

SELECT
  CONCAT(v1.table_catalog, ":", v1.table_schema, ".", v1.table_name) AS unmodified_table_name,
FROM
  `region-REGION`.INFORMATION_SCHEMA.TABLES v1
LEFT JOIN `region-REGION`.INFORMATION_SCHEMA.TABLE_STORAGE v2 ON v1.table_name = v2.table_name
WHERE v2.table_name IS NULL;

The INFORMATION_SCHEMA.TABLE_STORAGE view provides a current snapshot of storage usage for tables and materialized views. When you query the INFORMATION_SCHEMA.TABLE_STORAGE view, the query results contain one row for each table or materialized view. The data in this table is not kept in real time, and might be delayed by a few seconds to a few minutes.

The view data is regionalized, so you must use a region qualifier in queries against the view. If you don't specify a project, the default project is used.

The following examples show how to return data from a project or region.

To view storage information for tables in a specified project, run the following query:

SELECT
  *
FROM
  myProject.`region-REGION`.INFORMATION_SCHEMA.TABLE_STORAGE;

To view storage information for tables in a specified region, run the following query:

SELECT
  *
FROM
  `region-REGION`.INFORMATION_SCHEMA.TABLE_STORAGE;

The INFORMATION_SCHEMA.TABLE_STORAGE view has the following schema:

Column name Data type Value
PROJECT_ID STRING The project ID of the project that contains the dataset
PROJECT_NAME INT64 The project number of the project that contains the dataset
TABLE_SCHEMA STRING The name of the dataset that contains the table or materialized view, also referred to as the datasetId
TABLE_NAME STRING The name of the table or materialized view, also referred to as the tableId
CREATION_TIME TIMESTAMP The table's creation time
TOTAL_ROWS INT64 The total number of rows in the table or materialized view
TOTAL_PARTITIONS INT64 The number of partitions present in the table or materialized view. Unpartitioned tables return 0.
TOTAL_LOGICAL_BYTES INT64 Total number of logical bytes in the table or materialized view
ACTIVE_LOGICAL_BYTES INT64 Number of logical bytes that are less than 90 days old
LONG_TERM_LOGICAL_BYTES INT64 Number of logical bytes that are more than 90 days old
TOTAL_PHYSICAL_BYTES INT64 Total number of physical bytes used for storage, including active, long term, and time travel (for deleted tables) bytes
ACTIVE_PHYSICAL_BYTES INT64 Number of physical bytes less than 90 days old
LONG_TERM_PHYSICAL_BYTES INT64 Number of physical bytes more than 90 days old
TIME_TRAVEL_PHYSICAL_BYTES INT64 Number of physical bytes used by time travel storage (deleted or changed data)

Examples

The following example shows you which projects in your organization are currently using the most storage.

SELECT
  project_id,
  SUM(total_logical_bytes) AS total_logical_bytes
FROM
  `region-REGION`.INFORMATION_SCHEMA.TABLE_STORAGE
GROUP BY
  project_id
ORDER BY
  total_logical_bytes DESC;

The result is similar to the following:

+---------------------+---------------------+
|     project_id      | total_logical_bytes |
+---------------------+---------------------+
| projecta            |     971329178274633 |
+---------------------+---------------------+
| projectb            |     834638211024843 |
+---------------------+---------------------+
| projectc            |     562910385625126 |
+---------------------+---------------------+

TABLE_STORAGE_TIMELINE_BY_* views

The table storage timeline views return one row for every event that triggers a storage change for the table, like writing, updating, or deleting a row. This means there can be multiple rows for a table for a single day. When querying a view for a time range, use the most recent timestamp on the day of interest.

The following table storage timeline views are available:

  • INFORMATION_SCHEMA.TABLE_STORAGE_TIMELINE_BY_PROJECT returns information for all tables in the current or specified project.
  • INFORMATION_SCHEMA.TABLE_STORAGE_TIMELINE_BY_ORGANIZATION returns information for all tables in the parent folder of the current or specified project, including the projects in subfolders under it.

The view data is regionalized, so you must use a region qualifier in queries against the view.

The table storage timeline views have the following schema:

Column name Data type Value
TIMESTAMP TIMESTAMP Timestamp of when storage was last recalculated. Recalculation is triggered by changes to the data in the table.
DELETED BOOLEAN Indicates whether or not the table is deleted
PROJECT_ID STRING The project ID of the project that contains the dataset
PROJECT_NAME INT64 The project number of the project that contains the dataset
TABLE_SCHEMA STRING The name of the dataset that contains the table or materialized view, also referred to as the datasetId
TABLE_NAME STRING The name of the table or materialized view, also referred to as the tableId
CREATION_TIME TIMESTAMP The table's creation time
TOTAL_ROWS INT64 The total number of rows in the table or materialized view
TOTAL_PARTITIONS INT64 The number of partitions for the table or materialized view. Unpartitioned tables will return 0.
TOTAL_LOGICAL_BYTES INT64 Total number of logical bytes in the table or materialized view
ACTIVE_LOGICAL_BYTES INT64 Number of logical bytes that are less than 90 days old
LONG_TERM_LOGICAL_BYTES INT64 Number of logical bytes that are more than 90 days old
TOTAL_PHYSICAL_BYTES INT64 Total number of physical bytes used for storage, including active, long term, and time travel (for deleted tables) bytes
ACTIVE_PHYSICAL_BYTES INT64 Number of physical bytes less than 90 days old
LONG_TERM_PHYSICAL_BYTES INT64 Number of physical bytes more than 90 days old
TIME_TRAVEL_PHYSICAL_BYTES INT64 Number of physical bytes used by time travel storage (deleted or changed data)

Examples

Example 1:

The following example shows you which tables are using the most storage in a specific dataset.

  SELECT
    timestamp AS start_time,
    table_name,
    total_logical_bytes
  FROM
    `region-REGION`.INFORMATION_SCHEMA.TABLE_STORAGE_TIMELINE_BY_PROJECT
  WHERE
    table_schema = "TABLE_SCHEMA"
  ORDER BY
    total_logical_byes DESC;

The result is similar to the following:

------------------------+---------------------+----------------------+
|         start_time    |      table_name     | total_logical_bytes  |
+-----------------------+---------------------+----------------------+
| 2022-03-30 17:39:54   | table1              |              3322    |
| 2022-03-30 17:39:53   | table2              |              1657    |
+-----------------------+---------------------+----------------------+

Example 2:

The following example shows you the sum of physical storage that's used by each project in your organization for a given point in time:

WITH most_recent_records as (
  SELECT
    project_id,
    table_schema,
    table_name,
    MAX(timestamp) as max_timestamp
  FROM
    `region-REGION`.INFORMATION_SCHEMA.TABLE_STORAGE_TIMELINE_BY_ORGANIZATION
  WHERE
    timestamp <= 'TIMESTAMP'
  GROUP BY
    project_id, table_schema, table_name
  )
  SELECT
    i_s.project_id,
    SUM(i_s.total_physical_bytes) AS TotalPhysicalBytes
  FROM
    `region-REGION`.INFORMATION_SCHEMA.TABLE_STORAGE_TIMELINE_BY_ORGANIZATION as i_s
  JOIN
    most_recent_records
  ON
    i_s.project_id = most_recent_records.project_id
    AND i_s.table_schema = most_recent_records.table_schema
    AND i_s.table_name = most_recent_records.table_name
    AND i_s.timestamp = most_recent_records.max_timestamp
  GROUP BY
    project_id;

The result is similar to the following:

-----------------+------------------------+
|  project_id    |  TotalPhysicalBytes    |
+----------------+------------------------+
| projecta       | 3844                   |
| projectb       | 16022778               |
| projectc       | 8934009                |
+----------------+------------------------+