Getting table metadata using INFORMATION_SCHEMA

INFORMATION_SCHEMA is a series of views that provide access to metadata about datasets, tables, and views.

You can query the INFORMATION_SCHEMA.TABLES and INFORMATION_SCHEMA.TABLE_OPTIONS views to retrieve metadata about tables and views in a project. You can also query the INFORMATION_SCHEMA.COLUMNS and INFORMATION_SCHEMA.COLUMN_FIELD_PATHS views to retrieve metadata about the columns (fields) in a table.

The TABLES and TABLE_OPTIONS views also contain high-level information about views. For detailed information, query the INFORMATION_SCHEMA.VIEWS view instead.

Required permissions

To get information about tables, you must be assigned the READER role on the dataset, or you must be assigned a project-level IAM role that includes bigquery.tables.get permissions. If you are granted bigquery.tables.get permissions at the project level, you can get information about all tables in the project. All predefined, project-level IAM roles include bigquery.tables.get permissions except for bigquery.jobUser and bigquery.user.

In addition, a user assigned the bigquery.user role has bigquery.datasets.create permissions. This allows a user assigned to the bigquery.user role to get information about tables in any dataset that user creates. When a user assigned to the bigquery.user role creates a dataset, that user is given OWNER access to the dataset. OWNER access to a dataset gives the user full control over it and all the tables in it.

For more information on IAM roles and permissions in BigQuery, see Access Control. For more information on dataset-level roles, see Primitive roles for datasets.

TABLES view

When you query the INFORMATION_SCHEMA.TABLES view, the query results contain one row for each table or view in a dataset.

Queries against the INFORMATION_SCHEMA.TABLES view must have a dataset qualifier. The user submitting the query must have access to the dataset that contains the tables or views.

The INFORMATION_SCHEMA.TABLES view has the following schema:

Column name Data type Value
TABLE_CATALOG STRING The name of the project that contains the dataset
TABLE_SCHEMA STRING The name of the dataset that contains the table or view (also referred to as the datasetId)
TABLE_NAME STRING The name of the table or view (also referred to as the tableId)
TABLE_TYPE STRING The table type:
IS_INSERTABLE_INTO STRING YES or NO depending on whether the table supports DML INSERT statements
IS_TYPED STRING The value is always NO
CREATION_TIME TIMESTAMP The table's creation time

Examples

Example 1:

The following example retrieves all columns from the INFORMATION_SCHEMA.TABLES view except for is_typed which is reserved for future use. The metadata returned is for all tables in mydataset in your default project — myproject.

mydataset contains the following tables:

  • mytable1: a standard BigQuery table
  • myview1: a BigQuery view

Queries against the INFORMATION_SCHEMA.TABLES view must have a dataset qualifier. The user submitting the query must have access to the dataset that contains the tables.

To run the query against a project other than your default project, add the project ID to the dataset in the following format: `[PROJECT_ID]`.[DATASET].INFORMATION_SCHEMA.[VIEW] for example, `myproject`.mydataset.INFORMATION_SCHEMA.TABLES.

To run the query:

Console

  1. Open the BigQuery web UI in the GCP Console.

    Go to the BigQuery web UI

  2. Enter the following standard SQL query in the Query editor box. INFORMATION_SCHEMA requires standard SQL syntax. Standard SQL is the default syntax in the GCP Console.

    SELECT
     * EXCEPT(is_typed)
    FROM
     mydataset.INFORMATION_SCHEMA.TABLES
    
  3. Click Run.

Command-line

Use the query command and specify standard SQL syntax by using the --nouse_legacy_sql or --use_legacy_sql=false flag. Standard SQL syntax is required for INFORMATION_SCHEMA queries.

To run the query, enter:

bq query --nouse_legacy_sql \
'SELECT * EXCEPT(is_typed) FROM mydataset.INFORMATION_SCHEMA.TABLES'

The results should look like the following:

  +----------------+---------------+----------------+------------+--------------------+---------------------+
  | table_catalog  | table_schema  |   table_name   | table_type | is_insertable_into |    creation_time    |
  +----------------+---------------+----------------+------------+--------------------+---------------------+
  | myproject      | mydataset     | mytable1       | BASE TABLE | YES                | 2018-10-29 20:34:44 |
  | myproject      | mydataset     | myview1        | VIEW       | NO                 | 2018-12-29 00:19:20 |
  +----------------+---------------+----------------+------------+--------------------+---------------------+
  

Example 2:

The following example retrieves all tables of type BASE TABLE from the INFORMATION_SCHEMA.TABLES view. The is_typed column is excluded. The metadata returned is for tables in mydataset in your default project — myproject.

Queries against the INFORMATION_SCHEMA.TABLES view must have a dataset qualifier. The user submitting the query must have access to the dataset that contains the tables.

To run the query against a project other than your default project, add the project ID to the dataset in the following format: `[PROJECT_ID]`.[DATASET].INFORMATION_SCHEMA.[VIEW] for example, `myproject`.mydataset.INFORMATION_SCHEMA.TABLES.

To run the query:

Console

  1. Open the BigQuery web UI in the GCP Console.

    Go to the BigQuery web UI

  2. Enter the following standard SQL query in the Query editor box. INFORMATION_SCHEMA requires standard SQL syntax. Standard SQL is the default syntax in the GCP Console.

    SELECT
     * EXCEPT(is_typed)
    FROM
     mydataset.INFORMATION_SCHEMA.TABLES
    WHERE
     table_type="BASE TABLE"
    
  3. Click Run.

Command-line

Use the query command and specify standard SQL syntax by using the --nouse_legacy_sql or --use_legacy_sql=false flag. Standard SQL syntax is required for INFORMATION_SCHEMA queries.

To run the query, enter:

bq query --nouse_legacy_sql \
'SELECT * EXCEPT(is_typed) FROM mydataset.INFORMATION_SCHEMA.TABLES
WHERE table_type="BASE TABLE"'

The results should look like the following:

  +----------------+---------------+----------------+------------+--------------------+---------------------+
  | table_catalog  | table_schema  |   table_name   | table_type | is_insertable_into |    creation_time    |
  +----------------+---------------+----------------+------------+--------------------+---------------------+
  | myproject      | mydataset     | mytable1       | BASE TABLE | NO                 | 2018-10-31 22:40:05 |
  +----------------+---------------+----------------+------------+--------------------+---------------------+
  

TABLE_OPTIONS view

When you query the INFORMATION_SCHEMA.TABLE_OPTIONS view, the query results contain one row for each table or view in a dataset.

Queries against the INFORMATION_SCHEMA.TABLE_OPTIONS view must have a dataset qualifier. The user submitting the query must have access to the dataset that contains the tables or views.

The INFORMATION_SCHEMA.TABLE_OPTIONS view has the following schema:

Column name Data type Value
TABLE_CATALOG STRING The name of the project that contains the dataset
TABLE_SCHEMA STRING The name of the dataset that contains the table or view (also referred to as the datasetId)
TABLE_NAME STRING The name of the table or view (also referred to as the tableId)
OPTION_NAME STRING One of the name values in the options table
OPTION_TYPE STRING One of the data type values in the options table
OPTION_VALUE STRING One of the value options in the options table
Options table
OPTION_NAME OPTION_TYPE OPTION_VALUE
partition_expiration_days FLOAT64 The default lifetime, in days, of all partitions in a partitioned table
expiration_timestamp FLOAT64 The default lifetime, in days, of the table
kms_key_name STRING The name of the Cloud KMS key used to encrypt the table
friendly_name STRING The table's descriptive name
description STRING A description of the table
labels ARRAY<STRUCT<STRING, STRING>> An array of STRUCT's that represent the labels on the table

Examples

Example 1:

The following example retrieves the default table expiration times for all tables in mydataset in your default project (myproject) by querying the INFORMATION_SCHEMATA.TABLE_OPTIONS view.

Queries against the INFORMATION_SCHEMA.TABLE_OPTIONS view must have a dataset qualifier. The user submitting the query must have access to the dataset that contains the tables.

To run the query against a project other than your default project, add the project ID to the dataset in the following format: `[PROJECT_ID]`.[DATASET].INFORMATION_SCHEMA.[VIEW] for example, `myproject`.mydataset.INFORMATION_SCHEMA.TABLE_OPTIONS.

To run the query:

Console

  1. Open the BigQuery web UI in the GCP Console.

    Go to the BigQuery web UI

  2. Enter the following standard SQL query in the Query editor box. INFORMATION_SCHEMA requires standard SQL syntax. Standard SQL is the default syntax in the GCP Console.

    SELECT
     *
    FROM
     mydataset.INFORMATION_SCHEMA.TABLE_OPTIONS
    WHERE
     option_name="expiration_timestamp"
    
  3. Click Run.

Command-line

Use the query command and specify standard SQL syntax by using the --nouse_legacy_sql or --use_legacy_sql=false flag. Standard SQL syntax is required for INFORMATION_SCHEMA queries.

To run the query, enter:

bq query --nouse_legacy_sql \
'SELECT * FROM mydataset.INFORMATION_SCHEMA.TABLE_OPTIONS
WHERE option_name="expiration_timestamp"'

The results should look like the following:

  +----------------+---------------+------------+----------------------+-------------+--------------------------------------+
  | table_catalog  | table_schema  | table_name |     option_name      | option_type |             option_value             |
  +----------------+---------------+------------+----------------------+-------------+--------------------------------------+
  | myproject      | mydataset     | mytable1   | expiration_timestamp | TIMESTAMP   | TIMESTAMP "2020-01-16T21:12:28.000Z" |
  | myproject      | mydataset     | mytable2   | expiration_timestamp | TIMESTAMP   | TIMESTAMP "2021-01-01T21:12:28.000Z" |
  +----------------+---------------+------------+----------------------+-------------+--------------------------------------+
  

Example 2:

The following example retrieves metadata about all tables in mydataset that contain test data. The query uses the values in the description option to find tables that contain "test" anywhere in the description. mydataset is in your default project — myproject.

To run the query against a project other than your default project, add the project ID to the dataset in the following format: `[PROJECT_ID]`.[DATASET].INFORMATION_SCHEMA.[VIEW] for example, `myproject`.mydataset.INFORMATION_SCHEMA.TABLE_OPTIONS.

To run the query:

Console

  1. Open the BigQuery web UI in the GCP Console.

    Go to the BigQuery web UI

  2. Enter the following standard SQL query in the Query editor box. INFORMATION_SCHEMA requires standard SQL syntax. Standard SQL is the default syntax in the GCP Console.

    SELECT
     *
    FROM
     mydataset.INFORMATION_SCHEMA.TABLE_OPTIONS
    WHERE
     option_name="description" AND option_value LIKE "%test%"
    
  3. Click Run.

Command-line

Use the query command and specify standard SQL syntax by using the --nouse_legacy_sql or --use_legacy_sql=false flag. Standard SQL syntax is required for INFORMATION_SCHEMA queries.

To run the query, enter:

bq query --nouse_legacy_sql \
'SELECT * FROM mydataset.INFORMATION_SCHEMA.TABLE_OPTIONS
WHERE option_name="description" AND option_value LIKE "%test%"'

The results should look like the following:

  +----------------+---------------+------------+-------------+-------------+--------------+
  | table_catalog  | table_schema  | table_name | option_name | option_type | option_value |
  +----------------+---------------+------------+-------------+-------------+--------------+
  | myproject      | mydataset     | mytable1   | description | STRING      | "test data"  |
  | myproject      | mydataset     | mytable2   | description | STRING      | "test data"  |
  +----------------+---------------+------------+-------------+-------------+--------------+
  

COLUMNS view

When you query the INFORMATION_SCHEMA.COLUMNS view, the query results contain one row for each column (field) in a table.

Queries against the INFORMATION_SCHEMA.COLUMNS view must have a dataset qualifier. The user submitting the query must have access to the dataset that contains the tables.

The INFORMATION_SCHEMA.COLUMNS view has the following schema:

Column name Data type Value
TABLE_CATALOG STRING The name of the project that contains the dataset
TABLE_SCHEMA STRING The name of the dataset that contains the table (also referred to as the datasetId)
TABLE_NAME STRING The name of the table or view (also referred to as the tableId)
COLUMN_NAME STRING The name of the column
ORDINAL_POSITION INT64 The 1-indexed offset of the column within the table; if it's a pseudo column such as _PARTITIONTIME or _PARTITIONDATE, the value is NULL
IS_NULLABLE STRING YES or NO depending on whether the column's mode allows NULL values
DATA_TYPE STRING The column's standard SQL data type
IS_GENERATED STRING The value is always NEVER
GENERATION_EXPRESSION STRING The value is always NULL
IS_STORED STRING The value is always NULL
IS_HIDDEN STRING YES or NO depending on whether the column is a pseudo column such as _PARTITIONTIME or _PARTITIONDATE
IS_UPDATABLE STRING The value is always NULL
IS_SYSTEM_DEFINED STRING YES or NO depending on whether the column is a pseudo column such as _PARTITIONTIME or _PARTITIONDATE
IS_PARTITIONING_COLUMN STRING YES or NO depending on whether the column is a partitioning column
CLUSTERING_ORDINAL_POSITION STRING The 1-indexed offset of the column within the table's clustering columns; the value is NULL if the table is not a clustered table

Examples

The following example retrieves metadata from the INFORMATION_SCHEMA.COLUMNS view for the population_by_zip_2010 table in the census_bureau_usa dataset. This dataset is part of the BigQuery public dataset program.

Because this table is in the bigquery-public-data project, you add the project ID to the dataset in the following format: `[PROJECT_ID]`.[DATASET].INFORMATION_SCHEMA.[VIEW] for example, `bigquery-public-data`.census_bureau_usa.INFORMATION_SCHEMA.TABLES.

The following columns are excluded from the query results because they are currently reserved for future use:

  • IS_GENERATED
  • GENERATION_EXPRESSION
  • IS_STORED
  • IS_UPDATABLE

Queries against the INFORMATION_SCHEMA.COLUMNS view must have a dataset qualifier. The user submitting the query must have access to the dataset that contains the tables.

To run the query:

Console

  1. Open the BigQuery web UI in the GCP Console.

    Go to the BigQuery web UI

  2. Enter the following standard SQL query in the Query editor box. INFORMATION_SCHEMA requires standard SQL syntax. Standard SQL is the default syntax in the GCP Console.

    SELECT
     * EXCEPT(is_generated, generation_expression, is_stored, is_updatable)
    FROM
     `bigquery-public-data`.census_bureau_usa.INFORMATION_SCHEMA.COLUMNS
    WHERE
     table_name="population_by_zip_2010"
    
  3. Click Run.

Command-line

Use the query command and specify standard SQL syntax by using the --nouse_legacy_sql or --use_legacy_sql=false flag. Standard SQL syntax is required for INFORMATION_SCHEMA queries.

To run the query, enter:

bq query --nouse_legacy_sql \
'SELECT * EXCEPT(is_generated, generation_expression, is_stored, is_updatable)
FROM `bigquery-public-data`.census_bureau_usa.INFORMATION_SCHEMA.COLUMNS
WHERE table_name="population_by_zip_2010"'

The results should look like the following. For readability, table_catalog and table_schema are excluded from the results:

+------------------------+-------------+------------------+-------------+-----------+-----------+-------------------+------------------------+-----------------------------+
|       table_name       | column_name | ordinal_position | is_nullable | data_type | is_hidden | is_system_defined | is_partitioning_column | clustering_ordinal_position |
+------------------------+-------------+------------------+-------------+-----------+-----------+-------------------+------------------------+-----------------------------+
| population_by_zip_2010 | zipcode     |                1 | NO          | STRING    | NO        | NO                | NO                     |                        NULL |
| population_by_zip_2010 | geo_id      |                2 | YES         | STRING    | NO        | NO                | NO                     |                        NULL |
| population_by_zip_2010 | minimum_age |                3 | YES         | INT64     | NO        | NO                | NO                     |                        NULL |
| population_by_zip_2010 | maximum_age |                4 | YES         | INT64     | NO        | NO                | NO                     |                        NULL |
| population_by_zip_2010 | gender      |                5 | YES         | STRING    | NO        | NO                | NO                     |                        NULL |
| population_by_zip_2010 | population  |                6 | YES         | INT64     | NO        | NO                | NO                     |                        NULL |
+------------------------+-------------+------------------+-------------+-----------+-----------+-------------------+------------------------+-----------------------------+
  

COLUMN_FIELD_PATHS view

When you query the INFORMATION_SCHEMA.COLUMN_FIELD_PATHS view, the query results contain one row for each column nested within a RECORD (or STRUCT) column.

Queries against the INFORMATION_SCHEMA.COLUMN_FIELD_PATHS view must have a dataset qualifier. The user submitting the query must have access to the dataset that contains the tables.

The INFORMATION_SCHEMA.COLUMN_FIELD_PATHS view has the following schema:

Column name Data type Value
TABLE_CATALOG STRING The name of the project that contains the dataset
TABLE_SCHEMA STRING The name of the dataset that contains the table (also referred to as the datasetId)
TABLE_NAME STRING The name of the table or view (also referred to as the tableId)
COLUMN_NAME STRING The name of the column
FIELD_PATH STRING The path to a column nested within a `RECORD` (or `STRUCT`) column
DATA_TYPE STRING The column's standard SQL data type
DESCRIPTION STRING The column's description

Examples

The following example retrieves metadata from the INFORMATION_SCHEMA.COLUMN_FIELD_PATHS view for the commits table in the github_repos dataset. This dataset is part of the BigQuery public dataset program.

Because this table is in the bigquery-public-data project, you add the project ID to the dataset in the following format: `[PROJECT_ID]`.[DATASET].INFORMATION_SCHEMA.[VIEW]; for example, `bigquery-public-data`.github_repos.INFORMATION_SCHEMA.COLUMN_FIELD_PATHS.

The commits table contains the following nested and nested and repeated columns:

  • author: nested RECORD column
  • committer: nested RECORD column
  • trailer: nested and repeated RECORD column
  • difference: nested and repeated RECORD column

Your query will retrieve metadata about the author and difference columns.

Queries against the INFORMATION_SCHEMA.COLUMN_FIELD_PATHS view must have a dataset qualifier. The user submitting the query must have access to the dataset that contains the tables.

To run the query:

Console

  1. Open the BigQuery web UI in the GCP Console.

    Go to the BigQuery web UI

  2. Enter the following standard SQL query in the Query editor box. INFORMATION_SCHEMA requires standard SQL syntax. Standard SQL is the default syntax in the GCP Console.

    SELECT
     *
    FROM
     `bigquery-public-data`.github_repos.INFORMATION_SCHEMA.COLUMN_FIELD_PATHS
    WHERE
     table_name="commits"
     AND column_name="author"
     OR column_name="difference"
    
  3. Click Run.

Command-line

Use the query command and specify standard SQL syntax by using the --nouse_legacy_sql or --use_legacy_sql=false flag. Standard SQL syntax is required for INFORMATION_SCHEMA queries.

To run the query, enter:

bq query --nouse_legacy_sql \
'SELECT * FROM `bigquery-public-data`.github_repos.INFORMATION_SCHEMA.COLUMN_FIELD_PATHS
WHERE table_name="commits" AND column_name="author" OR column_name="difference"'

The results should look like the following. For readability, table_catalog and table_schema are excluded from the results.

  +------------+-------------+---------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------+-------------+
  | table_name | column_name |     field_path      |                                                                      data_type                                                                      | description |
  +------------+-------------+---------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------+-------------+
  | commits    | author      | author              | STRUCT<name STRING, email STRING, time_sec INT64, tz_offset INT64, date TIMESTAMP>                                                                  | NULL        |
  | commits    | author      | author.name         | STRING                                                                                                                                              | NULL        |
  | commits    | author      | author.email        | STRING                                                                                                                                              | NULL        |
  | commits    | author      | author.time_sec     | INT64                                                                                                                                               | NULL        |
  | commits    | author      | author.tz_offset    | INT64                                                                                                                                               | NULL        |
  | commits    | author      | author.date         | TIMESTAMP                                                                                                                                           | NULL        |
  | commits    | difference  | difference          | ARRAY<STRUCT<old_mode INT64, new_mode INT64, old_path STRING, new_path STRING, old_sha1 STRING, new_sha1 STRING, old_repo STRING, new_repo STRING>> | NULL        |
  | commits    | difference  | difference.old_mode | INT64                                                                                                                                               | NULL        |
  | commits    | difference  | difference.new_mode | INT64                                                                                                                                               | NULL        |
  | commits    | difference  | difference.old_path | STRING                                                                                                                                              | NULL        |
  | commits    | difference  | difference.new_path | STRING                                                                                                                                              | NULL        |
  | commits    | difference  | difference.old_sha1 | STRING                                                                                                                                              | NULL        |
  | commits    | difference  | difference.new_sha1 | STRING                                                                                                                                              | NULL        |
  | commits    | difference  | difference.old_repo | STRING                                                                                                                                              | NULL        |
  | commits    | difference  | difference.new_repo | STRING                                                                                                                                              | NULL        |
  +------------+-------------+---------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------+-------------+
  

Advanced example

The following advanced example queries the INFORMATION_SCHEMA.TABLES, TABLE_OPTIONS, and COLUMNS views to retrieve metadata about the tables in mydataset in your default project — myproject. mydataset contains 2 tables:

  • mytable1: Uses the same schema as the commits table in the github_repos public dataset
  • mytable2: Uses the same schema as the population_by_zip_2010 table in the census_bureau_usa public dataset

The results are used by user-defined functions to assemble the DDL statements necessary to recreate the tables. You can then use the DDL statements in the query results to recreate the tables in mydataset.

To run the query against a project other than your default project, add the project ID to the dataset in the following format: `[PROJECT_ID]`.[DATASET].INFORMATION_SCHEMA.[VIEW] for example, `myproject`.mydataset.INFORMATION_SCHEMA.TABLES.

To run the query:

Console

  1. Open the BigQuery web UI in the GCP Console.

    Go to the BigQuery web UI

  2. Enter the following standard SQL query in the Query editor box. INFORMATION_SCHEMA requires standard SQL syntax. Standard SQL is the default syntax in the GCP Console.

    CREATE TEMP FUNCTION MakePartitionByExpression(
      column_name STRING, data_type STRING
    ) AS (
      IF(
        column_name = '_PARTITIONTIME',
        'DATE(_PARTITIONTIME)',
        IF(
          data_type = 'TIMESTAMP',
          CONCAT('DATE(', column_name, ')'),
          column_name
        )
      )
    );
    
    CREATE TEMP FUNCTION MakePartitionByClause(
      columns ARRAY<STRUCT<column_name STRING, data_type STRING, is_nullable STRING,  is_partitioning_column STRING, clustering_ordinal_position INT64>>
    ) AS (
      IFNULL(
        CONCAT(
          'PARTITION BY ',
          (SELECT MakePartitionByExpression(column_name, data_type)
           FROM UNNEST(columns) WHERE is_partitioning_column = 'YES'),
          '\n'),
        ''
      )
    );
    
    CREATE TEMP FUNCTION MakeClusterByClause(
      columns ARRAY<STRUCT<column_name STRING, data_type STRING, is_nullable STRING,  is_partitioning_column STRING, clustering_ordinal_position INT64>>
    ) AS (
      IFNULL(
        CONCAT(
          'CLUSTER BY ',
          (SELECT STRING_AGG(column_name, ', ' ORDER BY clustering_ordinal_position)
            FROM UNNEST(columns) WHERE clustering_ordinal_position IS NOT NULL),
          '\n'
        ),
        ''
      )
    );
    
    CREATE TEMP FUNCTION MakeNullable(data_type STRING, is_nullable STRING)
    AS (
      IF(not STARTS_WITH(data_type, 'ARRAY<') and is_nullable = 'NO', ' NOT NULL', '')
    );
    
    CREATE TEMP FUNCTION MakeColumnList(
      columns ARRAY<STRUCT<column_name STRING, data_type STRING, is_nullable STRING,  is_partitioning_column STRING, clustering_ordinal_position INT64>>
    ) AS (
      IFNULL(
        CONCAT(
          '(\n',
          (SELECT STRING_AGG(CONCAT('  ', column_name, ' ', data_type,  MakeNullable(data_type, is_nullable)), ',\n')
           FROM UNNEST(columns)),
          '\n)\n'
        ),
        ''
      )
    );
    
    CREATE TEMP FUNCTION MakeOptionList(
      options ARRAY<STRUCT<option_name STRING, option_value STRING>>
    ) AS (
      IFNULL(
        CONCAT(
          'OPTIONS (\n',
          (SELECT STRING_AGG(CONCAT('  ', option_name, '=', option_value), ',\n') FROM UNNEST(options)),
          '\n)\n'),
        ''
      )
    );
    
    WITH Components AS (
      SELECT
        CONCAT('`', table_catalog, '.', table_schema, '.', table_name, '`') AS table_name,
        ARRAY_AGG(
          STRUCT(column_name, data_type, is_nullable, is_partitioning_column, clustering_ordinal_position)
          ORDER BY ordinal_position
        ) AS columns,
        (SELECT ARRAY_AGG(STRUCT(option_name, option_value))
         FROM mydataset.INFORMATION_SCHEMA.TABLE_OPTIONS AS t2
         WHERE t.table_name = t2.table_name) AS options
      FROM mydataset.INFORMATION_SCHEMA.TABLES AS t
      LEFT JOIN mydataset.INFORMATION_SCHEMA.COLUMNS
      USING (table_catalog, table_schema, table_name)
      WHERE table_type = 'BASE TABLE'
      GROUP BY table_catalog, table_schema, t.table_name
    )
    SELECT
      CONCAT(
        'CREATE OR REPLACE TABLE ',
        table_name,
        '\n',
        MakeColumnList(columns),
        MakePartitionByClause(columns),
        MakeClusterByClause(columns),
        MakeOptionList(options))
    FROM Components
    

Command-line

Use the query command and specify standard SQL syntax by using the --nouse_legacy_sql or --use_legacy_sql=false flag. Standard SQL syntax is required for INFORMATION_SCHEMA queries.

To run the query, enter:

QUERY_TEXT=$(cat <<ENDQUERY
CREATE TEMP FUNCTION MakePartitionByExpression(
  column_name STRING, data_type STRING
) AS (
  IF(
    column_name = '_PARTITIONTIME',
    'DATE(_PARTITIONTIME)',
    IF(
      data_type = 'TIMESTAMP',
      CONCAT('DATE(', column_name, ')'),
      column_name
    )
  )
);
CREATE TEMP FUNCTION MakePartitionByClause(
  columns ARRAY<STRUCT<column_name STRING, data_type STRING, is_nullable STRING,  is_partitioning_column STRING, clustering_ordinal_position INT64>>
) AS (
  IFNULL(
    CONCAT(
      'PARTITION BY ',
      (SELECT MakePartitionByExpression(column_name, data_type)
       FROM UNNEST(columns) WHERE is_partitioning_column = 'YES'),
      '\n'),
    ''
  )
);
CREATE TEMP FUNCTION MakeClusterByClause(
  columns ARRAY<STRUCT<column_name STRING, data_type STRING, is_nullable STRING,  is_partitioning_column STRING, clustering_ordinal_position INT64>>
) AS (
  IFNULL(
    CONCAT(
      'CLUSTER BY ',
      (SELECT STRING_AGG(column_name, ', ' ORDER BY clustering_ordinal_position)
        FROM UNNEST(columns) WHERE clustering_ordinal_position IS NOT NULL),
      '\n'
    ),
    ''
  )
);
CREATE TEMP FUNCTION MakeNullable(data_type STRING, is_nullable STRING)
AS (
  IF(not STARTS_WITH(data_type, 'ARRAY<') and is_nullable = 'NO', ' NOT NULL', '')
);
CREATE TEMP FUNCTION MakeColumnList(
  columns ARRAY<STRUCT<column_name STRING, data_type STRING, is_nullable STRING,  is_partitioning_column STRING, clustering_ordinal_position INT64>>
) AS (
  IFNULL(
    CONCAT(
      '(\n',
      (SELECT STRING_AGG(CONCAT('  ', column_name, ' ', data_type,  MakeNullable(data_type, is_nullable)), ',\n')
       FROM UNNEST(columns)),
      '\n)\n'
    ),
    ''
  )
);
CREATE TEMP FUNCTION MakeOptionList(
  options ARRAY<STRUCT<option_name STRING, option_value STRING>>
) AS (
  IFNULL(
    CONCAT(
      'OPTIONS (\n',
      (SELECT STRING_AGG(CONCAT('  ', option_name, '=', option_value), ',\n') FROM UNNEST(options)),
      '\n)\n'),
    ''
  )
);
WITH Components AS (
  SELECT
    CONCAT('\`', table_catalog, '.', table_schema, '.', table_name, '\`') AS table_name,
    ARRAY_AGG(
      STRUCT(column_name, data_type, is_nullable, is_partitioning_column, clustering_ordinal_position)
      ORDER BY ordinal_position
    ) AS columns,
    (SELECT ARRAY_AGG(STRUCT(option_name, option_value))
     FROM mydataset.INFORMATION_SCHEMA.TABLE_OPTIONS AS t2
     WHERE t.table_name = t2.table_name) AS options
  FROM mydataset.INFORMATION_SCHEMA.TABLES AS t
  LEFT JOIN mydataset.INFORMATION_SCHEMA.COLUMNS
  USING (table_catalog, table_schema, table_name)
  WHERE table_type = 'BASE TABLE'
  GROUP BY table_catalog, table_schema, t.table_name
)
SELECT
  CONCAT(
    'CREATE OR REPLACE TABLE ',
    table_name,
    '\n',
    MakeColumnList(columns),
    MakePartitionByClause(columns),
    MakeClusterByClause(columns),
    MakeOptionList(options))
FROM Components
ENDQUERY
)
bq query --nouse_legacy_sql "$QUERY_TEXT"

The output should look like the following:

+-------------------------------------------------------------------------------------------------------------------------------------------------------------------+
|                                                                                f0_                                                                                |
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| CREATE OR REPLACE TABLE `myproject.mydataset.population_by_zip_2010`                                                                                              |
| (                                                                                                                                                                 |
|   zipcode STRING NOT NULL,                                                                                                                                        |
|   geo_id STRING,                                                                                                                                                  |
|   minimum_age INT64,                                                                                                                                              |
|   maximum_age INT64,                                                                                                                                              |
|   gender STRING,                                                                                                                                                  |
|   population INT64                                                                                                                                                |
| )                                                                                                                                                                 |
| OPTIONS (                                                                                                                                                         |
|   expiration_timestamp=TIMESTAMP "2019-04-17T02:10:32.055Z"                                                                                                       |
| )                                                                                                                                                                 |
| CREATE OR REPLACE TABLE `myproject.mydataset.commits`                                                                                                             |
| (                                                                                                                                                                 |
|   commit STRING,                                                                                                                                                  |
|   tree STRING,                                                                                                                                                    |
|   parent ARRAY<STRING>,                                                                                                                                           |
|   author STRUCT<name STRING, email STRING, time_sec INT64, tz_offset INT64, date TIMESTAMP>,                                                                      |
|   committer STRUCT<name STRING, email STRING, time_sec INT64, tz_offset INT64, date TIMESTAMP>,                                                                   |
|   subject STRING,                                                                                                                                                 |
|   message STRING,                                                                                                                                                 |
|   trailer ARRAY<STRUCT<key STRING, value STRING, email STRING>>,                                                                                                  |
|   difference ARRAY<STRUCT<old_mode INT64, new_mode INT64, old_path STRING, new_path STRING, old_sha1 STRING, new_sha1 STRING, old_repo STRING, new_repo STRING>>, |
|   difference_truncated BOOL,                                                                                                                                      |
|   repo_name ARRAY<STRING>,                                                                                                                                        |
|   encoding STRING                                                                                                                                                 |
| )                                                                                                                                                                 |
| OPTIONS (                                                                                                                                                         |
|   expiration_timestamp=TIMESTAMP "2019-04-17T03:12:03.248Z"                                                                                                       |
| )                                                                                                                                                                 |
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------+

Next steps

หน้านี้มีประโยชน์ไหม โปรดแสดงความคิดเห็น

ส่งความคิดเห็นเกี่ยวกับ...

หากต้องการความช่วยเหลือ ให้ไปที่หน้าการสนับสนุน