INFORMATION_SCHEMA
contains these views for table metadata:
TABLES
andTABLE_OPTIONS
for metadata about tablesCOLUMNS
andCOLUMN_FIELD_PATHS
for metadata about columns and fields
TABLES
and TABLE_OPTIONS
also contain high-level information about views.
For detailed information, query VIEWS
instead.
Required permissions
For TABLES
and TABLE_OPTIONS
, you must be granted these permissions:
bigquery.tables.get
bigquery.tables.list
bigquery.routines.get
bigquery.routines.list
For COLUMNS
and COLUMN_FIELD_PATHS
, you must be granted these permissions:
bigquery.tables.get
bigquery.tables.list
Syntax
Queries against any of these views must have a dataset or region qualifier.
-- Returns metadata for tables in a single dataset.
SELECT * FROM myDataset.INFORMATION_SCHEMA.TABLES;
-- Returns metadata for tables in a region.
SELECT * FROM region-us.INFORMATION_SCHEMA.TABLES;
TABLES
view
When you query the INFORMATION_SCHEMA.TABLES
view, the query results contain
one row for each table or view in a dataset.
The INFORMATION_SCHEMA.TABLES
view has the following schema:
Column name | Data type | Value |
---|---|---|
TABLE_CATALOG |
STRING |
The project ID of the project that contains the dataset |
TABLE_SCHEMA |
STRING |
The name of the dataset that contains the table or view also referred
to as the datasetId |
TABLE_NAME |
STRING |
The name of the table or view also referred to as the tableId |
TABLE_TYPE |
STRING |
The table type:
|
IS_INSERTABLE_INTO |
STRING |
YES or NO depending on whether the table
supports DML INSERT
statements |
IS_TYPED |
STRING |
The value is always NO |
CREATION_TIME |
TIMESTAMP |
The table's creation time |
Examples
Example 1:
The following example retrieves table metadata for all of the tables in the
dataset named mydataset
. The query selects all of the columns from the
INFORMATION_SCHEMA.TABLES
view except for is_typed
, which is reserved for
future use. The metadata returned is for all tables in mydataset
in your
default project — myproject
.
mydataset
contains the following tables:
mytable1
: a standard BigQuery tablemyview1
: a BigQuery view
To run the query against a project other than your default project, add the
project ID to the dataset in the following format:
`project_id`.dataset.INFORMATION_SCHEMA.view
;
for example, `myproject`.mydataset.INFORMATION_SCHEMA.TABLES
.
To run the query:
Console
Open the BigQuery page in the Cloud Console.
Enter the following standard SQL query in the Query editor box.
INFORMATION_SCHEMA
requires standard SQL syntax. Standard SQL is the default syntax in the Cloud Console.SELECT * EXCEPT(is_typed) FROM mydataset.INFORMATION_SCHEMA.TABLES
Click Run.
bq
Use the query
command and specify standard SQL syntax by using the
--nouse_legacy_sql
or --use_legacy_sql=false
flag. Standard SQL syntax
is required for INFORMATION_SCHEMA
queries.
To run the query, enter:
bq query --nouse_legacy_sql \ 'SELECT * EXCEPT(is_typed) FROM mydataset.INFORMATION_SCHEMA.TABLES'
The results should look like the following:
+----------------+---------------+----------------+------------+--------------------+---------------------+ | table_catalog | table_schema | table_name | table_type | is_insertable_into | creation_time | +----------------+---------------+----------------+------------+--------------------+---------------------+ | myproject | mydataset | mytable1 | BASE TABLE | YES | 2018-10-29 20:34:44 | | myproject | mydataset | myview1 | VIEW | NO | 2018-12-29 00:19:20 | +----------------+---------------+----------------+------------+--------------------+---------------------+
Example 2:
The following example retrieves all tables of type BASE TABLE
from the
INFORMATION_SCHEMA.TABLES
view. The is_typed
column is excluded. The
metadata returned is for tables in mydataset
in your default project —
myproject
.
To run the query against a project other than your default project, add the
project ID to the dataset in the following format:
`project_id`.dataset.INFORMATION_SCHEMA.view
;
for example, `myproject`.mydataset.INFORMATION_SCHEMA.TABLES
.
To run the query:
Console
Open the BigQuery page in the Cloud Console.
Enter the following standard SQL query in the Query editor box.
INFORMATION_SCHEMA
requires standard SQL syntax. Standard SQL is the default syntax in the Cloud Console.SELECT * EXCEPT(is_typed) FROM mydataset.INFORMATION_SCHEMA.TABLES WHERE table_type="BASE TABLE"
Click Run.
bq
Use the query
command and specify standard SQL syntax by using the
--nouse_legacy_sql
or --use_legacy_sql=false
flag. Standard SQL syntax
is required for INFORMATION_SCHEMA
queries.
To run the query, enter:
bq query --nouse_legacy_sql \ 'SELECT * EXCEPT(is_typed) FROM mydataset.INFORMATION_SCHEMA.TABLES WHERE table_type="BASE TABLE"'
The results should look like the following:
+----------------+---------------+----------------+------------+--------------------+---------------------+ | table_catalog | table_schema | table_name | table_type | is_insertable_into | creation_time | +----------------+---------------+----------------+------------+--------------------+---------------------+ | myproject | mydataset | mytable1 | BASE TABLE | NO | 2018-10-31 22:40:05 | +----------------+---------------+----------------+------------+--------------------+---------------------+
TABLE_OPTIONS
view
When you query the INFORMATION_SCHEMA.TABLE_OPTIONS
view, the query results
contain one row for each table or view in a dataset.
The INFORMATION_SCHEMA.TABLE_OPTIONS
view has the following schema:
Column name | Data type | Value |
---|---|---|
TABLE_CATALOG |
STRING |
The project ID of the project that contains the dataset |
TABLE_SCHEMA |
STRING |
The name of the dataset that contains the table or view also referred
to as the datasetId |
TABLE_NAME |
STRING |
The name of the table or view also referred to as the tableId |
OPTION_NAME |
STRING |
One of the name values in the options table |
OPTION_TYPE |
STRING |
One of the data type values in the options table |
OPTION_VALUE |
STRING |
One of the value options in the options table |
Options table
OPTION_NAME |
OPTION_TYPE |
OPTION_VALUE |
---|---|---|
partition_expiration_days |
FLOAT64 |
The default lifetime, in days, of all partitions in a partitioned table |
expiration_timestamp |
FLOAT64 |
The time when this table expires |
kms_key_name |
STRING |
The name of the Cloud KMS key used to encrypt the table |
friendly_name |
STRING |
The table's descriptive name |
description |
STRING |
A description of the table |
labels |
ARRAY<STRUCT<STRING, STRING>> |
An array of STRUCT 's that represent the labels on the
table |
require_partition_filter |
BOOL |
Whether queries over the table require a partition filter |
enable_refresh |
BOOL |
Whether automatic refresh is enabled for a materialized view |
refresh_interval_minutes |
FLOAT64 |
How frequently a materialized view is refreshed |
Examples
Example 1:
The following example retrieves the default table expiration times for all
tables in mydataset
in your default project (myproject
) by querying the
INFORMATION_SCHEMA.TABLE_OPTIONS
view.
To run the query against a project other than your default project, add the
project ID to the dataset in the following format:
`project_id`.dataset.INFORMATION_SCHEMA.view
;
for example, `myproject`.mydataset.INFORMATION_SCHEMA.TABLE_OPTIONS
.
To run the query:
Console
Open the BigQuery page in the Cloud Console.
Enter the following standard SQL query in the Query editor box.
INFORMATION_SCHEMA
requires standard SQL syntax. Standard SQL is the default syntax in the Cloud Console.SELECT * FROM mydataset.INFORMATION_SCHEMA.TABLE_OPTIONS WHERE option_name="expiration_timestamp"
Click Run.
bq
Use the query
command and specify standard SQL syntax by using the
--nouse_legacy_sql
or --use_legacy_sql=false
flag. Standard SQL syntax
is required for INFORMATION_SCHEMA
queries.
To run the query, enter:
bq query --nouse_legacy_sql \ 'SELECT * FROM mydataset.INFORMATION_SCHEMA.TABLE_OPTIONS WHERE option_name="expiration_timestamp"'
The results should look like the following:
+----------------+---------------+------------+----------------------+-------------+--------------------------------------+ | table_catalog | table_schema | table_name | option_name | option_type | option_value | +----------------+---------------+------------+----------------------+-------------+--------------------------------------+ | myproject | mydataset | mytable1 | expiration_timestamp | TIMESTAMP | TIMESTAMP "2020-01-16T21:12:28.000Z" | | myproject | mydataset | mytable2 | expiration_timestamp | TIMESTAMP | TIMESTAMP "2021-01-01T21:12:28.000Z" | +----------------+---------------+------------+----------------------+-------------+--------------------------------------+
Example 2:
The following example retrieves metadata about all tables in mydataset
that
contain test data. The query uses the values in the description
option to find
tables that contain "test" anywhere in the description. mydataset
is in your
default project — myproject
.
To run the query against a project other than your default project, add the
project ID to the dataset in the following format:
`project_id`.dataset.INFORMATION_SCHEMA.view
;
for example,
`myproject`.mydataset.INFORMATION_SCHEMA.TABLE_OPTIONS
.
To run the query:
Console
Open the BigQuery page in the Cloud Console.
Enter the following standard SQL query in the Query editor box.
INFORMATION_SCHEMA
requires standard SQL syntax. Standard SQL is the default syntax in the Cloud Console.SELECT * FROM mydataset.INFORMATION_SCHEMA.TABLE_OPTIONS WHERE option_name="description" AND option_value LIKE "%test%"
Click Run.
bq
Use the query
command and specify standard SQL syntax by using the
--nouse_legacy_sql
or --use_legacy_sql=false
flag. Standard SQL syntax
is required for INFORMATION_SCHEMA
queries.
To run the query, enter:
bq query --nouse_legacy_sql \ 'SELECT * FROM mydataset.INFORMATION_SCHEMA.TABLE_OPTIONS WHERE option_name="description" AND option_value LIKE "%test%"'
The results should look like the following:
+----------------+---------------+------------+-------------+-------------+--------------+ | table_catalog | table_schema | table_name | option_name | option_type | option_value | +----------------+---------------+------------+-------------+-------------+--------------+ | myproject | mydataset | mytable1 | description | STRING | "test data" | | myproject | mydataset | mytable2 | description | STRING | "test data" | +----------------+---------------+------------+-------------+-------------+--------------+
COLUMNS
view
When you query the INFORMATION_SCHEMA.COLUMNS
view, the query results contain
one row for each column (field) in a table.
The INFORMATION_SCHEMA.COLUMNS
view has the following schema:
Column name | Data type | Value |
---|---|---|
TABLE_CATALOG |
STRING |
The project ID of the project that contains the dataset |
TABLE_SCHEMA |
STRING |
The name of the dataset that contains the table also referred to as
the datasetId |
TABLE_NAME |
STRING |
The name of the table or view also referred to as the tableId |
COLUMN_NAME |
STRING |
The name of the column |
ORDINAL_POSITION |
INT64 |
The 1-indexed offset of the column within the table; if it's a pseudo
column such as _PARTITIONTIME or _PARTITIONDATE, the value is
NULL |
IS_NULLABLE |
STRING |
YES or NO depending on whether the column's
mode allows NULL values |
DATA_TYPE |
STRING |
The column's standard SQL data type |
IS_GENERATED |
STRING |
The value is always NEVER |
GENERATION_EXPRESSION |
STRING |
The value is always NULL |
IS_STORED |
STRING |
The value is always NULL |
IS_HIDDEN |
STRING |
YES or NO depending on whether the column is
a pseudo column such as _PARTITIONTIME or _PARTITIONDATE |
IS_UPDATABLE |
STRING |
The value is always NULL |
IS_SYSTEM_DEFINED |
STRING |
YES or NO depending on whether the column is
a pseudo column such as _PARTITIONTIME or _PARTITIONDATE |
IS_PARTITIONING_COLUMN |
STRING |
YES or NO depending on whether the column is
a partitioning column |
CLUSTERING_ORDINAL_POSITION |
INT64 |
The 1-indexed offset of the column within the table's
clustering columns; the value is NULL if the table is not a
clustered table |
Examples
The following example retrieves metadata from the INFORMATION_SCHEMA.COLUMNS
view for the population_by_zip_2010
table in the
census_bureau_usa
dataset. This dataset is part of the BigQuery
public dataset program.
Because the table you're querying is in another project, the
bigquery-public-data
project, you add the project ID to the dataset in the
following format:
`project_id`.dataset.INFORMATION_SCHEMA.view
;
for example,
`bigquery-public-data`.census_bureau_usa.INFORMATION_SCHEMA.TABLES
.
The following columns are excluded from the query results because they are currently reserved for future use:
IS_GENERATED
GENERATION_EXPRESSION
IS_STORED
IS_UPDATABLE
To run the query:
Console
Open the BigQuery page in the Cloud Console.
Enter the following standard SQL query in the Query editor box.
INFORMATION_SCHEMA
requires standard SQL syntax. Standard SQL is the default syntax in the Cloud Console.SELECT * EXCEPT(is_generated, generation_expression, is_stored, is_updatable) FROM `bigquery-public-data`.census_bureau_usa.INFORMATION_SCHEMA.COLUMNS WHERE table_name="population_by_zip_2010"
Click Run.
bq
Use the query
command and specify standard SQL syntax by using the
--nouse_legacy_sql
or --use_legacy_sql=false
flag. Standard SQL syntax
is required for INFORMATION_SCHEMA
queries.
To run the query, enter:
bq query --nouse_legacy_sql \ 'SELECT * EXCEPT(is_generated, generation_expression, is_stored, is_updatable) FROM `bigquery-public-data`.census_bureau_usa.INFORMATION_SCHEMA.COLUMNS WHERE table_name="population_by_zip_2010"'
The results should look like the following. For readability, table_catalog
and
table_schema
are excluded from the results:
+------------------------+-------------+------------------+-------------+-----------+-----------+-------------------+------------------------+-----------------------------+ | table_name | column_name | ordinal_position | is_nullable | data_type | is_hidden | is_system_defined | is_partitioning_column | clustering_ordinal_position | +------------------------+-------------+------------------+-------------+-----------+-----------+-------------------+------------------------+-----------------------------+ | population_by_zip_2010 | zipcode | 1 | NO | STRING | NO | NO | NO | NULL | | population_by_zip_2010 | geo_id | 2 | YES | STRING | NO | NO | NO | NULL | | population_by_zip_2010 | minimum_age | 3 | YES | INT64 | NO | NO | NO | NULL | | population_by_zip_2010 | maximum_age | 4 | YES | INT64 | NO | NO | NO | NULL | | population_by_zip_2010 | gender | 5 | YES | STRING | NO | NO | NO | NULL | | population_by_zip_2010 | population | 6 | YES | INT64 | NO | NO | NO | NULL | +------------------------+-------------+------------------+-------------+-----------+-----------+-------------------+------------------------+-----------------------------+
COLUMN_FIELD_PATHS
view
Query results contain one row for each column
nested within a RECORD
(or
STRUCT
) column.
When you query the INFORMATION_SCHEMA.COLUMN_FIELD_PATHS
view, the query
results contain one row for each column
nested within a RECORD
(or STRUCT
) column.
The INFORMATION_SCHEMA.COLUMN_FIELD_PATHS
view has the following schema:
Column name | Data type | Value |
---|---|---|
TABLE_CATALOG |
STRING |
The project ID of the project that contains the dataset |
TABLE_SCHEMA |
STRING |
The name of the dataset that contains the table also referred to as
the datasetId |
TABLE_NAME |
STRING |
The name of the table or view also referred to as the tableId |
COLUMN_NAME |
STRING |
The name of the column |
FIELD_PATH |
STRING |
The path to a column nested within a `RECORD` or `STRUCT` column |
DATA_TYPE |
STRING |
The column's standard SQL data type |
DESCRIPTION |
STRING |
The column's description |
Examples
The following example retrieves metadata from the
INFORMATION_SCHEMA.COLUMN_FIELD_PATHS
view for the commits
table in the
github_repos
dataset.
This dataset is part of the BigQuery
public dataset program.
Because the table you're querying is in another project, the
bigquery-public-data
project, you add the project ID to the dataset in the
following format:
`project_id`.dataset.INFORMATION_SCHEMA.view
;
for example,
`bigquery-public-data`.github_repos.INFORMATION_SCHEMA.COLUMN_FIELD_PATHS
.
The commits
table contains the following nested and nested and repeated
columns:
author
: nestedRECORD
columncommitter
: nestedRECORD
columntrailer
: nested and repeatedRECORD
columndifference
: nested and repeatedRECORD
column
Your query will retrieve metadata about the author
and difference
columns.
To run the query:
Console
Open the BigQuery page in the Cloud Console.
Enter the following standard SQL query in the Query editor box.
INFORMATION_SCHEMA
requires standard SQL syntax. Standard SQL is the default syntax in the Cloud Console.SELECT * FROM `bigquery-public-data`.github_repos.INFORMATION_SCHEMA.COLUMN_FIELD_PATHS WHERE table_name="commits" AND column_name="author" OR column_name="difference"
Click Run.
bq
Use the query
command and specify standard SQL syntax by using the
--nouse_legacy_sql
or --use_legacy_sql=false
flag. Standard SQL syntax
is required for INFORMATION_SCHEMA
queries.
To run the query, enter:
bq query --nouse_legacy_sql \ 'SELECT * FROM `bigquery-public-data`.github_repos.INFORMATION_SCHEMA.COLUMN_FIELD_PATHS WHERE table_name="commits" AND column_name="author" OR column_name="difference"'
The results should look like the following. For readability, table_catalog
and
table_schema
are excluded from the results.
+------------+-------------+---------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------+-------------+ | table_name | column_name | field_path | data_type | description | +------------+-------------+---------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------+-------------+ | commits | author | author | STRUCT<name STRING, email STRING, time_sec INT64, tz_offset INT64, date TIMESTAMP> | NULL | | commits | author | author.name | STRING | NULL | | commits | author | author.email | STRING | NULL | | commits | author | author.time_sec | INT64 | NULL | | commits | author | author.tz_offset | INT64 | NULL | | commits | author | author.date | TIMESTAMP | NULL | | commits | difference | difference | ARRAY<STRUCT<old_mode INT64, new_mode INT64, old_path STRING, new_path STRING, old_sha1 STRING, new_sha1 STRING, old_repo STRING, new_repo STRING>> | NULL | | commits | difference | difference.old_mode | INT64 | NULL | | commits | difference | difference.new_mode | INT64 | NULL | | commits | difference | difference.old_path | STRING | NULL | | commits | difference | difference.new_path | STRING | NULL | | commits | difference | difference.old_sha1 | STRING | NULL | | commits | difference | difference.new_sha1 | STRING | NULL | | commits | difference | difference.old_repo | STRING | NULL | | commits | difference | difference.new_repo | STRING | NULL | +------------+-------------+---------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------+-------------+
Advanced example
The following advanced example queries the INFORMATION_SCHEMA.TABLES
,
TABLE_OPTIONS
, and COLUMNS
views to retrieve metadata about the
tables in mydataset
in your default project — myproject
. mydataset
contains 2 tables:
mytable1
: Uses the same schema as thecommits
table in thegithub_repos
public datasetmytable2
: Uses the same schema as thepopulation_by_zip_2010
table in thecensus_bureau_usa
public dataset
The results are used by user-defined functions
to assemble the DDL
statements necessary to recreate the tables. You can then use the DDL statements
in the query results to recreate the tables in mydataset
.
To run the query against a project other than your default project, add the
project ID to the dataset in the following format:
`project_id`.dataset.INFORMATION_SCHEMA.view
;
for example, `myproject`.mydataset.INFORMATION_SCHEMA.TABLES
.
To run the query:
Console
Open the BigQuery page in the Cloud Console.
Enter the following standard SQL query in the Query editor box.
INFORMATION_SCHEMA
requires standard SQL syntax. Standard SQL is the default syntax in the Cloud Console.CREATE TEMP FUNCTION MakePartitionByExpression( column_name STRING, data_type STRING ) AS ( IF( column_name = '_PARTITIONTIME', 'DATE(_PARTITIONTIME)', IF( data_type = 'TIMESTAMP', CONCAT('DATE(', column_name, ')'), column_name ) ) ); CREATE TEMP FUNCTION MakePartitionByClause( columns ARRAY<STRUCT<column_name STRING, data_type STRING, is_nullable STRING, is_partitioning_column STRING, clustering_ordinal_position INT64>> ) AS ( IFNULL( CONCAT( 'PARTITION BY ', (SELECT MakePartitionByExpression(column_name, data_type) FROM UNNEST(columns) WHERE is_partitioning_column = 'YES'), '\n'), '' ) ); CREATE TEMP FUNCTION MakeClusterByClause( columns ARRAY<STRUCT<column_name STRING, data_type STRING, is_nullable STRING, is_partitioning_column STRING, clustering_ordinal_position INT64>> ) AS ( IFNULL( CONCAT( 'CLUSTER BY ', (SELECT STRING_AGG(column_name, ', ' ORDER BY clustering_ordinal_position) FROM UNNEST(columns) WHERE clustering_ordinal_position IS NOT NULL), '\n' ), '' ) ); CREATE TEMP FUNCTION MakeNullable(data_type STRING, is_nullable STRING) AS ( IF(not STARTS_WITH(data_type, 'ARRAY<') and is_nullable = 'NO', ' NOT NULL', '') ); CREATE TEMP FUNCTION MakeColumnList( columns ARRAY<STRUCT<column_name STRING, data_type STRING, is_nullable STRING, is_partitioning_column STRING, clustering_ordinal_position INT64>> ) AS ( IFNULL( CONCAT( '(\n', (SELECT STRING_AGG(CONCAT(' ', column_name, ' ', data_type, MakeNullable(data_type, is_nullable)), ',\n') FROM UNNEST(columns)), '\n)\n' ), '' ) ); CREATE TEMP FUNCTION MakeOptionList( options ARRAY<STRUCT<option_name STRING, option_value STRING>> ) AS ( IFNULL( CONCAT( 'OPTIONS (\n', (SELECT STRING_AGG(CONCAT(' ', option_name, '=', option_value), ',\n') FROM UNNEST(options)), '\n)\n'), '' ) ); WITH Components AS ( SELECT CONCAT('`', table_catalog, '.', table_schema, '.', table_name, '`') AS table_name, ARRAY_AGG( STRUCT(column_name, data_type, is_nullable, is_partitioning_column, clustering_ordinal_position) ORDER BY ordinal_position ) AS columns, (SELECT ARRAY_AGG(STRUCT(option_name, option_value)) FROM mydataset.INFORMATION_SCHEMA.TABLE_OPTIONS AS t2 WHERE t.table_name = t2.table_name) AS options FROM mydataset.INFORMATION_SCHEMA.TABLES AS t LEFT JOIN mydataset.INFORMATION_SCHEMA.COLUMNS USING (table_catalog, table_schema, table_name) WHERE table_type = 'BASE TABLE' GROUP BY table_catalog, table_schema, t.table_name ) SELECT CONCAT( 'CREATE OR REPLACE TABLE ', table_name, '\n', MakeColumnList(columns), MakePartitionByClause(columns), MakeClusterByClause(columns), MakeOptionList(options)) FROM Components
bq
Use the query
command and specify standard SQL syntax by using the
--nouse_legacy_sql
or --use_legacy_sql=false
flag. Standard SQL syntax
is required for INFORMATION_SCHEMA
queries.
To run the query, enter:
QUERY_TEXT=$(cat <<ENDQUERY
CREATE TEMP FUNCTION MakePartitionByExpression(
column_name STRING, data_type STRING
) AS (
IF(
column_name = '_PARTITIONTIME',
'DATE(_PARTITIONTIME)',
IF(
data_type = 'TIMESTAMP',
CONCAT('DATE(', column_name, ')'),
column_name
)
)
);
CREATE TEMP FUNCTION MakePartitionByClause(
columns ARRAY<STRUCT<column_name STRING, data_type STRING, is_nullable STRING, is_partitioning_column STRING, clustering_ordinal_position INT64>>
) AS (
IFNULL(
CONCAT(
'PARTITION BY ',
(SELECT MakePartitionByExpression(column_name, data_type)
FROM UNNEST(columns) WHERE is_partitioning_column = 'YES'),
'\n'),
''
)
);
CREATE TEMP FUNCTION MakeClusterByClause(
columns ARRAY<STRUCT<column_name STRING, data_type STRING, is_nullable STRING, is_partitioning_column STRING, clustering_ordinal_position INT64>>
) AS (
IFNULL(
CONCAT(
'CLUSTER BY ',
(SELECT STRING_AGG(column_name, ', ' ORDER BY clustering_ordinal_position)
FROM UNNEST(columns) WHERE clustering_ordinal_position IS NOT NULL),
'\n'
),
''
)
);
CREATE TEMP FUNCTION MakeNullable(data_type STRING, is_nullable STRING)
AS (
IF(not STARTS_WITH(data_type, 'ARRAY<') and is_nullable = 'NO', ' NOT NULL', '')
);
CREATE TEMP FUNCTION MakeColumnList(
columns ARRAY<STRUCT<column_name STRING, data_type STRING, is_nullable STRING, is_partitioning_column STRING, clustering_ordinal_position INT64>>
) AS (
IFNULL(
CONCAT(
'(\n',
(SELECT STRING_AGG(CONCAT(' ', column_name, ' ', data_type, MakeNullable(data_type, is_nullable)), ',\n')
FROM UNNEST(columns)),
'\n)\n'
),
''
)
);
CREATE TEMP FUNCTION MakeOptionList(
options ARRAY<STRUCT<option_name STRING, option_value STRING>>
) AS (
IFNULL(
CONCAT(
'OPTIONS (\n',
(SELECT STRING_AGG(CONCAT(' ', option_name, '=', option_value), ',\n') FROM UNNEST(options)),
'\n)\n'),
''
)
);
WITH Components AS (
SELECT
CONCAT('\`', table_catalog, '.', table_schema, '.', table_name, '\`') AS table_name,
ARRAY_AGG(
STRUCT(column_name, data_type, is_nullable, is_partitioning_column, clustering_ordinal_position)
ORDER BY ordinal_position
) AS columns,
(SELECT ARRAY_AGG(STRUCT(option_name, option_value))
FROM mydataset.INFORMATION_SCHEMA.TABLE_OPTIONS AS t2
WHERE t.table_name = t2.table_name) AS options
FROM mydataset.INFORMATION_SCHEMA.TABLES AS t
LEFT JOIN mydataset.INFORMATION_SCHEMA.COLUMNS
USING (table_catalog, table_schema, table_name)
WHERE table_type = 'BASE TABLE'
GROUP BY table_catalog, table_schema, t.table_name
)
SELECT
CONCAT(
'CREATE OR REPLACE TABLE ',
table_name,
'\n',
MakeColumnList(columns),
MakePartitionByClause(columns),
MakeClusterByClause(columns),
MakeOptionList(options))
FROM Components
ENDQUERY
)
bq query --nouse_legacy_sql "$QUERY_TEXT"
The output should look like the following:
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | f0_ | +-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | CREATE OR REPLACE TABLE `myproject.mydataset.population_by_zip_2010` | | ( | | zipcode STRING NOT NULL, | | geo_id STRING, | | minimum_age INT64, | | maximum_age INT64, | | gender STRING, | | population INT64 | | ) | | OPTIONS ( | | expiration_timestamp=TIMESTAMP "2019-04-17T02:10:32.055Z" | | ) | | CREATE OR REPLACE TABLE `myproject.mydataset.commits` | | ( | | commit STRING, | | tree STRING, | | parent ARRAY<STRING>, | | author STRUCT<name STRING, email STRING, time_sec INT64, tz_offset INT64, date TIMESTAMP>, | | committer STRUCT<name STRING, email STRING, time_sec INT64, tz_offset INT64, date TIMESTAMP>, | | subject STRING, | | message STRING, | | trailer ARRAY<STRUCT<key STRING, value STRING, email STRING>>, | | difference ARRAY<STRUCT<old_mode INT64, new_mode INT64, old_path STRING, new_path STRING, old_sha1 STRING, new_sha1 STRING, old_repo STRING, new_repo STRING>>, | | difference_truncated BOOL, | | repo_name ARRAY<STRING>, | | encoding STRING | | ) | | OPTIONS ( | | expiration_timestamp=TIMESTAMP "2019-04-17T03:12:03.248Z" | | ) | +-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+