This document describes how to create and use tables partitioned by a DATE
,
TIMESTAMP
, or DATETIME
column. For information on ingestion-time partitioned
tables, see Creating and using ingestion-time partitioned tables. For
information on integer range partitioned tables, see Creating and using integer
range partitioned tables.
After creating a partitioned table, you can:
- Control access to your table data
- Get information about your partitioned tables
- List the partitioned tables in a dataset
- Get partitioned table metadata using meta-tables
For more information on managing partitioned tables including updating partitioned table properties, copying a partitioned table, and deleting a partitioned table, see Managing partitioned tables.
Limitations
Partitioned tables are subject to the following limitations:
- The partitioning column must be either a scalar
DATE
,TIMESTAMP
, orDATETIME
column. While the mode of the column can beREQUIRED
orNULLABLE
, it cannot beREPEATED
(array-based). - The partitioning column must be a top-level field. You cannot use a leaf field
from a
RECORD
(STRUCT
) as the partitioning column. - You cannot use legacy SQL to query partitioned tables or to write query results to partitioned tables.
Creating partitioned tables
You can create a partitioned table in the following ways:
- By using the Cloud Console.
- By using a data definition language (DDL)
CREATE TABLE
statement with aPARTITION BY
clause containing apartition expression
. - By using the
bq
command-line tool'sbq mk
command. - Programmatically by calling the
tables.insert
API method. - From query results.
- When you load data.
Table naming
When you create a table in BigQuery, the table name must be unique per dataset. The table name can:
- Contain up to 1,024 characters.
- Contain Unicode characters in category L (letter), M (mark), N (number), Pc (connector, including underscore), Pd (dash), Zs (space). For more information, see General Category.
For example, the following are all valid table names: table-01
, ग्राहक
,
00_お客様
, étudiant
.
Required permissions
At a minimum, to create a table, you must be granted the following permissions:
bigquery.tables.create
permissions to create the tablebigquery.tables.updateData
to write data to the table by using a load job, a query job, or a copy jobbigquery.jobs.create
to run a query job, load job, or copy job that writes data to the table
Additional permissions such as bigquery.tables.getData
might be required to
access the data you're writing to the table.
The following predefined IAM roles include both
bigquery.tables.create
and bigquery.tables.updateData
permissions:
bigquery.dataEditor
bigquery.dataOwner
bigquery.admin
The following predefined IAM roles include bigquery.jobs.create
permissions:
bigquery.user
bigquery.jobUser
bigquery.admin
In addition, if a user has bigquery.datasets.create
permissions, when that
user creates a dataset, they are granted bigquery.dataOwner
access to it.
bigquery.dataOwner
access gives the user the ability to create and
update tables in the dataset.
For more information on IAM roles and permissions in BigQuery, see Predefined roles and permissions.
Daily partitioning versus hourly, monthly, or yearly partitioning
When using a TIMESTAMP
, DATETIME
, or DATE
column to partition data, you can
create partitions with either daily, hourly, monthly, or yearly granularity,
depending on your data and needs.
Daily partitioning is the default partitioning type and, when used with clustering, works well for most BigQuery use cases. In particular, daily partitioning is the better choice when your data is spread out over a wide range of dates, or if data is continuously added over time. If your data spans a wide range of dates, daily partitioning helps you remain under your table's partition limits.
Choose hourly partitioning instead if your tables have a high volume of data that spans a short date range (typically less than six months of timestamp values). With hourly partitioning, you can address data at hour-level granularity; for example, when appending, truncating, or deleting data from a particular partition.
Choose monthly or yearly partitioning if your tables have a relatively small amount of data for each day, but span a wide date range. This partitioning option is also recommended if your workflow requires frequently updating or adding rows that span a wide date range (for example, more than 500 dates). Use monthly or yearly partitioning along with clustering on your timestamp, date, or datetime partitioning column to achieve the best performance in these scenarios. See Time-unit partitioning with clustering for more details and examples.
Creating an empty partitioned table with a schema definition
You cannot create an empty partitioned table that does not have a schema definition. The schema is required in order to identify the column used to create the partitions.
When you create an empty partitioned table with a schema definition, you can do the following:
- Provide the schema inline using the
bq
command-line tool. - Specify a JSON schema file using the
bq
command-line tool. - Provide the schema in a table resource
when calling the API's
tables.insert
method.
For more information about specifying a table schema, see Specifying a schema.
After the partitioned table is created, you can do the following:
- Load data into it.
- Write query results to it.
- Copy data into it.
To create an empty partitioned table with a schema definition:
Console
In the Explorer panel, expand your project and select a dataset.
In the details panel, click Create table.
In the Create table panel, in the Source section:
- For Create table from, select Empty table.
In the Destination section:
- For Dataset name, choose the appropriate dataset, and in the Table name field, enter the name of the table you're creating.
- Verify that Table type is set to Native table.
In the Schema section, enter the schema definition. Enter schema information manually by doing one of the following:
Enable Edit as text and enter the table schema as a JSON array.
Click Add field and enter the schema information.
For Partition and cluster settings, click No partition, select Partition by field and choose the
DATE
,TIMESTAMP
, orDATETIME
column. This option is not available if the schema does not contain aDATE
,TIMESTAMP
, orDATETIME
column.(Optional) For Partitioning filter, click the Require partition filter box to require users to include a
WHERE
clause that specifies the partitions to query. Requiring a partition filter might reduce cost and improve performance. For more information, see Querying partitioned tables.(Optional) Click Advanced options and for Encryption, click Customer-managed key to use a Cloud Key Management Service key. If you leave the Google-managed key setting, BigQuery encrypts the data at rest.
Click Create table.
SQL
Data definition language (DDL) statements let you create and modify tables and views using standard SQL query syntax.
See more on Using Data Definition Language statements.
To create a partitioned table by using a DDL statement in the Cloud Console:
Open the BigQuery page in the Cloud Console.
Click Compose new query.
Type your
CREATE TABLE
DDL statement into the Query editor text area.The following query creates a table named
newtable
, with daily partitions based on thetransaction_date
DATE
column, and a partition expiration of three days.CREATE TABLE mydataset.newtable (transaction_id INT64, transaction_date DATE) PARTITION BY transaction_date OPTIONS ( partition_expiration_days=3, description="a table partitioned by transaction_date" )
The following query creates the same table, but has hourly partitions instead. Note the use of
TIMESTAMP_TRUNC
to delineate the timestamp at the hour mark:CREATE TABLE mydataset.newtable (transaction_id INT64, transaction_ts TIMESTAMP) PARTITION BY TIMESTAMP_TRUNC(transaction_ts, HOUR) OPTIONS ( partition_expiration_days=3, description="a table partitioned by transaction_ts" )
The following query creates the same table, but is monthly partitioned instead. Note the use of
TIMESTAMP_TRUNC
to delineate the timestamp at the month mark:CREATE TABLE mydataset.newtable (transaction_id INT64, transaction_ts TIMESTAMP) PARTITION BY TIMESTAMP_TRUNC(transaction_ts, MONTH) OPTIONS ( partition_expiration_days=3, description="a table partitioned by transaction_ts" )
The following query creates the same table, but is yearly partitioned instead. Note the use of
TIMESTAMP_TRUNC
to delineate the timestamp at the year mark:CREATE TABLE mydataset.newtable (transaction_id INT64, transaction_ts TIMESTAMP) PARTITION BY TIMESTAMP_TRUNC(transaction_ts, YEAR) OPTIONS ( partition_expiration_days=3, description="a table partitioned by transaction_ts" )
Click Run. When the query completes, the table appears in the list of tables for that dataset.
bq
Use the bq mk
command with the --table
flag (or -t
shortcut), the
--schema
flag, and the --time_partitioning_field
flag. You can supply
table's schema definition inline or by specifying a JSON schema file.
Optional parameters include --expiration
, --description
,
--time_partitioning_expiration
, --destination_kms_key
,
--require_partition_filter
, --time_partitioning_type
, and --label
.
If you are creating a table in a project other than your default project,
add the project ID to the dataset in the following format:
project_id:dataset
.
--destination_kms_key
is not demonstrated here. For more information on
using this flag, see Protecting data with Cloud Key Management Service keys.
Enter the following command to create an empty partitioned table with a schema definition:
bq mk --table \ --expiration integer1 \ --schema schema \ --time_partitioning_field column \ --time_partitioning_type unit_time \ --time_partitioning_expiration integer2 \ --[no]require_partition_filter \ --description "description" \ --label key:value, key:value \ project_id:dataset.table
Replace the following:
integer1
is the default lifetime (in seconds) for the table. The minimum value is 3600 seconds (one hour). The expiration time evaluates to the current UTC time plus the integer value. If you set the table's expiration time when you create a time-unit partitioned table, the dataset's default table expiration setting is ignored. Setting this value deletes the table and all partitions after the specified time.schema
is an inline schema definition in the formatfield:data_type, field:data_type
, or the path to the JSON schema file on your local machine.column
is the name of theTIMESTAMP
,DATETIME
, orDATE
column used to create the partitions.unit_time
is eitherDAY
,HOUR
,MONTH
, orYEAR
, based on the desired time-unit partitioning granularity. The default isDAY
iftime_partitioning_type
is unspecified.integer2
is the default lifetime (in seconds) for the table's partitions. There is no minimum value. The expiration time evaluates to the partition's date plus the integer value. The partition expiration is independent of the table's expiration but does not override it. If you set a partition expiration that is longer than the table's expiration, the table expiration takes precedence.description
is a description of the table in quotes.key:value
is the key-value pair that represents a label. You can enter multiple labels using a comma-separated list.project_id
is your project ID.dataset
is a dataset in your project.table
is the name of the partitioned table you're creating.
For more information about the bq mk
command, see
bq mk
.
When you specify the schema using the bq
command-line tool, you cannot include a
RECORD
(STRUCT
)
type, you cannot include a column description, and you
cannot specify the column's mode. All modes default to NULLABLE
. To
include descriptions, modes, and RECORD
types,
supply a JSON schema file
instead.
Examples:
Enter the following command to create an hourly partitioned table
named mypartitionedtable
in mydataset
in your default project. The
partitioning expiration is set to 86,400 seconds (1 day), the table's
expiration is set to 2,592,000 (1 30-day month), the description is set to
This is my partitioned table
, and the label is set to
organization:development
. The command uses the -t
shortcut instead of
--table
.
The --require_partition_filter
flag is used to require users
to include a WHERE
clause that specifies the partitions to query.
Requiring a partition filter might reduce cost and improve performance.
For more information, see Querying partitioned tables.
The schema is specified inline as:
ts:TIMESTAMP,column1:STRING,column2:INTEGER,column4:STRING
. The specified
TIMESTAMP
field ts
is used to partition the data by hour. For
hourly partitioning, use a TIMESTAMP
or DATETIME
column, rather than
a DATE
column.
bq mk -t \
--expiration 2592000 \
--schema 'ts:TIMESTAMP,column1:STRING,column2:INTEGER,column4:STRING' \
--time_partitioning_field ts \
--time_partitioning_type HOUR \
--time_partitioning_expiration 86400 \
--require_partition_filter \
--description "This is my partitioned table" \
--label org:dev \
mydataset.mypartitionedtable
Enter the following command to create a daily partitioned table named
mypartitionedtable
in myotherproject
, not your default project. The
partitioning expiration is set to 259,200 seconds (3 days), the description
is set to This is my partitioned table
, and the label is set to
organization:development
. The command uses the -t
shortcut instead of
--table
. This command does not specify a table expiration. If the dataset
has a default table expiration, it is applied. If the dataset has no default
table expiration, the table never expires, but the partitions
expire in 3 days.
The schema is specified in a local JSON file: /tmp/myschema.json
.
The schema definition includes a TIMESTAMP
field named ts
that is used
to partition the data by day.
bq mk -t \
--expiration 2592000 \
--schema /tmp/myschema.json \
--time_partitioning_field ts \
--time_partitioning_type DAY \
--time_partitioning_expiration 86400 \
--description "This is my partitioned table" \
--label org:dev \
myotherproject:mydataset.mypartitionedtable
After the table is created, you can use the bq
command-line tool to update the partitioned
table's
table expiration,
partition expiration,
description,
and labels.
API
Call the tables.insert
method with a defined table resource
that specifies the timePartitioning
property and the schema
property.
Go
Before trying this sample, follow the Go setup instructions in the BigQuery Quickstart Using Client Libraries. For more information, see the BigQuery Go API reference documentation.
Java
Before trying this sample, follow the Java setup instructions in the BigQuery Quickstart Using Client Libraries. For more information, see the BigQuery Java API reference documentation.
Node.js
Before trying this sample, follow the Node.js setup instructions in the BigQuery Quickstart Using Client Libraries. For more information, see the BigQuery Node.js API reference documentation.
Python
Before trying this sample, follow the Python setup instructions in the BigQuery Quickstart Using Client Libraries. For more information, see the BigQuery Python API reference documentation.
Creating partitioned tables from query results
To create a partitioned table from a query result, write the results to a new destination table. You can create a partitioned table by querying either a partitioned table or a non-partitioned table. You cannot change an existing standard table to a partitioned table using query results.
When you create a partitioned table from a query result, you must use standard SQL. Currently, legacy SQL is not supported for querying partitioned tables or for writing query results to partitioned tables.
Partition decorators enable you to write the query results to a specific partition. For example, to write the results to the May 1, 2016, partition, use the following partition decorator:
table_name$20160501
When writing query results to a specific partition using a partition decorator, the data that is being written to the partition must conform to the table's partitioning scheme. All rows written to the partition should have values that fall within the partition's date.
For example:
The following query retrieves data from February 1, 2018 and writes the data to
the $20180201
partition of table mytable
. The table has two columns —
a TIMESTAMP
column named TS
and an INT64
column named a
.
bq query \
--nouse_legacy_sql \
--destination_table=mytable$20180201 \
'SELECT
TIMESTAMP("2018-02-01") AS TS,
2 AS a'
The following query retrieves data from January 31, 2018, and attempts to write
the data to the $20180201
partition of mytable
. This query fails because the
data you're attempting to write doesn't fall within the partition's date.
bq query \
--nouse_legacy_sql \
--destination_table=T$20180201 \
'SELECT
TIMESTAMP("2018-01-31") as TS,
2 as a'
For information on appending to or restating (replacing) data in partitioned tables, see Appending to and overwriting partitioned table data. For more information on querying partitioned tables, see Querying partitioned tables.
Creating a partitioned table from a query result
To create a partitioned table from a query result:
Console
You cannot specify partitioning options for a destination table when you query data using the Cloud Console.
bq
Enter the bq query
command, specify the --destination_table
flag to
create a permanent table based on the query results, and specify the
--time_partitioning_field
flag to create a partitioned
destination table.
Specify the use_legacy_sql=false
flag to use standard SQL syntax.
To write the query results to a table that is not in your default project,
add the project ID to the dataset name in the following format:
project_id:dataset
.
(Optional) Supply the --location
flag and set the value to your
location.
Enter the following command to create a new, partitioned destination table from a query result:
bq --location=location query \ --destination_table project_id:dataset.table \ --time_partitioning_field column \ --time_partitioning_type unit_time --use_legacy_sql=false \ 'query'
Replace the following:
location
is the name of your location. The--location
flag is optional. For example, if you are using BigQuery in the Tokyo region, set the flag's value toasia-northeast1
. You can set a default value for the location using the.bigqueryrc
file.project_id
is your project ID.dataset
is the name of the dataset that will contain the new partitioned table.table
is the name of the partitioned table that you're creating using the query results.column
is the name of theTIMESTAMP
orDATE
column that's used to create the partitions.unit_time
is eitherDAY
,HOUR
,MONTH
, orYEAR
, based on the desired time-unit partitioning granularity. This defaults toDAY
iftime_partitioning_type
is unspecified.query
is a query in standard SQL syntax. You cannot use legacy SQL to query partitioned tables or to write query results to partitioned tables.
Examples:
Enter the following command to write query results to a partitioned
destination table named mypartitionedtable
in mydataset
. mydataset
is
in your default project. The query retrieves data from a non-partitioned
table: the
NHTSA Traffic Fatality public dataset.
The table's timestamp_of_crash
TIMESTAMP
column is used to create the
partitions.
bq query \ --destination_table mydataset.mypartitionedtable \ --time_partitioning_field timestamp_of_crash \ --use_legacy_sql=false \ 'SELECT state_number, state_name, day_of_crash, month_of_crash, year_of_crash, latitude, longitude, manner_of_collision, number_of_fatalities, timestamp_of_crash FROM `bigquery-public-data`.nhtsa_traffic_fatalities.accident_2016 LIMIT 100'
Enter the following command to write query results to a partitioned
destination table named mypartitionedtable
in mydataset
. mydataset
is
in myotherproject
, not your default project. The query retrieves data from
a non-partitioned table: the NHTSA Traffic Fatality public dataset.
The table's timestamp_of_crash
TIMESTAMP
column is used to create the
partitions.
bq query \ --destination_table myotherproject:mydataset.mypartitionedtable \ --time_partitioning_field timestamp_of_crash \ --use_legacy_sql=false \ 'SELECT state_number, state_name, day_of_crash, month_of_crash, year_of_crash, latitude, longitude, manner_of_collision, number_of_fatalities, timestamp_of_crash FROM `bigquery-public-data`.nhtsa_traffic_fatalities.accident_2016 LIMIT 100'
API
To save query results to a permanent partitioned table, call the
jobs.insert
method,
configure a query
job, and include a value for the destinationTable
and
timePartitioning
properties.
Specify your location in the location
property in the
jobReference
section of the job resource.
Creating a partitioned table when loading data
You can create a partitioned table by specifying partitioning options when you load data into a new table. You do not need to create an empty partitioned table before loading data into it. You can create the partitioned table and load your data at the same time.
When you load data into BigQuery, you can supply the table schema, or for supported data formats, you can use schema auto-detection.
Partition decorators enable you to load data into a specific partition. For
example, to load all data that is generated on May 1, 2016,
into the 20160501
partition, use the following partition decorator:
table_name$20160501
When loading data into a specific partition using a partition decorator, the data that is being loaded into the partition must conform to the table's partitioning scheme. All rows written to the partition should have values that fall within the partition's date.
For more information on loading data, see Introduction to loading data into BigQuery.
Time-unit partitioning with clustering
Time-unit partitioning can be used with clustering. A time-unit partitioned table with clustering first partitions its data by the time-unit boundaries (day, hour, month, or year) of the partitioning column. Then within each partition boundary, data is clustered further by the clustering columns.
As an example, this command creates a table with a daily partitioned column and a cluster.
bq mk --time_partitioning_type=DAY \ --time_partitioning_field=ts_column \ --clustering_fields=column1,column2 \ mydataset.mytable2 "ts_column:TIMESTAMP,column1:INTEGER,column2:STRING"
When you retrieve the format of the table, you will see that both daily timestamp partitioning and clustering are in effect:
bq show --format=prettyjson mydataset.mytable2 ... "clustering": { "fields": [ "column1", "column2" ] }, ... "timePartitioning": { "field": "ts_column", "type": "DAY" }, ...
If you are running over the limit of number of partitions per table, or if you have too little data spread across many partitions and are mutating it too often, consider using larger interval time-unit partitioning with clustering on the same partitioning column instead. This is the recommended way to use partitioned tables to stay within the partition limits.
For instance, this command creates a daily partitioned and clustered table on the same column:
bq mk --time_partitioning_type=DAY \ --time_partitioning_field=ts_column \ --clustering_fields=ts_column,column1 \ mydataset.mytable2 "ts_column:TIMESTAMP,column1:INTEGER,column2:STRING"
Here's another example for the preceding table, but with a larger time-unit partitioning interval:
bq mk --time_partitioning_type=MONTH \ --time_partitioning_field=ts_column \ --clustering_fields=ts_column,column1 \ mydataset.mytable2 "ts_column:TIMESTAMP,column1:INTEGER,column2:STRING"
Controlling access to partitioned tables
To configure access to tables and views, you can grant an IAM role to an entity at the following levels, listed in order of range of resources allowed (largest to smallest):
- a high level in the Google Cloud resource hierarchy such as the project, folder, or organization level
- the dataset level
- the table/view level
Access with any resource protected by IAM is additive. For example, if an entity does not have access at the high level such as a project, you could grant the entity access at the dataset level, and then the entity will have access to the tables and views in the dataset. Similarly, if the entity does not have access at the high level or the dataset level, you could grant the entity access at the table of view level.
Granting IAM roles at a higher level in the Google Cloud resource hierarchy such as the project, folder, or organization level gives the entity access to a broad set of resources. For example, granting a role to an entity at the project level gives that entity permissions that apply to all datasets throughout the project.
Granting a role at the dataset level specifies the operations an entity is allowed to perform on tables and views in that specific dataset, even if the entity does not have access at a higher level. For information on configuring dataset-level access controls, see Controlling access to datasets.
Granting a role at the table or view level specifies the operations an entity is allowed to perform on specific tables and views, even if the entity does not have access at a higher level. For information on configuring table-level access controls, see Controlling access to tables and views.
You can also create IAM custom roles. If you create a custom role, the permissions you grant depend on the specific operations you want the entity to be able to perform.
You can't set a "deny" permission on any resource protected by IAM.
For more information on roles and permissions, see:
- Understanding roles in the IAM documentation
- BigQuery Predefined roles and permissions
- Controlling access to datasets
- Controlling access to tables and views
- Restricting access with BigQuery Column-level security
Using partitioned tables
Getting information about partitioned tables
You can get information about tables in the following ways:
- Using the Cloud Console.
- Using the
bq show
command in thebq
command-line tool. - Calling the
tables.get
API method. - Using the client libraries.
Required permissions
At a minimum, to get information about tables, you must be granted
bigquery.tables.get
permissions. The following predefined IAM
roles include bigquery.tables.get
permissions:
bigquery.metadataViewer
bigquery.dataViewer
bigquery.dataOwner
bigquery.dataEditor
bigquery.admin
In addition, if a user has bigquery.datasets.create
permissions, when that
user creates a dataset, they are granted bigquery.dataOwner
access to it.
bigquery.dataOwner
access gives the user the ability to retrieve table
metadata.
For more information on IAM roles and permissions in BigQuery, see Access control.
Getting partitioned table information
To view information about a partitioned table:
Console
Open the BigQuery page in the Cloud Console.
In the Explorer panel, expand your project and dataset, then select the table.
In the details panel, click Details. This tab displays the table's description and table information.
Click the Schema tab to view the table's schema definition. Notice partitioned tables do not include the
_PARTITIONTIME
pseudo column.
bq
Issue the bq show
command to display all table information. Use the
--schema
flag to display only table schema information. The --format
flag can be used to control the output.
If you are getting information about a table in a project other than
your default project, add the project ID to the dataset in the following
format: project_id:dataset
.
bq show --schema --format=prettyjson project_id:dataset.table
Replace the following:
project_id
is your project ID.dataset
is the name of the dataset.table
is the name of the table.
Examples:
Enter the following command to display all information about mytable
in
mydataset
. mydataset
is in your default project.
bq show --format=prettyjson mydataset.mytable
Enter the following command to display all information about mytable
in
mydataset
. mydataset
is in myotherproject
, not your default project.
bq show --format=prettyjson myotherproject:mydataset.mytable
The output should look like the following:
{ "creationTime": "1563236533535", "description": "This is my partitioned table", "etag": "/ABcDEo7f8GHijKL2mnOpQr==", "expirationTime": "1565828533000", "id": "myproject:mydataset.mypartitionedtable", "kind": "bigquery#table", "labels": { "org": "dev" }, "lastModifiedTime": "1563236533576", "location": "US", "numBytes": "0", "numLongTermBytes": "0", "numRows": "0", "requirePartitionFilter": true, "schema": { "fields": [ { "name": "ts", "type": "TIMESTAMP" }, { "name": "column1", "type": "STRING" }, { "name": "column2", "type": "INTEGER" }, { "name": "column3", "type": "STRING" } ] }, "selfLink": "https://bigquery.googleapis.com/bigquery/v2/projects/myproject/datasets/mydataset/tables/mypartitionedtable", "tableReference": { "datasetId": "mydataset", "projectId": "myproject", "tableId": "mypartitionedtable" }, "timePartitioning": { "expirationMs": "86400000", "field": "ts", "requirePartitionFilter": true, "type": "DAY" }, "type": "TABLE" }
Enter the following command to display only schema information about
mytable
in mydataset
. mydataset
is in myotherproject
, not your
default project.
bq show --schema --format=prettyjson myotherproject:mydataset.mytable
The output should look like the following:
[ { "name": "ts", "type": "TIMESTAMP" }, { "name": "column1", "type": "STRING" }, { "name": "column2", "type": "INTEGER" }, { "name": "column3", "type": "STRING" } ]
API
Call the bigquery.tables.get
method and provide any relevant parameters.
Listing partitioned tables in a dataset
You can list tables in datasets (including partitioned tables) in the following ways:
- Using the Cloud Console.
- Using the
bq ls
command in thebq
command-line tool. - Calling the
tables.list
API method. - Using the client libraries.
Required permissions
At a minimum, to list tables in a dataset, you must be granted
bigquery.tables.list
permissions. The following predefined IAM
roles include bigquery.tables.list
permissions:
bigquery.user
bigquery.metadataViewer
bigquery.dataViewer
bigquery.dataEditor
bigquery.dataOwner
bigquery.admin
For more information on IAM roles and permissions in BigQuery, see Access control.
Listing partitioned tables
To list the tables in a dataset (including partitioned tables):
Console
Open the BigQuery page in the Cloud Console.
In the Explorer panel, expand your project and select a dataset.
Scroll through the list to see the tables in the dataset. Tables, partitioned tables, models, and views are identified by different icons.
bq
Issue the bq ls
command. The --format
flag can be used to control the
output. If you are listing tables in a project other than your default
project, add the project ID to the dataset in the following format:
project_id:dataset
.
bq ls --format=pretty project_id:dataset
Replace the following:
project_id
is your project ID.dataset
is the name of the dataset.
When you run the command, the Type
field displays either TABLE
or
VIEW
. For partitioned tables, the Time Partitioning
field displays
DAY
, the column used to create the partitions, and the partition
expiration time in milliseconds, if expiration time is specified.
For example:
+-------------------------+-------+----------------------+---------------------------------------------------+ | tableId | Type | Labels | Time Partitioning | +-------------------------+-------+----------------------+---------------------------------------------------+ | mytable | TABLE | department:shipping | DAY (field: source_date, expirationMs: 86400000) | | myview | VIEW | | | +-------------------------+-------+----------------------+---------------------------------------------------+
Examples:
Enter the following command to list tables in dataset mydataset
in your
default project.
bq ls --format=pretty mydataset
Enter the following command to list tables in dataset mydataset
in
myotherproject
.
bq ls --format=pretty myotherproject:mydataset
API
To list tables using the API, call the tables.list
method.
Listing partitions in partitioned tables
You can list the partitions in a partitioned table by
querying the __PARTITIONS_SUMMARY__
meta table using legacy SQL.
You can run the query by using the Cloud Console, by using the
bq query
command, or by calling the
jobs.insert
method
and configuring a query
job.
Required permissions
At a minimum, to run a query job that uses the __PARTITIONS_SUMMARY__
meta-table, you must be granted bigquery.jobs.create
permissions. The
following predefined IAM roles include bigquery.jobs.create
permissions:
bigquery.user
bigquery.jobUser
bigquery.admin
You must also be granted bigquery.tables.getData
permissions. The following
predefined IAM roles include bigquery.tables.getData
permissions:
bigquery.dataViewer
bigquery.dataEditor
bigquery.dataOwner
bigquery.admin
For more information on IAM roles in BigQuery, see Access control.
Listing partitions in a partitioned table
You can list partitions in a partitioned table using legacy SQL. To list partitions in a partitioned table:
Console
Open the BigQuery page in the Cloud Console.
Click the Compose new query button.
To query the
__PARTITIONS_SUMMARY__
meta-table, enter the following text into the Query editor:#legacySQL SELECT partition_id FROM [dataset.table$__PARTITIONS_SUMMARY__]
Replace the following:
dataset
is the dataset that contains the table.table
is the name of the table.
Click Run.
bq
Enter the following query using the bq query
command:
bq --location=location query \ --use_legacy_sql=true \ 'SELECT partition_id FROM [dataset.table$__PARTITIONS_SUMMARY__]'
Replace the following:
location
is the name of your location. The--location
flag is optional. For example, if you are using BigQuery in the Tokyo region, set the flag's value toasia-northeast1
. You can set a default value for the location using the.bigqueryrc
file.dataset
is the dataset that contains the table.table
is the name of the table.
API
Call the jobs.insert
method and configure a query
job that queries the table's
__PARTITIONS_SUMMARY__
meta table.
Getting partitioned table metadata using meta tables
You can get information about partitioned tables by using special tables called meta tables. Meta tables contain metadata such as the list of tables and views in a dataset. The meta tables are read only.
Currently, you cannot use the INFORMATION_SCHEMA
service to get partitioned
table metadata.
Getting partition metadata using meta tables
The __PARTITIONS_SUMMARY__
meta table is a special table whose contents
represent metadata about partitions in a time-partitioned table. The
__PARTITIONS_SUMMARY__
meta table is read-only.
To access metadata about the partitions in a time-partitioned table, use the
__PARTITIONS_SUMMARY__
meta-table in a query's SELECT
statement. You can run
the query by:
- Using the Cloud Console
- Using the
bq
command-line tool'sbq query
command - Calling the
jobs.insert
API method and configuring aquery
job - Using the client libraries
Currently, standard SQL does not support the partition decorator separator ($
)
so you cannot query __PARTITIONS_SUMMARY__
in standard SQL. A legacy SQL query
that uses the __PARTITIONS_SUMMARY__
meta-table looks like the following:
#legacySQL SELECT column FROM [dataset.table$__PARTITIONS_SUMMARY__]
Where:
- dataset is the name of your dataset.
- table is the name of the time-partitioned table.
- column is one of the following:
Value | Description |
---|---|
project_id |
Name of the project. |
dataset_id |
Name of the dataset. |
table_id |
Name of the time-partitioned table. |
partition_id |
Name (date) of the partition. |
creation_time |
The time at which the partition was created, in milliseconds since January 1, 1970 UTC. |
last_modified_time |
The time at which the partition was last modified, in milliseconds since January 1, 1970 UTC. |
Partition meta table permissions
At a minimum, to run a query job that uses the __PARTITIONS_SUMMARY__
meta-table, you must be granted bigquery.jobs.create
permissions. The
following predefined IAM roles include bigquery.jobs.create
permissions:
bigquery.user
bigquery.jobUser
bigquery.admin
You must also be granted bigquery.tables.getData
permissions. The following
predefined IAM roles include bigquery.tables.getData
permissions:
bigquery.dataViewer
bigquery.dataEditor
bigquery.dataOwner
bigquery.admin
For more information on IAM roles in BigQuery, see Access control.
Partition meta table examples
The following query retrieves all partition metadata for a time-partitioned
table named mydataset.mytable
.
Console
#legacySQL SELECT * FROM [mydataset.mytable$__PARTITIONS_SUMMARY__]
bq
bq query --use_legacy_sql=true ' SELECT * FROM [mydataset.mytable$__PARTITIONS_SUMMARY__]'
The output looks like the following:
+----------------+------------+----------------+--------------+---------------+--------------------+ | project_id | dataset_id | table_id | partition_id | creation_time | last_modified_time | +----------------+------------+----------------+--------------+---------------+--------------------+ | myproject | mydataset | mytable | 20160314 | 1517190224120 | 1517190224997 | | myproject | mydataset | mytable | 20160315 | 1517190224120 | 1517190224997 | +----------------+------------+----------------+--------------+---------------+--------------------+
The following query lists the times when the partitions in mydataset.mytable
were last modified.
Console
#legacySQL SELECT partition_id, last_modified_time FROM [mydataset.mytable$__PARTITIONS_SUMMARY__]
bq
bq query --use_legacy_sql=true ' SELECT partition_id, last_modified_time FROM [mydataset.mytable$__PARTITIONS_SUMMARY__]'
The output looks like the following:
+--------------+--------------------+ | partition_id | last_modified_time | +--------------+--------------------+ | 20160102 | 1471632556179 | | 20160101 | 1471632538142 | | 20160103 | 1471632570463 | +--------------+--------------------+
To display the last_modified_time
field in human-readable format, use the
FORMAT_UTC_USEC
function. For example:
Console
#legacySQL SELECT partition_id, FORMAT_UTC_USEC(last_modified_time*1000) AS last_modified FROM [mydataset.table1$__PARTITIONS_SUMMARY__]
bq
bq query --use_legacy_sql=true ' SELECT partition_id, FORMAT_UTC_USEC(last_modified_time*1000) AS last_modified FROM [mydataset.mytable$__PARTITIONS_SUMMARY__]'
The output looks like the following:
+--------------+----------------------------+ | partition_id | last_modified | +--------------+----------------------------+ | 20160103 | 2016-08-19 18:49:30.463000 | | 20160102 | 2016-08-19 18:49:16.179000 | | 20160101 | 2016-08-19 18:48:58.142000 | +--------------+----------------------------+
Next steps
- For an overview of partitioned table support in BigQuery, see Introduction to partitioned tables.
- To learn how to create and use ingestion-time partitioned tables, see Creating and using ingestion-time partitioned tables.
- To learn how to create and use integer range partitioned tables, see Creating and using integer range partitioned tables.
- To learn how to manage and update partitioned tables, see Managing partitioned tables.
- For information on querying partitioned tables, see Querying partitioned tables.