This document describes how to create and use ingestion-time partitioned tables in BigQuery. For information on column-based time partitioned tables, see Creating and using column-based time partitioned tables. For information on integer range partitioned tables, see Creating and using integer range partitioned tables.
After creating an ingestion-time partitioned table, you can:
- Control access to your table data
- Get information about your partitioned tables
- List the partitioned tables in a dataset
- Get partitioned table metadata using meta-tables
For more information on managing partitioned tables including updating partitioned table properties, copying a partitioned table, and deleting a partitioned table, see Managing partitioned tables.
Creating ingestion-time partitioned tables
You can create an ingestion-time partitioned table in BigQuery:
- By using a DDL
CREATE TABLE
statement with apartition_expression
- Manually by using the Cloud Console
- By using the
bq
command-line tool'sbq mk
command - Programmatically by calling the
tables.insert
API method - By using the client libraries
- From query results
- When you load data
- By converting date-sharded tables into a partitioned tables
Table naming
When you create a table in BigQuery, the table name must be unique per dataset. The table name can:
- Contain up to 1,024 characters.
- Contain Unicode characters in category L (letter), M (mark), N (number), Pc (connector, including underscore), Pd (dash), Zs (space). For more information, see General Category.
For example, the following are all valid table names: table-01
, ग्राहक
,
00_お客様
, étudiant
.
Some table names and table name prefixes are reserved. If you receive an error saying that your table name or prefix is reserved, then select a different name and try again.
Required permissions
At a minimum, to create a table, you must be granted the following permissions:
bigquery.tables.create
permissions to create the tablebigquery.tables.updateData
to write data to the table by using a load job, a query job, or a copy jobbigquery.jobs.create
to run a query job, load job, or copy job that writes data to the table
Additional permissions such as bigquery.tables.getData
might be required to
access the data you're writing to the table.
The following predefined IAM roles include both
bigquery.tables.create
and bigquery.tables.updateData
permissions:
bigquery.dataEditor
bigquery.dataOwner
bigquery.admin
The following predefined IAM roles include bigquery.jobs.create
permissions:
bigquery.user
bigquery.jobUser
bigquery.admin
In addition, if a user has bigquery.datasets.create
permissions, when that
user creates a dataset, they are granted bigquery.dataOwner
access to it.
bigquery.dataOwner
access gives the user the ability to create and
update tables in the dataset.
For more information on IAM roles and permissions in BigQuery, see Predefined roles and permissions.
Creating an empty ingestion-time partitioned table with a schema definition
When you create an empty time partitioned table with a schema definition, you can:
- Enter the schema by using the Cloud Console
- Provide the schema inline using the
bq
command-line tool - Submit a JSON schema file using the
bq
command-line tool - Provide the schema in a table resource
when calling the API's
tables.insert
method
For more information on specifying a table schema, see Specifying a schema.
After the time partitioned table is created, you can:
- Load data into it
- Write query results to it
- Copy data into it
To create an empty time partitioned table with a schema definition:
Console
In the Explorer panel, expand your project and select a dataset.
On the Create table page, in the Source section, select Create empty table.
On the Create table page, in the Destination section:
For Dataset name, choose the appropriate dataset.
In the Table name field, enter the name of the table you're creating in BigQuery.
Verify that Table type is set to Native table.
In the Schema section, enter the schema definition. Enter schema information manually by doing one of the following:
Enable Edit as text and enter the table schema as a JSON array.
Click Add field and enter the schema information.
In the Partition and cluster settings section:
- For Partitioning, click No partitioning and select
Partition by ingestion time.
- For Partitioning, click No partitioning and select
Partition by ingestion time.
(Optional) For Partitioning filter, click the Require partition filter box to require users to include a
WHERE
clause that specifies the partitions to query. Requiring a partition filter may reduce cost and improve performance. For more information, see Querying partitioned tables.(Optional) Click Advanced options and for Encryption, click Customer-managed key to use a Cloud Key Management Service key. If you leave the Google-managed key setting, BigQuery encrypts the data at rest.
Click Create table.
After the table is created, you can update the partitioned table's table expiration, description, and labels. You cannot add a partition expiration after a table is created using the Cloud Console.
bq
Use the mk
command with the --table
flag (or -t
shortcut), and the --
time_partitioning_type
flag, set to either DAY
, HOUR
, MONTH
, or YEAR
as appropriate to your partitioning type. You can supply table's schema
definition inline or via a JSON schema file. Optional parameters include
--expiration
, --description
, --time_partitioning_expiration
,
--destination_kms_key
, and --label
. If you are creating a table in a
project other than your default project, add the project ID to the dataset
in the following format:
project_id:dataset
.
--destination_kms_key
is not demonstrated here. For more information on
using this flag, see Protecting data with Cloud KMS keys.
Enter the following command to create an empty ingestion-time partitioned table with a schema definition:
bq mk --table \ --expiration integer1 \ --time_partitioning_type=unit_time \ --time_partitioning_expiration integer2 \ --description "description" \ -label key:value,key:value \ project_id:dataset.table \ schema
Where:
- integer1 is the default lifetime (in seconds) for the table. The minimum value is 3600 seconds (one hour). The expiration time evaluates to the current UTC time plus the integer value. If you set the table's expiration time when you create a partitioned table, the dataset's default table expiration setting is ignored. Setting this value deletes the table and all partitions after the specified time.
- unit_time is either
DAY
,HOUR
,MONTH
, orYEAR
, depending on your partitioning type. This defaults toDAY
whentime_partitioning_type
is not specified. - integer2 is the default lifetime (in seconds) for the table's partitions. There is no minimum value. The expiration time evaluates to the partition's date plus the integer value. The partition expiration is independent of the table's expiration but does not override it. If you set a partition expiration that is longer than the table's expiration, the table expiration takes precedence.
- description is a description of the table in quotes.
- key:value is the key:value pair that represents a label. You can enter multiple labels using a comma-separated list.
- project_id is your project ID.
- dataset is a dataset in your project.
- table is the name of the time partitioned table you're creating.
- schema is an inline schema definition in the format column:data_type,column:data_type or the path to the JSON schema file on your local machine.
When you specify the schema on the command line, you cannot include a
RECORD
(STRUCT
)
type, you cannot include a column description, and you cannot specify the
column's mode. All modes default to NULLABLE
. To include descriptions,
modes, and RECORD
types, supply a JSON schema file
instead.
Examples:
Enter the following command to create an ingestion-time partitioned table
named mytable
in mydataset
in your default project. The partitioning
type is set to DAY
, the partitioning expiration is set to 259,200 seconds
(3 days), the table's expiration is set to 2,592,000 (one 30-day month), the
description is set to This is my time partitioned table
, and the label is
set to organization:development
. The command uses the -t
shortcut
instead of --table
. The schema is specified inline as:
qtr:STRING,sales:FLOAT,year:STRING
.
bq mk -t \
--expiration 2592000 \
--time_partitioning_type=DAY \
--time_partitioning_expiration 259200 \
--description "This is my time partitioned table" \
--label organization:development \
mydataset.mytable \
qtr:STRING,sales:FLOAT,year:STRING
Enter the following command to create a time partitioned table named
mytable
in myotherproject
, not your default project. The partitioning
type is set to DAY
, the partitioning expiration is set to 259,200 seconds
(3 days), the description is set to This is my time partitioned table
, and
the label is set to organization:development
. The command uses the -t
shortcut instead of --table
. This command does not specify a table
expiration. If the dataset has a default table expiration, it is applied. If
the dataset has no default table expiration, the table will never expire,
but the partitions will expire in 3 days. The path to the schema file is
/tmp/myschema.json
.
bq mk -t \
--time_partitioning_type=DAY \
--time_partitioning_expiration 259200 \
--description "This is my time partitioned table" \
--label organization:development \
myotherproject:mydataset.mytable \
/tmp/myschema.json
After the table is created, you can update the ingestion-time partitioned table's table expiration, partition expiration, description, and labels.
API
Call the tables.insert
method with a defined table resource
that specifies the timePartitioning
property and the schema
property.
Creating an ingestion-time partitioned table from a query result
To create a partitioned table from a query result, write the results to a new destination table. You can create a new partitioned table by querying a partitioned table or a non-partitioned table. You cannot change an existing table to a partitioned table using query results.
For more information on querying time partitioned tables, see Querying partitioned tables.
Console
You cannot specify partitioning options for a destination table when you query data using the Cloud Console.
bq
Enter the bq query
command, specify the --destination_table
flag to
create a permanent table based on the query results, and specify the
--time_partitioning_type
flag to create an ingestion-time partitioned
destination table. DAY
, HOUR
, MONTH
, and YEAR
are the supported
values for --time_partitioning_type
, depending on the granularity of your
partition.
Specify the use_legacy_sql=false
flag to use standard SQL syntax.
To write the query results to a table that is not in your default project,
add the project ID to the dataset name in the following format:
project_id:dataset.
(Optional) Supply the --location
flag and set the value to your
location.
Enter the following command to create a new, ingestion-time partitioned destination table from a query result:
bq --location=location query \ --destination_table project_id:dataset.table \ --time_partitioning_type=unit_time \ --use_legacy_sql=false \ 'query'
Where:
- location is the name of your location. The
--location
flag is optional. For example, if you are using BigQuery in the Tokyo region, you can set the flag's value toasia-northeast1
. You can set a default value for the location using the .bigqueryrc file. - project_id is your project ID.
- dataset is the name of the dataset that will contain the new time partitioned table.
- table is the name of the time partitioned table you're creating
using the query results.
- unit_time is either
DAY
,HOUR
,MONTH
, orYEAR
, depending on your partitioning type. This defaults toDAY
whentime_partitioning_type
is unspecified.
- unit_time is either
- query is a query in standard SQL syntax.
Examples:
Enter the following command to write query results to an ingestion-time
daily partitioned destination table named mytable
in mydataset
. The
dataset is in your default project. The query retrieves data from a non-
partitioned table — the USA Name Data public dataset.
bq query \
--destination_table mydataset.mytable \
--time_partitioning_type=DAY \
--use_legacy_sql=false \
'SELECT
name,
number
FROM
`bigquery-public-data`.usa_names.usa_1910_current
WHERE
gender = "M"
ORDER BY
number DESC'
Enter the following command to write query results to an ingestion-time
partitioned destination table named mytable
in mydataset
, partitioned by
the date of data ingestion. The dataset is in myotherproject
, not your
default project. The query retrieves data from a non-partitioned table
— the USA Name Data public dataset.
bq query \
--destination_table myotherproject:mydataset.mytable \
--time_partitioning_type=DAY \
--use_legacy_sql=false \
'SELECT
name,
number
FROM
`bigquery-public-data`.usa_names.usa_1910_current
WHERE
gender = "M"
ORDER BY
number DESC'
API
To save query results to a permanent, ingestion-time partitioned table, call
the jobs.insert
method,
configure a query
job, and include a value for the destinationTable
and
timePartitioning
properties.
Creating an ingestion-time partitioned table when loading data
You can create an ingestion-time partitioned table by specifying partitioning options when you load data into a new table. You do not need to create an empty partitioned table before loading data into it. You can create the partitioned table and load your data at the same time.
When you load data into BigQuery, you can supply the table schema, or for supported data formats, you can use schema auto-detection.
Partition decorators enable you to load data into a specific partition. To adjust for timezones, use a partition decorator to load data into a partition based on your preferred timezone. For example, if you are on Pacific Standard Time (PST), load all data that is generated on May 1, 2016 PST into the partition for that date by using the corresponding partition decorator:
table_name$20160501
For more information on loading data, see Introduction to loading data into BigQuery.
Converting date-sharded tables into ingestion-time partitioned tables
If you have previously created date-sharded tables, you can convert the entire
set of related tables into a single ingestion-time partitioned table by using
the partition
command in the bq
command-line tool. The date-sharded tables
must use the following naming convention: table_YYYYMMDD
.
For example, mytable_20160101
, ... , mytable_20160331
.
Optional parameters include --time_partitioning_expiration
, --location
, and
--time_partitioning_type
. If unspecified, --time_partitioning_type
defaults
to DAY
. If your source tables or your destination table are in a project other
than your default project, add the project ID to the dataset name in the
following format: project_id:dataset.
Enter the following command to convert a series of date-sharded tables to a single ingestion-time partitioned table, partitioned by date:
bq --location=location partition \ --time_partitioning_type=DAY \ --time_partitioning_expiration integer \ project_id:dataset.source_table_ \ project_id:dataset.destination_table
Where:
- location is the name of your location. The
--location
flag is optional. For example, if you are using BigQuery in the Tokyo region, set the flag's value toasia-northeast1
. You can set a default value for the location using the .bigqueryrc file. - integer is the default lifetime (in seconds) for the table's partitions. There is no minimum value. The expiration time evaluates to the partition's UTC date plus the integer value. The partition expiration is independent of the table's expiration but does not override it. If you set a partition expiration that is longer than the table's expiration, the table expiration takes precedence.
- project_id is your project ID.
- dataset is a dataset in your project.
- sourcetable is the prefix of your date-sharded table.
- destination_table is the name of the partitioned table you're creating.
Examples:
Enter the following command to create an ingestion-time partitioned table named
mytable_partitioned
in mydataset
in your default project. The table will be
partitioned by date, with the partitioning expiration set to 259,200 seconds (3
days). The date sharded source tables are prefixed with sourcetable_
. The
source tables are also in your default project.
bq partition \
--time_partitioning_type=DAY \
--time_partitioning_expiration 259200 \
mydataset.sourcetable_ \
mydataset.mytable_partitioned
When you run the command, a copy job is created that generates partitions from
the sharded tables. For example, if the date-sharded tables were
sourcetable_20180126
and sourcetable_20180127
, the copy job would create the
following partitions: mydataset.mytable_partitioned$20180126
and
mydataset.mytable_partitioned$20180127
.
Enter the following command to create an ingestion-time partitioned table named
mytable_partitioned
in mydataset
. mydataset
is in myotherproject
, not
your default project. The table will be partitioned by date, with the
partitioning expiration set to 86,400 seconds (1 day). The date sharded source
tables are prefixed with sourcetable_
. The source tables are in your default
project.
bq partition \
--time_partitioning_type=DAY \
--time_partitioning_expiration 86400 \
mydataset.sourcetable_ \
myotherproject:mydataset.mytable_partitioned
Enter the following command to create an ingestion-time partitioned table named
mytable_partitioned
in mydataset
in your default project. mydataset
was
created in the asia-northeast1
region. The table will be partitioned by date,
with the partitioning expiration is set to 259,200 seconds (3 days). The date
sharded source tables are prefixed with sourcetable_
. The source tables are
also in your default project.
bq --location=asia-northeast1 partition \
--time_partitioning_type=DAY \
--time_partitioning_expiration 259200 \
mydataset.sourcetable_ \
mydataset.mytable_partitioned
Controlling access to time partitioned tables
To configure access to tables and views, you can grant an IAM role to an entity at the following levels, listed in order of range of resources allowed (largest to smallest):
- a high level in the Google Cloud resource hierarchy such as the project, folder, or organization level
- the dataset level
- the table/view level
Access with any resource protected by IAM is additive. For example, if an entity does not have access at the high level such as a project, you could grant the entity access at the dataset level, and then the entity will have access to the tables and views in the dataset. Similarly, if the entity does not have access at the high level or the dataset level, you could grant the entity access at the table of view level.
Granting IAM roles at a higher level in the Google Cloud resource hierarchy such as the project, folder, or organization level gives the entity access to a broad set of resources. For example, granting a role to an entity at the project level gives that entity permissions that apply to all datasets throughout the project.
Granting a role at the dataset level specifies the operations an entity is allowed to perform on tables and views in that specific dataset, even if the entity does not have access at a higher level. For information on configuring dataset-level access controls, see Controlling access to datasets.
Granting a role at the table or view level specifies the operations an entity is allowed to perform on specific tables and views, even if the entity does not have access at a higher level. For information on configuring table-level access controls, see Controlling access to tables and views.
You can also create IAM custom roles. If you create a custom role, the permissions you grant depend on the specific operations you want the entity to be able to perform.
You can't set a "deny" permission on any resource protected by IAM.
For more information on roles and permissions, see:
- Understanding roles in the IAM documentation
- BigQuery Predefined roles and permissions
- Controlling access to datasets
- Controlling access to tables and views
- Restricting access with BigQuery Column-level security
Using ingestion-time partitioned tables
Getting information about ingestion-time partitioned tables
You can get information about tables by:
- Using the Cloud Console
- Using the
bq show
command in thebq
command-line tool - Calling the
tables.get
API method - Using the client libraries
Required permissions
At a minimum, to get information about tables, you must be granted
bigquery.tables.get
permissions. The following predefined IAM
roles include bigquery.tables.get
permissions:
bigquery.metadataViewer
bigquery.dataViewer
bigquery.dataOwner
bigquery.dataEditor
bigquery.admin
In addition, if a user has bigquery.datasets.create
permissions, when that
user creates a dataset, they are granted bigquery.dataOwner
access to it.
bigquery.dataOwner
access gives the user the ability to get information about
tables in a dataset.
For more information on IAM roles and permissions in BigQuery, see Predefined roles and permissions.
Getting ingestion-time partitioned table information
To view information about an ingestion-time partitioned table:
Console
Open the BigQuery page in the Cloud Console.
In the Explorer panel, expand your project and dataset, then select the table.
Click the table name.
Click Details. This tab displays the table's description and table information.
Click the Schema tab to view the table's schema definition. Notice the additional pseudo column
_PARTITIONTIME
that contains the date-based timestamp for data that is loaded into the table.
bq
Issue the bq show
command to display all table information. Use the
--schema
flag to display only table schema information. The --format
flag can be used to control the output.
If you are getting information about a table in a project other than your default project, add the project ID to the dataset in the following format: project_id:dataset.
bq show --schema --format=prettyjson <var>project_id:dataset.table</var>
Where:
- project_id is your project ID.
- dataset is the name of the dataset.
- table is the name of the table.
Examples:
Enter the following command to display all information about mytable
in
mydataset
. mydataset
is in your default project.
bq show --format=prettyjson mydataset.mytable
Enter the following command to display all information about mytable
in
mydataset
. mydataset
is in myotherproject
, not your default project.
bq show --format=prettyjson myotherproject:mydataset.mytable
Enter the following command to display only schema information about
mytable
in mydataset
. mydataset
is in myotherproject
, not your
default project.
When you show schema information for a time partitioned table, the
_PARTITIONTIME
pseudo column is not displayed.
bq show --schema --format=prettyjson myotherproject:mydataset.mytable
API
Call the bigquery.tables.get
method and provide any relevant parameters.
Listing ingestion-time partitioned tables in a dataset
You can list tables in datasets (including partitioned tables) by using the
Cloud Console or the bq ls
command in the bq
command-line tool, or by
calling the tables.list
API method.
Required permissions
At a minimum, to list tables in a dataset, you must be granted
bigquery.tables.list
permissions. The following predefined IAM
roles include bigquery.tables.list
permissions:
bigquery.user
bigquery.metadataViewer
bigquery.dataViewer
bigquery.dataOwner
bigquery.dataEditor
bigquery.admin
For more information on IAM roles and permissions in BigQuery, see Predefined roles and permissions.
Listing partitioned tables
To list the tables in a dataset (including partitioned tables):
Console
Open the BigQuery page in the Cloud Console.
In the Explorer panel, expand your project and dataset, then select the table.
Scroll through the list to see the tables in the dataset. Tables, partitioned tables, models, and views are identified by different icons.
bq
Issue the bq ls
command. The --format
flag can be used to control the
output. If you are listing tables in a project other than your default
project, add the project ID to the dataset in the following format:
project_id:dataset.
bq ls --format=pretty project_id:dataset
Where:
- project_id is your project ID.
- dataset is the name of the dataset.
When you run the command, the Type
field displays either TABLE
or
VIEW
. For ingestion-time partitioned tables, the Time Partitioning
field
displays DAY
and the partition expiration time in milliseconds, if a
partition expiration is specified.
For example:
+-------------------------+-------+----------------------+--------------------------------+ | tableId | Type | Labels | Time Partitioning | +-------------------------+-------+----------------------+--------------------------------+ | mytable | TABLE | department:shipping | DAY (expirationMs: 259200000) | | myview | VIEW | | | +-------------------------+-------+----------------------+--------------------------------+
Examples:
Enter the following command to list tables in dataset mydataset
in your
default project.
bq ls --format=pretty mydataset
Enter the following command to list tables in dataset mydataset
in
myotherproject
.
bq ls --format=pretty myotherproject:mydataset
API
To list tables using the API, call the tables.list
method.
Listing partitions in ingestion-time partitioned tables
You can list the partitions in an ingestion-time partitioned table by querying
the _PARTITIONTIME
pseudo column in standard SQL or the
__PARTITIONS_SUMMARY__
meta table in legacy SQL.
You can run the query by using the Cloud Console, by using the
bq query
command in the bq
command-line tool, or by calling the
jobs.insert
method
and configuring a query
job.
Required permissions
To run a query job that uses the __PARTITIONS_SUMMARY__
meta table, or the
_PARTITIONTIME
pseudo column, you must be granted bigquery.jobs.create
permissions. The following predefined IAM roles include
bigquery.jobs.create
permissions:
bigquery.user
bigquery.jobUser
bigquery.admin
You must also be granted bigquery.tables.getData
permissions. The following
predefined IAM roles include bigquery.tables.getData
permissions:
bigquery.admin
bigquery.dataEditor
bigquery.dataOwner
bigquery.dataViewer
For more information on IAM roles and permissions in BigQuery, see Predefined roles and permissions.
Listing partitions in an ingestion-time partitioned table
You can list partitions in an ingestion-time partitioned table using standard SQL (preferred) or legacy SQL. To list partitions:
Standard SQL:
Console
Open the BigQuery page in the Cloud Console.
Click the Compose new query button.
Enter the following text into the Query editor box to query the
_PARTITIONTIME
pseudo column:SELECT _PARTITIONTIME as pt FROM `dataset.table` GROUP BY 1
Where:
- dataset is the dataset that contains the table.
- table is the name of the table.
(Optional) Click More and select Query settings.
Click Run.
bq
Enter the following query using the bq query
command:
bq --location=location query --use_legacy_sql=false ' SELECT _PARTITIONTIME as pt FROM `dataset.table` GROUP BY 1'
Where:
- location is the name of your location. The
--location
flag is optional. For example, if you are using BigQuery in the Tokyo region, you can set the flag's value toasia-northeast1
. You can set a default value for the location using the .bigqueryrc file. - dataset is the dataset that contains the table.
- table is the name of the table.
API
Call the jobs.insert
method
and configure a query
job that queries the table's _PARTITIONTIME
pseudo column.
Specify your location in the location
property in the
jobReference
section of the
job resource.
Legacy SQL:
Console
Open the BigQuery page in the Cloud Console.
Click the Compose new query button.
Enter the following text into the Query editor box to query the
__PARTITIONS_SUMMARY__
meta table:#legacySQL SELECT partition_id FROM [dataset.table$__PARTITIONS_SUMMARY__]
Where:
- dataset is the dataset that contains the table.
- table is the name of the table.
Click More and select Query settings.
For Processing location, click Auto-select and choose your data's location. You can leave processing location set to auto-select if your data is in the
US
orEU
multi-region location. When your data is in theUS
or theEU
, the processing location is automatically detected.Click Run.
bq
Enter the following query using the bq query
command:
bq --location=location query --use_legacy_sql=true ' SELECT partition_id FROM [dataset.table$__PARTITIONS_SUMMARY__]'
Where:
- location is the name of your location. The
--location
flag is optional. For example, if you are using BigQuery in the Tokyo region, you can set the flag's value toasia-northeast1
. You can set a default value for the location using the .bigqueryrc file. - dataset is the dataset that contains the table.
- table is the name of the table.
API
Call the jobs.insert
method
and configure a query
job that queries the __PARTITIONS_SUMMARY__
meta table.
Specify your location in the location
property in the
jobReference
section of the job resource.
Getting table metadata using meta tables
You can get information about partitioned tables by using special tables called meta tables. Meta tables contain metadata such as the list of tables and views in a dataset. The meta tables are read only.
Getting partition metadata using meta tables
The __PARTITIONS_SUMMARY__
meta table is a special table whose contents
represent metadata about partitions in a time-partitioned table. The
__PARTITIONS_SUMMARY__
meta table is read-only.
To access metadata about the partitions in a time-partitioned table, use the
__PARTITIONS_SUMMARY__
meta-table in a query's SELECT
statement. You can run
the query by:
- Using the Cloud Console
- Using the
bq
command-line tool'sbq query
command - Calling the
jobs.insert
API method and configuring aquery
job - Using the client libraries
Currently, standard SQL does not support the partition decorator separator ($
)
so you cannot query __PARTITIONS_SUMMARY__
in standard SQL. A legacy SQL query
that uses the __PARTITIONS_SUMMARY__
meta-table looks like the following:
#legacySQL SELECT column FROM [dataset.table$__PARTITIONS_SUMMARY__]
Where:
- dataset is the name of your dataset.
- table is the name of the time-partitioned table.
- column is one of the following:
Value | Description |
---|---|
project_id |
Name of the project. |
dataset_id |
Name of the dataset. |
table_id |
Name of the time-partitioned table. |
partition_id |
Name (date) of the partition. |
creation_time |
The time at which the partition was created, in milliseconds since January 1, 1970 UTC. |
last_modified_time |
The time at which the partition was last modified, in milliseconds since January 1, 1970 UTC. |
Partition meta table permissions
At a minimum, to run a query job that uses the __PARTITIONS_SUMMARY__
meta-table, you must be granted bigquery.jobs.create
permissions. The
following predefined IAM roles include bigquery.jobs.create
permissions:
bigquery.user
bigquery.jobUser
bigquery.admin
You must also be granted bigquery.tables.getData
permissions. The following
predefined IAM roles include bigquery.tables.getData
permissions:
bigquery.dataViewer
bigquery.dataEditor
bigquery.dataOwner
bigquery.admin
For more information on IAM roles in BigQuery, see Access control.
Partition meta table examples
The following query retrieves all partition metadata for a time-partitioned
table named mydataset.mytable
.
Console
#legacySQL SELECT * FROM [mydataset.mytable$__PARTITIONS_SUMMARY__]
bq
bq query --use_legacy_sql=true ' SELECT * FROM [mydataset.mytable$__PARTITIONS_SUMMARY__]'
The output looks like the following:
+----------------+------------+----------------+--------------+---------------+--------------------+ | project_id | dataset_id | table_id | partition_id | creation_time | last_modified_time | +----------------+------------+----------------+--------------+---------------+--------------------+ | myproject | mydataset | mytable | 20160314 | 1517190224120 | 1517190224997 | | myproject | mydataset | mytable | 20160315 | 1517190224120 | 1517190224997 | +----------------+------------+----------------+--------------+---------------+--------------------+
The following query lists the times when the partitions in mydataset.mytable
were last modified.
Console
#legacySQL SELECT partition_id, last_modified_time FROM [mydataset.mytable$__PARTITIONS_SUMMARY__]
bq
bq query --use_legacy_sql=true ' SELECT partition_id, last_modified_time FROM [mydataset.mytable$__PARTITIONS_SUMMARY__]'
The output looks like the following:
+--------------+--------------------+ | partition_id | last_modified_time | +--------------+--------------------+ | 20160102 | 1471632556179 | | 20160101 | 1471632538142 | | 20160103 | 1471632570463 | +--------------+--------------------+
To display the last_modified_time
field in human-readable format, use the
FORMAT_UTC_USEC
function. For example:
Console
#legacySQL SELECT partition_id, FORMAT_UTC_USEC(last_modified_time*1000) AS last_modified FROM [mydataset.table1$__PARTITIONS_SUMMARY__]
bq
bq query --use_legacy_sql=true ' SELECT partition_id, FORMAT_UTC_USEC(last_modified_time*1000) AS last_modified FROM [mydataset.mytable$__PARTITIONS_SUMMARY__]'
The output looks like the following:
+--------------+----------------------------+ | partition_id | last_modified | +--------------+----------------------------+ | 20160103 | 2016-08-19 18:49:30.463000 | | 20160102 | 2016-08-19 18:49:16.179000 | | 20160101 | 2016-08-19 18:48:58.142000 | +--------------+----------------------------+
Example: Creating an ingestion-time partitioned table using query results
In this example, you create an ingestion-time partitioned table by using the
bq
command-line tool, and you add data to three partitions using query results. The
table will contain weather data, partitioned by date, for the first three days
in 2016.
This example queries the NOAA GSOD weather dataset for temperatures for the first three days of 2016 and writes the results to the a partitioned table.
Step 1. Create an empty ingestion-time table, partitioned by date, named
temps
in a dataset named mydataset
. There is no need to specify a schema for
the table. The schema from the query results is used as the schema definition
for the table.
bq mk --time_partitioning_type=DAY mydataset.temps
To view the configuration settings, use the bq show
command:
bq show --format=prettyjson mydataset.temps
Look for the timePartitioning
entry in the bq show
command output:
{ ... "timePartitioning": { "type": "DAY" }, "type": "TABLE" }
Step 2. Query the NOAA GSOD weather dataset
for temperatures for the first three days of 2016 and write the results to the
respective partitions in the temps
table. The following queries use the
--destination_table
option and standard SQL syntax to write 100 rows of query
results to a partition.
Run the following query to write January 1, 2016 temperatures to partition
mydataset.temps$20160101
:bq query --use_legacy_sql=false \ --destination_table 'mydataset.temps$20160101' \ 'SELECT stn, temp FROM `bigquery-public-data`.noaa_gsod.gsod2016 WHERE mo="01" AND da="01" LIMIT 100'
Run the following query to write January 2, 2016 temperatures to partition
mydataset.temps$20160102
:bq --location=US query --use_legacy_sql=false \ --destination_table 'mydataset.temps$20160102' \ 'SELECT stn, temp FROM `bigquery-public-data`.noaa_gsod.gsod2016 WHERE mo="01" AND da="02" LIMIT 100'
Run the following query to write January 3, 2016 temperatures to partition
mydataset.temps$20160103
:bq --location=US query --use_legacy_sql=false \ --destination_table 'mydataset.temps$20160103' \ 'SELECT stn, temp FROM `bigquery-public-data`.noaa_gsod.gsod2016 WHERE mo="01" AND da="03" LIMIT 100'
Step 3. Confirm that you have 300 rows in your table by using the bq show
command.
bq show mydataset.temps
The results show the Schema and Total Rows.
Last modified Schema Total Rows Total Bytes Expiration Time Partitioning Labels kmsKeyName ----------------- ---------------- ------------ ------------- ----------------- ------------------- -------- ------------ 28 Jan 15:03:45 |- stn: string 300 4800 29 Jan 15:00:32 DAY |- temp: float
Next steps
- For an overview of partitioned table support in BigQuery, see Introduction to partitioned tables.
- To learn how to manage and update partitioned tables, see Managing partitioned tables.
- For information on querying partitioned tables, see Querying partitioned tables.
- For information on column-based time partitioned tables, see Creating and using column-based time partitioned tables.
- For information on integer-range partitioned tables, see Creating and using integer-range partitioned tables.