Partitioned Tables

A partitioned table is a special table that is divided into segments, called partitions, that make it easier to manage and query your data. By dividing a large table into smaller partitions, you can improve query performance and reduce the number of bytes that are billed by restricting the amount of data that is scanned. BigQuery offers date-partitioned tables, which means that the table is divided into a separate partition for each date.

Schema of partitions

A partition does not have a schema by itself. All partitions in a partitioned table share the schema of the table. If you need to load data into a partition with a schema that is not the same as the schema of the table, you need to update the schema of the table before loading the data. Alternatively, you can use schema update options to update the schema of the table as part of a load or query job. See configuration.load.schemaUpdateOptions.

The _PARTITIONTIME pseudo column

Partitioned tables include a pseudo column named _PARTITIONTIME that contains a date-based timestamp for data that is loaded into the table. The timestamp is based on UTC time and represents the number of microseconds since the unix epoch. For example, if data is appended to a table on April 15, 2016, all of the rows of data that are appended on that day contain the value TIMESTAMP("2016-04-15") in the _PARTITIONTIME column.

The _PARTITIONTIME column name is reserved, which means that you cannot create a column by that name in any of your tables.

To select the value of _PARTITIONTIME, you must use an alias. For example, the following query selects _PARTITIONTIME by assigning the alias pt to the pseudo column:

SELECT
  _PARTITIONTIME AS pt,
  field1
FROM
  mydataset.table1

Data in the streaming buffer has a NULL value for the _PARTITIONTIME column.

The __UNPARTITIONED__ partition

The __UNPARTITIONED__ partition temporarily holds data that is streamed to a partitioned table while it is in the streaming buffer. Data that is streamed directly to a specific partition of a partitioned table does not use the __UNPARTITIONED__ partition. Instead, the data is streamed directly to the partition. See streaming into partitioned tables.

To query data in the __UNPARTITIONED__ partition, use the _PARTITIONTIME pseudo column with the NULL value.

SELECT
  field1
FROM
  [DATASET_ID].[PARTITIONED_TABLE]
WHERE
  _PARTITIONTIME IS NULL

Where [DATASET_ID] is the name of the dataset that stores the table and [PARTITIONED_TABLE] is the name of the partitioned table.

Addressing table partitions

With legacy SQL, you can use partition decorators to reference data from a specific partition. Partition decorators take the form:

[TABLE_NAME]$YYYYMMDD

where [TABLE_NAME] is the name of a partitioned table, and YYYYMMDD represents a date. For example, the partition for May 19, 2016 in a table named mydataset.table can be referenced using:

mydataset.table$20160519

The partition decorator separator ($) is a special variable in the unix shell. You might have to escape the decorator when you use the command-line tool. For example, the following commands escape the partition decorator:

bq query 'SELECT * from mydataset.table$20160519'
bq query "SELECT * from mydataset.table\$20160519"

BigQuery treats a table name with a partition decorator as a table, which means that a table name with a partition decorator can be used as a table in any api that receives a table name.

Partition expiration

Partition expiration is based on the elapsed time since the partition date. Partitions are deleted a set amount of time after the partition date, and the expiration happens regardless of when the partition was updated.

For example, if the partition expiration is set to 60 days and data is written to partition 20170101 on January 30, 2017, then the data expires on January 1, 2017 + 60 days.

For projects that have partitioned tables that were created before December 13, 2016, the partition expiration is based on the last date that the partition was modified. This behavior also applies to new tables created in these projects. To migrate your project to the new behavior, open a request in the BigQuery issue tracker.

For information on setting the expiration date, see Setting the expiration of a partition.

Partitioned table limits

Each partitioned table can have up to 2,500 partitions.

Daily limit: 2,000 partition updates per table, per day.

Rate limit: 50 partition updates every 10 seconds.

Long-term storage pricing

Each partition of a partitioned table is considered separately for long-term storage pricing. If a partition hasn't been modified in the last 90 days, the data in that partition is considered long term storage and is charged at the discounted price.

For more information on partitioned table pricing, see long term storage pricing.

What's next

Monitor your resources on the go

Get the Google Cloud Console app to help you manage your projects.

Send feedback about...