Querying Partitioned Tables

Before you begin

Create a partitioned table. The _PARTITIONTIME pseudo column is available only in tables that are created as partitioned tables.

Listing partitions in a table

legacy SQL

To list partitions in a table, query the table's summary partition by using the partition decorator separator ($) followed by __PARTITIONS_SUMMARY__. For example, the following command retrieves the partition IDs for table1:

 #legacySQL
 SELECT
   partition_id
 FROM
   [mydataset.table1$__PARTITIONS_SUMMARY__]

standard SQL

To list partitions in a table, use the _PARTITIONTIME pseudo column.

#standardSQL
SELECT
  _PARTITIONTIME as pt
FROM
  `[DATASET].[TABLE]`
GROUP BY 1

Where:

  • [DATASET] is a dataset in your project.
  • [TABLE] is the name of the table you're creating.

Checking when a partition was last modified

To list the times when the partitions in a table were last modified, use legacy SQL to select the last_modified_time field from the table's summary partition. Standard SQL does not support the partition decorator separator ($). For example, the following command lists the partition ID and last modified time for all partitions in table1 that is in a dataset named mydataset:

#legacySQL
SELECT
  partition_id,
  last_modified_time
FROM
  [mydataset.table1$__PARTITIONS_SUMMARY__]

Sample output:

+--------------+--------------------+
| partition_id | last_modified_time |
+--------------+--------------------+
| 20160102     |      1471632556179 |
| 20160101     |      1471632538142 |
| 20160103     |      1471632570463 |
+--------------+--------------------+

To display the last_modified_time field in human-readable format, use the FORMAT_UTC_USEC function:

#legacySQL
SELECT
  partition_id,
  FORMAT_UTC_USEC(last_modified_time*1000) AS last_modified
FROM
  [mydataset.table1$__PARTITIONS_SUMMARY__]

Sample output:

+--------------+----------------------------+
| partition_id |       last_modified        |
+--------------+----------------------------+
| 20160103     | 2016-08-19 18:49:30.463000 |
| 20160102     | 2016-08-19 18:49:16.179000 |
| 20160101     | 2016-08-19 18:48:58.142000 |
+--------------+----------------------------+

Limiting the number of partitions queried

Use the _PARTITIONTIME pseudo column to limit the number of partitions scanned during a query. For example, the following query scans only the partitions between the dates January 1, 2016 and January 2, 2016 from the partitioned table named temps:

# Query works in legacy and standard SQL
SELECT
  temp
FROM
  mydataset.temps
WHERE
  _PARTITIONTIME BETWEEN TIMESTAMP('2016-01-01')
  AND TIMESTAMP('2016-01-02');

In legacy SQL, to use the _PARTITIONTIME filter effectively to limit the number of scanned partitions, use a subquery on table1 with the filter in the subquery:

#legacySQL
# Scans only the specified partition
SELECT
  t1.field1,
  t2.field1
FROM (
  SELECT
    field1,
    field2
  FROM
    mydataset.table1
  WHERE
    _PARTITIONTIME = TIMESTAMP('2016-03-28')) t1
CROSS JOIN
  mydataset.table2 t2
WHERE
  t1.field2 = "one"

The following query limits some partitions that are based on part of the filter condition, _PARTITIONTIME BETWEEN TIMESTAMP('2017-01-01') AND TIMESTAMP('2017-03-01'). It does not limit it based on the condition that involves the subquery.

#legacySQL
# Scans partitions of table2 for dates between January 1, 2017 and March 1, 2017
SELECT
  field1
FROM
  mydataset.table2
WHERE
  _PARTITIONTIME BETWEEN TIMESTAMP('2017-01-01') AND TIMESTAMP('2017-03-01')
  AND _PARTITIONTIME = (SELECT MAX(timestamp) from mydataset.table1)

Scanning all partitions

The following examples use the _PARTITIONTIME pseudo column and scan all the partitions in a table.

In legacy SQL, the _PARTITIONTIME filter works only when the filter is specified as closely as possible to the table name. For example, the following query scans all partitions in table1 despite the presence of the _PARTITIONTIME filter:

#legacySQL
# Scans all partitions on t1
SELECT
  t1.field1,
  t2.field1
FROM
  mydataset.table2 t1
CROSS JOIN
  mydataset.table2 t2
WHERE
  t1._PARTITIONTIME = TIMESTAMP('2016-03-28')
  AND t1.field2 = "one"

Do not include any other columns in the _PARTITIONTIME filter. For example, the following query does not limit the partitions that are scanned because field2 is a column in the table and BigQuery can't determine in advance which partitions to select.

# Scans all partitions of table2
SELECT
  field1
FROM
  mydataset.table2
WHERE
  _PARTITIONTIME + field2 = TIMESTAMP('2016-03-28');

Filters on _PARTITIONTIME that include subqueries can't be used to limit the number of partitions scanned for a partitioned table. For example, the following query does not limit the partitions scanned on table mydataset.table2

# Scans all partitions of table2
SELECT
  field1
FROM
  mydataset.table2
WHERE
  _PARTITIONTIME = (SELECT MAX(timestamp) FROM mydataset.table1)

Next steps

Monitor your resources on the go

Get the Google Cloud Console app to help you manage your projects.

Send feedback about...