This page provides an overview of Data Manipulation Language (DML) support for partitioned tables.
For more information on DML, see:
Tables used in examples
The following JSON schema definitions represent the tables used in the examples on this page.
mytable
: an ingestion-time partitioned table
[ {"name": "field1", "type": "INTEGER"}, {"name": "field2", "type": "STRING"} ]
mytable2
: a standard (non-partitioned) table
[ {"name": "id", "type": "INTEGER"}, {"name": "ts", "type": "TIMESTAMP"} ]
mycolumntable
: a partitioned table
that is partitioned by using the ts
TIMESTAMP
column
[ {"name": "field1", "type": "INTEGER"}, {"name": "field2", "type": "STRING"} {"name": "field3", "type": "BOOLEAN"} {"name": "ts", "type": "TIMESTAMP"} ]
Inserting data
You use a DML INSERT
statement
to add rows to a partitioned table.
Inserting data into ingestion-time partitioned tables
When you use a DML statement to add rows to an ingestion-time partitioned table,
you can specify the partition to which the rows should be added. You reference
the partition using the _PARTITIONTIME
pseudo column.
For example, the following INSERT
statement adds a row to the May 1, 2017
partition of mytable
— “2017-05-01”
.
INSERT INTO project_id.dataset.mytable (_PARTITIONTIME, field1, field2) SELECT TIMESTAMP("2017-05-01"), 1, "one"
Only timestamps that correspond to exact date boundaries can be used. For example, the following DML statement returns an error:
INSERT INTO project_id.dataset.mytable (_PARTITIONTIME, field1, field2) SELECT TIMESTAMP("2017-05-01 21:30:00"), 1, "one"
Inserting data into partitioned tables
Inserting data into a partitioned table using DML is the same as inserting data into a non-partitioned table.
For example, the following INSERT
statement adds rows to partitioned table
mycolumntable
by selecting data from mytable2
(a non-partitioned table).
INSERT INTO project_id.dataset.mycolumntable (ts, field1) SELECT ts, id FROM project_id.dataset.mytable2
Deleting data
You use a DML DELETE
statement
to delete rows from a partitioned table.
Deleting data in ingestion-time partitioned tables
The following DELETE
statement deletes all rows from the June 1, 2017
partition ("2017-06-01"
) of mytable
where field1
is equal to 21
. You
reference the partition using the _PARTITIONTIME
pseudo column.
DELETE project_id.dataset.mytable WHERE field1 = 21 AND _PARTITIONTIME = "2017-06-01"
Deleting data in partitioned tables
Deleting data in a partitioned table using DML is the same as deleting data from a non-partitioned table.
For example, the following DELETE
statement deletes all rows from the
June 1, 2017 partition ("2017-06-01"
) of mycolumntable
where field1
is
equal to 21
.
DELETE project_id.dataset.mycolumntable WHERE field1 = 21 AND DATE(ts) = "2017-06-01"
Updating data
You use an UPDATE
statement
to update rows in a partitioned table.
Updating data in ingestion-time partitioned tables
The following UPDATE
statement moves rows from one partition to another.
Rows in the May 1, 2017 partition (“2017-05-01”
) of mytable
where field1
is equal to 21
are moved to the June 1, 2017 partition (“2017-06-01”
).
UPDATE project_id.dataset.mytable SET _PARTITIONTIME = "2017-06-01" WHERE _PARTITIONTIME = "2017-05-01" AND field1 = 21
Updating data in partitioned tables
Updating data in a partitioned table using DML is the same as updating data
from a non-partitioned table. For example, the following UPDATE
statement moves rows from one partition to another. Rows in the May 1, 2017
partition (“2017-05-01”
) of mytable
where field1
is equal to 21
are
moved to the June 1, 2017 partition (“2017-06-01”
).
UPDATE project_id.dataset.mycolumntable SET ts = "2017-06-01" WHERE DATE(ts) = "2017-05-01" AND field1 = 21
Using a MERGE
statement
You use a DML MERGE
statement
to combine INSERT
, UPDATE
, and DELETE
operations for a partitioned table
into one statement and perform them atomically.
Pruning partitions when using a MERGE
statement
When you run a MERGE
statement against a partitioned table, you can limit the
partitions involved in the statement by using the _PARTITIONTIME
pseudo column
(for ingestion-time partitioned tables) or by using the date or timestamp column
(for partitioned tables). Pruning partitions reduces cost and improves query
performance.
You can use partition pruning conditions in the following places: in a subquery
filter, a search_condition
filter, or a merge_condition
filter.
Each of the examples below queries an ingestion-time partitioned table using
the _PARTITIONTIME
pseudo column.
Using a subquery to filter source data
You can use a filter in a subquery to prune partitions. For example, in the
following MERGE
statement, only the rows in the '2018-01-01'
partition in
the source table are scanned.
MERGE dataset.target T USING (SELECT * FROM dataset.source WHERE _PARTITIONTIME = '2018-01-01') S ON T.c1 = S.c1 WHEN MATCHED THEN DELETE
Using a filter in the search_condition
of a when_clause
The query optimizer attempts to use a filter in a search_condition
to prune
partitions. For example, in the following MERGE
statement, only the rows in
the following partitions are scanned in the target table: '2018-01-01'
,
'2018-01-02'
, and '2018-01-03'
.
MERGE dataset.target T USING dataset.source S ON T.c1 = S.c1 WHEN MATCHED AND T._PARTITIONTIME = '2018-01-01' THEN UPDATE SET c1 = S.c1 WHEN MATCHED AND T._PARTITIONTIME = '2018-01-02' THEN UPDATE SET c1 = c1 + 10 WHEN NOT MATCHED BY SOURCE AND T._PARTITIONTIME = '2018-01-03' THEN DELETE
In the following example, the WHEN NOT MATCHED BY SOURCE
clause needs all data
from the target table. As a result, all partitions are scanned, and you are
charged for the bytes read in all partitions.
MERGE dataset.target T USING dataset.source S ON T.c1 = S.c1 WHEN MATCHED AND T._PARTITIONTIME = '2018-01-01' THEN UPDATE SET c1 = S.c1 WHEN NOT MATCHED BY SOURCE THEN UPDATE SET c1 = c1 + 1
In general, when you use the WHEN NOT MATCHED
and WHEN NOT MATCHED BY SOURCE
clauses together, BigQuery assumes a FULL OUTER JOIN
between the
source and target tables. Normally, partitions cannot be pruned in a FULL OUTER
JOIN
. However, if a constant false predicate
is used, the filter condition can be used for partition pruning. The following
query uses partition pruning to scan only the '2018-01-01'
partition in both
the target and source tables.
MERGE dataset.target T USING dataset.source S ON FALSE WHEN NOT MATCHED AND _PARTITIONTIME = '2018-01-01' THEN INSERT(c1) VALUES(c1) WHEN NOT MATCHED BY SOURCE AND _PARTITIONTIME = '2018-01-01' THEN DELETE
Using a filter in a merge_condition
The query optimizer attempts to use a filter in a merge_condition
to prune
partitions. For example, following query will only scan the '2018-01-01'
partition in both the target and source tables.
MERGE dataset.target T USING dataset.source S ON T.c1 = S.c1 AND T._PARTITIONTIME = '2018-01-01' AND S._PARTITIONTIME = '2018-01-01' WHEN MATCHED THEN UPDATE SET c1 = S.c1
In this example, the merge_condition
is used as a predicate to join the source
and target tables. The query optimizer may or may not be able to use predicate
push down (depending on the type of join).
In the following example, the MERGE
statement does not allow partition pruning
because the partition filter is a predicate in a join condition that cannot be
applied directly on the table.
MERGE dataset.target T USING dataset.source S ON T.c1 = S.c1 AND T._PARTITIONTIME = '2018-01-01' WHEN NOT MATCHED BY SOURCE THEN UPDATE SET c1 = S.c1
Limitations
For information on DML limitations, see Limitations on the Data Manipulation Language page.
Quotas
For information on DML quota information, see DML statements on the Quotas and limits page.
Pricing
For information on DML pricing, see DML pricing for partitioned tables.
Next steps
- Learn how to create and use ingestion-time partitioned tables
- Learn how to create and use partitioned tables
- Learn how to query partitioned tables
- Get an introduction to DML
- Learn how to compose DML statements using DML syntax