This page describes best practices for using Data Manipulation Language (DML) and Partitioned DML.
Use a WHERE clause to reduce the scope of locks
You execute DML statements inside read-write transactions. When Cloud Spanner reads data, it
acquires shared read locks on limited portions of the row ranges that you read. Specifically, it
acquires these locks only on the columns you access. The locks can include data that does not
match the filter condition of the
When Cloud Spanner modifies data using DML statements, it acquires exclusive locks on the specific data that you are modifying. In addition, it acquires shared locks in the same way as when you read data. If your request includes large row ranges, or an entire table, the shared locks might prevent other transactions from completing in parallel.
To modify data as efficiently as possible, use a
WHERE clause that enables
Cloud Spanner to read only the necessary rows. You can achieve this goal with a filter on the
primary key, or on the key of a secondary index. The
WHERE clause limits the scope of
the shared locks and enables Cloud Spanner to process the update more efficiently.
For example, suppose that one of the musicians in the
Singers table changes their
first name, and you need to update the name in your database. You could execute the following DML
statement, but it forces Cloud Spanner to scan the entire table and acquires shared locks that
cover the entire table. As a result, Cloud Spanner must read more data than necessary, and
concurrent transactions cannot modify the data in parallel:
-- ANTI-PATTERN: SENDING AN UPDATE WITHOUT THE PRIMARY KEY COLUMN -- IN THE WHERE CLAUSE UPDATE Singers SET FirstName = "Marcel" WHERE FirstName = "Marc" AND LastName = "Richards";
To make the update more efficient, include the
SingerId column in the
WHERE clause. The
SingerId column is the only primary key column for
-- RECOMMENDED: INCLUDING THE PRIMARY KEY COLUMN IN THE WHERE CLAUSE UPDATE Singers SET FirstName = "Marcel" WHERE FirstName = "Marc" AND LastName = "Richards" AND SingerId = 1;
Don't use DML statements and mutations in the same transaction
Cloud Spanner buffers insertions, updates, and deletions performed using DML statements on the server-side, and the results are visible to subsequent SQL and DML statements within the same transaction. This behavior is different from the Mutation API, where Cloud Spanner buffers the mutations on the client-side and sends the mutations server-side as part of the commit operation. As a result, mutations in the commit request are not visible to SQL or DML statements within the same transaction.
You might want to combine DML statements and mutations in the same
transaction, because some operations are only supported in the Mutation API. An
If a transaction contains both DML statements and mutations in the commit
request, Cloud Spanner executes the DML statements before the mutations. To
avoid having to account for the order of execution in your client library code,
you should use either DML statements or the mutations in a single
transaction, but not both. If you use both, you should buffer writes only at the
very end of the transaction.
Use the PENDING_COMMIT_TIMESTAMP function to write commit timestamps
You use the
PENDING_COMMIT_TIMESTAMP function to write the commit
timestamp in a DML statement. Cloud Spanner selects the commit timestamp when the transaction
Partitioned DML and date and timestamp functions
Use Batch DML to send multiple DMLs in a single request.
Use Batch DML to send multiple DMLs in a single request. This way, only one client-server round trip is needed, reducing latency.
Use a single DML statement with an array of data to insert or delete multiple rows of the same table
Consider the following scenarios. You want to
INSERT 1500 rows into the same
table in a single transaction. Or perhaps you want to
DELETE hundreds of rows
of a table in a single transaction. In both scenarios, you are working with a
single table, a single DML statement type and a single transaction.
In this case, make the change as a single DML statement with an array of data for better performance.
-- Inserting rows INSERT INTO Singers (SingerId, FirstName, LastName) SELECT * FROM UNNEST ([(4, 'Lea', 'Martin'), (5, 'David', 'Lomond'), (6, 'Elena', 'Campbell')]); -- Deleting rows DELETE FROM Albums WHERE (SingerId, AlbumId) IN UNNEST ([(6,8),(3,7)]);
For even better performance, define and use a parameter for the array data.
-- Inserting rows using a parameter -- @insertarray parameter = [(4, 'Lea', 'Martin'),(5, 'David', 'Lomond'),(6, 'Elena', 'Campbell')] INSERT INTO Singers (SingerId, FirstName, LastName) SELECT * FROM UNNEST ( @insertarray ); -- Deleting rows using a parameter -- @deletearray parameter = [(6,8),(3,7)] DELETE FROM Albums WHERE (SingerId, AlbumId) IN UNNEST ( @deletearray );