This topic describes how to write a commit timestamp for each insert and update operation that you perform with Spanner.
Overview of commit timestamps
The commit timestamp, based on TrueTime technology, is the time when a transaction is committed in the database. You can atomically store the commit timestamp of a transaction into a column. Using the commit timestamps stored in tables, you can determine the exact ordering of mutations and build features like changelogs.
To insert commit timestamps in your database, complete the following steps:
Create a column of type
SPANNER.COMMIT_TIMESTAMP
. For example:CREATE TABLE Performances ( ... LastUpdateTime SPANNER.COMMIT_TIMESTAMP NOT NULL, ... PRIMARY KEY (...) ) ;
If you are performing inserts or updates with DML, use the
SPANNER.PENDING_COMMIT_TIMESTAMP()
function to write the commit timestamp.If you are performing inserts or updates with prepared statements or mutations, use the placeholder string
SPANNER.COMMIT_TIMESTAMP()
for your commit timestamp column. You can also use the commit timestamp constant provided by the client library. For example, this constant in the Java client isValue.COMMIT_TIMESTAMP
.
When Spanner commits the transaction by using these placeholders as column values, the actual commit timestamp is written to the specified column. You can then use this column value to create a history of updates to the table.
Commit timestamp values are not guaranteed to be unique. Transactions that write to non-overlapping sets of fields might have the same timestamp. Transactions that write to overlapping sets of fields have unique timestamps.
Spanner commit timestamps have microsecond granularity,
and they are converted to nanoseconds when stored in SPANNER.COMMIT_TIMESTAMP
columns.
Keys and indexes
You can use a commit timestamp column as a primary key column or as a non-key
column. Primary keys can be defined as ASC
or DESC
.
ASC
(default) - Ascending keys are ideal for answering queries from a specific time forward.DESC
- Descending keys keep the latest rows at the top of the table. They provide quick access to the latest records.
Avoid hotspots
Using commit timestamps under the following scenarios creates hotspots, which reduce data performance:
Commit timestamp column as the first part of the primary key of a table.
CREATE TABLE Users ( LastAccess SPANNER.COMMIT_TIMESTAMP NOT NULL, UserId bigint NOT NULL, ... PRIMARY KEY (LastAccess, UserId) ) ;
Commit timestamp primary key column as the first part of a secondary index.
CREATE INDEX UsersByLastAccess ON Users(LastAccess)
or
CREATE INDEX UsersByLastAccessAndName ON Users(LastAccess, FirstName)
Hotspots reduce data performance, even with low write rates. There is no performance overhead if commit timestamps are enabled on non-key columns that are not indexed.
Add a commit timestamp column to an existing table
To add a commit timestamp column to an existing table, use the ALTER TABLE
statement. For example to add a LastUpdateTime
column to the Performances
table, use the following statement:
ALTER TABLE Performances ADD COLUMN LastUpdateTime SPANNER.COMMIT_TIMESTAMP;
Write a commit timestamp using a DML statement
You use the SPANNER.PENDING_COMMIT_TIMESTAMP()
function to write the commit
timestamp in a DML statement. Spanner selects the commit timestamp when the transaction
commits.
The following DML statement updates the LastUpdateTime
column in the
Performances
table with the commit timestamp:
UPDATE Performances SET LastUpdateTime = SPANNER.PENDING_COMMIT_TIMESTAMP()
WHERE SingerId=1 AND VenueId=2 AND EventDate="2015-10-21"
Insert a row using a mutation
When inserting a row, Spanner writes the commit timestamp value only
if you include the column in the column list and pass the
spanner.commit_timestamp()
placeholder string (or client library constant)
as its value. For example:
C++
C#
Go
Java
Node.js
PHP
Python
Ruby
If you have mutations on rows in multiple tables, you must specify
spanner.commit_timestamp()
(or client library constant) for the commit
timestamp column in each table.
Update a row using a mutation
When updating a row, Spanner writes the commit timestamp value only
if you include the column in the column list and pass the
spanner.commit_timestamp()
placeholder string (or client library constant)
as its value. You cannot update the primary key of a row. To update
the primary key, delete the existing row and create a new row.
For example, to update a commit timestamp column named LastUpdateTime
:
C++
C#
Go
Java
Node.js
PHP
Python
Ruby
If you have mutations on rows in multiple tables, you must specify
spanner.commit_timestamp()
(or the client library
constant) for the commit timestamp column in each table.
Query a commit timestamp column
The following example queries the commit timestamp column of the table.
C++
C#
Go
Java
Node.js
PHP
Python
Ruby
Provide your own value for the commit timestamp column
In your code, you can provide your own value for the commit timestamp column
instead of passing spanner.commit_timestamp()
(or the available client library
constant) as the
column value. The value must be a timestamp in the past. This restriction
ensures that writing timestamps is an inexpensive and fast operation. An
easy way to confirm that a value is in the past
is to compare it to the value returned by the CURRENT_TIMESTAMP
SQL function.
The server returns a FailedPrecondition
error if a future
timestamp is specified.
Create a changelog
Suppose that you want to create a changelog of every mutation that happens to a table and then use that changelog for auditing. An example would be a table that stores the history of changes to word processing documents. The commit timestamp makes creating the changelog easier, because the timestamps can enforce ordering of the changelog entries. You could build a changelog that stores the history of changes to a given document using a schema like the following example:
CREATE TABLE Documents (
UserId int8 NOT NULL,
DocumentId int8 NOT NULL,
Contents text NOT NULL,
PRIMARY KEY (UserId, DocumentId)
);
CREATE TABLE DocumentHistory (
UserId int8 NOT NULL,
DocumentId int8 NOT NULL,
Ts SPANNER.COMMIT_TIMESTAMP NOT NULL,
Delta text,
PRIMARY KEY (UserId, DocumentId, Ts)
) INTERLEAVE IN PARENT Documents;
To create a changelog, insert a new row in DocumentHistory
in the same
transaction in which you insert or update a row in Document
. In the insertion
of the new row in DocumentHistory
, use the placeholder
spanner.commit_timestamp()
(or client library constant) to tell
Spanner to write the commit timestamp into column Ts
. Interleaving
the DocumentsHistory
table with the Documents
table will allow for data
locality and more efficient inserts and updates. However, it also adds the
constraint that the parent and child rows
must be deleted together. To keep the rows in DocumentHistory
after rows
in Documents
are deleted, do not interleave the tables.
Optimize recent-data queries with commit timestamps
Commit timestamps enable a Spanner optimization that can reduce query I/O when retrieving data written after a particular time.
To activate this optimization, a query's WHERE
clause must include a
comparison between a table's commit timestamp column and a specific time
that you provide, with the following attributes:
Provide the specific time as a constant expression: a literal, a parameter, or a function whose own arguments evaluate to constants.
Compare whether the commit timestamp is more recent than the given time, through either the
>
or>=
operators.Optionally, add further restrictions to the
WHERE
clause withAND
. Extending the clause withOR
disqualifies the query from this optimization.
For example, consider the following Performances
table, which includes
a commit timestamp column:
CREATE TABLE Performances (
SingerId bigint NOT NULL,
VenueId bigint NOT NULL,
EventDate timestamp with time zone NOT NULL,
Revenue bigint,
LastUpdateTime spanner.commit_timestamp,
PRIMARY KEY(SingerId, VenueId, EventDate)
);
This query benefits from the commit-timestamp optimization described earlier, because it has a greater-than-or-equal-to comparison between the table's commit timestamp column and a constant expression—in this case, a literal:
SELECT * FROM Performances WHERE LastUpdateTime >= '2022-01-01';