Manage data retention with TTL

This page discusses how to use time to live (TTL) on Spanner tables. To learn more, see About TTL.

Before you begin

Before you begin, follow these best practices.

Enable backup and point-in-time recovery

Before adding TTL to your table, we recommend enabling Spanner backup and restore. This lets you fully restore a database in case you accidentally delete your data with the TTL policy.

If you've enabled point-in-time recovery, you can view and restore deleted data—without a full restore from backup—if it is within the configured version retention period. For information on reading data in the past, see Perform a stale read.

Clean up old data

If this is the first time you're using TTL and you expect the first run to delete many rows, consider first cleaning up old data manually using partitioned DML. This gives you more control over the resource usage, instead of leaving it to the TTL background process. TTL runs at a low priority, ideal for incremental clean-up. However, this will likely lengthen the time it takes to delete the initial set of rows in a busy database because Spanner's internal work scheduler will prioritize other work, such as user queries.

Verify your conditions

For GoogleSQL tables, if you want to verify the data that the row deletion policy will affect before enabling TTL, you can query your table using the same conditions. For example:

GoogleSQL

  SELECT COUNT(*)
  FROM CalculatedRoutes
  WHERE TIMESTAMP_ADD(CreatedAt, INTERVAL 30 DAY) < CURRENT_TIMESTAMP();

Required permissions

To change the database's schema, you must have the spanner.databases.updateDdl permission. For details, see Access control for Spanner.

Create a row deletion policy

GoogleSQL

To create a row deletion policy using GoogleSQL, you can define a ROW DELETION POLICY clause when you create a new table, or add a policy to an existing table. This clause contains an expression of a column and an interval.

To add a policy at the time of table creation:

CREATE TABLE MyTable(
Key INT64,
CreatedAt TIMESTAMP,
) PRIMARY KEY (Key),
ROW DELETION POLICY (OLDER_THAN(timestamp_column, INTERVAL num_days DAY));

Where:

  • timestamp_column must be an existing column with type TIMESTAMP. Columns with commit timestamps are valid, as are generated columns. However, you cannot specify a generated column that references a commit timestamp column.

  • num_days is the number of days past the timestamp in the timestamp_column in which the row is marked for deletion. The value must be a non-negative integer and DAY is the only supported unit.

To add a policy to an existing table, use the ALTER TABLE statement. A table can have at most one row deletion policy. Adding a row deletion policy to a table with an existing policy fails with an error. See TTL on generated columns to specify more sophisticated row deletion logic.

ALTER TABLE Albums
ADD ROW DELETION POLICY (OLDER_THAN(timestamp_column, INTERVAL num_days DAY));

PostgreSQL

To create a row deletion policy using PostgreSQL, you can define a TTL INTERVAL clause when you create a new table, or add a policy to an existing table.

To add a policy at the time of table creation:

CREATE TABLE mytable (
  key bigint NOT NULL,
  timestamp_column_name TIMESTAMPTZ,
  PRIMARY KEY(key)
) TTL INTERVAL interval_spec ON timestamp_column_name;

Where:

  • timestamp_column_name must be a column with data type TIMESTAMPTZ. You need to create this column in the CREATE TABLE statement. Columns with commit timestamps are valid, as are generated columns. However, you cannot specify a generated column that references a commit timestamp column.

  • interval_spec is the number of days past the timestamp in the timestamp_column_name on which the row is marked for deletion. The value must be a non-negative integer and it must evaluate to a whole number of days. For example, '3 days' is allowed, but '3 days - 2 minutes' returns an error.

To add a policy to an existing table, use the ALTER TABLE statement. A table can have at most one TTL policy. Adding a TTL policy to a table with an existing policy fails with an error. See TTL on generated columns to specify more sophisticated TTL logic.

To add a policy to an existing table:

ALTER TABLE albums
ADD COLUMN timestampcolumn TIMESTAMPTZ;

ALTER TABLE albums
ADD TTL INTERVAL '5 days' ON timestampcolumn;

Restrictions

You cannot create a row deletion policy:

  • On a table that is referenced by a foreign key.
  • On the parent of a table that is referenced by a foreign key.

In the following example, you cannot add a row deletion policy to the Customers table, because it is referenced by a foreign key in the Orders table and deleting customers could violate this constraint.

GoogleSQL

CREATE TABLE Customers (
  CustomerID INT64,
  CreatedAt TIMESTAMP
) PRIMARY KEY (CustomerID);

CREATE TABLE Orders (
  OrderID INT64,
  CustomerID INT64,
  CONSTRAINT FK_CustomerOrder FOREIGN KEY (CustomerID) REFERENCES Customers (CustomerID)
) PRIMARY KEY (OrderID)

PostgreSQL

CREATE TABLE customers (
  customerid   bigint NOT NULL,
  CreatedAt  timestamptz,
  PRIMARY KEY(customerid)
);

CREATE TABLE orders (
  orderid bigint NOT NULL,
  customerid bigint,
  PRIMARY KEY(orderid),
  CONSTRAINT fk_customerorder FOREIGN KEY (customerid) REFERENCES customers (customerid)
);

TTL on columns with default values

A row deletion policy can use a timestamp column with a default value. A typical default value is CURRENT_TIMESTAMP. If no value is explicitly assigned to the column, or if the column is set to its default value by an INSERT or UPDATE statement, the default value is used in the rule calculation.

In the following example, the default value for the column CreatedAt in table Customers is the timestamp at which the row is created.

GoogleSQL

CREATE TABLE Customers (
  CustomerID INT64,
  CreatedAt TIMESTAMP DEFAULT (CURRENT_TIMESTAMP())
) PRIMARY KEY (CustomerID);

For more information, see DEFAULT (expression) in "GoogleSQL data definition language."

PostgreSQL

CREATE TABLE customers (
  customerid bigint NOT NULL,
  createdat timestamptz DEFAULT CURRENT_TIMESTAMP,
  PRIMARY KEY(customerid)
  );

For more information, see CREATE TABLE in "PostgreSQL data definition language."

TTL on generated columns

Row deletion policies can use generated columns to express more sophisticated rules. For example, you can define a row deletion policy on the greatest timestamp (GoogleSQL or PostgreSQL) of multiple columns, or map another value to a timestamp.

GoogleSQL

The following table named Orders tracks sales orders. The table owner wants to set up a row deletion policy that deletes cancelled orders after 30 days, and non-cancelled orders after 180 days.

Spanner TTL only allows one row deletion policy per table. To express the two criteria in a single column, you can use a generated column with an IF statement:

CREATE TABLE Orders (
  OrderId INT64 NOT NULL,
  OrderStatus STRING(30) NOT NULL,
  LastModifiedDate TIMESTAMP NOT NULL,
  ExpiredDate TIMESTAMP AS (IF(OrderStatus = 'Cancelled',
    TIMESTAMP_ADD(LastModifiedDate, INTERVAL 30 DAY),
    TIMESTAMP_ADD(LastModifiedDate, INTERVAL 180 DAY))) STORED,
) PRIMARY KEY(OrderId),
ROW DELETION POLICY (OLDER_THAN(ExpiredDate, INTERVAL 0 DAY));

The statement creates a column named ExpiredDate that adds either 30 days or 180 days to the LastModifiedDate depending on the order status. Then, it defines the row deletion policy to expire rows on the day stored in the ExpiredDate column by specifying INTERVAL 0 day.

PostgreSQL

The following table named Orders tracks sales orders. The table owner wants to set up a row deletion policy that deletes rows after 30 days of inactivity.

Spanner TTL only allows one row deletion policy per table. To express the two criteria in a single column, you can create a generated column:

CREATE TABLE orders (
    orderid bigint NOT NULL,
    orderstatus varchar(30) NOT NULL,
    createdate timestamptz NOT NULL,
    lastmodifieddate timestamptz,
    expireddate timestamptz GENERATED ALWAYS AS (GREATEST(createdate, lastmodifieddate)) STORED,
    PRIMARY KEY(orderid)
) TTL INTERVAL '30 days' ON expireddate;

The statement creates a generated column named ExpiredDate that evaluates the most recent of the two dates (LastModifiedDate or CreateDate). Then, it defines the row deletion policy to expire rows 30 days after the order was created, or if the order was modified within those 30 days, it'll extend the deletion by another 30 days.

TTL and interleaved tables

Interleaved tables are a performance optimization that associates related rows in a one-to-many child table with a row in a parent table. To add a row deletion policy on a parent table, all interleaved child tables must specify ON DELETE CASCADE, meaning the child rows will be deleted atomically with the parent row. This ensures referential integrity such that deletes on the parent table also delete the related child rows in the same transaction. Spanner TTL does not support ON DELETE NO ACTION.

Maximum transaction size

Spanner has a transaction size limit. Cascading deletes on large parent-child hierarchies with indexed columns could exceed these limits and cause one or more TTL operations to fail. For failed operations, TTL will retry with smaller batches, down to a single parent row. However, large child hierarchies for even a single parent row may still exceed the mutation limit.

Failed operations are reported in TTL metrics.

If a single row and its interleaved children is too large to delete, you can attach a row deletion policy directly on the child tables, in addition to the one on the parent table. The policy on child tables should be configured such that child rows are deleted prior to parent rows.

Consider attaching a row deletion policy to child tables when the following two statements apply:

  • The child table has any global indexes associated with it; and
  • You expect a large number of (>100) child rows per parent row.

Delete a row deletion policy

You can drop an existing row deletion policy from a table. This returns an error if there is no existing row deletion policy on the table.

GoogleSQL

ALTER TABLE MyTable
DROP ROW DELETION POLICY;

PostgreSQL

ALTER TABLE mytable
DROP TTL;

Deleting a row deletion policy immediately aborts any TTL processes running in the background. Any rows already deleted by the in-progress processes remain deleted.

Delete a column referenced by a row deletion policy

Spanner doesn't let you delete a column that is referenced by a row deletion policy. You must first delete the row deletion policy before deleting the column.

View the row deletion policy of a table

You can view the row deletion policies of your Spanner tables.

GoogleSQL

SELECT TABLE_NAME, ROW_DELETION_POLICY_EXPRESSION
FROM INFORMATION_SCHEMA.TABLES
WHERE ROW_DELETION_POLICY_EXPRESSION IS NOT NULL;

For more information, see Information schema for GoogleSQL-dialect databases.

PostgreSQL

SELECT table_name, row_deletion_policy_expression
FROM information_schema.tables
WHERE row_deletion_policy_expression is not null;

For more information, see Information schema for PostgreSQL-dialect databases.

Modify a row deletion policy

You can alter the column or the interval expression of an existing row deletion policy. The following example switches the column from CreatedAt to ModifiedAt and extends the interval from 1 DAY to 7 DAY. This returns an error if there is no existing row deletion policy on the table.

GoogleSQL

ALTER TABLE MyTable
REPLACE ROW DELETION POLICY (OLDER_THAN(ModifiedAt, INTERVAL 7 DAY));

PostgreSQL

ALTER TABLE mytable
ALTER TTL INTERVAL '7 days' ON timestampcolumn;