Create and manage change streams

This page describes how to create, modify, and view Spanner change streams. To learn more about change streams, see About change streams.

Because change streams are schema objects, you create and manage them through the same DDL-driven schema updates used for any other kind of database-definition work, such as creating tables or adding indexes.

Spanner begins a long-running operation after you submit a schema-changing DDL statement, including those used to create, alter, or delete change streams. A new or altered change stream will begin to watch the columns or tables specified by its new configuration upon the completion of this long-running operation.

Create a change stream

To create a change stream, you need to supply its name and the schema objects it will watch: either the entire database, or a list of specific tables and columns. You can optionally specify a retention period if you want to override the one-day default retention period, or a value capture type if you want to override the default value capture type OLD_AND_NEW_VALUES.

GoogleSQL

The DDL syntax for creating a change stream using GoogleSQL looks like this:

CREATE CHANGE STREAM change_stream_name
  [FOR column_or_table_watching_definition[, ... ] ]
  [
    OPTIONS (
      retention_period = timespan,
      value_capture_type = type
    ) 
  ]

PostgreSQL

The DDL syntax for creating a change stream using PostgreSQL looks like this:

CREATE CHANGE STREAM change_stream_name
  [FOR column_or_table_watching_definition[, ... ] ]
  [
    WITH (
      retention_period = timespan,
      value_capture_type = type
    )
  ]

A new change stream begins to watch its assigned schema objects as soon as the long-running operation that created it completes.

The following examples illustrate the creation of change streams with various configurations.

Watch an entire database

To create a change stream that watches every data change performed throughout a database's tables, use the ALL keyword:

CREATE CHANGE STREAM EverythingStream
FOR ALL;

The ALL configuration implicitly includes all of the database's future data tables and columns, as soon as they get created. It does not include views, information schema tables, or other objects aside from normal data tables.

Watch specific tables

To limit a change stream's scope to specific tables, rather than an entire database, specify a list of one or more tables:

CREATE CHANGE STREAM SingerAlbumStream
FOR Singers, Albums;

Spanner automatically updates change streams that watch whole tables to reflect any schema changes affecting those tables, such as added or dropped columns.

Watch specific columns

Use the table(column_1[, column_2, ...]) syntax to watch changes to one or more specific, non-key columns within tables that you name:

CREATE CHANGE STREAM NamesAndTitles
FOR Singers(FirstName, LastName), Albums(Title);

You can't specify primary key columns here because every change stream always tracks the primary keys of every table it watches. This allows every data change record to identify the changed row by its primary key.

Watch tables and columns in a single stream

You can combine table-watching and column-watching syntax from the previous two examples in a single change stream:

CREATE CHANGE STREAM NamesAndAlbums
FOR Singers(FirstName, LastName), Albums;

Specify a longer retention period

To specify a change stream data retention period longer than the default of one day, set the retention_period to a period of up to one week, expressed as either hours (h) or days (d).

Two examples:

GoogleSQL

CREATE CHANGE STREAM LongerDataRetention
FOR ALL
OPTIONS ( retention_period = '36h' );
CREATE CHANGE STREAM MaximumDataRetention
FOR ALL
OPTIONS ( retention_period = '7d' );

PostgreSQL

CREATE CHANGE STREAM LongerDataRetention
FOR ALL
WITH ( retention_period = '36h' );
CREATE CHANGE STREAM MaximumDataRetention
FOR ALL
WITH ( retention_period = '7d' );

Specify a different value capture type

To specify a change stream value capture type other than OLD_AND_NEW_VALUES, set the value_capture_type to either NEW_VALUES or NEW_ROW, as shown in the following examples:

GoogleSQL

CREATE CHANGE STREAM NewRowChangeStream
FOR ALL
OPTIONS ( value_capture_type = 'NEW_ROW' );
CREATE CHANGE STREAM NewValuesChangeStream
FOR ALL
OPTIONS ( value_capture_type = 'NEW_VALUES' );

PostgreSQL

CREATE CHANGE STREAM NewRowChangeStream
FOR ALL
WITH ( value_capture_type = 'NEW_ROW' );
CREATE CHANGE STREAM NewValuesChangeStream
FOR ALL
WITH ( value_capture_type = 'NEW_VALUES' );

Modify a change stream

To modify a change stream's configuration, use an ALTER CHANGE STREAM DDL statement. It uses syntax similar to CREATE CHANGE STREAM. You can change which columns a stream watches, or the length of its data retention period. You can also suspend its watching altogether while retaining its data change records.

Modify what a change stream watches

This example adds the entire Songs table to the NamesAndAlbums change stream configured earlier:

ALTER CHANGE STREAM NamesAndAlbums
SET FOR Singers(FirstName, LastName), Albums, Songs;

Spanner replaces the named change stream's behavior with the new configuration upon the completion of the long-running operation that updates the change stream's definition within the database's schema.

Modify a change stream's data retention period

To modify the length of time that a change stream retains its internal records, set the retention_period in an ALTER CHANGE STREAMS DDL statement.

This example adjusts the data retention period to the NamesAndAlbums change stream created previously:

GoogleSQL

ALTER CHANGE STREAM NamesAndAlbums
SET OPTIONS ( retention_period = '36h' );

PostgreSQL

ALTER CHANGE STREAM NamesAndAlbums
SET ( retention_period = '36h' );

Modify a change stream's value capture type

To modify the value capture type of a change stream, set the value_capture_type clause in an ALTER CHANGE STREAMS DDL statement.

This example adjusts the value capture type to NEW_VALUES.

GoogleSQL

ALTER CHANGE STREAM NamesAndAlbums
SET OPTIONS ( value_capture_type = 'NEW_VALUES' );

PostgreSQL

ALTER CHANGE STREAM NamesAndAlbums
SET ( value_capture_type = 'NEW_VALUES' );

Suspend a change stream

If you wish to have a change stream halt its activity but retain its internal records—at least for the life of its data retention period—you can alter it so that it watches nothing.

To do this, issue an ALTER CHANGE STREAM DDL statement that replaces the change stream's definition with the special phrase DROP FOR ALL. For example:

ALTER CHANGE STREAM MyStream DROP FOR ALL;

The stream continues to exist in the database, but watches no objects, and generates no further data change records. Its existing change records remain intact, subject to the stream's data retention policy.

To resume a suspended stream, issue another ALTER CHANGE STREAM statement with its previous configuration.

Delete a change stream

To permanently delete a change stream, issue a DROP CHANGE STREAM statement that includes the stream's name:

DROP CHANGE STREAM NamesAndAlbums;

Spanner immediately stops the stream, removes it from the database's schema, and deletes its data change records.

List and view change streams

The Google Cloud console provides a web interface for listing and reviewing a database's change stream definitions. You can also view the structure of change streams as their equivalent DDL statements, or by querying the database's information schema.

View change streams with Google Cloud console

To see a list of a database's change streams and review their definitions:

  1. Visit the Spanner instances page of the Google Cloud console.

    Open the instances page

  2. Navigate to the appropriate instance and database.

  3. Click Change streams in the navigational menu.

This displays a list of all of that database's change streams, and summarizes the configuration of each one. Clicking a stream's name reveals further details about the tables and columns that it watches.

View definitions of change streams as DDL

Viewing a database's schema as DDL includes descriptions of all its change streams, where they appear as CREATE CHANGE STREAM statements.

Query the information schema about change streams

You can directly query a database's information schema about its change streams. The following tables contain the metadata that defines change streams' names, the tables and columns that they watch, and their retention periods:

Change streams best practices

The following are some best practices for configuring and managing change streams.

Consider a separate metadata database

Change streams use a metadata database to maintain internal state. The metadata database can be the same as or different from the database that contains the change streams. We recommend creating a separate database for metadata storage.

The Spanner change streams connector needs read-write permissions to the metadata database. You do not need to prepare this database with a schema; the connector takes care of that.

Using a separate metadata database eliminates complexities that might arise from allowing the connector to write directly to your application database:

  • By separating the metadata database from the production database with the change stream, the connector needs only read permissions to the production database.

  • By restricting the traffic of the connector to a separate metadata database, writes performed by the connector itself are not included in the production change streams. This is especially relevant for change streams that watch the entire database.

If no separate database is being used to store the metadata, we recommend monitoring the CPU impact of the change streams connector on their instances.

Benchmark new change streams and resize if needed

Before adding new change streams to your production instance, consider benchmarking a realistic workload on a staging instance with change streams enabled. This lets you determine whether you need to add nodes to your instance, in order to increase its compute and storage capacities.

Run these tests until CPU and storage metrics stabilize. Optimally, the instance's CPU utilization should remain under the recommended maximums, and its storage usage should not exceed the instance's storage limit.

Use different regions to load-balance

When using change streams in a multi-region instance configuration, consider running their processing pipelines in a different region than the default leader region. This helps to spread the streaming load among non-leader replicas. If you need to prioritize the lowest possible streaming delay over load balancing, however, run the streaming load in the leader region.

What's next