Recover a stream

You can recover a permanently failed stream without having to create a new stream. You can do this by specifying the position from which Datastream attempts to resume reading changes from the source.

Stream recovery overview

A running stream can encounter some unrecoverable errors and change its state to FAILED_PERMANENTLY. Such errors prevent the stream from continuing to run, and might cause data loss.

You can recover a permanently failed stream by setting it to ignore the error and keep reading the ongoing events instead of recreating the stream and backfilling the historical data. To recover a permanently failed stream, you reset the replication to start reading from a different replication position. Each supported source type has its own definition of what a replication position is:

  • For Oracle sources, a replication position is a redo log file in the database and the system change number (SCN) in this file.
  • For MySQL sources, a replication position is the database binary log (binlog) file and the position in this file.
  • For PostgreSQL sources (including AlloyDB for PostgreSQL), a replication position is the log sequence number (LSN) in the replication slot. During recovery, the stream starts reading from the first LSN in the replication slot.

Recover a stream for a MySQL or an Oracle source

To recover a stream for a MySQL or Oracle source, you have the following options:

  • Retry from the current position (recommended): Select this option to attempt streaming from the current position, where the stream last failed. You need to fix the log file or recover it from backup first. This is the recommended option.

  • Skip the current position and stream from the next available position: If one or more log files are missing, select this option to skip these files and resume streaming from the first position in the subsequent available file. The changes from the missing log files are lost, but you can recover them by performing a backfill.

  • Skip the current position and stream from the most recent position: If one or more log files are missing, select this option to skip these files and resume streaming from the most recent position in the most up-to-date log file. The changes from the missing log files are lost, but you can recover them by performing a backfill.

  • Resume from your preferred streaming file and position: Select this option to resume the stream from a specific log file and log position. Some changes might be lost if the specified log position doesn't overlap with or immediately follow the lost log position. You can recover these changes by performing a backfill.

To recover a permanently failed stream for a MySQL or an Oracle source, perform the following steps:

  1. Go to the Streams page in Google Cloud.

    Go to the Streams page

  2. Select the checkbox to the left of the stream that you want to recover.

  3. Click Recover.

  4. The Choose a recovery strategy pane opens. Select an option. If you select Resume from your preferred streaming file and position, enter the following:

    • For a MySQL source: the log filename in the File name field and the log position in the Position field. If you don't specify the position, the stream resumes from the first position in the indicated log file.
    • For an Oracle source: the system change number (SCN) in the System Change Number (SCN) field. This field is mandatory.
  5. Click Apply.

  6. When the stream is recovered, a timestamp appears in the Recovered column in the Streams page.

Recover a stream for a PostgreSQL source

To recover a stream for a PostgreSQL source, you need to provide the replication slot name. The server uses this replication slot to send events to Datastream. The replication slot name can be the same as the slot used for the failed stream or different:

Any change events to the source that occurred between the log position loss and the first LSN in the new replication slot are lost. You can recover these changes by performing a backfill.

To recover a permanently failed stream for a PostgreSQL source, perform the following steps:

  1. Go to the Streams page in Google Cloud.

    Go to the Streams page

  2. Select the checkbox to the left of the stream that you want to recover.

  3. Click Recover.

  4. The Define a new replication slot pane opens.

  5. In the Replication slot name field, provide the name of a new replication slot from which the stream will attempt to recover. If you recreated the replication slot using the same name, or you want to reuse the slot you specified when you configured your source, you can leave this field empty.

  6. Click Apply.

  7. When the stream is recovered, a timestamp appears in the Recovered column in the Streams page.

You can also recover permanently failed streams from the Stream details page. To do this, click Recover stream when viewing detailed information about your stream.

Use stream recovery for a MySQL source in a manual failover scenario

You can perform a manual failover and use stream recovery to avoid recreating your streams from scratch during maintenance or primary instance failure. Generally, Datastream doesn't support failovers to replicas because they break the binlog continuity, but you can follow these steps to recover the stream and ensure your change data is captured:

  1. Stop all writes to the primary instance.
  2. Make sure that data freshness metric is set to 0. This means that Datastream captured all changes and there are no new events to read from the source. For more information, see Monitor a stream.
  3. Fail over to the new database instance.
  4. If needed, update the stream's connection profile to the new database instance (for example, you might need to change the database hostname or IP address). For more information, see Modify connection profiles.
  5. Recover the stream from a specific position on the failover instance to ensure CDC continuity.