Overview
The stream might incur errors during runtime.
- Some errors, such as a bad password on the source database, are recoverable, meaning they can be fixed and the stream resumes automatically.
- Errors may impact a single object, such as an event containing unsupported data types. Other errors may impact several objects or the entire stream, such as when Datastream can't connect to the source database.
- Depending on the error, information is provided in the Streams or Stream details pages of the Datastream UI. You can also use Datastream's APIs to retrieve information about the error.
To troubleshoot an error, navigate to the stream to view the error, and follow the steps outlined in the error message.
This page contains information about configuration, connectivity, Oracle, and MySQL errors, along with steps for troubleshooting the errors.
Configuration and connectivity errors
Error | Troubleshooting steps |
---|---|
Failure connecting to the source database (generic). | This can happen for various reasons. To troubleshoot this error, perform the following actions:
The stream resumes automatically. |
Failure connecting to source database (IP allowlisting). | This can happen if the connectivity method chosen is IP allowlist, but one or more of the outgoing IP addresses of Datastream aren't added properly on the source database. Make sure that the outgoing IP addresses displayed in the Datastream connection profile are configured on the network firewall so that the source database server can accept connections from these IP addresses. After this is fixed, the stream resumes automatically. |
Failure connecting to the source database (Forward SSH tunnel). | This can happen if there's an issue with the Forward SSH tunnel. Check the status of the tunnel. If the tunnel is stopped, then it needs to be started. After this is fixed, the stream resumes automatically. |
Datastream can't connect to a bastion host through a Forward SSH tunnel. | Verify that the Forward SSH tunnel configuration is configured correctly in the source connection profile, and that the port is open on the SSH tunnel server. |
Failure connecting to the source database because of bad certificates. | This can happen if there's an issue with the certificates provided when defining the source connection profile. Navigate to the Connection profiles page, and then select the given connection profile. Verify that the certificates are set up correctly. After making your changes, save the connection profile, and the stream resumes automatically. |
Failure using private connectivity to connect to the source database. |
|
Connectivity type STATIC_SERVICE_IP_CONNECTIVITY isn't allowed while Org Policy constraints/datastream.disablePublicConnectivity is on. | You selected the public IP allowlist or Forward SSH tunnel network connectivity methods for the connection profile that you're creating. However, the Block Public Connectivity Methods organization policy for Datastream is enabled. Therefore, you can't select public connectivity methods for your connection profile. To resolve this issue, either select the private VPC peering network connectivity method or disable the organization policy. To disable the organization policy:
|
When configuring the source database for my stream, I can't find the tables and schemas I want to transfer in the list of objects to include. | This can happen if your database has thousands of tables and schemas. Some of them might not be included in the list of objects to pull when configuring the source for the stream in the Google Cloud console. Instead of selecting Specific schemas and tables in the Select objects to include section, select Custom. In the Object matching criteria field, enter the schemas and tables that you want Datastream to pull. |
I've added multiple tables to my stream using the Objects to include menu. However, when I look at the Objects tab in Stream details, I can see that some tables are missing. | Make sure that there's at least one CDC update to each of these tables so that Datastream can recognize the changes and automatically include the tables in the stream. |
Failure loading the list of objects when using the Objects to include menu in the Google Cloud console. | This can happen if your database has more than 5,000 schemas and tables. Use a different method to specify which objects to include, or use the Datastream API. For more information, see Configure source databases. |
Events dropped during streaming and not replicated in the destination. | Datastream might drop unsupported events during streaming. You can perform the following actions to address the issue:
|
Oracle errors
Error | Troubleshooting steps |
---|---|
Supplemental logging is configured incorrectly on the source database. | An error fetching ongoing, change data capture (CDC) data can happen if the supplemental logging configuration isn't correct on the source database. Verify that supplemental logging is configured correctly. Specifically, confirm that supplemental logging is turned on for the database tables that are being streamed from the source into the destination. The stream resumes automatically. |
Unable to resume replication as log position is lost. | This error can occur when the replication process is paused for a long time, which causes the log position to be lost. Streams shouldn't be paused for periods of time that approach the log retention period. Recreate the stream. |
Log files are missing, either partially or entirely. | The log files may have been deleted. Oracle purges log files as soon as
they can unless you specify a minimum rotation period to keep them around.
In the Oracle server, set how long the log files should be retained. For example, use
For an RDS deployment, use |
The exclude list subsumes the include list. | The include list is contained completely within the exclude list, so the list of objects that Datastream pulls from the source is empty. Modify the object selection, and then try again. |
The logging mode for Oracle database isn't set to ARCHIVELOG. | Change the logging mode, and then try again. |
Datastream returns an ORA-00942: table or view does not exist error message, but everything is configured properly. |
This can be a result of caching on the Oracle server. Recreating the database user should fix the caching issue. |
Changes to an Oracle source aren't reflected in the destination when the stream is already running. | Because Datastream reads from archived redo log files, the changes you make to the source aren't reflected in the destination until the log is archived. To see the changes in the destination, change the log archive policy or manually force a log switch. For more information, see Work with Oracle database redo log files. |
There's an unexpected internal error. | For more details, contact Google Support. |
MySQL errors
Error | Troubleshooting steps |
---|---|
Binlog is configured incorrectly on the source database. | This can happen for continuous MySQL streams if the binlog configuration is incorrect on the source database. To troubleshoot this error, perform the following actions:
|
Unable to resume replication as binlog position is lost. | This error can occur when the replication process is paused for a long time, which causes the binlog position to be lost. Streams shouldn't be paused for periods of time that approach the binlog retention period. Recreate the stream. |
Failure running the stream because of incompatible source database and destination versions. | This can happen when the source database doesn't adhere to the version support matrix. To troubleshoot this error, perform the following actions:
|
AWS RDS MySQL source binlogs are missing, either partially or entirely. | The binlogs may have been deleted. AWS RDS purges binlogs as soon as
they can unless you specify a minimum rotation period to keep them around.
In the source AWS RDS MySQL instance,
set how long, in hours, the binlogs should be retained. For example, use
mysql.rds_set_configuration('binlog retention hours', 168);
to keep the binlogs around for at least 7 days. |
The exclude list subsumes the include list. | The include list is contained completely within the exclude list, so the list of objects that Datastream pulls from the source is empty. Modify the object selection, and then try again. |
Datastream can't replicate a MySQL database. | Make sure that Datastream has permissions to replicate the database. |
When creating a connection profile for a MySQL source, multiple PEM-encoded SSL certificates aren't accepted in the Encryption type menu. | Datastream doesn't support SSL certificate chains in MySQL connection profiles. Only single, x509 PEM-encoded certificates are supported. |
High latency when streaming from a MySQL source. | Increase the Datastream ability to read from the source database:
|
There's an unexpected internal error. | For more details, contact Google Support. |
PostgreSQL errors
Error | Troubleshooting steps |
---|---|
Logical decoding is configured incorrectly on the source database. | Verify that logical decoding is configured correctly. See Configure a source PostgreSQL database. |
Replication slot does not exist. | An error fetching ongoing, change data capture (CDC) data can happen if the replication slot doesn't exist on the database. Verify that the replication slot is configured correctly. See Configure a source PostgreSQL database. |
Replication slot is configured with a wrong plugin. | This error can occur if the replication slot is configured with a plugin different than pgoutput . Verify that the replication slot is configured correctly. See Source PostgreSQL database for more information. |
Replication slot is active in a different process. | This error can occur when the replication slot is being used by another process. Replication slots can only be used by a single process at a time. Make sure that you don't use the same replication slot in any other process except for Datastream. |
Publication is not configured properly. | This error can occur when when the publication isn't configured to expose the tables which are included in the stream configuration. Verify that the publication is configured correctly. See Configure information about the source database for the stream. |
Publication does not exist. | This error can occur if the publication doesn't exist on the database. Verify that the publication is configured correctly. See Configure a source PostgreSQL database. |
Unable to resume replication as WAL files are lost. | This error can occur when the replication process is paused for a long time, which causes the WAL files to be lost. Streams shouldn't be paused for periods of time that approach the WAL files retention period. Recreate the stream. |
The exclude list subsumes the include list. | The include list is contained completely within the exclude list, so the list of objects that Datastream pulls from the source is empty. Modify the object selection, and then try again. |
Datastream can't replicate a PostgreSQL schema. | Make sure that Datastream has permissions to replicate the schema. |
Large transactions on the source database cause issues with data replication and synchronization. | If you insert, update or delete a significant number of records in the source database, the replication slot might get overloaded with the corresponding events. Datastream needs time to read and process these events. Because PostgreSQL replication slots are single-threaded, processing other changes in the replication slot, including changes to data in other tables, is delayed until Datastream catches up with all the changes in the replication slot. |
Large transactions on the source database cause low CDC throughput. | Datastream doesn't support multi-threaded CDC in PostgreSQL. To overcome this limitation, and to increase your CDC throughput, you can split the source into multiple streams, each with its own publication and replication slot. For example, you might want to create one stream for the largest table in your database and another for all the other tables, or one stream for your top-priority tables and another for the remaining ones. The use cases might vary, so you need to consider what makes the most sense in your specific CDC scenario. For information about creating a publication, see Configure a source PostgreSQL database. |
Unsupported events dropped with reason code: BIGQUERY_TOO_MANY_PRIMARY_KEYS . |
When the PostgreSQL replica identity for a table is set to FULL , then Datastream treats all the columns in this table as primary keys. If there are more than 16 columns in the table, then this violates the BigQuery CDC limitation and causes the error. To solve the issue, complete the following steps:
|
There's an unexpected internal error. | For more details, contact Google Support. |
SQL Server errors
Error | Troubleshooting steps |
---|---|
CDC is disabled for the database DATABASE_NAME. | Change data capture (CDC) must be enabled for the database. Datastream needs direct read access to transaction logs in order to replicate real time changes to the source database, and to get complete log information. Enable CDC for the database and try again. For information about enabling CDC for a database, see Configure a source SQL Server database. |
Tables with CDC disabled. | Change data capture (CDC) must be enabled for all the tables included in the stream. Datastream needs direct read access to transaction logs in order to replicate real time changes to source tables, and to get complete log information. Enable CDC for the tables included in the stream and try again. For information about enabling CDC for source tables, see Configure a source SQL Server database. |
Missing permissions. | Datastream is missing the necessary permissions to read from the source. Grant the appropriate privileges to the user account that is used to connect to your database, and try again. |
The SQL Server EDITION_NAME edition is not supported. | Datastream doesn't support this SQL Server edition. For more information about supported editions of SQL Server, see Overview of SQL Server as a source. |
The SQL Server version VERSION_NAME of the Standard edition is not supported. | Datastream doesn't support this version of the SQL Server Standard edition. For more information about supported versions of SQL Server, see Overview of SQL Server as a source. |
SQL Server CDC configuration: failed. | The CDC method that you have selected doesn't comply with your database configuration. Change the CDC method and try again. |
BigQuery errors
Error | Troubleshooting steps |
---|---|
BIGQUERY_UNSUPPORTED_PRIMARY_KEY_CHANGE, details: Failed to write to BigQuery due to an unsupported primary key change: adding primary keys to existing tables is not supported. |
If the primary key changes in the source, you must drop the table in BigQuery and initiate backfill again. This is a limitation of BigQuery, because there's no way to ensure correct merging of new events with existing rows if the primary key is different. For more information, see Configure a BigQuery destination. |
The destination BigQuery table has significantly more records than the source table. | This can happen when the source table doesn't have a primary key. In such a case, Datastream processes the table in the append-only write mode, and each event for a given row appears as a separate row in BigQuery. |
Data gets duplicated when performing backfill in the Append-only write mode. | When you select the Append-only write mode for your stream, your data is appended in BigQuery as a stream of SELECT * FROM (SELECT *, row_number() OVER (PARTITION BY datastream_metadata.uuid) AS num FROM TABLE_NAME) WHERE num=1
|
Datastream is configured for the merge write mode, but changes aren't merged in BigQuery. | Verify that there's a primary key in your source table. BigQuery needs it to merge the changes into the destination table. If there's no primary key, consider adding one in either the source or destination table. To add a primary key in your destination BigQuery table, follow these steps:
|
Unable to add a primary key, remove a primary key, or change the primary key definition for a table that's already replicated to BigQuery. | By default, Datastream doesn't support adding a primary key to a table that's already replicated to BigQuery without a primary key or removing a primary key from a table that's replicated to BigQuery with a primary key. You can, however, change the primary key definition for a source table replicated to BigQuery that already has a primary key:
|
What's next
- To learn how to look for potential issues with your stream, see Troubleshoot a stream.
- To learn how to configure your source database, see Sources.
- To learn how to configure your BigQuery or Cloud Storage destination, see Destinations.