Source MySQL database

This section contains information about:

  • The behavior of how Datastream handles data that's being pulled from a source MySQL database
  • The versions of MySQL database that Datastream supports
  • Known limitations for using MySQL database as a source
  • An overview of how to setup a source MySQL database so that data can be streamed from it to a destination

Behavior

The source MySQL database relies upon its binary log feature for exposing changes to the data.

  • All databases or specific databases from a given MySQL source, as well as all tables from the databases or specific tables, can be selected.
  • All historical data is replicated.
  • All data manipulation language (DML) changes, such as inserts, updates, and deletes from the specified databases and tables, are replicated.
  • Only committed changes are replicated.

Versions

Datastream supports the following versions of MySQL database:

  • MySQL 5.6
  • MySQL 5.7
  • MySQL 8.0

Datastream supports the following types of MySQL database:

Known limitations

Known limitations for using MySQL database as a source include:

  • Streams are limited to 10,000 tables.
  • Tables that have a primary key defined as INVISIBLE can't be backfilled.
  • A table that has more than 500 million rows can't be backfilled unless the following conditions are met:
    1. The table has a unique index.
    2. None of the columns of the index are nullable.
    3. The index isn't descending.
    4. All columns of the index are included in the stream.
  • Datastream periodically fetches the latest schema from the source as events are processed. If a schema changes, Datastream detects the schema change and triggers a schema fetch. However, some events might get processed incorrectly or get dropped between the schema fetches, which can cause data discrepancies.
  • Not all changes to the source schema can be detected automatically, in which case data corruption may occur. The following schema changes may cause data corruption or failure to process the events downstream:
    • Dropping columns
    • Adding columns to the middle of a table
    • Changing the data type of a column
    • Reordering columns
    • Dropping tables (relevant if the same table is then recreated with new data added)
    • Truncating tables
  • Datastream doesn't support replicating views.
  • Datastream doesn't support columns of spatial data types. The values in these columns are replaced with NULL values.
  • Datastream doesn't support the zero value (0000-00-00 00:00:00) in columns of the DATETIME, DATE, or TIMESTAMP data types. The zero value is replaced with the NULL value.
  • Datastream doesn't support replicating rows which include the following values in JSON columns: DECIMAL, NEWDECIMAL, TIME, TIME2 DATETIME, DATETIME2, DATE, TIMESTAMP or TIMESTAMP2. Events containing such values are discarded.
  • Datastream doesn't support binary log transaction compression.
  • Datastream doesn't support SSL certificate chains in the source MySQL connection profiles. Only single, x509 PEM-encoded certificates are supported.
  • Datastream doesn't support cascading deletes. Such events aren't written to the binary log, and as a result, aren't propagated to the destination.
  • Datastream doesn't support failovers to replicas. Because of this, we don't recommend using Datastream for replication from Cloud SQL for MySQL Enterprise Plus sources. Cloud SQL Enterprise Plus edition instances are subject to near-zero downtime maintenance and fail over to a replica during maintenance. This breaks the binlog continuity, and as a result, affected streams fail permanently.

What's next