Sources

Overview

Datastream can pull changes to data from an on-premises Oracle database, a cloud-hosted Oracle relational database service (RDS), or a source MySQL database.

Source Oracle database

This section contains information about:

  • The behavior of how Datastream handles data that's being pulled from a source Oracle database
  • The versions of Oracle database that Datastream supports
  • An overview of how to setup a source Oracle database so that data can be streamed from it to a destination
  • Known limitations for using Oracle database as a source

Behavior

The source Oracle database relies upon its Oracle Logminer feature for exposing changes to the data.

  • All schemas or specific schemas from a given database, as well as all tables from the schemas or specific tables, can be selected.
  • All historical data is replicated.
  • All data manipulation lanuguage (DML) changes, such as inserts, updates, and deletes from the specified databases and tables, are replicated.
  • Datastream replicates both committed and, in some cases, uncommitted changes into the destination. Datastream reads uncommitted changes. In case of a rollback, the Datastream output records also include the opposite operation. For example, if there's a rolled-back INSERT operation, then the output records will also contain a corresponding DELETE operation. In this case, the event will appear as a DELETE event with only the ROWID.

Versions

Datastream supports the following versions of Oracle database:

  • Oracle 11g, Version 11.2.0.4
  • Oracle 12c, Version 12.1.0.2
  • Oracle 12c, Version 12.2.0.1
  • Oracle 18c
  • Oracle 19c

Setup

To set up a source Oracle database so that data from it can be streamed into a destination, you must configure the database to grant access, set up logging, define a retention policy, and so on.

See Configure your source Oracle database to learn how to configure this database so that Datastream can pull data from it into a destination.

Known limitations

Known limitations for using Oracle database as a source include:

  • Tables that have more than 100 million rows can't be backfilled.
  • Streams are limited to 10,000 tables.
  • Oracle multi-tenant architecture (CDB/PDB) isn't supported.
  • Oracle Autonomous Database isn't supported.
  • Secure Sockets Layer (SSL) authentication isn't supported.
  • Events from tables that don't have a primary key won't contain the information required to perform an update on the consumer side.
  • Events have a size limitation of 3 MB.
  • Index-organized tables (IOTs) aren't supported.
  • Temporary tables aren't supported.
  • For columns of type BFILE, only the path to the file will be replicated. The contents of the file won't be replicated.
  • Columns of data types ANYDATA, BLOB, CLOB, LONG/LONG RAW, NCLOB, UDT, UROWID, XMLTYPE aren't supported, and will be replaced with NULL values.
  • For Oracle 11g, tables that have columns of data types ANYDATA or UDT aren't supported, and the entire table won't be replicated.
  • Oracle Label Security (OLS) isn't replicated.
  • Datastream periodically fetches the latest schema from the source as events are processed. If a schema changes, then some events from the new schema may be read while the old schema is still applied. In this case, Datastream will detect the schema change, trigger a schema fetch, and reprocess the failed events.
  • Not all changes to the source schema can be detected automatically, in which case data corruption may occur. The following schema changes may cause data corruption or failure to process the events downstream:
    • Dropping columns
    • Adding columns to the middle of a table
    • Changing the data type of a column
    • Reordering columns
    • Dropping tables (relevant if the same table is then recreated with new data added)
    • Truncating tables
  • Datastream doesn't support replicating views.
  • Datastream supports materialized views. However, new views created while the stream is running aren't backfilled automatically.

Source MySQL database

This section contains information about:

  • The behavior of how Datastream handles data that's being pulled from a source MySQL database
  • The versions of MySQL database that Datastream supports
  • An overview of how to setup a source MySQL database so that data can be streamed from it to a destination
  • Known limitations for using MySQL database as a source

Behavior

The source MySQL database relies upon its binary log feature for exposing changes to the data.

  • All databases or specific databases from a given MySQL source, as well as all tables from the databases or specific tables, can be selected.
  • All historical data is replicated.
  • All data manipulation lanuguage (DML) changes, such as inserts, updates, and deletes from the specified databases and tables, are replicated.
  • Only committed changes are replicated.

Versions

Datastream supports the following versions of MySQL database:

  • MySQL 5.6
  • MySQL 5.7
  • MySQL 8.0

Datastream supports the following types of MySQL database:

  • Self-hosted MySQL
  • Cloud SQL for MySQL
  • Amazon RDS for MySQL
  • Amazon Aurora MySQL
  • MariaDB

Known limitations

Known limitations for using MySQL database as a source include:

  • Streams are limited to 10,000 tables.
  • Events have a size limitation of 3 MB.
  • Tables that have more than 100 million rows and that don't have a numeric primary key can't be backfilled.
  • Tables that have a primary key defined as INVISIBLE can't be backfilled.
  • Datastream periodically fetches the latest schema from the source as events are processed. If a schema changes, then some events from the new schema may be read while the old schema is still applied. In this case, Datastream will detect the schema change, trigger a schema fetch, and reprocess the failed events.
  • Not all changes to the source schema can be detected automatically, in which case data corruption may occur. The following schema changes may cause data corruption or failure to process the events downstream:
    • Dropping columns
    • Adding columns to the middle of a table
    • Changing the data type of a column
    • Reordering columns
    • Dropping tables (relevant if the same table is then recreated with new data added)
    • Truncating tables
  • Datastream doesn't support replicating views.
  • Datastream doesn't support columns of the Polygon data type. The values in these columns will be replaced with NULL values.
  • Datastream doesn't support the zero value (0000-00-00 00:00:00) in columns of the DATETIME, DATE, or TIMESTAMP data types. The zero value will be replaced with the NULL value.