Jump to
Datastream for BigQuery

Datastream for BigQuery

Seamless replication from relational databases directly to BigQuery, enabling near real-time insights on operational data.

  • Low-latency replication to enable near real-time insights in BigQuery

  • Access to streaming data from MySQL, PostgreSQL, AlloyDB, SQL Server, and Oracle databases

  • Serverless platform that scales automatically, with no resources to provision or manage

  • Easy setup of ELT (extract, load, transform) pipelines with built-in secure connectivity

  • Used by thousands of customers to replicate their operational data to BigQuery

Benefits

Replicate operational data with minimal latency

Seamlessly replicate data from MySQL, PostgreSQL, AlloyDB, SQL Server, and Oracle databases directly into BigQuery, with low latency and without impacting source performance.

Scale up and down with a serverless architecture

Eliminate operational overhead with a serverless approach that scales automatically with no infrastructure for you to manage.

Get up and running in minutes

A simplified setup experience allows you to start replicating data from your operational databases to BigQuery in just a few steps.

Key features

Key features

Replication of operational data into BigQuery

Datastream uses BigQuery’s Change Data Capture (CDC) functionality and Storage Write API to efficiently replicate updates directly from source systems in near real time. You no longer need replication solutions that waste valuable resources on complex data pipelines, self-managed staging tables, tricky merge logic, or manual data type conversion.

Simplified setup

Datastream allows you to start replicating data into BigQuery in a few steps. Just configure your source database, connection type, and destination in BigQuery, and you’re all set. Datastream for BigQuery will backfill historical data and continuously replicate new changes as they happen.

Streaming data from relational databases

Datastream reads and delivers every change—insert, update, and delete—from your MySQL, PostgreSQL, AlloyDB, SQL Server, and Oracle databases into BigQuery with minimal latency. The source database can be hosted on-premises, on Google Cloud services, such as Cloud SQL or Bare Metal Solution for Oracle, or anywhere else on any cloud. An agentless and Google-native service built specifically for BigQuery, it reliably streams every event as it happens. 

Schema drift resolution

As source schemas change, Datastream seamlessly handles schema drift and automatically replicates new columns and tables added in the source to BigQuery.

Security by design

Datastream supports multiple secure, private connectivity methods to protect data in transit. Data is also encrypted at rest.

Falabella logo
With Datastream, we have a single tool to perform seamless, near real-time replication of our operational data to BigQuery. Datastream helps us get much quicker insights on our operational data, deliver more stable data products, and better address our business needs.

René Delgado, Head of Data Solutions at Falabella

Blog link

Use cases

Use cases

Use case
Serverless replication to BigQuery

Datastream reads change events (inserts, updates, and deletes) from source databases and writes them in BigQuery tables in near real time. This enables you to enrich existing BigQuery data warehouses and ML models with transactional data, such as retail purchases, to build a more complete end-to-end picture of data. Datastream will backfill historical data, continuously replicate new changes as they happen, and seamlessly handle schema changes.

Serverless replication to BigQuery
Compare features

Compare options for streaming data from operational databases into BigQuery

Datastream for BigQuery

Fully managed solution for replicating data from transactional databases into BigQuery

Datastream and Dataflow

Customizable solution for replicating changes in data sources

Datastream and Data Fusion

Code-free wizard that is part of a fully managed ETL service

Key benefits

  • Easiest option for replicating operational data to BigQuery

  • Serverless architecture that automatically scales up and down

  • Single interface for end-to-end visibility and monitoring of replication pipelines

  • Customizable solution with additional flexibility

  • Pre-built templates supported by Google for a range of destinations

  • Integration of additional features, such as data quality and data masking

  • Simple interface for ETL developers and data analysts

  • Identification of potential issues and gaps in replication in advance

  • Near real-time insights into replication performance

Fully managed solution for replicating data from transactional databases into BigQuery

Key benefits

  • Easiest option for replicating operational data to BigQuery

  • Serverless architecture that automatically scales up and down

  • Single interface for end-to-end visibility and monitoring of replication pipelines

Customizable solution for replicating changes in data sources

Key benefits

  • Customizable solution with additional flexibility

  • Pre-built templates supported by Google for a range of destinations

  • Integration of additional features, such as data quality and data masking

Code-free wizard that is part of a fully managed ETL service

Key benefits

  • Simple interface for ETL developers and data analysts

  • Identification of potential issues and gaps in replication in advance

  • Near real-time insights into replication performance

You can also stream data from operational databases into BigQuery with partner ETL/ELT solutions, Kafka, or batch jobs. Compared to these options, Datastream typically has the advantages of serverless architecture, ease of integration, and low latency.

Pricing

Datastream pricing

Datastream pricing is based on actual data processed. Volume-based tiered pricing is available, which makes it more affordable if you're moving larger volumes of data. Additional pricing details are available on the Datastream pricing page.

Additional resources such as BigQuery, Cloud Storage, and Dataflow are billed per those services' pricing.

Take the next step

Start building on Google Cloud with $300 in free credits and 20+ always free products.

Google Cloud
  • ‪English‬
  • ‪Deutsch‬
  • ‪Español‬
  • ‪Español (Latinoamérica)‬
  • ‪Français‬
  • ‪Indonesia‬
  • ‪Italiano‬
  • ‪Português (Brasil)‬
  • ‪简体中文‬
  • ‪繁體中文‬
  • ‪日本語‬
  • ‪한국어‬
Console
Google Cloud