Export data to Spanner (reverse ETL)

To provide feedback or request support for this feature, send email to bq-cloud-spanner-federation-preview@google.com.

This document describes how you can set up a reverse extract, transform, and load (reverse ETL) workflow from BigQuery to Spanner. You can do this by using the EXPORT DATA statement to export data from a BigQuery table to a Spanner table.

This reverse ETL workflow combines analytic capabilities in BigQuery with low latency and high throughput in Spanner. This workflow lets you serve data to application users without exhausting quotas and limits on BigQuery.

Before you begin

Required roles

To get the permissions that you need to export BigQuery data to Spanner, ask your administrator to grant you the following IAM roles on your project:

For more information about granting roles, see Manage access.

You might also be able to get the required permissions through custom roles or other predefined roles.

Limitations

The following BigQuery data types don't have equivalents in Spanner and are not supported:

Spanner database dialect Unsupported BigQuery types
All dialects
  • STRUCT
  • GEOGRAPHY
  • DATETIME
  • RANGE
  • TIME
GoogleSQL
  • BIGNUMERIC: The supported NUMERIC type is not wide enough. Consider adding explicit casts to the NUMERIC type in the query.

The maximum size of an exported row cannot exceed 1 MiB.

Configure exports with spanner_options option

You can use the spanner_options option to specify a destination Spanner database and table. The configuration is expressed in the form of a JSON string, as the following example shows:

EXPORT DATA OPTIONS(
   uri="https://spanner.googleapis.com/projects/PROJECT_ID/instances/INSTANCE_ID/databases/DATABASE_ID",
  format='CLOUD_SPANNER',
   spanner_options = """{
      "table": "TABLE_NAME",
      "priority": "PRIORITY",
      "tag": "TAG",
   }"""
)

Replace the following:

  • PROJECT_ID: the name of your Google Cloud project.
  • INSTANCE_ID: the name of your database instance.
  • DATABASE_ID: the name of your database.
  • TABLE_NAME: the name of an existing destination table.
  • PRIORITY (optional): priority of the write requests. Allowed values: LOW, MEDIUM, HIGH. Default value: MEDIUM.
  • TAG (optional): request tag to help identify exporter traffic in Spanner monitoring. Default value: bq_export.

Export query requirements

To export query results to Spanner, the results must meet the following requirements:

  • All columns in the result set must exist in the destination table, and their types must match.
  • The result set must contain all NOT NULL columns for the destination table.
  • Column values must not exceed Spanner data size limits within tables.
  • Any unsupported column types must be converted to one of the supported types before exporting to Spanner.

Export data

You can use the EXPORT DATA statement to export data from a BigQuery table into a Spanner table.

The following example exports selected fields from a table that's named mydataset.table1:

EXPORT DATA OPTIONS (
  uri = 'https://spanner.googleapis.com/PROJECT_ID/project/INSTANCE_ID/instance/databases/DATABASE_ID',
  format='CLOUD_SPANNER',
  spanner_options="""{ "table": "TABLE_NAME" }"""
)
AS SELECT * FROM mydataset.table1;

Replace the following:

  • PROJECT_ID: the name of your Google Cloud project
  • INSTANCE_ID: the name of your database instance
  • DATABASE_ID: the name of your database
  • TABLE_NAME: the name of an existing destination table

Export optimization

To optimize the export of records from BigQuery to Spanner, you can try the following:

  • Increasing the number of nodes in the Spanner destination instance. The exporter automatically scales the number of BigQuery export slots in proportion to the number of Spanner nodes and processing units. For additional details about maximizing write throughput, see Performance overview.

  • Avoid ordering the query results. If the result set contains all primary key columns, then the exporter automatically sorts the primary keys of the destination table to streamline writes and minimize contention.

    If the destination table's primary key includes generated columns, then you should add the generated columns' expressions to the query to ensure that the exported data is sorted properly.

Pricing

For information on data export pricing, see BigQuery pricing.

After the data is exported, you're charged for storing the data in Spanner. For more information, see Spanner pricing.