Stay organized with collections Save and categorize content based on your preferences.

Cloud Spanner federated queries

This page describes how to query data in Cloud Spanner from BigQuery using federated queries.


BigQuery Spanner federation enables BigQuery to query data residing in Spanner in real-time, without copying or moving data.

After the initial one-time set up, you can write a query with the SQL EXTERNAL_QUERY function.


To create the connection to the Spanner database, you must have the permissions described in Working with connections.

To query the Spanner database, you must have the following permissions:

  • spanner.instances.get
  • spanner.sessions.create

To perform parallel reads, you must have the spanner.databases.partitionQuery permission.

Before you begin

Enable the BigQuery connection service

  1. Open the BigQuery Connection API page in the API library.
  2. From the drop-down menu, select the project that contains your external data source.
  3. Click Enable.

    BigQuery Connection API

Setting up Spanner database connections

Once the BigQuery Connection API is enabled, create a connection to the Spanner database.


  1. To create a connection resource, go to the BigQuery page in the Google Cloud console.

    Go to the BigQuery page

  2. In the Add data menu, select External data source.

    Create connection resource.

  3. In the External data source pane, enter the following information:

    • For Connection type, select Spanner.
    • For Connection ID, enter an identifier for the connection resource. Letter, numbers, and underscores are allowed.
    • For Connection location, select a BigQuery location (or region) that is compatible with your external data source region.
    • Optional: For Friendly name, enter a user-friendly name for the connection, such as My connection resource. The friendly name can be any value that helps you identify the connection resource if you need to modify it later.
    • Optional: For Description, enter a description for this connection resource.
    • For Database name, enter the name of the Spanner in the following format: "projects/PROJECT_ID/instances/INSTANCE/databases/DATABASE"
    • Optional: To perform parallel reads, select Read data in parallel. For more information, see Read data in parallel
  4. Click Create connection.


To create the connection, use the bq mk command with the --connection flag.

bq mk --connection \
  --connection_type=CLOUD_SPANNER \
  --properties='PROPERTIES' \
  --location=LOCATION \
  --display_name='FRIENDLY_NAME' \
  --description 'DESCRIPTION' \

Replace the following:

  • PROPERTIES: A JSON object with the following fields:

    • "database": The Spanner database for the connection. Specify as a string with the following format: "projects/PROJECT_ID/instances/INSTANCE/databases/DATABASE".
    • "use_parallelism": Optional. If true, this connection performs parallel reads. The default value is false. For more information, see Read data in parallel
  • LOCATION: A BigQuery location that is compatible with your external data source region.

  • FRIENDLY_NAME: Optional. A user-friendly name for the connection.

  • DESCRIPTION: Optional. A description for this connection.

  • CONNECTION_ID: Optional. An identifier for the connection resource. The connection ID can contain letters, numbers and underscores. If you don't provide a connection ID, BigQuery automatically generates a unique ID.

The following example creates a new connection resource named my_connection_id.

bq mk --connection \
  --connection_type='CLOUD_SPANNER' \
  --properties='{"database":"projects/my_project/instances/my_instance/databases/database1"}' \
  --project_id=federation-test \
  --location=us \


Within the BigQuery Connection API, call the CreateConnection method within the ConnectionService service to create a connection resource.

Query data in Spanner

To send a federated query to Spanner from a Google Standard SQL query, use the EXTERNAL_QUERY function.

Formulate your Spanner query in either Google Standard SQL or PostgreSQL, depending on the specified dialect of the database.

The following example makes a federated query to a Spanner database named orders and joins the results with a BigQuery table named mydataset.customers.

SELECT c.customer_id,, rq.first_order_date
FROM mydataset.customers AS c
  '''SELECT customer_id, MIN(order_date) AS first_order_date
  FROM orders
  GROUP BY customer_id''') AS rq 
  ON rq.customer_id = c.customer_id
GROUP BY c.customer_id,, rq.first_order_date;

Read data in parallel

Spanner can divide certain queries into smaller pieces, or partitions, and fetch the partitions in parallel. For more information, see Read data in parallel in the Spanner documentation.

To enable parallel reads in federated queries, configure this setting when you create the connection resource. This option divides the SQL query into smaller partitions and fetches each partition in parallel. However, this option is restricted to queries whose first operator in the execution plan is a distributed union operator. Other queries return an error. To view the query execution plan for a Spanner query, see Understand how Cloud Spanner executes queries.

Manage query execution priority

You can assign priority (high, medium, or low) to individual queries, by specifying the query_execution_priority option, as shown below:

  '''SELECT customer_id, MIN(order_date) AS first_order_date
  FROM orders
  GROUP BY customer_id''',

The default priority is medium.

Queries with priority high will compete with transactional traffic. Queries with priority low are best-effort, and might get preempted by background load, for example scheduled backups.

View a Spanner table schema

You can use the EXTERNAL_QUERY function to query information_schema views to access database metadata, such as listing all tables in the database or showing a table schema. The following example returns information about the columns in the table MyTable:

Google SQL database

    '''SELECT t.column_name, t.spanner_type, t.is_nullable
      FROM information_schema.columns AS t
        t.table_catalog = ''
        AND t.table_schema = ''
        AND t.table_name = 'MyTable'
      ORDER BY t.ordinal_position

PostgreSQL database

  '''SELECT t.column_name, t.data_type, t.is_nullable
    FROM information_schema.columns AS t
      t.table_schema = 'public' and t.table_name='MyTable'
    ORDER BY t.ordinal_position

For more information, see the following information schema references in the Spanner documentation:

Data type mappings

When you execute a Spanner federated query, the data from Spanner is converted to Google Standard SQL types.

Spanner Google Standard SQL type Spanner PostgreSQL type BigQuery type
FLOAT64 float8 FLOAT64
INT64 bigint INT64
STRUCT - Not supported for Spanner federated queries
TIMESTAMP timestamptz TIMESTAMP with nanoseconds truncated

* PostgreSQL numeric values with a precision that is greater than the precision that BigQuery supports are rounded. Values that are larger than the maximum value generate an Invalid NUMERIC value error.

If your external query contains a data type that is unsupported for federated queries, the query fails immediately. You can cast the unsupported data type to a supported data type.

Supported regions

Federated queries are only supported in locations that support both Spanner and BigQuery.

There are two types of locations:

  • A region is a specific geographic place, such as London.

  • A multi-region is a large geographic area, such as the United States, that contains two or more geographic places.

You can create a connection and execute a federated query across regions according to the following rules.


A BigQuery multi-region can query any data source region in the same large geographic area (US, EU), for example:

  • The BigQuery US multi-region can query any single region in the US geographic area, such as us-central1, us-east4, us-west2, and so on.
  • The BigQuery EU multi-region can query any single region in member states of the European Union, such as europe-north1, europe-west3, and so on.
  • The connection used in the query must reside in the same location as the query location. For example queries executed from the US multi-region must reference a connection located in the US multi-region.

The query processing location is the multi-region location, either US or EU.

Single regions

A BigQuery single region can only query a resource in the same region. For example:

  • The BigQuery single region us-east4 can only query Spanner in us-east4.

In this example, the query processing location is the BigQuery single region.


Spanner federated queries are subject to the following limitations:

  • Performance. A federated query is likely to not be as fast as querying only BigQuery storage. BigQuery needs to wait for the source database to execute the external query and temporarily move data from the external data source to BigQuery. Also, the source database might not be optimized for complex analytical queries.

  • Federated queries are read-only. The external query that is executed in the source database must be read-only. Therefore, DML or DDL statements are not supported.

  • Unsupported data types. If your external query contains a data type that is unsupported in BigQuery, the query fails immediately. You can cast the unsupported data type to a different supported data type.


This section helps you troubleshoot issues you might encounter when sending a federated query to Spanner.

Issue: Query is not root partitionable.
Resolution: If you configure the connection to read data in parallel, the first operator in the query execution plan must be Distributed Union. To resolve this error, view the query execution plan and rewrite the query. For more information, see Understand how Cloud Spanner executes queries.
Issue: Deadline exceeded.
Resolution: Select the option to read data in parallel and rewrite the query to be root partitionable. For more information, see Understand how Cloud Spanner executes queries.