Connect to Spanner
As a BigQuery administrator, you can create a connection to access Spanner data. This connection enables data analysts to query data in Spanner.
Before you begin
- Enable the BigQuery Connection API.
-
To get the permissions that you need to connect to Spanner, ask your administrator to grant you the BigQuery Connection Admin (
roles/bigquery.connectionAdmin
) IAM role on the project. For more information about granting roles, see Manage access to projects, folders, and organizations.You might also be able to get the required permissions through custom roles or other predefined roles.
Create Spanner connections
Select one of the following options:
Console
Go to the BigQuery page.
In the Explorer pane, click
Add, and then select Connections to external data source.In the External data source pane, enter the following information:
- For Connection type, select Cloud Spanner.
- For Connection ID, enter an identifier for the connection resource. Letter, numbers, and underscores are allowed.
- For Location type, select a BigQuery location (or region) that is compatible with your external data source region.
- Optional: For Friendly name, enter a user-friendly name for the
connection, such as
My connection resource
. The friendly name can be any value that helps you identify the connection resource if you need to modify it later. - Optional: For Description, enter a description for this connection resource.
- For Database name, enter the name of the Spanner
database in the following format:
"projects/PROJECT_ID/instances/INSTANCE/databases/DATABASE"
- Optional: To perform parallel reads, select Read data in parallel. Spanner can divide certain queries into smaller pieces, or partitions, and fetch the partitions in parallel. For more information, see Read data in parallel in the Spanner documentation. This option is restricted to queries whose first operator in the execution plan is a distributed union operator. Other queries return an error. To view the query execution plan for a Spanner query, see Understand how Spanner executes queries.
- Optional: For Database role, enter the name of a
Spanner database role. If not empty, this
connection queries Spanner using this database role
by default. Spanner fine-grained access control users who submit
queries through this connection
must have been granted access to this role by their administrator,
and the database role must have the
SELECT
privilege on all schema objects specified in external queries. For information about fine-grained access control, see About fine-grained access control. - Optional: To enable Data Boost, select Use Spanner Data Boost. Data Boost lets you execute analytics queries and data exports with near-zero impact to existing workloads on the provisioned BigQuery instance. To enable Data Boost, select Data Boost and Read data in parallel.
Click Create connection.
bq
To create the connection, use the
bq mk
command
with the --connection
flag.
bq mk --connection \ --connection_type=CLOUD_SPANNER \ --properties='PROPERTIES' \ --location=LOCATION \ --display_name='FRIENDLY_NAME' \ --description 'DESCRIPTION' \ CONNECTION_ID
Replace the following:
PROPERTIES
: a JSON object with the following fields:"database"
: the Spanner database for the connectionSpecify as a string with the following format:
"projects/PROJECT_ID/instances/INSTANCE/databases/DATABASE"
."use_parallelism"
: (Optional) iftrue
, this connection performs parallel readsThe default value is
false
. Spanner can divide certain queries into smaller pieces, or partitions, and fetch the partitions in parallel. For more information, see Read data in parallel in the Spanner documentation. This option is restricted to queries whose first operator in the execution plan is a distributed union operator. Other queries return an error. To view the query execution plan for a Spanner query, see Understand how Spanner executes queries."database_role"
: (Optional) If not empty, this connection queries Spanner using this database role by default. Spanner fine-grained access control users who submit queries through this connection must have been granted access to this role by their administrator, and the database role must have theSELECT
privilege on all schema objects specified in external queries.If not specified, the connection authenticates with IAM predefined roles for Spanner, and the principal running queries with this connection must have been granted the
roles/spanner.databaseReader
IAM role.For information about fine-grained access control, see About fine-grained access control.
"useDataBoost"
: (Optional) Iftrue
, this connection lets users use Data Boost. Data Boost lets users run federated queries in separate, independent, compute capacity distinct from provisioned instances to avoid impacting existing workloads. To enable Data Boost, set"useDataBoost"
totrue
and"use_parallelism"
totrue
.In order to use Data Boost, the principal running queries with this connection must have been granted the
spanner.databases.useDataBoost
permission. This permission is included by default in theroles/spanner.admin
androles/spanner.databaseAdmin
roles.
LOCATION
: a BigQuery location that is compatible with your external data source region.CONNECTION_ID
: an identifier for the connection resourceThe connection ID can contain letters, numbers and underscores. If you don't provide a connection ID, BigQuery automatically generates a unique ID.
The following example creates a new connection resource named
my_connection_id
.bq mk --connection \ --connection_type='CLOUD_SPANNER' \ --properties='{"database":"projects/my_project/instances/my_instance/databases/database1"}' \ --project_id=federation-test \ --location=us \ my_connection_id
API
Call the CreateConnection
method within
the ConnectionService
service.
Share connections with users
You can grant the following roles to let users query data and manage connections:
roles/bigquery.connectionUser
: enables users to use connections to connect with external data sources and run queries on them.roles/bigquery.connectionAdmin
: enables users to manage connections.
For more information about IAM roles and permissions in BigQuery, see Predefined roles and permissions.
Select one of the following options:
Console
Go to the BigQuery page.
Connections are listed in your project, in a group called External connections.
In the Explorer pane, click your project name > External connections > connection.
In the Details pane, click Share to share a connection. Then do the following:
In the Connection permissions dialog, share the connection with other principals by adding or editing principals.
Click Save.
bq
You cannot share a connection with the bq command-line tool. To share a connection, use the Google Cloud console or the BigQuery Connections API method to share a connection.
API
Use the
projects.locations.connections.setIAM
method
in the BigQuery Connections REST API reference section, and
supply an instance of the policy
resource.
Java
Before trying this sample, follow the Java setup instructions in the BigQuery quickstart using client libraries. For more information, see the BigQuery Java API reference documentation.
To authenticate to BigQuery, set up Application Default Credentials. For more information, see Set up authentication for client libraries.
What's next
- Learn about different connection types.
- Learn about managing connections.
- Learn about federated queries.
- Learn how to query Spanner data.