Connect to Spanner

As a BigQuery administrator, you can create a connection to access Spanner data. This connection enables data analysts to query data in Spanner.

Before you begin

Enable the BigQuery Connection API.

Enable the API
To get the permissions that you need to connect to Spanner, ask your administrator to grant you the BigQuery Connection Admin (roles/bigquery.connectionAdmin) IAM role on the project. For more information about granting roles, see Manage access to projects, folders, and organizations.

You might also be able to get the required permissions through custom roles or other predefined roles.

Create Spanner connections

Select one of the following options:

Console

Go to the BigQuery page.

Go to BigQuery
In the Explorer pane, click Add data.

The Add data dialog opens.
In the Filter By pane, in the Data Source Type section, select Databases.

Alternatively, in the Search for data sources field, you can enter Spanner.
In the Featured data sources section, click Google Cloud Spanner.
Click the Google Cloud Spanner: BigQuery Federation solution card.
In the External data source pane, enter the following information:
- For Connection type, select Cloud Spanner.
- For Connection ID, enter an identifier for the connection resource. Letter, numbers, and underscores are allowed.
- For Location type, select a BigQuery location (or region) that is compatible with your external data source region.
- Optional: For Friendly name, enter a user-friendly name for the connection, such as My connection resource. The friendly name can be any value that helps you identify the connection resource if you need to modify it later.
- Optional: For Description, enter a description for this connection resource.
- For Database name, enter the name of the Spanner database in the following format: "projects/PROJECT_ID/instances/INSTANCE/databases/DATABASE"
- Optional: To perform parallel reads, select Read data in parallel. Spanner can divide certain queries into smaller pieces, or partitions, and fetch the partitions in parallel. For more information, see Read data in parallel in the Spanner documentation. This option is restricted to queries whose first operator in the execution plan is a distributed union operator. Other queries return an error. To view the query execution plan for a Spanner query, see Understand how Spanner executes queries.
- Optional: For Database role, enter the name of a Spanner database role. If not empty, this connection queries Spanner using this database role by default. Spanner fine-grained access control users who submit queries through this connection must have been granted access to this role by their administrator, and the database role must have the SELECT privilege on all schema objects specified in external queries. For information about fine-grained access control, see About fine-grained access control.
- Optional: To enable Data Boost, select Use Spanner Data Boost. Data Boost lets you execute analytics queries and data exports with near-zero impact to existing workloads on the provisioned BigQuery instance. To enable Data Boost, select Data Boost and Read data in parallel.
Click Create connection.

bq

To create the connection, use the bq mk command with the --connection flag.

bq mk --connection \
    --connection_type=CLOUD_SPANNER \
    --properties='PROPERTIES' \
    --location=LOCATION \
    --display_name='FRIENDLY_NAME' \
    --description 'DESCRIPTION' \
    CONNECTION_ID

Replace the following:

PROPERTIES: a JSON object with the following fields:
- "database": the Spanner database for the connection
  
  Specify as a string with the following format: "projects/PROJECT_ID/instances/INSTANCE/databases/DATABASE".
- "use_parallelism": (Optional) if true, this connection performs parallel reads
  
  The default value is false. Spanner can divide certain queries into smaller pieces, or partitions, and fetch the partitions in parallel. For more information, see Read data in parallel in the Spanner documentation. This option is restricted to queries whose first operator in the execution plan is a distributed union operator. Other queries return an error. To view the query execution plan for a Spanner query, see Understand how Spanner executes queries.
- "database_role": (Optional) If not empty, this connection queries Spanner using this database role by default. Spanner fine-grained access control users who submit queries through this connection must have been granted access to this role by their administrator, and the database role must have the SELECT privilege on all schema objects specified in external queries.
  
  If not specified, the connection authenticates with IAM predefined roles for Spanner, and the principal running queries with this connection must have been granted the roles/spanner.databaseReader IAM role.
  
  For information about fine-grained access control, see About fine-grained access control.
- "useDataBoost": (Optional) If true, this connection lets users use Data Boost. Data Boost lets users run federated queries in separate, independent, compute capacity distinct from provisioned instances to avoid impacting existing workloads. To enable Data Boost, set "useDataBoost" to true and "use_parallelism" to true.
  
  In order to use Data Boost, the principal running queries with this connection must have been granted the spanner.databases.useDataBoost permission. This permission is included by default in the roles/spanner.admin and roles/spanner.databaseAdmin roles.
LOCATION: a BigQuery location that is compatible with your external data source region.
CONNECTION_ID: an identifier for the connection resource

The connection ID can contain letters, numbers and underscores. If you don't provide a connection ID, BigQuery automatically generates a unique ID.

The following example creates a new connection resource named my_connection_id.
```
bq mk --connection \
  --connection_type='CLOUD_SPANNER' \
  --properties='{"database":"projects/my_project/instances/my_instance/databases/database1"}' \
  --project_id=federation-test \
  --location=us \
  my_connection_id
```

API

Call the CreateConnection method within the ConnectionService service.

Share connections with users

You can grant the following roles to let users query data and manage connections:

roles/bigquery.connectionUser: enables users to use connections to connect with external data sources and run queries on them.
roles/bigquery.connectionAdmin: enables users to manage connections.

For more information about IAM roles and permissions in BigQuery, see Predefined roles and permissions.

Select one of the following options:

Console

Go to the BigQuery page.

Go to BigQuery

Connections are listed in your project, in a group called Connections.
In the left pane, click Explorer:

If you don't see the left pane, click Expand left pane to open the pane.
Click your project, click Connections, and then select a connection.
In the Details pane, click Share to share a connection. Then do the following:
1. In the Connection permissions dialog, share the connection with other principals by adding or editing principals.
2. Click Save.

bq

You cannot share a connection with the bq command-line tool. To share a connection, use the Google Cloud console or the BigQuery Connections API method to share a connection.

API

Use the projects.locations.connections.setIAM method in the BigQuery Connections REST API reference section, and supply an instance of the policy resource.

Java

Before trying this sample, follow the Java setup instructions in the BigQuery quickstart using client libraries. For more information, see the BigQuery Java API reference documentation.

To authenticate to BigQuery, set up Application Default Credentials. For more information, see Set up authentication for client libraries.

import com.google.api.resourcenames.ResourceName;
import com.google.cloud.bigquery.connection.v1.ConnectionName;
import com.google.cloud.bigqueryconnection.v1.ConnectionServiceClient;
import com.google.iam.v1.Binding;
import com.google.iam.v1.Policy;
import com.google.iam.v1.SetIamPolicyRequest;
import java.io.IOException;

// Sample to share connections
public class ShareConnection {

  public static void main(String[] args) throws IOException {
    // TODO(developer): Replace these variables before running the sample.
    String projectId = "MY_PROJECT_ID";
    String location = "MY_LOCATION";
    String connectionId = "MY_CONNECTION_ID";
    shareConnection(projectId, location, connectionId);
  }

  static void shareConnection(String projectId, String location, String connectionId)
      throws IOException {
    try (ConnectionServiceClient client = ConnectionServiceClient.create()) {
      ResourceName resource = ConnectionName.of(projectId, location, connectionId);
      Binding binding =
          Binding.newBuilder()
              .addMembers("group:example-analyst-group@google.com")
              .setRole("roles/bigquery.connectionUser")
              .build();
      Policy policy = Policy.newBuilder().addBindings(binding).build();
      SetIamPolicyRequest request =
          SetIamPolicyRequest.newBuilder()
              .setResource(resource.toString())
              .setPolicy(policy)
              .build();
      client.setIamPolicy(request);
      System.out.println("Connection shared successfully");
    }
  }
}

What's next

Learn about different connection types.
Learn about managing connections.
Learn about federated queries.
Learn how to query Spanner data.