Working with connections

Overview

The BigQuery Connection API enables users to set up a connection between BigQuery and an external data source. You can use the connection to submit queries against the external data source from BigQuery without moving or copying data into BigQuery.

You must complete an initial one-time setup to create a connection resource in BigQuery. After that you can perform the following tasks:

Before you begin

Enable the BigQuery connection service

  1. Open the BigQuery Connection API page in the API library.
  2. From the drop-down menu, select the project that contains your external data source.
  3. Click the ENABLE button.

    BigQuery Connection API

Service Account

BigQuery uses a service account to connect to your external data source. When you enable the BigQuery Connection API, a Google Cloud–managed Identity and Access Management (IAM) service account is automatically created on your behalf.

Permissions

  • To create and maintain a connection resource, the user must have the bigquery.admin predefined IAM role.

  • The bigquery.admin role includes the following BigQuery connection service permissions:

    • bigquery.connections.create
    • bigquery.connections.get
    • bigquery.connections.list
    • bigquery.connections.update
    • bigquery.connections.use
    • bigquery.connections.delete

To grant permissions to another user so they can use the connection resource, see Share a connection resource.

Granting bigquery.admin access

To grant the bigquery.admin role:

Console

  1. Open the IAM page in the Cloud Console

    Open the IAM page

  2. Click Select a project.

  3. Select a project and click Open.

  4. Click Add to add new members to the project and set their permissions.

  5. In the Add members dialog:

    • For Members, enter the email address of the user or group.
    • In the Select a role drop-down, click BigQuery > BigQuery Admin.
    • Click Add.

      Grant admin

gcloud

You can use the gcloud command-line tool to grant a user or group the bigquery.admin role.

To add a single binding to your project's IAM policy, type the following command. To add a user, supply the --member flag in the format user:user@example.com. To add a group, supply the --member flag in the format group:group@example.com.

gcloud projects add-iam-policy-binding project_id \
--member group/user:address \
--role roles/bigquery.admin

Where:

  • project_id is your project ID.
  • group/user is either group or user.
  • address is the user or group's email address.

For example:

gcloud projects add-iam-policy-binding myproject \
--member group:group@example.com \
--role roles/bigquery.admin

The command outputs the updated policy:

    bindings:
    - members:
      - group:group@example.com
        role: roles/bigquery.admin
    

For more information on IAM roles in BigQuery, see Predefined roles and permissions.

Create a connection resource

To avoid writing database credentials as cleartext in a federated query, you need to first create a database connection resource per database in BigQuery and then reference the connection resource in your federated query.

The connection resource has a set of IAM permissions, which you can grant to other users. The connection resource is encrypted and stored securely in the BigQuery connection service, and it can only be used for federated queries.

For more information about IAM roles and permissions in BigQuery, see Predefined roles and permissions.

Console

  1. To create a connection resource, go to the BigQuery page in the Cloud Console.

    Go to the BigQuery page

  2. In the Add data menu, select External data source.

    Create connection resource.

  3. In the External data source pane, enter the following information:

    • For Connection type, select the type of source, for example MySQL or Postgres.
    • For Connection ID, enter an identifier for the connection resource. Letter, numbers, and underscores are allowed.
    • For Connection location, select a BigQuery location (or region) that is compatible with your externa data source region.
    • (Optional): For Friendly name, enter a user-friendly name for the connection, such as My connection resource. The friendly name can be any value that helps you identify the connection resource if you need to modify it later.
    • (Optional): For Description, enter a description for this connection resource.
    • If you chose Cloud SQL MySQL or Postgres for the connection type, for Cloud SQL instance ID, enter the full name of the Cloud SQL instance, usually in the format project-id:location-id:instance-id. You can find the instance ID on the detail page of the Cloud SQL instance you want to query.
    • For Database name, enter the name of the database.
    • For Username, enter the username for the database.
    • For Password, enter the password for the database.

      • (Optional) Check Show password to reveal the password.

      New connection resource.

  4. Click Create connection.

bq

Enter the bq mk command and supply the connection flag: --connection. The following flags are also required:

  • --connection_type
  • --properties
  • --connection_credential
  • --project_id
  • --location

The following flags are optional:

  • --display_name The friendly name for the connection.
  • --description A description of the connection.

The connection id is an optional parameter that can be added as the last argument of the command which is used for storage internally. If a connection id is not provided a unique id is automatically generated. The connection id can contain letters, numbers and underscores.

    bq mk --connection --display_name='friendly name' --connection_type=TYPE \
      --properties=PROPERTIES --connection_credential=CREDENTIALS \
      --project_id=PROJECT_ID --location=LOCATION \
      CONNECTION_ID

Replace the following:

  • TYPE: the type of the external data source.
  • PROPERTIES: the parameters for the created connection in JSON format. For example: --properties='{"param":"param_value"}'. For creating a connection resource, you must supply the instanceID, database, and type parameters.
  • CREDENTIALS: the parameters username and password.
  • PROJECT_ID: your project ID.
  • LOCATION: the region your Cloud SQL instance is located in.
  • CONNECTION_ID: the connection identifier.

For example, the following command creates a new connection resource named my_new_connection (friendly name: "My new connection") in a project with the ID federation-test.

bq mk --connection --display_name='friendly name' --connection_type='CLOUD_SQL' \
  --properties='{"instanceId":"federation-test:us-central1:mytestsql","database":"mydatabase","type":"MYSQL"}' \
  --connection_credential='{"username":"myusername", "password":"mypassword"}' \
  --project_id=federation-test --location=us my_connection_id

API

Within the BigQuery Connection API, you can invoke CreateConnection within the ConnectionService to instantiate a connection. See the client library page for more details.

You can also create a connection using the programming language specific client libraries. See the connection type topics for language specific examples.

Query a connection resource

Once a connection is established, you can use the EXTERNAL_QUERY() function to execute a federated query. See Federated query syntax for more information and sample queries.

Query another database through the same connection resource

When you create a connection to an external data source, you specify credentials for that data source. If the same user credentials are valid for other data in the source, the same connection resource can be used.

Get information about a connection resource

After you create a connection resource, you can get information about the connection resource's configuration. The configuration includes the values you supplied when you created the transfer.

Required permissions

Getting information about a connection resource requires the bigquery.connections.get permission. The following predefined Identity and Access Management (IAM) roles already include the bigquery.connections.get permission:

  • bigquery.admin
  • bigquery.connection.admin
  • bigquery.connection.user

For more information about IAM roles and permissions in BigQuery, see Predefined roles and permissions.

Viewing a connection resource

Console

  1. To view the status of your connection resources, go to the BigQuery page in the Cloud Console.

    Go to BigQuery

  2. Connection resources will be listed at the top level under your project, under a group called External connections. Click on a connection to see information about that connection resource, such as connection ID and Cloud SQL instance ID.

    View connection resources

bq

Enter the bq show command and supply the connection flag: --connection. The fully qualified connection_id is required.

    bq show --connection project.location.connection_id

For example, the following command gets information about a connection resource named my_new_connection in a project with the ID federation-test located in the region us.

    bq show --connection federation-test.us.my_new_connection

API

Use the projects.locations.connections.get method in the REST API reference section.

Java

Before trying this sample, follow the Java setup instructions in the BigQuery quickstart using client libraries. For more information, see the BigQuery Java API reference documentation.

import com.google.cloud.bigquery.connection.v1.Connection;
import com.google.cloud.bigquery.connection.v1.ConnectionName;
import com.google.cloud.bigquery.connection.v1.GetConnectionRequest;
import com.google.cloud.bigqueryconnection.v1.ConnectionServiceClient;
import java.io.IOException;

// Sample to get connection
public class GetConnection {

  public static void main(String[] args) throws IOException {
    // TODO(developer): Replace these variables before running the sample.
    String projectId = "MY_PROJECT_ID";
    String location = "MY_LOCATION";
    String connectionId = "MY_CONNECTION_ID";
    getConnection(projectId, location, connectionId);
  }

  public static void getConnection(String projectId, String location, String connectionId)
      throws IOException {
    try (ConnectionServiceClient client = ConnectionServiceClient.create()) {
      ConnectionName name = ConnectionName.of(projectId, location, connectionId);
      GetConnectionRequest request =
          GetConnectionRequest.newBuilder().setName(name.toString()).build();
      Connection response = client.getConnection(request);
      System.out.println("Connection info retrieved successfully :" + response.getName());
    }
  }
}

List all connection resources

Required permissions

Listing all the connection resources in a project requires the bigquery.connections.list permissions. The following predefined IAM roles already include the bigquery.connections.list permission:

  • bigquery.admin
  • bigquery.connection.admin
  • bigquery.connection.user

For more information about IAM roles and permissions in BigQuery, see Predefined roles and permissions.

Listing connection resources

To list all the connection resources in a project:

Console

  1. To view the status of a connection resource, go to the BigQuery page in the Cloud Console.

    Go to BigQuery

  2. Connection resources will be listed at the top level under your project, under a group called External connections.

    View connection resources

bq

Enter the bq show command and supply the connection flag: --connection. The fully qualified connection_id is required.

    bq ls --connection --project_id=[project_id] --location=[location]

For example, the following command lists the connection resources in a project with the ID federation-test located in the region us.

    bq ls --connection --project_id=bigquery-federation-test --location=us

API

Use the projects.locations.connections.list method in the REST API reference section.

Java

Before trying this sample, follow the Java setup instructions in the BigQuery quickstart using client libraries. For more information, see the BigQuery Java API reference documentation.

import com.google.cloud.bigquery.connection.v1.ListConnectionsRequest;
import com.google.cloud.bigquery.connection.v1.LocationName;
import com.google.cloud.bigqueryconnection.v1.ConnectionServiceClient;
import java.io.IOException;

// Sample to get list of connections
public class ListConnections {

  public static void main(String[] args) throws IOException {
    // TODO(developer): Replace these variables before running the sample.
    String projectId = "MY_PROJECT_ID";
    String location = "MY_LOCATION";
    listConnections(projectId, location);
  }

  public static void listConnections(String projectId, String location) throws IOException {
    try (ConnectionServiceClient client = ConnectionServiceClient.create()) {
      LocationName parent = LocationName.of(projectId, location);
      int pageSize = 10;
      ListConnectionsRequest request =
          ListConnectionsRequest.newBuilder()
              .setParent(parent.toString())
              .setPageSize(pageSize)
              .build();
      client
          .listConnections(request)
          .iterateAll()
          .forEach(con -> System.out.println("Connection Id :" + con.getName()));
    }
  }
}

Share a connection resource

Required permissions

To allow another user to use a connection resource for Cloud SQL federated queries, the bigquery.admin user can grant the two different roles in IAM.

BigQuery Connection User

The bigquery.connection.user role is intended for users who need to run queries with the connection resource.

This IAM role includes the following permissions:

  • bigquery.connections.get
  • bigquery.connections.list
  • bigquery.connections.use
  • bigquery.connections.getIamPolicy

BigQuery Connection Admin

The bigquery.connection.admin role is intended for users who need to manage connection resources.

This IAM role includes all the same permissions as bigquery.connection.user, plus additional permissions to create, update, and delete existing connection resources, as well as to set IAM policy on connection resources.

  • bigquery.connections.create
  • bigquery.connections.update
  • bigquery.connections.setIamPolicy
  • bigquery.connections.delete

For more information about IAM roles and permissions in BigQuery, see Predefined roles and permissions.

Sharing a connection resource

Console

  1. To share a connection resource, go to the BigQuery page in the Cloud Console.

    Go to BigQuery

  2. Connection resources will be listed at the top level under your project, under a group called External connections. Click on a connection to see information about that connection resource.

    View connection resources

  3. Click on the SHARE CONNECTION button to see the permissions pane for that connection resource.

    Share connection button

  4. In the Connection permissions pane, users with the role BigQuery Admin or BigQuery Connection Admin can share the connection resource with other users by adding or editing other users' permissions.

    Share connection resources

  5. Click Done.

bq

Currently, you cannot share a resource with the bq command-line tool. To share a connection resource, use the Cloud Console or the BigQuery Connections API method to share a connection.

API

See the projects.locations.connections.setIAM method in the BigQuery Connections REST API reference section, and supply an instance of the policy resource.

Java

Before trying this sample, follow the Java setup instructions in the BigQuery quickstart using client libraries. For more information, see the BigQuery Java API reference documentation.

import com.google.api.resourcenames.ResourceName;
import com.google.cloud.bigquery.connection.v1.ConnectionName;
import com.google.cloud.bigqueryconnection.v1.ConnectionServiceClient;
import com.google.iam.v1.Binding;
import com.google.iam.v1.Policy;
import com.google.iam.v1.SetIamPolicyRequest;
import java.io.IOException;

// Sample to share connections
public class ShareConnection {

  public static void main(String[] args) throws IOException {
    // TODO(developer): Replace these variables before running the sample.
    String projectId = "MY_PROJECT_ID";
    String location = "MY_LOCATION";
    String connectionId = "MY_CONNECTION_ID";
    shareConnection(projectId, location, connectionId);
  }

  public static void shareConnection(String projectId, String location, String connectionId)
      throws IOException {
    try (ConnectionServiceClient client = ConnectionServiceClient.create()) {
      ResourceName resource = ConnectionName.of(projectId, location, connectionId);
      Binding binding =
          Binding.newBuilder()
              .addMembers("group:example-analyst-group@google.com")
              .setRole("roles/bigquery.connectionUser")
              .build();
      Policy policy = Policy.newBuilder().addBindings(binding).build();
      SetIamPolicyRequest request =
          SetIamPolicyRequest.newBuilder()
              .setResource(resource.toString())
              .setPolicy(policy)
              .build();
      client.setIamPolicy(request);
      System.out.println("Connection shared successfully");
    }
  }
}

Edit or update a connection resource

Once the connection resource is added, you can edit the connection resource. You can edit most of the fields populated during connection resource creation, including the username credentials.

A connection uses the credentials of the user who created it. If you need to change the user attached to a connection resource, you can update the user's credentials. This is useful if the user who created the connection is no longer with your organization.

You cannot edit the following elements of a connection resource:

  • Connection type
  • Connection ID
  • Location

Required permissions

Updating a connection resource requires the bigquery.connections.update permission.

The following predefined IAM roles already include the bigquery.connections.update permission:

  • bigquery.admin
  • bigquery.connection.admin

For more information about IAM roles and permissions in BigQuery, see Predefined roles and permissions.

Editing a connection resource

To edit a connection:

Console

  1. In the Cloud Console, go to the BigQuery page.

    Go to BigQuery

  2. Connection resources will be listed at the top level under your project, under a group called External connections. Click on a connection to see information about that connection resource.

    View connection resources

  3. Click on the EDIT CONNECTION button to see the edit pane for that connection resource.

    Edit connection button

  4. Edit any field shown, including the user credentials.

    Edit connection resource

  5. Click Save.

bq

Enter the bq update command and supply the connection flag: --connection. The fully qualified connection_id is required.

    bq update --connection --connection_type='CLOUD_SQL'
   --properties='{"instanceId" : "instance",
   "database" : "db", "type" : "MYSQL" }'
   --connection_credential='{"username":"u", "password":"p"}'
   project.location.connection_id

For example, the following command updates the connection resources in a project with the ID federation-test and connection ID test-mysql.

    bq update --connection --connection_type='CLOUD_SQL'
   --properties='{"instanceId" : "federation-test:us-central1:new-mysql",
   "database" : "imdb2", "type" : "MYSQL" }'
   --connection_credential='{"username":"my_username",
   "password":"my_password"}' federation-test.us.test-mysql

API

See the projects.locations.connections.patch method in the REST API reference section, and supply an instance of the connection resource.

Java

Before trying this sample, follow the Java setup instructions in the BigQuery quickstart using client libraries. For more information, see the BigQuery Java API reference documentation.

import com.google.cloud.bigquery.connection.v1.Connection;
import com.google.cloud.bigquery.connection.v1.ConnectionName;
import com.google.cloud.bigquery.connection.v1.UpdateConnectionRequest;
import com.google.cloud.bigqueryconnection.v1.ConnectionServiceClient;
import com.google.protobuf.FieldMask;
import com.google.protobuf.util.FieldMaskUtil;
import java.io.IOException;

// Sample to update connection
public class UpdateConnection {

  public static void main(String[] args) throws IOException {
    // TODO(developer): Replace these variables before running the sample.
    String projectId = "MY_PROJECT_ID";
    String location = "MY_LOCATION";
    String connectionId = "MY_CONNECTION_ID";
    String description = "MY_DESCRIPTION";
    Connection connection = Connection.newBuilder().setDescription(description).build();
    updateConnection(projectId, location, connectionId, connection);
  }

  public static void updateConnection(
      String projectId, String location, String connectionId, Connection connection)
      throws IOException {
    try (ConnectionServiceClient client = ConnectionServiceClient.create()) {
      ConnectionName name = ConnectionName.of(projectId, location, connectionId);
      FieldMask updateMask = FieldMaskUtil.fromString("description");
      UpdateConnectionRequest request =
          UpdateConnectionRequest.newBuilder()
              .setName(name.toString())
              .setConnection(connection)
              .setUpdateMask(updateMask)
              .build();
      Connection response = client.updateConnection(request);
      System.out.println("Connection updated successfully :" + response.getDescription());
    }
  }
}

Delete a connection resource

You can delete a connection resource if you have the correct permissions to do so.

Required permissions

Deleting a connection resource requires the bigquery.connections.delete permission.

The following predefined IAM roles already include the bigquery.connections.delete permission:

  • bigquery.admin
  • bigquery.connection.admin

For more information on IAM roles and permissions in BigQuery, see Predefined roles and permissions.

Deleting a connection resource

To delete a connection:

Console

  1. In the Google Cloud Console, go to the BigQuery page.

    Go to BigQuery

  2. Connection resources will be listed at the top level under your project, under a group called External connections. Click on a connection to see information about that connection resource.

    View connection resources

  3. Click the DELETE CONNECTION button to see the delete pane for that connection resource.

    Delete connection button

  4. In the Delete connection? dialog box, enter delete to confirm your intention.

    Delete connection resource

  5. Click DELETE.

bq

Enter the bq rm command and supply the connection flag: --connection. The fully qualified connection_id is required.

    bq rm --connection project.location.connection_id

For example, the following command updates the connection resources in a project with the ID federation-test and connection ID test-mysql.

    bq rm --connection federation-test.us.test-mysql

API

See the projects.locations.connections.delete method in the REST API reference section.

Java

Before trying this sample, follow the Java setup instructions in the BigQuery quickstart using client libraries. For more information, see the BigQuery Java API reference documentation.

import com.google.cloud.bigquery.connection.v1.ConnectionName;
import com.google.cloud.bigquery.connection.v1.DeleteConnectionRequest;
import com.google.cloud.bigqueryconnection.v1.ConnectionServiceClient;
import java.io.IOException;

// Sample to delete a connection
public class DeleteConnection {

  public static void main(String[] args) throws IOException {
    // TODO(developer): Replace these variables before running the sample.
    String projectId = "MY_PROJECT_ID";
    String location = "MY_LOCATION";
    String connectionName = "MY_CONNECTION_NAME";
    deleteConnection(projectId, location, connectionName);
  }

  public static void deleteConnection(String projectId, String location, String connectionName)
      throws IOException {
    try (ConnectionServiceClient client = ConnectionServiceClient.create()) {
      ConnectionName name = ConnectionName.of(projectId, location, connectionName);
      DeleteConnectionRequest request =
          DeleteConnectionRequest.newBuilder().setName(name.toString()).build();
      client.deleteConnection(request);
      System.out.println("Connection deleted successfully");
    }
  }
}

Audit logging

For audit logging of connection resources, see BigQuery audit logs overview.

Troubleshooting

This section helps you troubleshoot issues you might encounter when setting up a new connection. This section does not encompass all possible error messages or issues.

When diagnosing general connection issues, verify the following:

  • You have completed all the steps in the Before you begin section.
  • The connection configuration properties are correct.
  • You have the appropriate permissions to create a connection.

If your connection properties are correct, and the appropriate permissions are granted, refer to the following for solutions to common issues.

Issue: BigQuery and the external data source are not colocated.
Resolution: Federated querying is only supported in regions that support both BigQuery and the external data source. The BigQuery dataset and the data source instance must be in the same region, or, the BigQuery dataset must be in a multi-region location such as US and EU in the same geographic area that is compatible with the data source region. See Supported regions for more information about regions and region compatibility.
Issue: Performance is slower than expected.
Resolution: Performance of federated queries is not as high as querying data stored in BigQuery because the federated query has to externally query the source data, return the data to a temporary BigQuery table, map the data to a BigQuery data type, then execute the query in BigQuery. Although the query performance is not as high, the data does not need to be copied, moved, or stored again.
Issue: How to format the connection name.
Resolution: The connection name should include project, location, and connection ID. The connection ID should conform to this pattern: project_id.location_id.connection_id, for example, federation-test.us.my_new_connection