Google BigQuery

Stay organized with collections Save and categorize content based on your preferences.

Overview

This page explains how to set up a connection in Looker to Google BigQuery Standard SQL or Google BigQuery Legacy SQL.

The general steps for setting up a Google BigQuery Standard SQL or Google BigQuery Legacy SQL connection are:

  1. On your BigQuery database, configure the authentication that Looker will use to access your BigQuery database. Looker supports the following authentication options for BigQuery:
  2. On your BigQuery database, if you want to use persistent derived tables (PDTs) on the connection, create a temporary dataset that Looker can use to create PDTs on your database. See the section Creating a temporary dataset for persistent derived tables on this page for the procedure.

  3. In Looker, set up the Looker connection to your BigQuery database. See the section Connecting Looker to BigQuery on this page for the procedure.

  4. In Looker, test the connection between Looker and your BigQuery database. See the section Testing the connection on this page for the procedure.

Authentication with BigQuery service accounts

One way that Looker can authenticate into your BigQuery database is with a BigQuery service account. You create the service account on your BigQuery database using the API Manager in the Google Cloud console. You must have Google Cloud admin permissions to create the service account. Google has documentation on creating a service account and generating a private key.

Creating a service account and downloading the JSON credentials certificate

Follow these steps to create a BigQuery service account:

  1. Open the credentials page in the API Manager in the Google Cloud console and select your project.

  2. Click CREATE CREDENTIALS and choose Service account.

  3. Enter a name for the new service account, optionally add a description, and click CREATE.

  4. Your service account requires two Google BigQuery predefined roles:

    • BigQuery > BigQuery Data Editor
    • BigQuery > BigQuery Job User

    Select the first role in the Select a role field, then click ADD ANOTHER ROLE and select the second role:

    After selecting both roles, click CONTINUE:

  5. Click CREATE KEY:

  6. Select JSON under Key type and click CREATE:

  7. The JSON key will be saved to your computer. BE SURE TO REMEMBER WHERE IT IS SAVED. YOU WILL NOT BE ABLE TO DOWNLOAD THE KEY AGAIN. After noting the download location, click CLOSE:

  8. Click DONE:

  9. Find the email address corresponding to the service account. You will need this to configure the Looker connection to BigQuery.

  10. Once you create the service account on your BigQuery database, you will enter this service account information and the certificate file details in the Service Account Email, Service Account JSON/P12 File, and Password fields of Looker's Connections window when you set up the Looker connection to BigQuery.

Authentication with OAuth

Looker supports OAuth for Google BigQuery connections, meaning that each Looker user authenticates in to Google with their own Google OAuth credentials and authorizes Looker to access the database.

OAuth lets database administrators perform the following functions:

  • Audit which Looker users are running queries against the database
  • Enforce role-based access controls using Google permissions
  • Use OAuth tokens for all processes and actions that access Google BigQuery, instead of embedding BigQuery IDs and passwords in multiple places

For BigQuery connections with OAuth:

  • If a database administrator changes the BigQuery OAuth client credentials, any schedules or alerts that a user owns will be affected. Users must log in again if their administrator changes the BigQuery OAuth credentials. Users can also go to their Looker Account page from their user profile account page to log in to Google.
  • Since BigQuery connections that use OAuth are "per user," caching policies are also per user, and not just per query. So instead of using cached results whenever the same query is run within the caching period, Looker will use cached results only if the same user has run the same query within the caching period. For further information on caching, see the Caching queries and rebuilding PDTs with datagroups documentation page.
  • If you want to use persistent derived tables (PDTs) on a BigQuery connection with OAuth, you must create an additional service account for Looker to access your database for PDT processes. See the Persistent derived tables on a BigQuery connection section on this page for information.
  • Admins, when they sudo as another user, will use that user's OAuth authorization token. See the Users documentation page for information on using the sudo command.

Configuring a BigQuery database project for OAuth

The following sections describe how to generate OAuth credentials and how to configure an OAuth consent screen. If you've already configured an OAuth consent screen for another application in your project, you won't need to create another — you configure only one consent screen for all applications in a project.

OAuth credentials and the OAuth consent screen must be configured in the Google Cloud console. The Google generic description is on the Google Cloud support site and on the Google Dev console site.

Depending on the type of users accessing BigQuery data in Looker and whether your BigQuery data is public or private, OAuth may not be the most appropriate authentication method. Likewise, the type of data requested from the user and degree of access needed to that user's data when they're authenticating into Google to use Looker may require verification by Google. See more about verification in the Generating Google OAuth credentials section on this page.

Generating Google OAuth credentials

  1. Go to the Google Cloud console.

  2. In the Select a project drop-down, navigate to your BigQuery project. This should take you to your project dashboard.

  3. In the left menu, select the APIs & Services page. Then click Credentials. On the Credentials page, click the down arrow in the Create credentials button, and select OAuth client ID from the drop-down menu:

  4. Google requires that you configure an OAuth consent screen, which lets your users choose how to grant access to their private data, before you can generate your OAuth credentials. To configure your OAuth consent screen, see the Configuring an OAuth consent screen section on this page.

    If you've already configured an OAuth consent screen, Google displays the Create OAuth client ID page, which lets you create an OAuth client ID and secret to use in your BigQuery connection to Looker:

  5. Select Web application as the application type and, when the page expands, enter a name for the app, such as Looker, in the Name field.

  6. In the Authorized JavaScript origins field, enter the URL to your Looker instance, including the https://. For example:

    • If Looker hosts your instance: https://<instancename>.looker.com
    • If you have a customer-hosted Looker instance: https://looker.<mycompany>.com
    • If your Looker instance requires a port number: https://looker.<mycompany>.com:9999
  7. In the Authorized redirect URIs field, enter the URL to your Looker instance, followed by /external_oauth/redirect. For example: https://<instancename>.looker.com/external_oauth/redirect or https://looker.<mycompany>.com:9999/external_oauth/redirect.

  8. Click Create. Google displays your client ID and your client secret.

  9. Copy your client ID and your client secret values. You will need them to configure the OAuth for the BigQuery connection in Looker.

Google requires that you configure an OAuth consent screen, which lets your users choose how to grant access to their private data and provides a link to your organization's terms of service and privacy policy.

In the left menu, select the OAuth consent screen page. Before you can configure your OAuth consent screen, you must choose the type of users to whom you're making this app available. Depending on your selection, your app may require verification by Google.

Make your selection and click Create. Google displays the OAuth consent screen page. You can configure this screen for all applications in your project, including both internal and public applications.

Google will perform a verification for public applications if any of these are true:

  • The application uses Google APIs that use restricted or sensitive scopes.
  • The OAuth consent screen includes an application logo.
  • The project has exceeded the domain threshold.

Do the following to configure your OAuth consent screen:

  1. In the Application name field, put the name of the application that the user is granting access to — in this case, Looker.

  2. Enter the support email that users should contact with login issues.

  3. Looker requires only the default scopes, so no additional scope configuration is required.

  4. In the Authorized domains field, enter the domain of the URL to your Looker instance. For example, if Looker hosts your instance at https://<instance_name>.looker.com, the domain is looker.com. For customer-hosted Looker deployments, enter the domain on which you host Looker.

    The remaining fields are optional but can be used to further customize your consent screen.

  5. Once you've configured your OAuth consent screen, click Save. You can now continue the procedure for generating your OAuth credentials.

For more information about configuring the Google OAuth consent screen, see the Google support documentation.

Configuring the Looker connection for BigQuery with OAuth

To enable OAuth for your BigQuery connection, select the Use OAuth checkbox on the Looker Connection page when you are setting up the Looker connection to BigQuery. When you click the checkbox, Looker will display the OAuth Client ID and OAuth Client Secret fields. Paste in the Client ID and Client Secret values that you obtained as a step in the Generating Google OAuth credentials procedure on this page.

Complete the rest of the procedure as described in the Connecting Looker to BigQuery section of this page.

How Looker users authenticate into BigQuery with OAuth

Once the Looker connection to BigQuery is set up for OAuth, users can use Looker to perform their initial authentication into your BigQuery database by doing one of the following:

Authenticating into Google from a query

Once the Looker connection to BigQuery is set up for OAuth, Looker will prompt users to log in with their Google account before running queries that use the BigQuery connection. Looker shows this prompt from Explores, dashboards, Looks, and SQL Runner. Here's an example Explore that uses a BigQuery connection where the user has not yet logged in:

The user must click Log In and authenticate with OAuth. After the user authenticates into BigQuery, the user can click the Run button in the Explore and Looker will load the data into the Explore.

Authenticating into Google from the user account page

Once the Looker connection to BigQuery is set up for OAuth, a user can authenticate into their Google account from the Looker user account page:

  1. From Looker, click the profile icon and select Account from the user menu.
  2. Scroll down to the OAuth Connection Credentials section and click the Log In button for the desired BigQuery database connection.
  3. Select the appropriate account from the Sign in with Google page.
  4. Click Allow on the OAuth consent screen to allow Looker to view and manage your data in Google BigQuery.

Once you authenticate into Google through Looker, you can log out or reauthorize your credentials at any time through your Account page, as described on the Personalizing your user account documentation page. Although Google BigQuery tokens do not expire, a user may click Reauthorize to log in with a different Google account.

Revoking OAuth tokens

Users can revoke access from applications like Looker to the Google account by visiting their Google account settings.

Google BigQuery tokens do not expire; however, if a database admin changes the database connection's OAuth credentials in a way that invalidates the existing credentials, users will have to log in with their Google account again before running any queries that use that connection.

Persistent derived tables on a BigQuery connection

If you want to use persistent derived tables (PDTs) for your BigQuery connection, you may need to do the following, depending on your connection configuration:

Creating a temporary dataset for persistent derived tables

To enable persistent derived tables (PDT) for your BigQuery connection, click the Persistent Derived Tables checkbox on the Looker Connection page when you are setting up the Looker connection to BigQuery. When you click the checkbox, Looker will display the Temp Dataset field. In this field, you'll enter the dataset name that Looker can use to create PDTs. You should configure this database or schema ahead of time, with the appropriate write permissions.

You can set up a temporary dataset using the Google Cloud BigQuery console:

  1. Open the Google Cloud BigQuery console and select your project.

  2. Click the three-dot menu and select Create dataset.

  3. Enter a Dataset ID (typically looker_scratch) and select your Data location (optional), Default table expiration, and encryption key management solution. Click CREATE DATASET to finish.

Now that you have created the dataset, you can specify the name of the dataset in the Temp Dataset field of Looker's Connections window when you set up the Looker connection to BigQuery.

Enabling PDTs for Looker connections to BigQuery with OAuth

For BigQuery connections that use OAuth, your users authenticate into Looker with their OAuth credentials. Looker supports PDTs for BigQuery connections with OAuth, but Looker itself cannot use OAuth, so you must set up a BigQuery service account specifically to allow Looker to access your database for PDT processes.

You can set up a PDT service account on your BigQuery database using the Google Cloud API Manager. See the Creating a service account and downloading the JSON credentials certificate section on this page.

Once you create the service account on your BigQuery database, you will enter this service account information and the certificate file details in the PDT Service Account Email, PDT Service Account JSON/P12 File, and Password fields of Looker's Connections window when you set up the Looker connection to BigQuery.

Connecting Looker to BigQuery

In the Admin section of Looker, select Connections to open the Connections page, then do one of the following:

  • To create a new connection, click the Add Connection button.
  • To edit an existing connection, find the connection from the Databases table, then click the Edit button in the connection's listing.

Fill out the connection details. The majority of these settings are common to most database dialects and are described on the Connecting Looker to your database documentation page. The settings below are mentioned to highlight them, or to clarify how they apply specifically to BigQuery connections:

  • Dialect: Select Google BigQuery Standard SQL or Google BigQuery Legacy SQL.
  • Project ID: The Google Cloud project ID.
  • Dataset: The name of the default dataset that you plan to use. If a table doesn't have a dataset specified, then it is assumed to be in this dataset. (You can also model other datasets in this project.) This must match the name of a dataset in your BigQuery database.
  • Use OAuth: Select this box to enable each Looker user to authenticate into Google BigQuery and authorize Looker to access the database with the user's BigQuery account. See the Authentication with OAuths section on this page for more information on implementing OAuth for your BigQuery connection.
  • OAuth Client ID: The OAuth client ID. You get this information from the Google Cloud console as a step in the Generating Google OAuth credentials procedure. The OAuth Client ID field applies only to BigQuery connections that use OAuth for user authentication. For BigQuery connections that a BigQuery service account, this field does not apply.
  • OAuth Client Secret: The OAuth client secret. You get this information from the Google Cloud console as a step in the Generating Google OAuth credentials procedure. The OAuth Client Secret field applies only to BigQuery connections that use OAuth for user authentication. For BigQuery connections that a BigQuery service account, this field does not apply.
  • Service Account Email: The email address for the BigQuery service account. You get this email from the Google Cloud API Manager as a step in the Creating a service account and downloading the JSON credentials certificate procedure. The Service Account Email field applies only to BigQuery connections that use a BigQuery service account for user authentication. For BigQuery connections that use OAuth, this field does not apply.
  • Service Account JSON/P12 File: The certificate file for the BigQuery service account. You download this file from the Google Cloud API Manager as a step in the Creating a service account and downloading the JSON credentials certificate procedure. The Service Account JSON/P12 File field applies only to BigQuery connections that use a BigQuery service account for user authentication. For BigQuery connections that use OAuth, this field does not apply.
  • Password: The password for the P12 credentials file for the BigQuery service account. The Password field applies only if you selected the P12 key type in the Creating a service account and downloading the JSON credentials certificate procedure. If you are using a JSON credentials file, leave the Password field empty. The Password field is not available for BigQuery connections that use OAuth.
  • Persistent Derived Tables: Click to enable Persistent Derived Tables (PDTs) on the connection. You will need to specify the temporary dataset on your database that Looker will use to write PDTs. See the Creating a temporary dataset for persistent derived tables section on this page for the procedure.
  • PDT Service Account Email: (This field is displayed only for BigQuery connections that are enabled for both OAuth and PDTs.) The email address for the service account. This is the email address that was automatically created by the BigQuery database when you created the service account for Looker to use for PDT processes. See Enabling PDTs for Looker connections to BigQuery with OAuth for more information.
  • PDT Service Account JSON/P12 File: (This field is displayed only for BigQuery connections that are enabled for both OAuth and PDTs.) Click the Choose File button to upload the certificate file for the service account that Looker will use for PDT processes. This is the private key JSON file that you downloaded as part of the procedure for creating the service account. See Enabling PDTs for Looker connections to BigQuery with OAuth for more information.
  • Password (for PDT service account): If you opted to use a legacy .p12 credentials file instead of a JSON file as part of the procedure for creating the service account for Looker to use for PDT processes, enter the password to the .p12 credentials file. If you're using a JSON credentials file, leave this field empty.
  • Additional Params: Add any additional JDBC parameters, such as BigQuery labels (see the Job labels and context comments for BigQuery connections section on this page for more information.) These are some of the other supported parameters:
    • connectTimeout: Number of milliseconds to wait for a connection. Defaults to 240000.
    • readTimeout: Number of milliseconds to wait for a read. Defaults to 240000.
    • rootUrl: If you have a BigQuery instance in a private network, specify an alternate endpoint to connect to BigQuery other than the default public endpoint.
  • Temp Dataset: The BigQuery dataset that you created in the Google Cloud BigQuery console to allow Looker to write persistent derived tables to your database. See the Creating a temporary dataset for persistent derived tables section for the procedure.
  • Max Billing Gigabytes: Leave blank for no limit. Read more details about pricing here.
  • Max Connections: Can be left at the default value initially. Read more about this setting in the Max Connections section of the Connecting Looker to your database documentation page.
  • Connection Pool Timeout: Can be left at the default value initially. Read more about this setting in the Connection Pool Timeout section of the Connecting Looker to your database documentation page.
  • SQL Runner Precache: If you want SQL Runner to not preload table information and, instead, to load table information only when a table is selected, clear this option. Read more about this setting in the SQL Runner Precache section of the Connecting Looker to your database documentation page.
  • Disable Context Comment: This option disables context comments on a BigQuery connection. Context comments are disabled by default for BigQuery connections because the context comments invalidate Google BigQuery's ability to cache and can negatively impact cache performance. For BigQuery connections, it is recommended that you use job labels instead of SQL query comments. See the Job labels and context comments for BigQuery connections section on this page for more information.
  • Database Time Zone: The default time zone for BigQuery is UTC. The time zone setting you specify here needs to match your BigQuery time zone setting.

Once you fill in all the applicable fields for the connection, you can test your connection as needed.

Testing the connection

  • Click the Test These Settings button at the bottom of the Connections Settings page, as described on the Connecting Looker to your database documentation page.
  • Click the Test button by the connection's listing on the Connections admin page, as described on the Connections documentation page.

For new connections, if you see Can connect, then press Add connection. This runs the rest of the connection tests to verify that the service account was set up correctly and with the proper roles.

Testing a connection that uses OAuth

  1. In Looker, go into Development Mode.
  2. For an existing BigQuery connection that uses OAuth, navigate to the project files for a Looker project that uses your BigQuery connection. For new BigQuery connections that use OAuth, open a model file and replace the model's connection value with the name of the your new BigQuery connection, then save the model file.
  3. Open one of the model's Explores or dashboards and run a query. When you try to run a query, Looker will prompt you to log in with your Google account. Follow the Google OAuth login prompts.

Job labels and context comments for BigQuery connections

For BigQuery connections, Looker sends query context in the form of BigQuery job labels. By default, Looker sends the following context label keys for BigQuery connections:

  • looker-context-user_id: The unique identifier for each user on the Looker instance. You can match this user ID to the user IDs on the Users page in the Admin menu.
  • looker-context-history_slug: The unique identifier for each query that is run on the database by the Looker instance.

  • looker-context-instance_slug: The ID number of the Looker instance that issued the query. Looker support can use this information to help you troubleshoot, if necessary.

You can configure additional job labels for Looker to send with every query on the BigQuery connection by using the Additional Params text field of the Connections page. In the Additional Params field, add an additional JDBC parameter, labels, and provide a comma-separated list of URL-encoded key=value pairs. For example, if you include this in the Additional Params field:

labels=this%3Dconnection-label,that%3Danother-connection-label

The %3D is the URL-encoding for =, so this would add the following two labels to every query that Looker sends to the BigQuery database, in addition to the default Looker context labels:

  • this: connection-label
  • that: another-connection-label

Note that BigQuery has restrictions on job labels:

  • Any connection label that has the same key as a context label will be ignored.
  • If the union of connection labels and context labels exceeds the maximum of 64 total labels, context labels are the first to be dropped, followed by connection labels, until the total number of labels is at most 64.

Looker ensures that context labels conform to all BigQuery's label validity requirements, but does not check connection labels for validity. Configuring invalid connection labels may cause queries to fail.

The BigQuery job labels that Looker sends by default (looker-context-user_id, looker-context-history_id, and looker-context-instance_slug) correspond to the SQL context comments that Looker attaches to SQL queries for database dialects other than BigQuery. For BigQuery connections, context comments are disabled by default because they invalidate BigQuery's ability to cache, and can negatively impact cache performance. You can enable context comments for a BigQuery connection by deselecting the Disable Context Comment setting for the BigQuery connection. It is recommended that you keep the default setting for Disable Context Comment, so that you are able to use BigQuery's cache. But if you deselect the Disable Context Comment option for a BigQuery connection, Looker will send SQL context comments and BigQuery job labels to your database. The following image shows a SQL Runner query on a BigQuery database where SQL context comments are enabled. You can see that Looker sends the database both the SQL context comments and the BigQuery job labels, and that they contain the same information:

Feature support

For Looker to support some features, your database dialect must also support them.

In the latest release of Looker, Google BigQuery Legacy SQL supports the following Looker features:

In the latest release of Looker, Google BigQuery Standard SQL supports the following Looker features:

Next steps

After you've connected your database to Looker, configure sign-in options for your users.