Use R with BigQuery

This page describes how to load data from BigQuery into an R data frame, using the bigrquery R package.

Before you begin

Before you begin, create a new AI Platform Notebooks instance for R.

Open a JupyterLab notebook

Follow these steps to open an AI Platform Notebooks instance.

  1. Go to the AI Platform Notebooks page in the Google Cloud Platform Console.

    Go to the AI Platform Notebooks page

  2. Select Open JupyterLab for the R instance that you want to open.

    Open JupyterLab

  3. Select File -> New -> Notebook, and then select the R kernel.

    Add an R notebook

Install and load R packages

Follow these steps to install and load the R packages that you need.

  1. In the notebook's first code cell, enter the following:

    # Install the packages
    install.packages("tidyverse")
    install.packages("httpuv")
    install.packages("gargle")
    install.packages("bigrquery")
    
  2. Click the run button to run the command. R installs the packages.

    The run button

  3. To add a code cell to the notebook, click the notebook's + button.

    The + button

  4. In the new code cell, add the following code and run it.

    # Load the packages
    library(tidyverse)
    library(httpuv)
    library(gargle)
    library(bigrquery)
    

Authenticate your BigQuery request

Use the following steps to authenticate your BigQuery request and load data into an R data frame. You can authenticate your BigQuery request through one of two methods: use your Google Cloud Platform account's credentials or a service account key. Learn about the differences between a GCP user account and a service account.

Use GCP credentials

To authenticate your BigQuery access through your GCP credentials, follow these steps:

  1. In a new code cell, enter the following code.

    # Provide authentication
    bq_auth(use_oob = TRUE)
    
  2. Run the code to authenticate your access to BigQuery.

Use a service account key

Create a service account key

To create a BigQuery service account key:

  1. Go to the APIs & Services Credentials page in the Google Cloud Platform Console.

    Go to the APIs & Services Credentials page

  2. Select Create credentials -> Service account key.

    Create a credential

  3. In the Service account drop-down menu, select an existing credential that has BigQuery access, or select New service account to create a new one.

    To create a new service account, enter a Service account name, and on the Role drop-down menu, select BigQuery, and then select the role that you want to assign, such as "BigQuery Admin."

    Create a service account

  4. Select JSON as the key type.

    Select JSON

  5. Click Create. Your browser downloads a JSON file that contains the service account key. This file allows access to your resources, so store it securely.

  6. Click Close to close the Private key saved to your computer message.

Authenticate your BigQuery service account key

To authenticate your BigQuery access through your service account key, follow these steps:

  1. In your JupyterLab notebook, click the Upload Files button.

    The Upload Files button

  2. Go to and select the JSON file that you created, and then click Open to upload it.

  3. In a new code cell, enter the following code. Replace path-name.json with the path to your notebook instance's service account key JSON file.

    To copy the path for your JSON file, right-click (Command + click on Mac) the filename in the JupyterLab file browser, and then select Copy Path.

    # Provide authentication through the JSON service account key
    bq_auth(path="path-name.json")
    
  4. Run the code to authenticate your access to BigQuery.

Load data from BigQuery

Follow these steps to load BigQuery data into a data frame using the bigrquery R package.

  1. In a new code cell, enter the following. Replace project-id with your GCP project ID.

    To get your project ID, click the drop-down arrow next to your project name in the Google Cloud Platform Console.

    # Store the project id
    projectid = "project-id"
    
    # Set your query
    sql <- "SELECT * FROM `bigquery-public-data.usa_names.usa_1910_current` LIMIT 10"
    
    # Run the query and store the data in a dataframe
    df <- query_exec(sql, projectid, use_legacy_sql = FALSE)
    
    # Print the query result
    df
    
  2. Run the cell to view 10 rows of data from one of BigQuery's public datasets.

What's next

Read bigrquery documentation to learn more about how you can use BigQuery data in your R notebooks.

Was this page helpful? Let us know how we did:

Send feedback about...

AI Platform Notebooks
Need help? Visit our support page.