Controlling access to datasets

This document describes how to control access to datasets in BigQuery.

You can also do the following:

Overview

Dataset-level permissions determine the users, groups, and service accounts allowed to access the tables, views, and table data in a specific dataset. For example, if you grant the bigquery.dataOwner IAM role to a user on a specific dataset, that user can create, update, and delete tables and views in the dataset.

You can apply access controls during dataset creation by calling the datasets.insert API method.

Access controls can't be applied during dataset creation in the Cloud Console or the bq command-line tool.

You can apply access controls to a dataset after it is created in the following ways:

  • Using the Cloud Console.
  • Using the bq update command in the bq command-line tool.
  • Calling the datasets.patch API method.
  • Using the client libraries.

Required permissions

At a minimum, to assign or update dataset access controls, you must be granted bigquery.datasets.update and bigquery.datasets.get permissions. The following predefined IAM roles include bigquery.datasets.update and bigquery.datasets.get permissions:

  • bigquery.dataOwner
  • bigquery.admin

In addition, if a user has bigquery.datasets.create permissions, when that user creates a dataset, they are granted bigquery.dataOwner access to it. bigquery.dataOwner access gives users the ability to update datasets they create.

For more information on IAM roles and permissions in BigQuery, see Predefined roles and permissions.

Controlling access to a dataset

To assign access controls to a dataset:

Console

  1. Select a dataset from Resources, then click Share dataset near the right side of the window.

    Add people to dataset

  2. In the Share dataset panel, in the Dataset permissions tab, enter the entity that you want to add into the Add members field. You can add any of the following entities:

    • Google account email: Grants an individual Google account access to the dataset.
    • Google Group: Grants all members of a Google group access to the dataset.
    • Google Apps Domain: Grants all users and groups in a Google domain access to the dataset.
    • Service account: Grants a service account access to the dataset.
    • Anybody: Enter allUsers to grant access to the general public.
    • All Google accounts: Enter allAuthenticatedUsers to grant access to any user signed in to a Google Account.
  3. For Select a role, select BigQuery and choose an appropriate predefined IAM role for the new members. For more information on the permissions assigned to each predefined BigQuery role, see the Roles section of the access control page.

  4. Click Done.

bq

  1. Write the existing dataset information (including access controls) to a JSON file using the show command. If the dataset is in a project other than your default project, add the project ID to the dataset name in the following format: project_id:dataset.

    bq show \
    --format=prettyjson \
    project_id:dataset > path_to_file
    

    Replace the following:

    • project_id is your project ID.
    • dataset is the name of your dataset.
    • path_to_file is the path to the JSON file on your local machine.

    Examples:

    Enter the following command to write the access controls for mydataset to a JSON file. mydataset is in your default project.

      bq show --format=prettyjson mydataset > /tmp/mydataset.json
    

    Enter the following command to write the access controls for mydataset to a JSON file. mydataset is in myotherproject.

      bq show --format=prettyjson \
      myotherproject:mydataset > /tmp/mydataset.json
    
  2. Make your changes to the "access" section of the JSON file. You can add or remove any of the specialGroup entries: projectOwners, projectWriters, projectReaders, and allAuthenticatedUsers. You can also add, remove, or modify any of the following: userByEmail, groupByEmail, and domain.

    For example, the access section of a dataset's JSON file would look like the following:

    {
     "access": [
      {
       "role": "READER",
       "specialGroup": "projectReaders"
      },
      {
       "role": "WRITER",
       "specialGroup": "projectWriters"
      },
      {
       "role": "OWNER",
       "specialGroup": "projectOwners"
      },
      {
       "role": "READER",
       "specialGroup": "allAuthenticatedUsers"
      },
      {
       "role": "READER",
       "domain": "domain_name"
      },
      {
       "role": "WRITER",
       "userByEmail": "user_email"
      },
      {
       "role": "READER",
       "groupByEmail": "group_email"
      }
     ],
     ...
    }
    

  3. When your edits are complete, use the update command and include the JSON file using the --source flag. If the dataset is in a project other than your default project, add the project ID to the dataset name in the following format: project_id:dataset.

    bq update \
    --source path_to_file \
    project_id:dataset
    

    Replace the following:

    • path_to_file is the path to the JSON file on your local machine.
    • project_id is your project ID.
    • dataset is the name of your dataset.

    Examples:

    Enter the following command to update the access controls for mydataset. mydataset is in your default project.

        bq update --source /tmp/mydataset.json mydataset
    

    Enter the following command to update the access controls for mydataset. mydataset is in myotherproject.

        bq update --source /tmp/mydataset.json myotherproject:mydataset
    
  4. To verify your access control changes, enter the show command again without writing the information to a file.

    bq show --format=prettyjson dataset
    

    or

    bq show --format=prettyjson project_id:dataset
    

API

Call datasets.insert with a defined dataset resource to apply access controls when the dataset is created. Call datasets.patch and use the access property in the dataset resource to update your access controls.

Because the datasets.update method replaces the entire dataset resource, datasets.patch is the preferred method for updating access controls.

Go

Before trying this sample, follow the Go setup instructions in the BigQuery Quickstart Using Client Libraries. For more information, see the BigQuery Go API reference documentation.

Set the dataset.access_entries property with the access controls for a dataset. Then call the client.update_dataset() function to update the property.

import (
	"context"
	"fmt"

	"cloud.google.com/go/bigquery"
)

// updateDatasetAccessControl demonstrates how the access control policy of a dataset
// can be amended by adding an additional entry corresponding to a specific user identity.
func updateDatasetAccessControl(projectID, datasetID string) error {
	// projectID := "my-project-id"
	// datasetID := "mydataset"
	ctx := context.Background()
	client, err := bigquery.NewClient(ctx, projectID)
	if err != nil {
		return fmt.Errorf("bigquery.NewClient: %v", err)
	}
	defer client.Close()

	ds := client.Dataset(datasetID)
	meta, err := ds.Metadata(ctx)
	if err != nil {
		return err
	}
	// Append a new access control entry to the existing access list.
	update := bigquery.DatasetMetadataToUpdate{
		Access: append(meta.Access, &bigquery.AccessEntry{
			Role:       bigquery.ReaderRole,
			EntityType: bigquery.UserEmailEntity,
			Entity:     "sample.bigquery.dev@gmail.com"},
		),
	}

	// Leverage the ETag for the update to assert there's been no modifications to the
	// dataset since the metadata was originally read.
	if _, err := ds.Update(ctx, update, meta.ETag); err != nil {
		return err
	}
	return nil
}

Python

Before trying this sample, follow the Python setup instructions in the BigQuery Quickstart Using Client Libraries. For more information, see the BigQuery Python API reference documentation.

Set the dataset.access_entries property with the access controls for a dataset. Then call the client.update_dataset() function to update the property.
from google.cloud import bigquery

# Construct a BigQuery client object.
client = bigquery.Client()

# TODO(developer): Set dataset_id to the ID of the dataset to fetch.
# dataset_id = 'your-project.your_dataset'

dataset = client.get_dataset(dataset_id)  # Make an API request.

entry = bigquery.AccessEntry(
    role="READER",
    entity_type="userByEmail",
    entity_id="sample.bigquery.dev@gmail.com",
)

entries = list(dataset.access_entries)
entries.append(entry)
dataset.access_entries = entries

dataset = client.update_dataset(dataset, ["access_entries"])  # Make an API request.

full_dataset_id = "{}.{}".format(dataset.project, dataset.dataset_id)
print(
    "Updated dataset '{}' with modified user permissions.".format(full_dataset_id)
)

Next steps