Authorized datasets
This document describes how to use authorized datasets in BigQuery. An authorized dataset lets you authorize all of the views in a specified dataset to access the data in a second dataset. With an authorized dataset, you don't need to configure individual authorized views.
You can create an authorized dataset in BigQuery by using the following:
- The Google Cloud console
- The
bq update
command - The
datasets.update
API method
Overview
A view in BigQuery is a virtual table defined by a SQL query. For example, a view's query might return only a subset of the columns of a table, excluding columns that contain personal identifiable information (PII). To query a view, a user needs to have access to the resources that are accessed by the view's query.
If you want to let users query a view, without giving them direct access to the resources referenced by the view, you can use an authorized view. An authorized view lets you, for example, share more limited data in a view with specified groups or users (principals), without giving the principals access to all of the underlying data. You do this by giving the principals access to the view, and by giving the view access to the dataset that contains the underlying data.
If you want to give a collection of views access to a dataset, without having to authorize each individual view, you can group the views together into a dataset, and then give the dataset that contains the views access to the dataset that contains the data. You can then give principals access to the dataset with the group of views, or to individual views in the dataset, as needed. A dataset that has access to another dataset is called an authorized dataset. The dataset that authorizes another dataset to access its data is called the shared dataset.
Required permissions and roles
To authorize a dataset, or to revoke a dataset's authorization, you must have the following Identity and Access Management (IAM) permissions, which let you update the access control list of the dataset you are sharing.
After a dataset is authorized, you need these same permissions if you want to create or update views in the authorized dataset. For more information, see Create or update a view in an authorized dataset.
Permission | Resource |
---|---|
bigquery.datasets.get
|
The dataset you are sharing. |
bigquery.datasets.update
|
The dataset you are sharing. |
The following predefined IAM roles provide the required permissions.
Role | Description |
---|---|
bigquery.dataOwner
|
BigQuery Data Owner |
bigquery.admin
|
BigQuery Admin |
Quotas and limits
Authorized datasets are subject to dataset limits. For more information, see Dataset limits.
Authorize a dataset
You can authorize a dataset's current and future views to access another dataset by adding the dataset you want to authorize to the access list of the dataset you want to share, as follows:
Console
In the Google Cloud console, go to the BigQuery page.
In the Explorer panel, expand your project and select the dataset that contains the data you want to share.
Click the
Actions icon and select Open.In the details pane that appears, click Sharing and select the Authorize Datasets option.
In the Authorized dataset pane that appears, enter the Dataset ID of the dataset that you want to authorize, in the following format:
PROJECT.AUTHORIZED_DATASET
For example:
myProject.myDataset
Click Add Authorization and then click Close.
bq
Open the Cloud Shell:
Write the existing metadata (including the access control list) for the dataset you want to share into a JSON file by using the
bq show
command.bq show --format=prettyjson PROJECT:SHARED_DATASET > FILE_PATH
Use a text editor to add the dataset that you want to authorize into the existing
access
section of the JSON file that was created at FILE_PATH.For example:
"access": [ ... { "dataset": { "dataset": { "project_id": "PROJECT", "dataset_id": "AUTHORIZED_DATASET" }, "target_types": "VIEWS" } } ]
Update the shared dataset by using the
bq update
command. For example:bq update --source FILE_PATH PROJECT:SHARED_DATASET
To verify that the authorized dataset has been added, enter the
bq show
command again. For example:bq show --format=prettyjson PROJECT:SHARED_DATASET
API
Get the current metadata for the dataset you want to share by calling the
datasets.get
method, as follows:GET https://bigquery.googleapis.com/bigquery/v2/projects/PROJECT/datasets/SHARED_DATASET
The response body returns a
Dataset
resource that contains JSON metadata for the dataset.Add the dataset that you want authorize into the
access
section of the JSON metadata that was returned in theDataset
resource as follows:"access": [ ... { "dataset": { "dataset": { "project_id": "PROJECT", "dataset_id": "AUTHORIZED_DATASET" }, "target_types": "VIEWS" } } ]
Use the
datasets.update
method to update the dataset with the added authorization:PUT https://bigquery.googleapis.com/bigquery/v2/projects/PROJECT/datasets/SHARED_DATASET
Include the updated
Dataset
resource in the request body.You can verify that the authorized dataset has been added by calling the
datasets.get
method again.
Revoke a dataset's authorization
When you delete a dataset authorized to access another source dataset, it can take up to 24 hours for the change to fully reflect in the source dataset's access control lists (ACLs). During this time:
- You won't be able to access the source data through the deleted dataset.
- The deleted dataset might still appear in the source dataset's ACL and count towards any authorized dataset limits. This could prevent you from creating new authorized datasets until the ACL is updated.
To revoke the access granted to the views in an authorized dataset, remove the authorized dataset from the shared dataset's access list, as follows:
Console
In the Google Cloud console, go to the BigQuery page.
In the Explorer panel, expand your project and select the shared dataset.
Click the
Actions icon and select Open.In the details pane that appears, click Sharing and select the Authorize Datasets option.
In the Authorized dataset pane that appears, find the entry for the authorized dataset in the Currently authorized datasets section.
Click the delete icon next to the authorized dataset you want to remove, and then click Close.
bq
Open the Cloud Shell:
Write the existing metadata (including the access control list) for the shared dataset into a JSON file by using the
bq show
command.bq show --format=prettyjson PROJECT:SHARED_DATASET > FILE_PATH
Use a text editor to remove the authorized dataset from the
access
section of the JSON file that was created at FILE_PATH, as follows:{ "dataset": { "dataset": { "project_id": "PROJECT", "dataset_id": "AUTHORIZED_DATASET" }, "target_types": "VIEWS" } }
Update the shared dataset by using the
bq update
command. For example:bq update --source FILE_PATH PROJECT:SHARED_DATASET
To verify that the authorized dataset has been removed, enter the
bq show
command again. For example:bq show --format=prettyjson PROJECT:SHARED_DATASET
API
Get the current metadata for the shared dataset by calling the
datasets.get
method, as follows:GET https://bigquery.googleapis.com/bigquery/v2/projects/PROJECT/datasets/SHARED_DATASET
The response body returns a
Dataset
resource that contains JSON metadata for the dataset.Remove the authorized dataset from the
access
section of the JSON that was returned in theDataset
resource, for example:{ "dataset": { "dataset": { "project_id": "PROJECT", "dataset_id": "AUTHORIZED_DATASET" }, "target_types": "VIEWS" } }
Use the
datasets.update
method to update the dataset with the removed authorization:PUT https://bigquery.googleapis.com/bigquery/v2/projects/PROJECT/datasets/SHARED_DATASET
Include the updated
Dataset
resource in the request body.You can verify that the authorized dataset has been removed by calling the
datasets.get
method again.
Create or update a view in an authorized dataset
To create or update a view that is in an authorized dataset, you must have the permissions for the shared dataset that are listed in Required permissions and roles, in addition to the permissions that are required to create or update a view in a standard dataset.
The following table summarizes the necessary Identity and Access Management (IAM) permissions to create or update a view that is in an authorized dataset:
Permission | Resource |
---|---|
bigquery.datasets.get
|
The dataset you are sharing. |
bigquery.tables.getData
|
Any tables or views from the shared dataset that are referenced in the new view you are creating or updating. |
bigquery.tables.create
|
The authorized dataset in which you are creating a view. |
bigquery.tables.update
|
The authorized dataset in which you are updating a view. |
You don't need any additional permissions to delete a view from an authorized dataset.
Query a view in an authorized dataset
To query a view in an authorized dataset, a user needs to have access to the view, but access to the shared dataset is not required.
For more information, see Authorized views.
Authorized dataset example
The following example describes how to create and use an authorized dataset.
Assume you have two datasets, named private_dataset
and public_dataset
.
The private_dataset
dataset contains a table named private_table
. The
public_dataset
dataset contains a view named private_table_filtered
. The
private_table_filtered
view is based on a query that returns some, but not
all, of the fields in the private_table
table.
You can give a user access to
the data returned by the private_table_filtered
view, but not all of the data
in the private_table
table, as follows:
Grant the
bigquery.dataViewer
role to the user for thepublic_dataset
dataset. This role includes thebigquery.tables.getData
permission, which lets the user query the views in thepublic_dataset
dataset. For information about how to grant a role to a user for a dataset, see Controlling access to datasets.The user now has permission to query views in the
public_dataset
, but they still cannot access theprivate_table
table inprivate_dataset
. If the user tries to query theprivate_table
table directly, or if they try to access theprivate_table
table indirectly by querying theprivate_table_filtered
view, they get an error message similar to the following:Access Denied: Table PROJECT:private_dataset.private_table: User does not have permission to query table PROJECT:private_dataset.private_table.
In the BigQuery page of the Google Cloud console, open the
private_dataset
dataset, click Sharing, and then select Authorize Datasets.In the Authorized dataset pane that appears, enter
PROJECT.public_dataset
in the Dataset ID field, and then click Add Authorization.The
public_dataset
dataset is added to the access control list of theprivate_dataset
dataset, authorizing the views in thepublic_dataset
dataset to query the data in theprivate_dataset
dataset.The user can now query the
private_table_filtered
view in thepublic_dataset
dataset, which indirectly accesses theprivate_dataset
dataset, without having any permissions to directly access data in theprivate_dataset
dataset.
Limitations
- You can create authorized datasets in different regions, but BigQuery doesn't support cross-region queries. Therefore, we recommend that you create datasets in the same region.
What's next
For information about authorizing an individual view to access data in a dataset, see Authorized views.
For information about authorizing a table function or a user-defined function to access data in a dataset, see Authorized functions.