BigQuery is a petabyte-scale analytics data warehouse that you can use to run SQL queries over vast amounts of data in near real time.
Giving a view access to a dataset is also known as creating an authorized view in BigQuery. An authorized view lets you share query results with particular users and groups without giving them access to the underlying tables. You can also use the view's SQL query to restrict the columns (fields) the users are able to query. In this tutorial, you create an authorized view.
Objectives
This tutorial shows you how to complete the following tasks:
- Create datasets and apply access controls to them
- Assign access controls to your project
- Create an authorized view that restricts the data users are able to query
Costs
BigQuery is a paid product and you incur BigQuery usage in this tutorial. BigQuery offers some resources free of charge up to a specific limit. For more information, see BigQuery free operations and free tier.
Before you begin
Before you begin this tutorial, use the Google Cloud console to create or select a project.
- Sign in to your Google Cloud account. If you're new to Google Cloud, create an account to evaluate how our products perform in real-world scenarios. New customers also get $300 in free credits to run, test, and deploy workloads.
-
In the Google Cloud console, on the project selector page, select or create a Google Cloud project.
-
In the Google Cloud console, on the project selector page, select or create a Google Cloud project.
- BigQuery is automatically enabled in new projects.
To activate BigQuery in a preexisting project, go to
Enable the BigQuery API.
- Optional: Enable billing for the project. If you don't want to enable billing or provide a credit card, the steps in this document still work. BigQuery provides you a sandbox to perform the steps. For more information, see Enable the BigQuery sandbox.
Create a source dataset
You begin by creating a dataset to store your source data. For this tutorial, you populate a table in your source dataset by querying the GitHub public dataset. The data in your source dataset contains information you do not want your data analysts to see. You restrict access to the data using an authorized view.
To create your source dataset:
Console
In the Google Cloud console, open the BigQuery page.
In the Explorer pane, select the project where you want to create the dataset.
Expand the
Actions option and click Create dataset.For Dataset ID, enter
github_source_data
.Leave the other default settings in place and click Create dataset.
SQL
Use the CREATE SCHEMA
DDL statement:
In the Google Cloud console, go to the BigQuery page.
In the query editor, enter the following statement:
CREATE SCHEMA github_source_data;
Click
Run.
For more information about how to run queries, see Running interactive queries.
Java
Before trying this sample, follow the Java setup instructions in the BigQuery quickstart using client libraries. For more information, see the BigQuery Java API reference documentation.
To authenticate to BigQuery, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Python
Before trying this sample, follow the Python setup instructions in the BigQuery quickstart using client libraries. For more information, see the BigQuery Python API reference documentation.
To authenticate to BigQuery, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
After creating the source dataset, you populate a table in it using a SQL query. This query retrieves data from the GitHub public dataset.
Console
In the Google Cloud console, open the BigQuery page.
Copy and paste the following query into the Editor pane.
SELECT commit, author, committer, repo_name FROM `bigquery-public-data.github_repos.commits` LIMIT 1000;
Click More and select Query settings.
For Destination, select Set a destination table for query results.
For Dataset, enter
PROJECT_ID.github_source_data
. Replace PROJECT_ID with your project ID.For Table Id, enter
github_contributors
.Click Save.
Click Run.
When the query completes, click github_contributors and then click Preview to verify the data was written to the table.
Java
Before trying this sample, follow the Java setup instructions in the BigQuery quickstart using client libraries. For more information, see the BigQuery Java API reference documentation.
To authenticate to BigQuery, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Python
Before trying this sample, follow the Python setup instructions in the BigQuery quickstart using client libraries. For more information, see the BigQuery Python API reference documentation.
To authenticate to BigQuery, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Create a dataset where you can store your view
After creating your source dataset, you create a new, separate dataset to store the authorized view that you share with your data analysts. In a later step, you grant the authorized view access to the data in the source dataset. Your data analysts then have access to the authorized view, but not direct access to the source data.
Authorized views should be created in a different dataset from the source data. That way, data owners can give users access to the authorized view without simultaneously granting access to the underlying data. The source data dataset and authorized view dataset must be in the same regional location.
To create a dataset to store your view:
Console
In the Google Cloud console, open the BigQuery page.
In the Explorer panel, select the project where you want to create the dataset.
Expand the
Actions option and click Create dataset.For Dataset ID, enter
shared_views
.Leave the other default settings in place and click Create dataset.
SQL
Use the CREATE SCHEMA
DDL statement:
In the Google Cloud console, go to the BigQuery page.
In the query editor, enter the following statement:
CREATE SCHEMA shared_views;
Click
Run.
For more information about how to run queries, see Running interactive queries.
Java
Before trying this sample, follow the Java setup instructions in the BigQuery quickstart using client libraries. For more information, see the BigQuery Java API reference documentation.
To authenticate to BigQuery, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Python
Before trying this sample, follow the Python setup instructions in the BigQuery quickstart using client libraries. For more information, see the BigQuery Python API reference documentation.
To authenticate to BigQuery, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Create the view in the new dataset
In the new dataset, you create the view you intend to authorize. This is the view you share with your data analysts. This view is created using a SQL query that excludes the columns you do not want the data analysts to see.
For this tutorial, your shared view excludes the author's information except for the author's name, and excludes the committer's information except for the committer's name.
To create the view in the new dataset:
Console
In the Google Cloud console, open the BigQuery page.
Copy and paste the following query into the Editor pane. Replace
PROJECT_ID
with your project ID.SELECT commit, author.name AS author, committer.name AS committer, repo_name FROM `PROJECT_ID.github_source_data.github_contributors`;
Click Save > Save view.
In the Save view dialog:
- For Project, verify your project is selected.
- For Dataset, enter
shared_views
. - For Table, enter
github_analyst_view
. - Click Save.
SQL
Use the CREATE VIEW
DDL statement:
In the Google Cloud console, go to the BigQuery page.
In the query editor, enter the following statement:
CREATE VIEW shared_views.github_analyst_view AS ( SELECT commit, author.name AS author, committer.name AS committer, repo_name FROM `PROJECT_ID.github_source_data.github_contributors` );
Replace
PROJECT_ID
with your project ID.Click
Run.
For more information about how to run queries, see Running interactive queries.
Java
Before trying this sample, follow the Java setup instructions in the BigQuery quickstart using client libraries. For more information, see the BigQuery Java API reference documentation.
To authenticate to BigQuery, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Python
Before trying this sample, follow the Python setup instructions in the BigQuery quickstart using client libraries. For more information, see the BigQuery Python API reference documentation.
To authenticate to BigQuery, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Assign a project-level IAM role to your data analysts
To query the view, your data analysts need permission to run query
jobs. The bigquery.user
role includes permissions to run jobs, including
query jobs, within the project. If you grant a user or group the
bigquery.user
role at the project level, the user can create datasets and can
run query jobs against tables in those datasets. The bigquery.user
role does
not give users permission to query data, view table data, or view table schema
details for datasets the user did not create.
Assigning your data analysts the project-level bigquery.user
role does not
give them the ability to view or query table data in the dataset containing the
tables queried by the view. The bigquery.user
role also does not grant users
the ability to update your views.
Most individuals (data scientists, business
intelligence analysts, data analysts) in an enterprise should be assigned the
project-level bigquery.user
role.
When you add a group to an IAM role, the email address and domain must be associated with an active Google Account or Google Apps account.
To assign the data analysts group to the bigquery.user
role at the project
level:
Console
Open the IAM page in the Google Cloud console.
Ensure your project is selected in the project selector in the top bar.
Click
Grant access.In the Grant access to dialog:
- In the New principals box, enter the group that contains your data
analysts.
For example,
data_analysts@example.com
. - In the Select a role box, search for the BigQuery User role and select it.
- Click Save.
- In the New principals box, enter the group that contains your data
analysts.
For example,
Assign access controls to the dataset containing the view
For your data analysts to query the view, they need to be granted the
bigquery.dataViewer
role on the dataset containing the view. The
bigquery.user
role gives your data analysts the permissions required to create
query jobs. However, they cannot successfully query the view unless they also
have bigquery.dataViewer
access to the dataset that contains the view.
To give your data analysts bigquery.dataViewer
access to the dataset:
Console
In the Google Cloud console, open the BigQuery page.
In the Explorer pane, select the
shared_views
dataset.Click > Permissions.
SharingIn the Dataset permissions pane, click Add principal.
In the New principals box, enter the group that contains your data analysts (for example,
data_analysts@example.com
).Click Select a role and select BigQuery > BigQuery Data Viewer.
Click Save.
Click Close.
Java
Before trying this sample, follow the Java setup instructions in the BigQuery quickstart using client libraries. For more information, see the BigQuery Java API reference documentation.
To authenticate to BigQuery, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Python
Before trying this sample, follow the Python setup instructions in the BigQuery quickstart using client libraries. For more information, see the BigQuery Python API reference documentation.
To authenticate to BigQuery, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Authorize the view to access the source dataset
Once you create access controls for the dataset containing the view, you add the view as an authorized view in the source dataset. This authorization gives the view, but not your data analysts group, access to the source data.
To authorize the view to access the source data:
Console
In the Google Cloud console, open the BigQuery page.
In the Explorer pane, select the
github_source_data
dataset.Click
Sharing, and then select Authorize views.In the Authorized views pane that opens, enter
github_analyst_view
view in the Authorized views field.Click Add Authorization.
The github_analyst_view
view is now authorized to access data in the source
dataset.
Java
Before trying this sample, follow the Java setup instructions in the BigQuery quickstart using client libraries. For more information, see the BigQuery Java API reference documentation.
To authenticate to BigQuery, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Python
Before trying this sample, follow the Python setup instructions in the BigQuery quickstart using client libraries. For more information, see the BigQuery Python API reference documentation.
To authenticate to BigQuery, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Verify the configuration
When your configuration is complete, a member of your data analysts group (for
example, data_analysts
) can verify the configuration by querying the view.
To verify the configuration:
SQL
Have a member of your data analysts group do the following:
In the Google Cloud console, go to the BigQuery page.
In the query editor, enter the following statement:
SELECT * FROM `PROJECT_ID.shared_views.github_analyst_view`;
Replace
PROJECT_ID
with your project ID.Click
Run.
For more information about how to run queries, see Running interactive queries.
Complete source code
Here is the complete source code for the tutorial for your reference.
Java
Before trying this sample, follow the Java setup instructions in the BigQuery quickstart using client libraries. For more information, see the BigQuery Java API reference documentation.
To authenticate to BigQuery, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Python
Before trying this sample, follow the Python setup instructions in the BigQuery quickstart using client libraries. For more information, see the BigQuery Python API reference documentation.
To authenticate to BigQuery, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Clean up
To avoid incurring charges to your Google Cloud account for the resources used in this tutorial, either delete the project that contains the resources, or keep the project and delete the individual resources.
- In the Google Cloud console, go to the Manage resources page.
- In the project list, select the project that you want to delete, and then click Delete.
- In the dialog, type the project ID, and then click Shut down to delete the project.
What's next
- Read Predefined roles and permissions to learn about access controls in BigQuery.
- Read Introduction to views to learn about BigQuery views.
- Read Creating an authorized view to learn more about authorized views.
- Read IAM overview to learn the basic IAM concepts.
- Read Managing policies to learn how to manage access control.