BigQuery is a petabyte-scale analytics data warehouse that you can use to run SQL queries over vast amounts of data in near realtime.
Giving a view access to a dataset is also known as creating an authorized view in BigQuery. An authorized view lets you share query results with particular users and groups without giving them access to the underlying tables. You can also use the view's SQL query to restrict the columns (fields) the users are able to query. In this tutorial, you create an authorized view.
Objectives
In this tutorial you:
- Create datasets and apply access controls to them
- Assign access controls to your project
- Create an authorized view that restricts the data users are able to query
Costs
BigQuery is a paid product and you will incur BigQuery usage in this tutorial. BigQuery offers some resources free of charge up to a specific limit. For more information, see BigQuery free operations and free tier.
Before you begin
Before you begin this tutorial, use the Google Cloud Console to create or select a project.
-
Sign in to your Google Account.
If you don't already have one, sign up for a new account.
-
In the Google Cloud Console, on the project selector page, select or create a Google Cloud project.
- BigQuery is automatically enabled in new projects. To activate BigQuery in a preexisting project, go to Enable the BigQuery API.
- BigQuery provides a sandbox if you do not want to provide a credit card or enable billing for your project. The steps in this topic work for a project whether or not your project has billing enabled. If you optionally want to enable billing, see Learn how to enable billing.
Introduction
In this tutorial, you create two datasets: one dataset for your source data and a second dataset for your authorized view. You populate the source dataset with data from the GitHub public dataset. You then create a view that queries a table in the source dataset.
After you create your datasets and your view, you assign access controls to the project, to the dataset containing the view, and to the dataset containing the source data.
Giving a view access to the source dataset is also referred to as creating an authorized view. When you create an authorized view, follow these steps:
- Create a separate dataset to store the view
- Create the view in the new dataset
- Assign access controls to the project
- Assign access controls to the dataset containing the view
- Authorize the view to access the source dataset
Create a source dataset
You begin by creating a dataset to store your source data. For this tutorial, you populate a table in your source dataset by querying the GitHub public dataset. The data in your source dataset contains information you do not want your data analysts to see. You restrict access to the data using an authorized view.
To create your source dataset:
Console
Open the BigQuery page in the Cloud Console.
In the Explorer panel, select the project where you want to create the dataset.
In the details panel, click Create dataset.
For Dataset ID type
github_source_data
.Leave all of the other default settings in place and click Create dataset.
Java
Before trying this sample, follow the Java setup instructions in the BigQuery Quickstart Using Client Libraries. For more information, see the BigQuery Java API reference documentation.
Python
Before trying this sample, follow the Python setup instructions in the BigQuery Quickstart Using Client Libraries. For more information, see the BigQuery Python API reference documentation.
After creating the source dataset, you populate a table in it using a SQL query. This query retrieves data from the GitHub public dataset.
Console
Open the BigQuery page in the Cloud Console.
Click Compose new query.
Copy and paste the following query into the Query editor text area.
SELECT commit, author, committer, repo_name FROM `bigquery-public-data.github_repos.commits` LIMIT 1000
Click More and select Query settings.
For Destination check the box for Set a destination table for query results.
- For Project name, verify your project is selected.
- For Dataset name, verify
github_source_data
is selected. - For Table name type:
github_contributors
. - Click Save.
Click Run.
When the query completes, click github_contributors and then click Preview to verify the data was written to the table.
Java
Before trying this sample, follow the Java setup instructions in the BigQuery Quickstart Using Client Libraries. For more information, see the BigQuery Java API reference documentation.
Python
Before trying this sample, follow the Python setup instructions in the BigQuery Quickstart Using Client Libraries. For more information, see the BigQuery Python API reference documentation.
Create a separate dataset to store your view
After creating your source dataset, you will create a new, separate dataset to store the authorized view that you will share with your data analysts. In a later step, you will grant the authorized view access to the data in the source dataset. Your data analysts will have access to the authorized view, but not direct access to the source data.
Authorized views that are access controlled at the dataset level should be created in a different dataset from the source data, so that data owners can give users access to the authorized view without simultaneously granting access to the underlying data. The source data dataset and authorized view dataset must be in the same regional location.
To create a dataset to store your view:
Console
Open the BigQuery page in the Cloud Console.
In the Explorer panel, select the project where you want to create the dataset.
In the details panel, click Create dataset.
For Dataset ID type
shared_views
.Leave all of the other default settings in place and click Create dataset.
Java
Before trying this sample, follow the Java setup instructions in the BigQuery Quickstart Using Client Libraries. For more information, see the BigQuery Java API reference documentation.
Python
Before trying this sample, follow the Python setup instructions in the BigQuery Quickstart Using Client Libraries. For more information, see the BigQuery Python API reference documentation.
Create the view in the new dataset
In the new dataset, you create the view you intend to authorize. This is the view you share with your data analysts. This view is created using a SQL query that excludes the columns you do not want the data analysts to see.
For this tutorial, your shared view excludes all of the author's information except for the author's name, and all of the committer's information except for the committer's name.
To create the view in the new dataset:
Console
Open the BigQuery page in the Cloud Console.
Click Compose new query.
Copy and paste the following query into the Query editor text area. Replace
project_id
with your project ID.SELECT commit, author.name as author, committer.name as committer, repo_name FROM `project_id.github_source_data.github_contributors`
Click More and select Query settings.
Under SQL dialect select Standard. Click Save to update query settings.
In the Save drop-down list, select Save view.
In the Save view dialog:
- For Project name, verify your project is selected.
- For Dataset name, verify
shared_views
is selected. - For Table name type:
github_analyst_view
. - Click Save.
Java
Before trying this sample, follow the Java setup instructions in the BigQuery Quickstart Using Client Libraries. For more information, see the BigQuery Java API reference documentation.
Python
Before trying this sample, follow the Python setup instructions in the BigQuery Quickstart Using Client Libraries. For more information, see the BigQuery Python API reference documentation.
Assign a project-level IAM role to your data analysts
To query the view, your data analysts need permission to run query
jobs. The bigquery.user
role includes permissions to run jobs, including
query jobs, within the project. If you grant a user or group the
bigquery.user
role at the project level, the user can create datasets and can
run query jobs against tables in those datasets. The bigquery.user
role does
not give users permission to query data, view table data, or view table schema
details for datasets the user did not create.
Assigning your data analysts the project-level bigquery.user
role does not
give them the ability to view or query table data in the dataset containing the
tables queried by the view. Most individuals (data scientists, business
intelligence analysts, data analysts) in an enterprise should be assigned the
project-level bigquery.user
role.
When you add a group to an IAM role, the email address and domain must be associated with an active Google Account or Google Apps account.
To assign the data analysts group to the bigquery.user
role at the project
level:
Console
Open the IAM page in the Google Cloud Console.
Click Select a project.
Select your project and click Open.
On the IAM page, click Add.
In the Add members dialog:
- In the Members box, enter the group that contains your data
analysts (for example,
data_analysts@example.com
). - In the Select a role box, search for the BigQuery User role and select it.
- Click Save.
- In the Members box, enter the group that contains your data
analysts (for example,
Assign access controls to the dataset containing the view
For your data analysts to query the view, they need to be granted the
bigquery.dataViewer
role on the dataset containing the view. The
bigquery.user
role gives your data analysts the permissions required to create
query jobs, but they cannot successfully query the view unless they also have at
least bigquery.dataViewer
access to the dataset that contains the view.
To give your data analysts bigquery.dataViewer
access to the dataset:
Console
In the Explorer panel, select the
shared_views
dataset.Click Share dataset.
In the Add members text box, enter the group that contains your data analysts (for example,
data_analysts@example.com
).Click Select role and select BigQuery > BigQuery Data Viewer.
Click Add.
Click Done.
Java
Before trying this sample, follow the Java setup instructions in the BigQuery Quickstart Using Client Libraries. For more information, see the BigQuery Java API reference documentation.
Python
Before trying this sample, follow the Python setup instructions in the BigQuery Quickstart Using Client Libraries. For more information, see the BigQuery Python API reference documentation.
Authorize the view to access the source dataset
Once you create access controls for the dataset containing the view, you add the view as an authorized view in the source dataset. This gives the view access to the source data, not your data analysts group.
To authorize the view to access the source data:
Console
In the Explorer panel, select the
github_source_data
dataset.Click Share dataset.
In the Dataset permissions panel, click the Authorized views tab.
Under Share authorized view:
- For Select project, verify your project is selected.
- For Select dataset, choose
shared_views
. - For Select view, type the view name:
github_analyst_view
. - Click Add.
Click Done.
Java
Before trying this sample, follow the Java setup instructions in the BigQuery Quickstart Using Client Libraries. For more information, see the BigQuery Java API reference documentation.
Python
Before trying this sample, follow the Python setup instructions in the BigQuery Quickstart Using Client Libraries. For more information, see the BigQuery Python API reference documentation.
Verify the configuration
When your configuration is complete, a member of your data analysts group (for
example, data_analysts
) can verify the configuration by querying the view.
To verify the configuration:
Console
Have a member of your data analysts group go to the BigQuery page in the Cloud Console.
Click Compose new query.
Copy and paste the following query into the Query editor text area. Replace
project_id
with your project ID.SELECT * FROM `project_id.shared_views.github_analyst_view`
Complete source code
Here is the complete source code for the tutorial for your reference.
Java
Before trying this sample, follow the Java setup instructions in the BigQuery Quickstart Using Client Libraries. For more information, see the BigQuery Java API reference documentation.
Python
Before trying this sample, follow the Python setup instructions in the BigQuery Quickstart Using Client Libraries. For more information, see the BigQuery Python API reference documentation.
Cleaning up
To avoid incurring charges to your Google Cloud account for the resources used in this tutorial, either delete the project that contains the resources, or keep the project and delete the individual resources.
- In the Cloud Console, go to the Manage resources page.
- In the project list, select the project that you want to delete, and then click Delete.
- In the dialog, type the project ID, and then click Shut down to delete the project.
What's next
- Read Predefined roles and permissions to learn about access controls in BigQuery.
- Read Introduction to views to learn about BigQuery views.
- Read Creating an authorized view to learn more about authorized views.
- Read IAM overview to learn the basic IAM concepts.
- Read Managing policies to learn how to manage access control.