Create an authorized view


In this tutorial, you create an authorized view in BigQuery that is used by your data analysts. Authorized views let you share query results with particular users and groups without giving them access to the underlying source data. The view is given access to the source data instead of a user or group. You can also use the view's SQL query to exclude columns and fields from the query results.

An alternative approach to using an authorized view would be to set up column-level access controls on the source data and then give your users access to a view that queries the access-controlled data. For more information on column-level access controls, see Introduction to column-level access control.

If you have multiple authorized views that access the same source dataset, you can authorize the dataset that contains the views instead of authorizing an individual view.

Objectives

  • Create a dataset to contain your source data.
  • Run a query to load data into a destination table in the source dataset.
  • Create a dataset to contain your authorized view.
  • Create an authorized view from a SQL query that restricts the columns that your data analysts can see in the query results.
  • Grant your data analysts permission to run query jobs.
  • Grant your data analysts access to the dataset that contains the authorized view.
  • Grant the authorized view access to the source dataset.

Costs

In this document, you use the following billable components of Google Cloud:

To generate a cost estimate based on your projected usage, use the pricing calculator. New Google Cloud users might be eligible for a free trial.

When you finish the tasks that are described in this document, you can avoid continued billing by deleting the resources that you created. For more information, see Clean up.

Before you begin

  1. Sign in to your Google Cloud account. If you're new to Google Cloud, create an account to evaluate how our products perform in real-world scenarios. New customers also get $300 in free credits to run, test, and deploy workloads.
  2. In the Google Cloud console, on the project selector page, select or create a Google Cloud project.

    Go to project selector

  3. In the Google Cloud console, on the project selector page, select or create a Google Cloud project.

    Go to project selector

  4. Make sure that billing is enabled for your Google Cloud project.

  5. Enable the BigQuery API.

    Enable the API

  6. Ensure that you have the necessary permissions to perform the tasks in this document.

Required roles

If you create a new project, you are the project owner, and you are granted all of the required IAM permissions that you need to complete this tutorial.

If you are using an existing project you need the following role.

Make sure that you have the following role or roles on the project:

Check for the roles

  1. In the Google Cloud console, go to the IAM page.

    Go to IAM
  2. Select the project.
  3. In the Principal column, find all rows that identify you or a group that you're included in. To learn which groups you're included in, contact your administrator.

  4. For all rows that specify or include you, check the Role column to see whether the list of roles includes the required roles.

Grant the roles

  1. In the Google Cloud console, go to the IAM page.

    Go to IAM
  2. Select the project.
  3. Click Grant access.
  4. In the New principals field, enter your user identifier. This is typically the email address for a Google Account.

  5. In the Select a role list, select a role.
  6. To grant additional roles, click Add another role and add each additional role.
  7. Click Save.

For more information about roles in BigQuery, see Predefined IAM roles.

Required permissions

To create the resources used in this tutorial, the following permissions are required. The BigQuery Studio Admin predefined role grants all of these permissions.

  • bigquery.datasets.create to create the source dataset and the dataset that contains the authorized view.
  • bigquery.tables.create to create the table that stores the source data and to create the authorized view.
  • bigquery.jobs.create to run the query job that loads data into the source table.
  • bigquery.datasets.getIamPolicy and bigquery.datasets.get to get IAM permissions for the source dataset and the dataset that contains the authorized view.
  • bigquery.datasets.setIamPolicy and bigquery.datasets.update to update IAM permissions for the source dataset and the dataset that contains the authorized view.

For more information about IAM permissions in BigQuery, see BigQuery permissions.

Create a dataset to store your source data

You begin by creating a dataset to store your source data.

To create your source dataset, choose one of the following options:

Console

  1. Go to the BigQuery page.

    Go to BigQuery

  2. In the Explorer pane, beside the project where you want to create the dataset, click View actions > Create dataset.

  3. On the Create dataset page, do the following:

    1. For Dataset ID, enter github_source_data.

    2. For Location type, verify that Multi-region is selected.

    3. For Multi-region, choose US or EU. All the resources you create in this tutorial should be in the same multi-region location.

    4. Click Create dataset.

SQL

Use the CREATE SCHEMA DDL statement:

  1. In the Google Cloud console, go to the BigQuery Studio page.

    Go to BigQuery Studio

  2. In the query editor, enter the following statement:

    CREATE SCHEMA github_source_data;

  3. Click Run.

For more information about how to run queries, see Run an interactive query.

Java

Before trying this sample, follow the Java setup instructions in the BigQuery quickstart using client libraries. For more information, see the BigQuery Java API reference documentation.

To authenticate to BigQuery, set up Application Default Credentials. For more information, see Set up authentication for client libraries.

// Create a source dataset to store your table.
Dataset sourceDataset = bigquery.create(DatasetInfo.of(sourceDatasetId));

Python

Before trying this sample, follow the Python setup instructions in the BigQuery quickstart using client libraries. For more information, see the BigQuery Python API reference documentation.

To authenticate to BigQuery, set up Application Default Credentials. For more information, see Set up authentication for client libraries.

from google.cloud import bigquery

client = bigquery.Client()
source_dataset_id = "github_source_data"
source_dataset_id_full = "{}.{}".format(client.project, source_dataset_id)


source_dataset = bigquery.Dataset(source_dataset_id_full)
# Specify the geographic location where the dataset should reside.
source_dataset.location = "US"
source_dataset = client.create_dataset(source_dataset)  # API request

Create a table and load your source data

After you create the source dataset, you populate a table in it by saving the results of a SQL query to a destination table. The query retrieves data from the GitHub public dataset.

Console

  1. Go to the BigQuery page.

    Go to BigQuery

  2. In the query editor, enter the following query:

    SELECT
      commit,
      author,
      committer,
      repo_name
    FROM
      `bigquery-public-data.github_repos.commits`
    LIMIT
      1000;
    
  3. Click More and select Query settings.

  4. For Destination, select Set a destination table for query results.

  5. For Dataset, enter PROJECT_ID.github_source_data.

    Replace PROJECT_ID with your project ID.

  6. For Table Id, enter github_contributors.

  7. Click Save.

  8. Click Run.

  9. When the query completes, in the Explorer pane, expand github_source_data, and then click github_contributors.

  10. To verify the data was written to the table, click the Preview tab.

Java

Before trying this sample, follow the Java setup instructions in the BigQuery quickstart using client libraries. For more information, see the BigQuery Java API reference documentation.

To authenticate to BigQuery, set up Application Default Credentials. For more information, see Set up authentication for client libraries.

// Populate a source table
String tableQuery =
    "SELECT commit, author, committer, repo_name"
        + " FROM `bigquery-public-data.github_repos.commits`"
        + " LIMIT 1000";
QueryJobConfiguration queryConfig =
    QueryJobConfiguration.newBuilder(tableQuery)
        .setDestinationTable(TableId.of(sourceDatasetId, sourceTableId))
        .build();
bigquery.query(queryConfig);

Python

Before trying this sample, follow the Python setup instructions in the BigQuery quickstart using client libraries. For more information, see the BigQuery Python API reference documentation.

To authenticate to BigQuery, set up Application Default Credentials. For more information, see Set up authentication for client libraries.

source_table_id = "github_contributors"
job_config = bigquery.QueryJobConfig()
job_config.destination = source_dataset.table(source_table_id)
sql = """
    SELECT commit, author, committer, repo_name
    FROM `bigquery-public-data.github_repos.commits`
    LIMIT 1000
"""
query_job = client.query(
    sql,
    # Location must match that of the dataset(s) referenced in the query
    # and of the destination table.
    location="US",
    job_config=job_config,
)  # API request - starts the query

query_job.result()  # Waits for the query to finish

Create a dataset to store your authorized view

After creating your source dataset, you create a new, separate dataset to store the authorized view that you share with your data analysts. In a later step, you grant the authorized view access to the data in the source dataset. Your data analysts then have access to the authorized view, but not direct access to the source data.

Authorized views should be created in a different dataset from the source data. That way, data owners can give users access to the authorized view without simultaneously granting access to the underlying data. The source data dataset and authorized view dataset must be in the same regional location.

To create a dataset to store your view, choose one of the following options:

Console

  1. Go to the BigQuery page.

    Go to BigQuery

  2. In the Explorer panel, select the project where you want to create the dataset.

  3. Expand the View actions option and click Create dataset.

  4. On the Create dataset page, do the following:

    1. For Dataset ID, enter shared_views.

    2. For Location type, verify that Multi-region is selected.

    3. For Multi-region, choose US or EU. All the resources you create in this tutorial should be in the same multi-region location.

    4. Click Create dataset.

SQL

Use the CREATE SCHEMA DDL statement:

  1. In the Google Cloud console, go to the BigQuery Studio page.

    Go to BigQuery Studio

  2. In the query editor, enter the following statement:

    CREATE SCHEMA shared_views;

  3. Click Run.

For more information about how to run queries, see Run an interactive query.

Java

Before trying this sample, follow the Java setup instructions in the BigQuery quickstart using client libraries. For more information, see the BigQuery Java API reference documentation.

To authenticate to BigQuery, set up Application Default Credentials. For more information, see Set up authentication for client libraries.

// Create a separate dataset to store your view
Dataset sharedDataset = bigquery.create(DatasetInfo.of(sharedDatasetId));

Python

Before trying this sample, follow the Python setup instructions in the BigQuery quickstart using client libraries. For more information, see the BigQuery Python API reference documentation.

To authenticate to BigQuery, set up Application Default Credentials. For more information, see Set up authentication for client libraries.

shared_dataset_id = "shared_views"
shared_dataset_id_full = "{}.{}".format(client.project, shared_dataset_id)


shared_dataset = bigquery.Dataset(shared_dataset_id_full)
shared_dataset.location = "US"
shared_dataset = client.create_dataset(shared_dataset)  # API request

Create the authorized view in the new dataset

In the new dataset, you create the view you intend to authorize. This is the view you share with your data analysts. This view is created using a SQL query that excludes the columns you don't want the data analysts to see.

The github_contributors source table contains two fields of type RECORD: author and committer. For this tutorial, your authorized view excludes all of the author data except for the author's name, and it excludes all of the committer data except for the committer's name.

To create the view in the new dataset, choose one of the following options:

Console

  1. Go to the BigQuery page.

    Go to BigQuery

  2. In the query editor, enter the following query.

    SELECT
    commit,
    author.name AS author,
    committer.name AS committer,
    repo_name
    FROM
    `PROJECT_ID.github_source_data.github_contributors`;

    Replace PROJECT_ID with your project ID.

  3. Click Save > Save view.

  4. In the Save view dialog, do the following:

    1. For Project, verify your project is selected.

    2. For Dataset, enter shared_views.

    3. For Table, enter github_analyst_view.

    4. Click Save.

SQL

Use the CREATE VIEW DDL statement:

  1. In the Google Cloud console, go to the BigQuery Studio page.

    Go to BigQuery Studio

  2. In the query editor, enter the following statement:

    CREATE VIEW shared_views.github_analyst_view
    AS (
      SELECT
        commit,
        author.name AS author,
        committer.name AS committer,
        repo_name
      FROM
        `PROJECT_ID.github_source_data.github_contributors`
    );

    Replace PROJECT_ID with your project ID.

  3. Click Run.

For more information about how to run queries, see Run an interactive query.

Java

Before trying this sample, follow the Java setup instructions in the BigQuery quickstart using client libraries. For more information, see the BigQuery Java API reference documentation.

To authenticate to BigQuery, set up Application Default Credentials. For more information, see Set up authentication for client libraries.

// Create the view in the new dataset
String viewQuery =
    String.format(
        "SELECT commit, author.name as author, committer.name as committer, repo_name FROM %s.%s.%s",
        projectId, sourceDatasetId, sourceTableId);

ViewDefinition viewDefinition = ViewDefinition.of(viewQuery);

Table view =
    bigquery.create(TableInfo.of(TableId.of(sharedDatasetId, sharedViewId), viewDefinition));

Python

Before trying this sample, follow the Python setup instructions in the BigQuery quickstart using client libraries. For more information, see the BigQuery Python API reference documentation.

To authenticate to BigQuery, set up Application Default Credentials. For more information, see Set up authentication for client libraries.

shared_view_id = "github_analyst_view"
view = bigquery.Table(shared_dataset.table(shared_view_id))
sql_template = """
    SELECT
        commit, author.name as author,
        committer.name as committer, repo_name
    FROM
        `{}.{}.{}`
"""
view.view_query = sql_template.format(
    client.project, source_dataset_id, source_table_id
)
view = client.create_table(view)  # API request

Grant your data analysts permission to run query jobs

To query the view, your data analysts need the bigquery.jobs.create permission so they can run query jobs. The bigquery.studioUser role includes bigquery.jobs.create permission. The bigquery.studioUser role doesn't give users permission to view or query the authorized view. In a later step, you grant your data analysts permission to access the view.

To assign the data analysts group to the bigquery.studioUser role at the project level, do the following:

  1. In the Google Cloud console, go to the IAM page.

    Go to IAM

  2. Ensure your project is selected in the project selector.

  3. Click Grant access.

  4. In the Grant access to dialog, do the following:

    1. In the New principals field, enter the group that contains your data analysts. For example, data_analysts@example.com.

    2. In the Select a role field, search for the BigQuery Studio User role and select it.

    3. Click Save.

Grant your data analysts permission to query the authorized view

For your data analysts to query the view, they need to be granted the bigquery.dataViewer role at either the dataset level or the view level. Granting this role at the dataset level gives your analysts access to all tables and views in the dataset. Because the dataset created in this tutorial contains a single authorized view, you're granting access at the dataset level. If you have a collection of authorized views that you need to grant access to, consider using an authorized dataset instead.

The bigquery.studioUser role you granted to your data analysts previously gives them the permissions required to create query jobs. However, they cannot successfully query the view unless they also have bigquery.dataViewer access to the authorized view or to the dataset that contains the view.

To give your data analysts bigquery.dataViewer access to the dataset that contains the authorized view, do the following:

Console

  1. Go to the BigQuery page.

    Go to BigQuery

  2. In the Explorer pane, select the shared_views dataset.

  3. Click Sharing > Permissions.

  4. In the Share permissions pane, click Add principal.

  5. For New principals, enter the group that contains your data analysts—for example, data_analysts@example.com.

  6. Click Select a role and select BigQuery > BigQuery Data Viewer.

  7. Click Save.

  8. Click Close.

Java

Before trying this sample, follow the Java setup instructions in the BigQuery quickstart using client libraries. For more information, see the BigQuery Java API reference documentation.

To authenticate to BigQuery, set up Application Default Credentials. For more information, see Set up authentication for client libraries.

// Assign access controls to the dataset containing the view
List<Acl> viewAcl = new ArrayList<>(sharedDataset.getAcl());
viewAcl.add(Acl.of(new Acl.Group("example-analyst-group@google.com"), Acl.Role.READER));
sharedDataset.toBuilder().setAcl(viewAcl).build().update();

Python

Before trying this sample, follow the Python setup instructions in the BigQuery quickstart using client libraries. For more information, see the BigQuery Python API reference documentation.

To authenticate to BigQuery, set up Application Default Credentials. For more information, see Set up authentication for client libraries.

# analyst_group_email = 'data_analysts@example.com'
access_entries = shared_dataset.access_entries
access_entries.append(
    bigquery.AccessEntry("READER", "groupByEmail", analyst_group_email)
)
shared_dataset.access_entries = access_entries
shared_dataset = client.update_dataset(
    shared_dataset, ["access_entries"]
)  # API request

Authorize the view to access the source dataset

After you create access controls for the dataset that contains the authorized view, you grant the authorized view access to the source dataset. This authorization gives the view, but not your data analysts group, access to the source data.

To grant the authorized view access the source data, choose one of these options:

Console

  1. Go to the BigQuery page.

    Go to BigQuery

  2. In the Explorer pane, select the github_source_data dataset.

  3. Click Sharing > Authorize views.

  4. In the Authorized views pane, for Authorized view enter PROJECT_ID.shared_views.github_analyst_view.

    Replace PROJECT_ID with your project ID.

  5. Click Add authorization.

Java

Before trying this sample, follow the Java setup instructions in the BigQuery quickstart using client libraries. For more information, see the BigQuery Java API reference documentation.

To authenticate to BigQuery, set up Application Default Credentials. For more information, see Set up authentication for client libraries.

// Authorize the view to access the source dataset
List<Acl> srcAcl = new ArrayList<>(sourceDataset.getAcl());
srcAcl.add(Acl.of(new Acl.View(view.getTableId())));
sourceDataset.toBuilder().setAcl(srcAcl).build().update();

Python

Before trying this sample, follow the Python setup instructions in the BigQuery quickstart using client libraries. For more information, see the BigQuery Python API reference documentation.

To authenticate to BigQuery, set up Application Default Credentials. For more information, see Set up authentication for client libraries.

access_entries = source_dataset.access_entries
access_entries.append(
    bigquery.AccessEntry(None, "view", view.reference.to_api_repr())
)
source_dataset.access_entries = access_entries
source_dataset = client.update_dataset(
    source_dataset, ["access_entries"]
)  # API request

Verify the configuration

When your configuration is complete, a member of your data analysts group (for example, data_analysts) can verify the configuration by querying the view.

To verify the configuration, a data analyst should run the following query:

  1. Go to the BigQuery page.

    Go to BigQuery

  2. In the query editor, enter the following statement:

    SELECT
      *
    FROM
      `PROJECT_ID.shared_views.github_analyst_view`;

    Replace PROJECT_ID with your project ID.

  3. Click Run.

The query results are similar to the following. Only the author name and committer name are visible in the results.

The query results after querying the authorized view

For more information about how to run queries, see Run an interactive query.

Complete source code

Here is the complete source code for the tutorial for your reference.

Java

Before trying this sample, follow the Java setup instructions in the BigQuery quickstart using client libraries. For more information, see the BigQuery Java API reference documentation.

To authenticate to BigQuery, set up Application Default Credentials. For more information, see Set up authentication for client libraries.

// Create a source dataset to store your table.
Dataset sourceDataset = bigquery.create(DatasetInfo.of(sourceDatasetId));

// Populate a source table
String tableQuery =
    "SELECT commit, author, committer, repo_name"
        + " FROM `bigquery-public-data.github_repos.commits`"
        + " LIMIT 1000";
QueryJobConfiguration queryConfig =
    QueryJobConfiguration.newBuilder(tableQuery)
        .setDestinationTable(TableId.of(sourceDatasetId, sourceTableId))
        .build();
bigquery.query(queryConfig);

// Create a separate dataset to store your view
Dataset sharedDataset = bigquery.create(DatasetInfo.of(sharedDatasetId));

// Create the view in the new dataset
String viewQuery =
    String.format(
        "SELECT commit, author.name as author, committer.name as committer, repo_name FROM %s.%s.%s",
        projectId, sourceDatasetId, sourceTableId);

ViewDefinition viewDefinition = ViewDefinition.of(viewQuery);

Table view =
    bigquery.create(TableInfo.of(TableId.of(sharedDatasetId, sharedViewId), viewDefinition));

// Assign access controls to the dataset containing the view
List<Acl> viewAcl = new ArrayList<>(sharedDataset.getAcl());
viewAcl.add(Acl.of(new Acl.Group("example-analyst-group@google.com"), Acl.Role.READER));
sharedDataset.toBuilder().setAcl(viewAcl).build().update();

// Authorize the view to access the source dataset
List<Acl> srcAcl = new ArrayList<>(sourceDataset.getAcl());
srcAcl.add(Acl.of(new Acl.View(view.getTableId())));
sourceDataset.toBuilder().setAcl(srcAcl).build().update();

Python

Before trying this sample, follow the Python setup instructions in the BigQuery quickstart using client libraries. For more information, see the BigQuery Python API reference documentation.

To authenticate to BigQuery, set up Application Default Credentials. For more information, see Set up authentication for client libraries.

# Create a source dataset
from google.cloud import bigquery

client = bigquery.Client()
source_dataset_id = "github_source_data"
source_dataset_id_full = "{}.{}".format(client.project, source_dataset_id)


source_dataset = bigquery.Dataset(source_dataset_id_full)
# Specify the geographic location where the dataset should reside.
source_dataset.location = "US"
source_dataset = client.create_dataset(source_dataset)  # API request

# Populate a source table
source_table_id = "github_contributors"
job_config = bigquery.QueryJobConfig()
job_config.destination = source_dataset.table(source_table_id)
sql = """
    SELECT commit, author, committer, repo_name
    FROM `bigquery-public-data.github_repos.commits`
    LIMIT 1000
"""
query_job = client.query(
    sql,
    # Location must match that of the dataset(s) referenced in the query
    # and of the destination table.
    location="US",
    job_config=job_config,
)  # API request - starts the query

query_job.result()  # Waits for the query to finish

# Create a separate dataset to store your view
shared_dataset_id = "shared_views"
shared_dataset_id_full = "{}.{}".format(client.project, shared_dataset_id)


shared_dataset = bigquery.Dataset(shared_dataset_id_full)
shared_dataset.location = "US"
shared_dataset = client.create_dataset(shared_dataset)  # API request

# Create the view in the new dataset
shared_view_id = "github_analyst_view"
view = bigquery.Table(shared_dataset.table(shared_view_id))
sql_template = """
    SELECT
        commit, author.name as author,
        committer.name as committer, repo_name
    FROM
        `{}.{}.{}`
"""
view.view_query = sql_template.format(
    client.project, source_dataset_id, source_table_id
)
view = client.create_table(view)  # API request

# Assign access controls to the dataset containing the view
# analyst_group_email = 'data_analysts@example.com'
access_entries = shared_dataset.access_entries
access_entries.append(
    bigquery.AccessEntry("READER", "groupByEmail", analyst_group_email)
)
shared_dataset.access_entries = access_entries
shared_dataset = client.update_dataset(
    shared_dataset, ["access_entries"]
)  # API request

# Authorize the view to access the source dataset
access_entries = source_dataset.access_entries
access_entries.append(
    bigquery.AccessEntry(None, "view", view.reference.to_api_repr())
)
source_dataset.access_entries = access_entries
source_dataset = client.update_dataset(
    source_dataset, ["access_entries"]
)  # API request

Clean up

To avoid incurring charges to your Google Cloud account for the resources used in this tutorial, either delete the project that contains the resources, or keep the project and delete the individual resources.

Delete the project

Console

  1. In the Google Cloud console, go to the Manage resources page.

    Go to Manage resources

  2. In the project list, select the project that you want to delete, and then click Delete.
  3. In the dialog, type the project ID, and then click Shut down to delete the project.

gcloud

    Delete a Google Cloud project:

    gcloud projects delete PROJECT_ID

Delete individual resources

Alternatively, to remove the individual resources used in this tutorial, do the following:

  1. Delete the authorized view.

  2. Delete the dataset that contains the authorized view.

  3. Delete the table in the source dataset.

  4. Delete the source dataset.

Because you created the resources used in this tutorial, no additional permissions are required to delete them.

What's next