Using policy tags in BigQuery

This page describes best practices for using policy tags in BigQuery. Use policy tags to define access to your data, for example, when you use BigQuery column-level security.

Build a hierarchy of data classes

Build a hierarchy of data classes that makes sense for your business.

First, consider what kinds of data the organization processes. Usually there are a small number of data classes managed by an organization. For example, an organization could have data classes such as:

  • PII data
  • Financial data
  • Customer order history

A single data class can be applied to multiple data columns using a policy tag. You should leverage this level of abstraction to efficiently manage many columns with only a few policy tags.

Second, consider if there are groups of people who need different access to different data classes. For example, one group needs access to business- sensitive data such as revenues and customer history. Another group needs access to personally identifiable data (PII) like phone numbers and addresses.

Keep in mind that you can group policy tags together in a tree. Sometimes it is helpful to create a root policy tag that contains all of the other policy tags.

The following figure shows an example taxonomy. This hierarchy groups all data types into three top level policy tags: High, Medium, and Low.

Data hierarchy.

Each of the top level policy tags contains leaf policy tags. For example, the High policy tag contains the Credit card, Government ID , and Biometric policy tags. The Medium and Low similarly have leaf policy tags.

This structure has several benefits:

  • You can grant access to an entire group of policy tags at once. For example, you can grant the Data Catalog Fine Grained Reader role on the Low tier.

  • You can move policy tags from one tier to another. For example, you can move Address from the Low tier to the Medium tier to further restrict its access, without needing to reclassify all Address columns.

  • With this fine-grained access, you can manage access to many columns by controlling only a small number of data classification policy tags.

For more information about policy tags in BigQuery, see:

Test in monitor mode

Before enforcing access policies for your organization, you can run in monitor only mode. Monitor only mode is where you are not yet enforcing access control but you are auditing the effects of your policy tags.

This best practice assumes:

  • You already have a set of users authorized to access your data.
  • You want to find out if enforcement of new column-level security changes would unexpectedly prevent those users from accessing data.

To use monitor only mode, create a taxonomy and policy tags, assign the policy tags to columns, but do not yet enforce access control. Then, have your previously authorized users continue to use the system. As they use the system, an audit trail is generated. You can scan the audit logs to determine if any unexpected PERMISSION_DENIED errors were encountered. After you are satisfied that the column-level security is properly set up, enforce access control.