Introduction to data governance in BigQuery

This document provides an introduction to BigQuery data governance and explains how you can use BigQuery features to implement and enforce BigQuery data governance policies. For a more comprehensive overview of data governance in Google Cloud, see What is data governance?

Data governance is the management of the security and quality of data throughout its lifecycle to ensure that the access and accuracy are in accordance with organizational policies and regulations. These data governance priorities can be broken down into three categories:

The following sections define these data governance categories, discuss how BigQuery features support them, and recommend next steps for you.

Access control

Data access management is the process of defining, enforcing, and monitoring the rules and policies governing who has access to data. Access management ensures that data is only accessible to those who are authorized to access it. BigQuery provides the following features to help you with data access:

  • Identity and Access Management (IAM). IAM lets you control who has access to your BigQuery resources such as projects, datasets, tables, and views. You can grant IAM roles to users, groups, and service accounts. These roles define what they can do with your resources.
  • Column-level access controls and row-level access controls. Column-level and row-level access controls let you restrict access to specific columns and rows in a table, based on user attributes or data values. This control lets you implement fine-grained access to help protect sensitive data from unauthorized access.
  • Data transfer management. VPC Service Controls let you create perimeters around Google Cloud resources and control access to those resources based on your organization's policies.
  • Audit logs. Audit logs provide you with a detailed record of user activity and system events in your organization. These logs help you enforce data governance policies and identify potential security risks.

Next steps for access control

The following table outlines next steps that you can take to learn more about access control features:

Experience level Learning path
New cloud users
Experienced cloud users

Data stewardship

Data stewardship helps safeguard sensitive data by appropriately categorizing, masking, redacting, or encrypting it during querying, transit, or storage. This approach enhances data protection and organization. BigQuery provides the following features to help you with data stewardship:

  • Data masking. Data masking lets you obscure sensitive data in a table while still permitting authorized users to access the surrounding data. It can also mask data that matches sensitive data patterns, safeguarding against accidental data disclosure.
  • Encryption. BigQuery automatically encrypts all data at rest and in transit, while letting you customize your encryption settings to meet your specific needs and requirements.
  • Metadata management. Metadata management lets you tag resources, which in turn helps you with data search, organization, and categorization.

Next steps for data stewardship

The following table outlines next steps that you can take to learn more about data stewardship features:

Experience level Learning path
New cloud users
Experienced cloud users
  • Add column-level data masking to your table to make it easier to share information through your organization without revealing sensitive data.
  • Use Sensitive Data Protection to scan your data for sensitive and high-risk information, such as personally identifiable information (PII), financial data, and health information.

Data quality

Data quality management is the process of tracing data lineage and ensuring that data meets your standards for accuracy, completeness, and consistency. BigQuery provides the following features to help you with data quality:

  • Data lineage. Data lineage lets you track the flow of your data over time, providing insights into the data's origin, how it changes over time, and its final destination within your system.
  • Data profile scans. Data profile scans let you analyze the statistical characteristics of your data, such as average and unique values.
  • Data quality scans. Data quality scans let you perform data checks, validate your data against defined rules, and troubleshoot data quality issues.

Next steps for data quality

The following table outlines next steps that you can take to learn more about access data quality features:

Experience level Learning path
New cloud users
  • Run a data profile scan to gain insights about your data, including the limits or averages of your data.
Experienced cloud users

What's next