Work with data profiles

This page provides an overview of the data profiling service of Cloud Data Loss Prevention. It also lists procedures that guide you through working with data profiles.

Overview

The Cloud DLP profiler analyzes all BigQuery tables across an organization, folder, or project, and generates data profiles at the project, table, and column levels. Each data profile contains metrics about your data and provides insight into the type of information stored in your BigQuery tables. Use data profiles to determine which tables and columns need further protection through security features like BigQuery policy tags and de-identification techniques like masking and pseudonymization.

To start profiling data in an organization, folder, or project, you create a scan configuration. The scan configuration dictates which resource to profile, which inspection template to use (if any), and what to do with the scan output.

Shortly after you configure the profiler, Cloud DLP starts profiling all BigQuery tables in the resource that you specified. As long as your scan configuration is active, Cloud DLP automatically profiles new tables in that resource and periodically reprofiles tables that have schema changes.

The following image shows a list of column data profiles. Click the image to enlarge it.

Screenshot of column data profiles

For more information about data profiles, see Data profiles for BigQuery data.

Work with data profiles

The workflow for using data profiles is as follows:

  1. Confirm that you have the required user roles
  2. Profile a single project
  3. Profile an organization or folder
  4. Organization or folder scans only: grant profiling access to the service agent
  5. View the data profiles
  6. Analyze the data profiles
  7. Remediate the findings

What's next