Enable sensitive data discovery in the Enterprise tier

This page describes how to enable sensitive data discovery using default settings. You can customize the settings at any time after you enable discovery.

If you're a Security Command Center Enterprise customer, the Sensitive Data Protection discovery service is included in your Enterprise subscription. For more information, see Discovery capacity allocation on this page.

During the Security Command Center Enterprise tier activation process, the Sensitive Data Protection discovery service is automatically enabled for all supported resource types. This automatic enablement process is a one-time operation that applies only to resource types that are supported at the time of the Enterprise tier activation. If Sensitive Data Protection later adds discovery support for new resource types, you need to enable those discovery types manually by following these instructions.

Benefits

This feature offers the following benefits:

You can use Sensitive Data Protection findings to identify and remediate vulnerabilities and misconfigurations in your resources that can expose sensitive data to the public or to malicious actors.
You can use these findings to add context to the triage process and prioritize threats that target resources containing sensitive data.
You can configure Security Command Center to automatically prioritize resources for the attack path simulation feature according to the sensitivity of the data that the resources contain. For more information, see Set resource priority values automatically by data sensitivity.

How it works

The Sensitive Data Protection discovery service helps you protect data across your organization by identifying where sensitive and high-risk data reside. In Sensitive Data Protection, the service generates data profiles, which provide metrics and insights about your data at various levels of detail. In Security Command Center, the service does the following:

Generate observation findings in Security Command Center that show the calculated sensitivity and data risk levels of your data. You can use these findings to inform your response when you encounter threats and vulnerabilities related to your data assets. For a list of finding types generated, see Observation findings from the discovery service.

These findings can inform the automatic designation of high-value resources based on data sensitivity. For more information, see Use discovery insights to identify high-value resources on this page.
Generate vulnerability findings in Security Command Center when Sensitive Data Protection detects the presence of highly sensitive data that is not protected. For a list of finding types generated, see Vulnerability findings from the Sensitive Data Protection discovery service.

Finding generation latency

Depending on the size of your organization, Sensitive Data Protection findings can start appearing in Security Command Center within a few minutes after you enable sensitive data discovery. For larger organizations or organizations with specific configurations that affect finding generation, it can take up to 12 hours before initial findings appear in Security Command Center.

Subsequently, Sensitive Data Protection generates findings in Security Command Center within a few minutes after the discovery service scans your resources.

Before you begin

Complete these tasks before you complete the remaining tasks on this page.

Activate the Security Command Center Enterprise tier

Complete step 1 and step 2 of the setup guide to activate the Security Command Center Enterprise tier. For more information, see Activate the Security Command Center Enterprise tier.

Make sure Sensitive Data Protection is enabled as an integrated service

By default, Sensitive Data Protection is enabled in Security Command Center as an integrated service. If Sensitive Data Protection isn't already enabled, you must enable it. For more information, see Add a Google Cloud integrated service.

Set up permissions

To get the permissions that you need to configure sensitive data discovery, ask your administrator to grant you the following IAM roles on the organization:

Purpose	Predefined role	Relevant permissions
Create a discovery scan configuration and view data profiles	DLP Administrator (`roles/dlp.admin`)	dlp.columnDataProfiles.list dlp.fileStoreProfiles.list dlp.inspectTemplates.create dlp.jobs.create dlp.jobs.list dlp.jobTriggers.create dlp.jobTriggers.list dlp.projectDataProfiles.list dlp.tableDataProfiles.list
Create a project to be used as the service agent container¹	Project Creator (`roles/resourcemanager.projectCreator`)	resourcemanager.organizations.get resourcemanager.projects.create
Grant discovery access²	One of the following: Organization Administrator (`roles/resourcemanager.organizationAdmin`) Security Admin (`roles/iam.securityAdmin`)	resourcemanager.organizations.getIamPolicy resourcemanager.organizations.setIamPolicy

¹ If you don't have the Project Creator (roles/resourcemanager.projectCreator) role, you can still create a scan configuration, but the service agent container that you use must be an existing project.

² If you don't have the Organization Administrator (roles/resourcemanager.organizationAdmin) or Security Admin (roles/iam.securityAdmin) role, you can still create a scan configuration. After you create the scan configuration, someone in your organization who has one of these roles must grant discovery access to the service agent.

For more information about granting roles, see Manage access.

You might also be able to get the required permissions through custom roles or other predefined roles.

Enable discovery with default settings

To enable discovery, you create a discovery configuration for each data source that you want to scan. This procedure lets you create those discovery configurations automatically using default settings. You can customize the settings at any time after you perform this procedure.

If you want to customize the settings from the start, see the following pages instead:

To enable discovery with default settings, follow these steps:

In the Google Cloud console, go to the Sensitive Data Protection Enable discovery page.

Go to Enable discovery
Verify that you are viewing the organization that you activated Security Command Center on.
In the Service agent container field, set the project to be used as a service agent container. Within this project, the system creates a service agent and automatically grants the required discovery permissions to it.

If you previously used the discovery service for your organization, you might already have a service agent container project that you can reuse.
- To automatically create a project to use as your service agent container, review the suggested project ID and edit it as needed. Then, click Create. It can take a few minutes for the permissions to be granted to the new project's service agent.
- To select an existing project, click the Service agent container field and select the project.
To review the default settings, click the expand icon.
In the Enable discovery section, for each discovery type that you want to enable, click Enable. Enabling a discovery type does the following:
- BigQuery: Creates a discovery configuration for profiling BigQuery tables across the organization. Sensitive Data Protection starts profiling your BigQuery data and sends the profiles to Security Command Center.
- Cloud SQL: Creates a discovery configuration for profiling Cloud SQL tables across the organization. Sensitive Data Protection starts creating default connections for each of your Cloud SQL instances. This process can take a few hours. When the default connections are ready, you must give Sensitive Data Protection access to your Cloud SQL instances by updating each connection with the proper database user credentials.
- Secrets/credentials vulnerabilities: Creates a discovery configuration for detecting and reporting unencrypted secrets in Cloud Run environment variables. Sensitive Data Protection starts scanning your environment variables.
- Cloud Storage: Creates a discovery configuration for profiling Cloud Storage buckets across the organization. Sensitive Data Protection starts profiling your Cloud Storage data and sends the profiles to Security Command Center.
- Vertex AI datasets: Creates a discovery configuration for profiling Vertex AI datasets across the organization. Sensitive Data Protection starts profiling your Vertex AI datasets and sends the profiles to Security Command Center.
- Amazon S3: Creates a discovery configuration for profiling all Amazon S3 data that your AWS connector has access to.
  
  Note: This feature requires you to first create an AWS connector that has the AWS permissions needed for Sensitive Data Protection discovery.
- Azure Blob Storage: Creates a discovery configuration for profiling all Azure Blob Storage data that your Azure connector has access to.
  
  Note: This feature requires you to first create an Azure connector that has the Azure permissions needed for Sensitive Data Protection discovery.
To view the newly created discovery configurations, click Go to discovery configuration.

If you enabled Cloud SQL discovery, the discovery configuration is created in paused mode with errors indicating the absence of credentials. See Manage connections for use with discovery to grant the required IAM roles to your service agent and to provide database user credentials for each Cloud SQL instance.
Close the pane.

To view the findings generated by Sensitive Data Protection, see Review Sensitive Data Protection findings in the Google Cloud console.

Use discovery insights to identify high-value resources

You can have Security Command Center automatically designate a resource that contains high-sensitivity or medium-sensitivity data as a high-value resource by enabling the Sensitive Data Protection discovery insights option when you create a resource value configuration for the attack path simulation feature.

For high-value resources, Security Command Center provides attack exposure scores and attack path visualizations, which you can use to prioritize the security of your resources that contain sensitive data. For more information, see Set resource priority values automatically by data sensitivity .

Customize the scan configurations

Each discovery type that is enabled has a discovery scan configuration that you can customize. For example, you can do the following:

Adjust the scan frequencies.
Specify filters for data assets that you don't want to reprofile.
Change the inspection template, which defines the information types that Sensitive Data Protection scans for.
Publish the generated data profiles to other Google Cloud services.
Change the service agent container.

Discovery capacity allocation

If your sensitive data discovery needs exceed the capacity allocated for Security Command Center Enterprise customers, then Sensitive Data Protection might increase your capacity temporarily. However, this increase is not guaranteed and is dependent on whether compute resources are available. If you require more discovery capacity, contact your account representative or a Google Cloud sales specialist. For more information, see Monitor utilization in the Sensitive Data Protection documentation.