Sensitive Data Protection can detect and classify sensitive data within text content. Given text input, DLP API returns details about any infoTypes found in the text, a likelihood value, and offset information.
Best Practices
Identify and prioritize scanning
It's important to identify your resources and specify which have the highest priority for scanning. When just getting started you may have a large backlog of data that needs classification, and it'll be impossible to scan it all immediately. Choose data initially that poses the highest risk—for example, data that is frequently accessed, widely accessible, or unknown.
Reduce latency
Latency is affected by several factors: the amount of data to scan, the storage repository being scanned, and the type and number of infoTypes that are enabled.
To help reduce job latency, you can try the following:
- Enable sampling.
- Avoid enabling infoTypes you don't need. While useful in
certain scenarios, some infoTypes—including
PERSON_NAME
,FEMALE_NAME
,MALE_NAME
,FIRST_NAME
,LAST_NAME
,DATE_OF_BIRTH
,LOCATION
,STREET_ADDRESS
,ORGANIZATION_NAME
—can make requests run much more slowly than requests that do not include them. - Always specify infoTypes explicitly. Do not use an empty infoTypes list.
- Consider organizing the data to be inspected into a table with rows and columns, if possible, to reduce network roundtrips.
Limit the scope of your first scans
For best results, limit the scope of your first scans instead of scanning all of
your data. Start with a few requests. Your findings will be more meaningful
when you fine-tune what detectors to enable and what
exclusion rules
might be needed to reduce false positives. Avoid turning on all infoTypes if you
don't need them all, as false positives or unusable findings may make it harder
to assess your risk. While useful in certain scenarios, some infoTypes such as
DATE
, TIME
, DOMAIN_NAME
, and URL
detect a broad range of findings and
may not be useful to turn on.
On-premises, hybrid, and multi-cloud scans
If the data to be scanned resides on-premises or outside of Google Cloud,
use the API methods
content.inspect
and
content.deidentify
to scan the content to classify findings and pseudonymize content without
persisting the content outside of your local storage.
Inspecting a text string
Following is sample JSON and code in several languages that demonstrate how to use the DLP API to inspect strings of text for sensitive data.
C#
To learn how to install and use the client library for Sensitive Data Protection, see Sensitive Data Protection client libraries.
To authenticate to Sensitive Data Protection, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Go
To learn how to install and use the client library for Sensitive Data Protection, see Sensitive Data Protection client libraries.
To authenticate to Sensitive Data Protection, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Java
To learn how to install and use the client library for Sensitive Data Protection, see Sensitive Data Protection client libraries.
To authenticate to Sensitive Data Protection, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Node.js
To learn how to install and use the client library for Sensitive Data Protection, see Sensitive Data Protection client libraries.
To authenticate to Sensitive Data Protection, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
PHP
To learn how to install and use the client library for Sensitive Data Protection, see Sensitive Data Protection client libraries.
To authenticate to Sensitive Data Protection, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Python
To learn how to install and use the client library for Sensitive Data Protection, see Sensitive Data Protection client libraries.
To authenticate to Sensitive Data Protection, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Ruby
To learn how to install and use the client library for Sensitive Data Protection, see Sensitive Data Protection client libraries.
To authenticate to Sensitive Data Protection, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
REST
See the JSON quickstart for more information about using the DLP API with JSON.
JSON Input:
POST https://dlp.googleapis.com/v2/projects/[PROJECT_ID]/content:inspect?key={YOUR_API_KEY}
{
"item":{
"value":"My phone number is (415) 555-0890"
},
"inspectConfig":{
"includeQuote":true,
"minLikelihood":"POSSIBLE",
"infoTypes":{
"name":"PHONE_NUMBER"
}
}
}
JSON Output:
{
"result":{
"findings":[
{
"quote":"(415) 555-0890",
"infoType":{
"name":"PHONE_NUMBER"
},
"likelihood":"VERY_LIKELY",
"location":{
"byteRange":{
"start":"19",
"end":"33"
},
"codepointRange":{
"start":"19",
"end":"33"
}
},
"createTime":"2018-11-13T19:29:15.412Z"
}
]
}
}
Inspecting a text file
The code samples below demonstrate how to check a text file for sensitive content.
C#
To learn how to install and use the client library for Sensitive Data Protection, see Sensitive Data Protection client libraries.
To authenticate to Sensitive Data Protection, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Go
To learn how to install and use the client library for Sensitive Data Protection, see Sensitive Data Protection client libraries.
To authenticate to Sensitive Data Protection, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Java
To learn how to install and use the client library for Sensitive Data Protection, see Sensitive Data Protection client libraries.
To authenticate to Sensitive Data Protection, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Node.js
To learn how to install and use the client library for Sensitive Data Protection, see Sensitive Data Protection client libraries.
To authenticate to Sensitive Data Protection, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Python
To learn how to install and use the client library for Sensitive Data Protection, see Sensitive Data Protection client libraries.
To authenticate to Sensitive Data Protection, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Ruby
To learn how to install and use the client library for Sensitive Data Protection, see Sensitive Data Protection client libraries.
To authenticate to Sensitive Data Protection, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
What's next
- Work through the Redacting Sensitive Data with Sensitive Data Protection codelab.
- Learn how to inspect images for sensitive data.