This page contains code samples for Cloud Data Loss Prevention. To search and filter code samples for other Google Cloud products, see the Google Cloud sample browser.
View in documentation
Compute l-diversity
Compute l-diversity with Cloud DLP. L-diversity, which is an extension of k-anonymity, measures the diversity of sensitive values for each column in which they occur. A dataset has l-diversity if, for every set of rows with identical quasi-identifiers, there are at least l distinct values for each sensitive attribute.
View in documentation
Compute numerical statistics
You can determine minimum, maximum, and quantile values for an individual BigQuery column. To calculate these values, you configure a DlpJob, setting the NumericalStatsConfig privacy metric to the name of the column to scan. When you run the job, Cloud DLP computes statistics for the given column, returning its results in the NumericalStatsResult object.
View in documentation
Computing k-anonymity
K-anonymity is a property of a dataset that indicates the re-identifiability of its records. A dataset is k-anonymous if quasi-identifiers for each person in the dataset are identical to at least k – 1 other people also in the dataset. This sample demonstrates how to use Cloud DLP to compute a k-anonymity value.
View in documentation
Create an inspection job
Creates an inspection job with the Cloud Data Loss Prevention API.
View in documentation
Create an inspection template
Use templates to create and persist configuration information for use with Cloud DLP. Templates are useful for decoupling configuration information—such as what you inspect for and how you de-identify it—from the implementation of your requests. Templates provide a way to re-use configuration and enable consistency across users and datasets. In addition, whenever you update a template, it's updated for any job trigger that uses it.
View in documentation
View in documentation
De-identify data: Redacting with matched input values
Uses the Data Loss Prevention API to de-identify sensitive data in a string by redacting matched input values.
View in documentation
De-identify sensitive data by replacing with infoType
Uses the Data Loss Prevention API to de-identify sensitive data in a string by replacing it with the infoType.
View in documentation
De-identify sensitive data: Replacing matched input values
Uses the Data Loss Prevention API to de-identify sensitive data in a string by replacing matched input values with a value that you specify.
View in documentation
De-identify table data with format-preserving encryption
Demonstrates encrypting sensitive data in a table while maintaining format.
De-identify table data with infoTypes
Transform findings found in columns. You can transform findings that either make up part of a cell's content or all of it. In this example, all instances of PERSON_NAME are anonymized.
View in documentation
De-identify table data: Suppress a row based on the content of a column
Suppress a row based on the content of a column. You can remove a row entirely based on the content that appears in any column. This example suppresses the record for "Charles Dickens," as this patient is over 89 years old.
View in documentation
Delete an inspection template
Delete an inspection template from Cloud DLP.
View in documentation
Format-preserving encryption (FPE)
Demonstrates encrypting sensitive characters while maintaining format.
View in documentation
View in documentation
Inspect a local file
Demonstrates finding sensitive data in a local text or image file.
View in documentation
Inspect a string for sensitive data by using multiple rules
Illustrates applying both exclusion and hotword rules. This snippet's rule set includes both hotword rules and dictionary and regex exclusion rules. Notice that the four rules are specified in an array within the rules element.
View in documentation
Inspect a string for sensitive data, excluding a custom substring
Illustrates how to use an InspectConfig to instruct Cloud DLP to avoid matching on the name "Jimmy" in a scan that uses the specified custom regular expression detector.
View in documentation
Inspect a string from sensitive data by using a custom hotword
Increase the likelihood of a PERSON_NAME match if there is the hotword "patient" nearby. Illustrates using the InspectConfig property for the purpose of scanning a medical database for patient names. You can use Cloud DLP's built-in PERSON_NAME infoType detector, but that causes Cloud DLP to match on all names of people, not just names of patients. To fix this, you can include a hotword rule that looks for the word "patient" within a certain character proximity from the first character of potential matches. You can then assign findings that match this pattern a likelihood of "very likely," since they correspond to your special criteria. Setting the minimum likelihood to VERY_LIKELY within InspectConfig ensures that only matches to this configuration are returned in findings.
View in documentation
Inspect an image file for sensitive data
Uses Cloud DLP to inspect an image for sensitive data.
View in documentation
Inspect an image for sensitive data with listed infoTypes
If you want to inspect an image for only certain sensitive data types, specify their corresponding built-in infoTypes.
View in documentation
Inspect BigQuery for sensitive data with sampling
The following examples demonstrate using the Cloud Data Loss Prevention API to scan a 1000-row subset of a BigQuery table. The scan starts from a random row.
View in documentation
Inspect data for phone numbers
Demonstrates a simple scan request to the Cloud DLP API. Notice that the PHONE_NUMBER detector is specified in inspectConfig, which instructs Cloud DLP to scan the given string for a phone number.
View in documentation
Inspect data with a custom regex
Regex example: Matching medical record numbers. The following sample uses a regular expression custom infoType detector that instructs Cloud DLP to match a medical record number (MRN) in the input text "Patient's MRN 444-5-22222," and then assigns each match a likelihood of POSSIBLE.
View in documentation
Inspect image for sensitive data with infoTypes
To inspect an image for sensitive data, you submit a base64-encoded image to the Cloud DLP API's content.inspect method. Unless you specify information types (infoTypes) to search for, Cloud DLP searches for the most common infoTypes.
View in documentation
Inspect storage with sampling
The following examples demonstrate using the Cloud DLP API to scan a 90% subset of a Cloud Storage bucket for person names. The scan starts from a random location in the dataset and only includes text files under 200 bytes.
View in documentation
View in documentation
Redact data from an image with color-coded infoTypes
Redacting infoTypes from an image with color coding.
View in documentation
Redact only certain sensitive data from an image using infoTypes
Redact only certain sensitive data from an image.
View in documentation