This topic covers the available de-identification techniques, or transformations, in Sensitive Data Protection.
Types of de-identification techniques
Choosing the de-identification transformation you want to use depends on the kind of data you want to de-identify and for what purpose you're de-identifying the data. The de-identification techniques that Sensitive Data Protection supports fall into the following general categories:
- Redaction: Deletes all or part of a detected sensitive value.
- Replacement: Replaces a detected sensitive value with a specified surrogate value.
- Masking: Replaces a number of characters of a sensitive value with a specified surrogate character, such as a hash (#) or asterisk (*).
- Crypto-based tokenization: Encrypts the original sensitive data value using a cryptographic key. Sensitive Data Protection supports several types of tokenization, including transformations that can be reversed, or "re-identified."
- Bucketing: "Generalizes" a sensitive value by replacing it with a range of values. (For example, replacing a specific age with an age range, or temperatures with ranges corresponding to "Hot," "Medium," and "Cold.")
- Date shifting: Shifts sensitive date values by a random amount of time.
- Time extraction: Extracts or preserves specified portions of date and time values.
The remainder of this topic covers each different type of de-identification transformation and provides examples of their use.
Transformation methods
The following table lists the transformations that Sensitive Data Protection provides to de-identify sensitive data:
Transformation | Object | Description | Can Reverse1 | Referential Integrity2 | Input Type |
---|---|---|---|---|---|
Redaction | RedactConfig |
Redacts a value by removing it. | Any | ||
Replacement | ReplaceValueConfig |
Replaces each input value with a given value. | Any | ||
Replace with dictionary | ReplaceDictionaryConfig
|
Replaces an input value with a value that is randomly selected from a word list. | Any | ||
Replace with infoType | ReplaceWithInfoTypeConfig |
Replaces an input value with the name of its infoType. | Any | ||
Mask with character | CharacterMaskConfig |
Masks a string either fully or partially by replacing a given number of characters with a specified fixed character. | Any | ||
Pseudonymization by replacing input value with cryptographic hash | CryptoHashConfig |
Replaces input values with a 32-byte hexadecimal string generated using a given data encryption key. See pseudonymization conceptual documentation to learn more. | ✔ | Strings or integers | |
Pseudonymization by replacing with cryptographic format preserving token | CryptoReplaceFfxFpeConfig |
Replaces an input value with a token, or surrogate value, of the
same length using format-preserving encryption (FPE) with the FFX mode
of operation. This allows the output to be used in systems that have
format validation on length. This is useful for legacy systems where
string length must be maintained.
Important: For input that varies in length or has
length greater than 32 bytes, use CryptoDeterministicConfig .
To remain secure the following limits are recommended by the
National Institute of Standards and Technology:
|
✔ | ✔ | Strings or integers with a limited number of characters and of uniform length. The alphabet must be made up of at least 2 characters and contain no more than 95. |
Pseudonymization by replacing with cryptographic token | CryptoDeterministicConfig |
Replaces an input value with a token, or surrogate value, of the same length using AES in Synthetic Initialization Vector mode (AES-SIV). This transformation method, unlike format-preserving tokenization, has no limitation on supported string character sets, generates identical tokens for each instance of an identical input value, and uses surrogates to enable re-identification given the original encryption key. | ✔ | ✔ | Any |
Bucket values based on fixed size ranges | FixedSizeBucketingConfig |
Masks input values by replacing them with buckets, or ranges within which the input value falls. | Any | ||
Bucket values based on custom size ranges | BucketingConfig |
Buckets input values based on user-configurable ranges and replacement values. | Any | ||
Date Shifting | DateShiftConfig |
Shifts dates by a random number of days, with the option to be consistent for the same context. | ✔ Preserves sequence and duration |
Dates/Times | |
Extract time data | TimePartConfig |
Extracts or preserves a portion of Date ,
Timestamp , and TimeOfDay values. |
Dates/Times |
Footnotes
content.reidentify
method.
Redaction
If you want to simply remove sensitive data from your input content,
Sensitive Data Protection supports a redaction transformation
(RedactConfig
in the
DLP API).
For example, suppose you want to perform a simple redaction of all
EMAIL_ADDRESS
infoTypes, and the following string is sent to
Sensitive Data Protection:
My name is Alicia Abernathy, and my email address is aabernathy@example.com.
The returned string will be the following:
My name is Alicia Abernathy, and my email address is .
The following JSON example and code in several languages shows how to form the API request and what the DLP API returns.
C#
To learn how to install and use the client library for Sensitive Data Protection, see Sensitive Data Protection client libraries.
To authenticate to Sensitive Data Protection, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Go
To learn how to install and use the client library for Sensitive Data Protection, see Sensitive Data Protection client libraries.
To authenticate to Sensitive Data Protection, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Java
To learn how to install and use the client library for Sensitive Data Protection, see Sensitive Data Protection client libraries.
To authenticate to Sensitive Data Protection, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Node.js
To learn how to install and use the client library for Sensitive Data Protection, see Sensitive Data Protection client libraries.
To authenticate to Sensitive Data Protection, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
PHP
To learn how to install and use the client library for Sensitive Data Protection, see Sensitive Data Protection client libraries.
To authenticate to Sensitive Data Protection, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Python
To learn how to install and use the client library for Sensitive Data Protection, see Sensitive Data Protection client libraries.
To authenticate to Sensitive Data Protection, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
REST
See the JSON quickstart for more information about using the DLP API with JSON.
HTTP method and URL
POST https://dlp.googleapis.com/v2/projects/PROJECT_ID/content:deidentify
Replace PROJECT_ID
with the project ID.
JSON input
{
"item":{
"value":"My name is Alicia Abernathy, and my email address is aabernathy@example.com."
},
"deidentifyConfig":{
"infoTypeTransformations":{
"transformations":[
{
"infoTypes":[
{
"name":"EMAIL_ADDRESS"
}
],
"primitiveTransformation":{
"redactConfig":{
}
}
}
]
}
},
"inspectConfig":{
"infoTypes":[
{
"name":"EMAIL_ADDRESS"
}
]
}
}
JSON output
{
"item":{
"value":"My name is Alicia Abernathy, and my email address is ."
},
"overview":{
"transformedBytes":"22",
"transformationSummaries":[
{
"infoType":{
"name":"EMAIL_ADDRESS"
},
"transformation":{
"redactConfig":{
}
},
"results":[
{
"count":"1",
"code":"SUCCESS"
}
],
"transformedBytes":"22"
}
]
}
}
Replacement
The replacement transformations replace each input value with either a given token value or with the name of its infoType.
Basic replacement
The basic replacement transformation
(ReplaceValueConfig
in the DLP API) replaces detected sensitive data values with a value
that you specify. For example, suppose you've told Sensitive Data Protection to use
"[fake@example.com]" to replace all detected EMAIL_ADDRESS
infoTypes, and
the following string is sent to Sensitive Data Protection:
My name is Alicia Abernathy, and my email address is aabernathy@example.com.
The returned string is the following:
My name is Alicia Abernathy, and my email address is [fake@example.com].
The following JSON example and code in several languages shows how to form the API request and what the DLP API returns.
C#
To learn how to install and use the client library for Sensitive Data Protection, see Sensitive Data Protection client libraries.
To authenticate to Sensitive Data Protection, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Go
To learn how to install and use the client library for Sensitive Data Protection, see Sensitive Data Protection client libraries.
To authenticate to Sensitive Data Protection, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Java
To learn how to install and use the client library for Sensitive Data Protection, see Sensitive Data Protection client libraries.
To authenticate to Sensitive Data Protection, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Node.js
To learn how to install and use the client library for Sensitive Data Protection, see Sensitive Data Protection client libraries.
To authenticate to Sensitive Data Protection, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
PHP
To learn how to install and use the client library for Sensitive Data Protection, see Sensitive Data Protection client libraries.
To authenticate to Sensitive Data Protection, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Python
To learn how to install and use the client library for Sensitive Data Protection, see Sensitive Data Protection client libraries.
To authenticate to Sensitive Data Protection, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
REST
See the JSON quickstart for more information about using the DLP API with JSON.
HTTP method and URL
POST https://dlp.googleapis.com/v2/projects/PROJECT_ID/content:deidentify
Replace PROJECT_ID
with the project ID.
JSON input
{
"item":{
"value":"My name is Alicia Abernathy, and my email address is aabernathy@example.com."
},
"deidentifyConfig":{
"infoTypeTransformations":{
"transformations":[
{
"infoTypes":[
{
"name":"EMAIL_ADDRESS"
}
],
"primitiveTransformation":{
"replaceConfig":{
"newValue":{
"stringValue":"[email-address]"
}
}
}
}
]
}
},
"inspectConfig":{
"infoTypes":[
{
"name":"EMAIL_ADDRESS"
}
]
}
}
JSON output
{
"item":{
"value":"My name is Alicia Abernathy, and my email address is [email-address]."
},
"overview":{
"transformedBytes":"22",
"transformationSummaries":[
{
"infoType":{
"name":"EMAIL_ADDRESS"
},
"transformation":{
"replaceConfig":{
"newValue":{
"stringValue":"[email-address]"
}
}
},
"results":[
{
"count":"1",
"code":"SUCCESS"
}
],
"transformedBytes":"22"
}
]
}
}
Dictionary replacement
Dictionary replacement
(ReplaceDictionaryConfig
)
replaces each piece of detected sensitive data with a value that
Sensitive Data Protection randomly selects from a list of words that you provide.
This transformation method is useful if you want to use realistic surrogate
values.
Suppose you want Sensitive Data Protection to detect email addresses and replace each detected value with one of three surrogate email addresses.
You send the following input string to Sensitive Data Protection along with the list of surrogate email addresses:
Input string
My name is Alicia Abernathy, and my email address is aabernathy@example.com.
Word list
- izumi@example.com
- alex@example.com
- tal@example.com
The returned string can be any of the following:
My name is Alicia Abernathy, and my email address is izumi@example.com.
My name is Alicia Abernathy, and my email address is alex@example.com.
My name is Alicia Abernathy, and my email address is tal@example.com.
The following JSON example shows how to form the API request and what the DLP API returns.
See the JSON quickstart for more information about using the DLP API with JSON.
HTTP method and URL
POST https://dlp.googleapis.com/v2/projects/PROJECT_ID/content:deidentify
Replace PROJECT_ID
with the project ID.
JSON input
{
"item": {
"value": "My name is Alicia Abernathy, and my email address is aabernathy@example.com."
},
"deidentifyConfig": {
"infoTypeTransformations": {
"transformations": [
{
"infoTypes": [
{
"name": "EMAIL_ADDRESS"
}
],
"primitiveTransformation": {
"replaceDictionaryConfig": {
"wordList": {
"words": [
"izumi@example.com",
"alex@example.com",
"tal@example.com"
]
}
}
}
}
]
}
},
"inspectConfig": {
"infoTypes": [
{
"name": "EMAIL_ADDRESS"
}
]
}
}
JSON output
{
"item": {
"value": "My name is Alicia Abernathy, and my email address is izumi@example.com."
},
"overview": {
"transformedBytes": "22",
"transformationSummaries": [
{
"infoType": {
"name": "EMAIL_ADDRESS"
},
"transformation": {
"replaceDictionaryConfig": {
"wordList": {
"words": [
"izumi@example.com",
"alex@example.com",
"tal@example.com"
]
}
}
},
"results": [
{
"count": "1",
"code": "SUCCESS"
}
],
"transformedBytes": "22"
}
]
}
}
C#
To learn how to install and use the client library for Sensitive Data Protection, see Sensitive Data Protection client libraries.
To authenticate to Sensitive Data Protection, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Go
To learn how to install and use the client library for Sensitive Data Protection, see Sensitive Data Protection client libraries.
To authenticate to Sensitive Data Protection, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Java
To learn how to install and use the client library for Sensitive Data Protection, see Sensitive Data Protection client libraries.
To authenticate to Sensitive Data Protection, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Node.js
To learn how to install and use the client library for Sensitive Data Protection, see Sensitive Data Protection client libraries.
To authenticate to Sensitive Data Protection, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
PHP
To learn how to install and use the client library for Sensitive Data Protection, see Sensitive Data Protection client libraries.
To authenticate to Sensitive Data Protection, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Python
To learn how to install and use the client library for Sensitive Data Protection, see Sensitive Data Protection client libraries.
To authenticate to Sensitive Data Protection, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
InfoType replacement
You can also specify an infoType replacement
(ReplaceWithInfoTypeConfig
in the DLP API). This transformation does the same thing as the basic
replacement transformation, but it replaces each detected sensitive data value
with the infoType of the detected value.
For example, suppose you've told Sensitive Data Protection to detect both email addresses and last names, and to replace each detected value with the value's infoType. You send the following string to Sensitive Data Protection:
My name is Alicia Abernathy, and my email address is aabernathy@example.com.
The returned string is the following:
My name is Alicia LAST_NAME, and my email address is EMAIL_ADDRESS.
Masking
You can configure Sensitive Data Protection to completely or partially mask a
detected sensitive value
(CharacterMaskConfig
in the DLP API) by replacing each character with a fixed single masking
character such as an asterisk (*) or hash (#). Masking can start from the
beginning or end of the string. This transformation also works with number types
such as long integers.
Sensitive Data Protection's masking transformation has the following options you can specify:
- Masking character (The
maskingCharacter
argument in the DLP API): The character to use to mask each character of a sensitive value. For example, you could specify an asterisk (*) or dollar sign ($) to mask a series of numbers such as those in a credit card number. - The number of characters to mask (
numberToMask
): If you don't specify this value, all characters will be masked. - Whether to reverse the order (
reverseOrder
): Whether to mask characters in reverse order. Reversing the order causes characters in matched values to be masked from the end toward the beginning of the value. - Characters to ignore (
charactersToIgnore
): One or more characters to skip when masking values. For example, you can tell Sensitive Data Protection to leave hyphens in place when masking a telephone number. You can also specify a group of common characters (CharsToIgnore
) to ignore when masking.
Suppose you send the following string to Sensitive Data Protection and instruct it to use the character masking transformation on email addresses:
My name is Alicia Abernathy, and my email address is aabernathy@example.com.
With the masking character sent to '#,' the characters to ignore set to the common character set, and otherwise default settings, Sensitive Data Protection returns the following:
My name is Alicia Abernathy, and my email address is ##########@#######.###.
The following JSON and code examples demonstrate how the masking transformation works.
Java
To learn how to install and use the client library for Sensitive Data Protection, see Sensitive Data Protection client libraries.
To authenticate to Sensitive Data Protection, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Node.js
To learn how to install and use the client library for Sensitive Data Protection, see Sensitive Data Protection client libraries.
To authenticate to Sensitive Data Protection, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Python
To learn how to install and use the client library for Sensitive Data Protection, see Sensitive Data Protection client libraries.
To authenticate to Sensitive Data Protection, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Go
To learn how to install and use the client library for Sensitive Data Protection, see Sensitive Data Protection client libraries.
To authenticate to Sensitive Data Protection, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
PHP
To learn how to install and use the client library for Sensitive Data Protection, see Sensitive Data Protection client libraries.
To authenticate to Sensitive Data Protection, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
C#
To learn how to install and use the client library for Sensitive Data Protection, see Sensitive Data Protection client libraries.
To authenticate to Sensitive Data Protection, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
REST
See the JSON quickstart for more information about using the DLP API with JSON.
HTTP method and URL
POST https://dlp.googleapis.com/v2/projects/PROJECT_ID/content:deidentify
Replace PROJECT_ID
with the project ID.
JSON input
{
"item":{
"value":"My name is Alicia Abernathy, and my email address is aabernathy@example.com."
},
"deidentifyConfig":{
"infoTypeTransformations":{
"transformations":[
{
"infoTypes":[
{
"name":"EMAIL_ADDRESS"
}
],
"primitiveTransformation":{
"characterMaskConfig":{
"maskingCharacter":"#",
"reverseOrder":false,
"charactersToIgnore":[
{
"charactersToSkip":".@"
}
]
}
}
}
]
}
},
"inspectConfig":{
"infoTypes":[
{
"name":"EMAIL_ADDRESS"
}
]
}
}
JSON output
{
"item":{
"value":"My name is Alicia Abernathy, and my email address is ##########@#######.###."
},
"overview":{
"transformedBytes":"22",
"transformationSummaries":[
{
"infoType":{
"name":"EMAIL_ADDRESS"
},
"transformation":{
"characterMaskConfig":{
"maskingCharacter":"#",
"charactersToIgnore":[
{
"charactersToSkip":".@"
}
]
}
},
"results":[
{
"count":"1",
"code":"SUCCESS"
}
],
"transformedBytes":"22"
}
]
}
}
Crypto-based tokenization transformations
Crypto-based tokenization (also referred to as "pseudonymization") transformations are de-identification methods that replace the original sensitive data values with encrypted values. Sensitive Data Protection supports the following types of tokenization, including transformations that can be reversed and allow for re-identification:
- Cryptographic hashing: Given a
CryptoKey
, Sensitive Data Protection uses a SHA-256-based message authentication code (HMAC-SHA-256) on the input value, and then replaces the input value with the hashed value encoded in base64. Unlike other types of crypto-based transformations, this type of transformation isn't reversible. - Format preserving encryption: Replaces an input value with a token that has been generated using format-preserving encryption (FPE) with the FFX mode of operation. This transformation method produces a token that is limited to the same alphabet as the input value and is the same length as the input value. FPE also supports re-identification given the original encryption key.
- Deterministic encryption: Replaces an input value with a token that has been generated using AES in Synthetic Initialization Vector mode (AES-SIV). This transformation method has no limitation on supported string character sets, generates identical tokens for each instance of an identical input value, and uses surrogates to enable re-identification given the original encryption key.
Cryptographic hashing
The cryptographic hashing transformation
(CryptoHashConfig
in the DLP API) takes an input value (a piece of sensitive data that
Sensitive Data Protection has detected) and replaces it with a hashed value. The
hash value is generated by using a SHA-256-based message authentication code
(HMAC-SHA-256)
) on the input value with a
CryptoKey
.
Sensitive Data Protection outputs a base64-encoded representation of the hashed input value in the place of the original value.
Before using the cryptographic hashing transformation, keep in mind the following:
- The input value is not encrypted but hashed.
- This transformation can't be reversed. That is, given the hashed output value of the transformation and the original cryptographic key, there is no way to restore the original value.
- Currently, only string and integer values can be hashed.
- The hashed output of the transformation is always the same length, depending on the size of the cryptographic key. For example, if you use the cryptographic hashing transformation on 10-digit phone numbers, each phone number will be replaced by a fixed-length base64-encoded hash value.
Format-preserving encryption
The format-preserving encryption (FPE) transformation method
(CryptoReplaceFfxFpeConfig
in the DLP API) takes an input value (a piece of sensitive data
that Sensitive Data Protection has detected), encrypts it using format-preserving
encryption in FFX mode and a
CryptoKey
,
and then replaces the original value with the encrypted value, or token.
The input value:
- Must be at least two characters long (or the empty string).
- Must be encoded as ASCII.
- Comprised of the characters specified by an "alphabet," which is the set
of between 2 and 95 allowed characters in the input value. For
more information, see the alphabet field in
CryptoReplaceFfxFpeConfig
.
The generated token:
- Is the encrypted input value.
- Preserves the character set ("alphabet") and length of the input value post-encryption.
- Is computed using format-preserving encryption in FFX mode keyed on the specified cryptographic key.
- Isn't necessarily unique, as each instance of the same input value de-identifies to the same token. This enables referential integrity, and therefore enables more efficient searching of de-identified data. You can change this behavior by using context "tweaks," as described in Contexts.
If there are multiple instances of an input value in the source content, each one will be de-identified to the same token. FPE preserves both length and alphabet space (the character set), which is limited to 95 characters. You can change this behavior by using context "tweaks," which can improve security. The addition of a context tweak to the transformation enables Sensitive Data Protection to de-identify multiple instances of the same input value to different tokens. If you don't need to preserve the length and alphabet space of the original values, use deterministic encryption, described below.
Sensitive Data Protection computes the replacement token using a cryptographic key. You provide this key in one of three ways:
- By embedding it unencrypted in the API request. This is not recommended.
- By requesting that Sensitive Data Protection generate it.
- By embedding it encrypted in the API request.
If you choose to embed the key in the API request, you need to create a key and wrap (encrypt) it using a Cloud Key Management Service (Cloud KMS) key. For more information, see Create a wrapped key. The value returned is a base64-encoded string by default. To set this value in Sensitive Data Protection, you must decode it into a byte string. The following code snippets highlight how to do this in several languages. End-to-end examples are provided following these snippets.
Java
KmsWrappedCryptoKey.newBuilder()
.setWrappedKey(ByteString.copyFrom(BaseEncoding.base64().decode(wrappedKey)))
Python
# The wrapped key is base64-encoded, but the library expects a binary
# string, so decode it here.
import base64
wrapped_key = base64.b64decode(wrapped_key)
PHP
// Create the wrapped crypto key configuration object
$kmsWrappedCryptoKey = (new KmsWrappedCryptoKey())
->setWrappedKey(base64_decode($wrappedKey))
->setCryptoKeyName($keyName);
C#
WrappedKey = ByteString.FromBase64(wrappedKey)
For more information about encrypting and decrypting data using Cloud KMS, see Encrypting and Decrypting Data.
Format-preserving encryption examples: de-identification
This example uses the
CryptoReplaceFfxFpeConfig
transformation method to de-identify sensitive data. For more information, see
Format-preserving encryption on this page.
Java
To learn how to install and use the client library for Sensitive Data Protection, see Sensitive Data Protection client libraries.
To authenticate to Sensitive Data Protection, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Node.js
To learn how to install and use the client library for Sensitive Data Protection, see Sensitive Data Protection client libraries.
To authenticate to Sensitive Data Protection, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Python
To learn how to install and use the client library for Sensitive Data Protection, see Sensitive Data Protection client libraries.
To authenticate to Sensitive Data Protection, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Go
To learn how to install and use the client library for Sensitive Data Protection, see Sensitive Data Protection client libraries.
To authenticate to Sensitive Data Protection, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
PHP
To learn how to install and use the client library for Sensitive Data Protection, see Sensitive Data Protection client libraries.
To authenticate to Sensitive Data Protection, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
C#
To learn how to install and use the client library for Sensitive Data Protection, see Sensitive Data Protection client libraries.
To authenticate to Sensitive Data Protection, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Format-preserving encryption examples: de-identification with surrogate type
This example uses the
CryptoReplaceFfxFpeConfig
transformation method to de-identify sensitive data. For more information, see
Format-preserving encryption on this page.
Go
To learn how to install and use the client library for Sensitive Data Protection, see Sensitive Data Protection client libraries.
To authenticate to Sensitive Data Protection, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Java
To learn how to install and use the client library for Sensitive Data Protection, see Sensitive Data Protection client libraries.
To authenticate to Sensitive Data Protection, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Node.js
To learn how to install and use the client library for Sensitive Data Protection, see Sensitive Data Protection client libraries.
To authenticate to Sensitive Data Protection, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
PHP
To learn how to install and use the client library for Sensitive Data Protection, see Sensitive Data Protection client libraries.
To authenticate to Sensitive Data Protection, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Python
To learn how to install and use the client library for Sensitive Data Protection, see Sensitive Data Protection client libraries.
To authenticate to Sensitive Data Protection, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
C#
To learn how to install and use the client library for Sensitive Data Protection, see Sensitive Data Protection client libraries.
To authenticate to Sensitive Data Protection, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Format-preserving encryption examples: de-identification of sensitive data in tables
This example uses the
CryptoReplaceFfxFpeConfig
transformation method to de-identify sensitive data in tables. For more
information, see Format-preserving encryption on this page.
C#
To learn how to install and use the client library for Sensitive Data Protection, see Sensitive Data Protection client libraries.
To authenticate to Sensitive Data Protection, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Go
To learn how to install and use the client library for Sensitive Data Protection, see Sensitive Data Protection client libraries.
To authenticate to Sensitive Data Protection, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Java
To learn how to install and use the client library for Sensitive Data Protection, see Sensitive Data Protection client libraries.
To authenticate to Sensitive Data Protection, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Node.js
To learn how to install and use the client library for Sensitive Data Protection, see Sensitive Data Protection client libraries.
To authenticate to Sensitive Data Protection, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
PHP
To learn how to install and use the client library for Sensitive Data Protection, see Sensitive Data Protection client libraries.
To authenticate to Sensitive Data Protection, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Python
To learn how to install and use the client library for Sensitive Data Protection, see Sensitive Data Protection client libraries.
To authenticate to Sensitive Data Protection, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Format-preserving encryption examples: re-identification
Following is sample code in several languages that demonstrates how to use
Sensitive Data Protection to re-identify sensitive data that was de-identified
through the
CryptoReplaceFfxFpeConfig
transformation method. For more information, see Format-preserving
encryption on this page.
Go
To learn how to install and use the client library for Sensitive Data Protection, see Sensitive Data Protection client libraries.
To authenticate to Sensitive Data Protection, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Java
To learn how to install and use the client library for Sensitive Data Protection, see Sensitive Data Protection client libraries.
To authenticate to Sensitive Data Protection, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Node.js
To learn how to install and use the client library for Sensitive Data Protection, see Sensitive Data Protection client libraries.
To authenticate to Sensitive Data Protection, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
PHP
To learn how to install and use the client library for Sensitive Data Protection, see Sensitive Data Protection client libraries.
To authenticate to Sensitive Data Protection, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Python
To learn how to install and use the client library for Sensitive Data Protection, see Sensitive Data Protection client libraries.
To authenticate to Sensitive Data Protection, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
C#
To learn how to install and use the client library for Sensitive Data Protection, see Sensitive Data Protection client libraries.
To authenticate to Sensitive Data Protection, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Format-preserving encryption examples: re-identification of text
Following is sample code in several languages that demonstrates how to use
Sensitive Data Protection to re-identify sensitive text that was de-identified
through the
CryptoReplaceFfxFpeConfig
transformation method. For more information, see Format-preserving
encryption on this page.
C#
To learn how to install and use the client library for Sensitive Data Protection, see Sensitive Data Protection client libraries.
To authenticate to Sensitive Data Protection, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Go
To learn how to install and use the client library for Sensitive Data Protection, see Sensitive Data Protection client libraries.
To authenticate to Sensitive Data Protection, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Java
To learn how to install and use the client library for Sensitive Data Protection, see Sensitive Data Protection client libraries.
To authenticate to Sensitive Data Protection, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Node.js
To learn how to install and use the client library for Sensitive Data Protection, see Sensitive Data Protection client libraries.
To authenticate to Sensitive Data Protection, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
PHP
To learn how to install and use the client library for Sensitive Data Protection, see Sensitive Data Protection client libraries.
To authenticate to Sensitive Data Protection, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Python
To learn how to install and use the client library for Sensitive Data Protection, see Sensitive Data Protection client libraries.
To authenticate to Sensitive Data Protection, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Format-preserving encryption examples: re-identification with surrogate type
Following is sample code in several languages that demonstrates how to use
Sensitive Data Protection to re-identify sensitive data that was de-identified
through through the
CryptoReplaceFfxFpeConfig
transformation method. For more information, see Format-preserving
encryption on this page.
C#
To learn how to install and use the client library for Sensitive Data Protection, see Sensitive Data Protection client libraries.
To authenticate to Sensitive Data Protection, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Go
To learn how to install and use the client library for Sensitive Data Protection, see Sensitive Data Protection client libraries.
To authenticate to Sensitive Data Protection, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Java
To learn how to install and use the client library for Sensitive Data Protection, see Sensitive Data Protection client libraries.
To authenticate to Sensitive Data Protection, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Node.js
To learn how to install and use the client library for Sensitive Data Protection, see Sensitive Data Protection client libraries.
To authenticate to Sensitive Data Protection, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
PHP
To learn how to install and use the client library for Sensitive Data Protection, see Sensitive Data Protection client libraries.
To authenticate to Sensitive Data Protection, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Python
To learn how to install and use the client library for Sensitive Data Protection, see Sensitive Data Protection client libraries.
To authenticate to Sensitive Data Protection, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Format-preserving encryption examples: re-identification of sensitive data in tables
Following is sample code in several languages that demonstrates how to use
Sensitive Data Protection to re-identify sensitive data in tables that were
de-identified through the
CryptoReplaceFfxFpeConfig
transformation method. For more information, see Format-preserving
encryption on this page.
C#
To learn how to install and use the client library for Sensitive Data Protection, see Sensitive Data Protection client libraries.
To authenticate to Sensitive Data Protection, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Go
To learn how to install and use the client library for Sensitive Data Protection, see Sensitive Data Protection client libraries.
To authenticate to Sensitive Data Protection, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Java
To learn how to install and use the client library for Sensitive Data Protection, see Sensitive Data Protection client libraries.
To authenticate to Sensitive Data Protection, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Node.js
To learn how to install and use the client library for Sensitive Data Protection, see Sensitive Data Protection client libraries.
To authenticate to Sensitive Data Protection, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
PHP
To learn how to install and use the client library for Sensitive Data Protection, see Sensitive Data Protection client libraries.
To authenticate to Sensitive Data Protection, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Python
To learn how to install and use the client library for Sensitive Data Protection, see Sensitive Data Protection client libraries.
To authenticate to Sensitive Data Protection, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Deterministic encryption
The deterministic encryption transformation method
CryptoDeterministicConfig
in the DLP API takes an input value (a piece of sensitive data that
Sensitive Data Protection has detected), encrypts it using
AES-SIV
with a
CryptoKey
,
and then replaces the original value with a base64-encoded representation of
the encrypted value.
Using the deterministic encryption transformation enables more efficient searching of encrypted data.
The input value:
- Must be at least 1 character long.
- Has no character set limitations.
The generated token:
- Is a base64-encoded representation of the encrypted value.
- Does not preserve the character set ("alphabet") or length of the input value post-encryption.
- Is computed using AES encryption in SIV mode (AES-SIV) with a
CryptoKey
. - Isn't necessarily unique, as each instance of the same input value de-identifies to the same token. This enables more efficient searching of encrypted data. You can change this behavior by using context "tweaks," as described in Contexts.
- Is generated with a prefix added, in the form
[SURROGATE_TYPE]([LENGTH]):
, where[SURROGATE_TYPE]
represents a surrogate infoType describing the input value, and[LENGTH]
indicates its character length. The surrogate enables the token to be re-identified using the original encryption key used for de-identification.
Following is an example JSON configuration for de-identification using
deterministic encryption. Note that we've chosen to use "PHONE_SURROGATE" as
our descriptive surrogate type since we're de-identifying telephone numbers.
[CRYPTO_KEY]
represents an unwrapped cryptographic key obtained from
Cloud KMS.
{
"deidentifyConfig":{
"infoTypeTransformations":{
"transformations":[
{
"infoTypes":[
{
"name":"PHONE_NUMBER"
}
],
"primitiveTransformation":{
"cryptoDeterministicConfig":{
"cryptoKey":{
"unwrapped":{
"key":"[CRYPTO_KEY]"
}
},
"surrogateInfoType":{
"name":"PHONE_SURROGATE"
}
}
}
}
]
}
},
"inspectConfig":{
"infoTypes":[
{
"name":"PHONE_NUMBER"
}
]
},
"item":{
"value":"My phone number is 206-555-0574, call me"
}
}
De-identifying the string "My phone number is 206-555-0574" using this transformation results in a de-identified string such as the following:
My phone number is PHONE_SURROGATE(36):ATZBu5OCCSwo+e94xSYnKYljk1OQpkW7qhzx, call me
To re-identify this string, you can use a JSON request like the following, where [CRYPTO_KEY] is the same cryptographic key used to de-identify the content.
{
"reidentifyConfig":{
"infoTypeTransformations":{
"transformations":[
{
"infoTypes":[
{
"name":"PHONE_SURROGATE"
}
],
"primitiveTransformation":{
"cryptoDeterministicConfig":{
"cryptoKey":{
"unwrapped":{
"key":"[CRYPTO_KEY]"
}
},
"surrogateInfoType":{
"name":"PHONE_SURROGATE"
}
}
}
}
]
}
},
"inspectConfig":{
"customInfoTypes":[
{
"infoType":{
"name":"PHONE_SURROGATE"
},
"surrogateType":{
}
}
]
},
"item":{
"value":"My phone number is [PHONE_SURROGATE](36):ATZBu5OCCSwo+e94xSYnKYljk1OQpkW7qhzx, call me"
}
}
Re-identifying this string results in the original string:
My phone number is 206-555-0574, call me
If you want to use a wrapped (encrypted) CryptoKey
instead for better
security, see Quickstart: De-identifying and re-identifying
sensitive text for an example. When you're
ready to use a client library to de-identify content, remember to decode the
wrapped key (which is a base64-encoded string by default), as demonstrated in
Format-preserving encryption
on this page.
Deterministic encryption examples: de-identification
This example uses the
CryptoDeterministicConfig
transformation method to de-identify sensitive data. For more information, see
Deterministic encryption on this page.
C#
To learn how to install and use the client library for Sensitive Data Protection, see Sensitive Data Protection client libraries.
To authenticate to Sensitive Data Protection, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Go
To learn how to install and use the client library for Sensitive Data Protection, see Sensitive Data Protection client libraries.
To authenticate to Sensitive Data Protection, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Java
To learn how to install and use the client library for Sensitive Data Protection, see Sensitive Data Protection client libraries.
To authenticate to Sensitive Data Protection, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Node.js
To learn how to install and use the client library for Sensitive Data Protection, see Sensitive Data Protection client libraries.
To authenticate to Sensitive Data Protection, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
PHP
To learn how to install and use the client library for Sensitive Data Protection, see Sensitive Data Protection client libraries.
To authenticate to Sensitive Data Protection, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Python
To learn how to install and use the client library for Sensitive Data Protection, see Sensitive Data Protection client libraries.
To authenticate to Sensitive Data Protection, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Deterministic encryption examples: re-identification
This example demonstrates how to re-identify sensitive data that was
de-identified through the
CryptoDeterministicConfig
transformation method. For more information, see Deterministic encryption
on this page.
C#
To learn how to install and use the client library for Sensitive Data Protection, see Sensitive Data Protection client libraries.
To authenticate to Sensitive Data Protection, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Go
To learn how to install and use the client library for Sensitive Data Protection, see Sensitive Data Protection client libraries.
To authenticate to Sensitive Data Protection, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Java
To learn how to install and use the client library for Sensitive Data Protection, see Sensitive Data Protection client libraries.
To authenticate to Sensitive Data Protection, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Node.js
To learn how to install and use the client library for Sensitive Data Protection, see Sensitive Data Protection client libraries.
To authenticate to Sensitive Data Protection, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
PHP
To learn how to install and use the client library for Sensitive Data Protection, see Sensitive Data Protection client libraries.
To authenticate to Sensitive Data Protection, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Python
To learn how to install and use the client library for Sensitive Data Protection, see Sensitive Data Protection client libraries.
To authenticate to Sensitive Data Protection, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Bucketing
The bucketing transformations serve to de-identify numerical data by "bucketing" it into ranges. The resulting number range is a hyphenated string consisting of a lower bound, a hyphen, and an upper bound.
Fixed-size bucketing
Sensitive Data Protection can bucket numerical input values based on fixed size ranges
(FixedSizeBucketingConfig
in the DLP API). You specify the following to configure fixed-size bucketing:
- The lower bound value of all of the buckets. Any values less than the lower bound are grouped together in a single bucket.
- The upper bound value of all of the buckets. Any values greater than the upper bound are grouped together in a single bucket.
- The size of each bucket other than the minimum and maximum buckets.
For example, if the lower bound is set to 10, the upper bound is set to 89, and the bucket size is set to 10, then the following buckets would be used: -10, 10-20, 20-30, 30-40, 40-50, 50-60, 60-70, 70-80, 80-89, 89+.
For more information about the concept of bucketing, see Generalization and Bucketing.
Customizable bucketing
Customizable bucketing
(BucketingConfig
in the DLP API) offers more flexibility than fixed size bucketing.
Instead of specifying upper and lower bounds and an interval value with which to
create equal-sized buckets, you specify the maximum and minimum values for each
bucket you want created. Each maximum and minimum value pair must have the same
type.
You set up customizable bucketing by specifying individual buckets. Each bucket has the following properties:
- The lower bound of the bucket's range. Omit this value to create a bucket that has no lower bound.
- The upper bound of the bucket's range. Omit this value to create a bucket that has no upper bound.
- The replacement value for this bucket range. This is the value with which to replace all detected values that fall within the lower and upper bounds. If you don't provide a replacement value, a hyphenated min-max range will be generated instead.
For example, consider the following JSON configuration for this bucketing transformation:
"bucketingConfig":{
"buckets":[
{
"min":{
"integerValue":"1"
},
"max":{
"integerValue":"30"
},
"replacementValue":{
"stringValue":"LOW"
}
},
{
"min":{
"integerValue":"31"
},
"max":{
"integerValue":"65"
},
"replacementValue":{
"stringValue":"MEDIUM"
}
},
{
"min":{
"integerValue":"66"
},
"max":{
"integerValue":"100"
},
"replacementValue":{
"stringValue":"HIGH"
}
}
]
}
This defines the following behavior:
- Integer values falling between 1 and 30 are masked by being replaced with
LOW
. - Integer values falling between 31-65 are masked by being replaced with
MEDIUM
. - Integer values falling between 66-100 are masked by being replaced with
HIGH
.
For more information about the concept of bucketing, see Generalization and Bucketing.
Date shifting
When you use the date shifting transformation
(DateShiftConfig
in the DLP API on a date input value, Sensitive Data Protection shifts the
dates by a random number of days.
Date shifting techniques randomly shift a set of dates but preserve the sequence and duration of a period of time. Shifting dates is usually done in context to an individual or an entity. That is, you want to shift all of the dates for a specific individual using the same shift differential but use a separate shift differential for each other individual.
For more information about date shifting, see Date shifting.
Following is sample code in several languages that demonstrates how to use the Cloud DLP API to de-identify dates using date shifting.
Java
To learn how to install and use the client library for Sensitive Data Protection, see Sensitive Data Protection client libraries.
To authenticate to Sensitive Data Protection, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Node.js
To learn how to install and use the client library for Sensitive Data Protection, see Sensitive Data Protection client libraries.
To authenticate to Sensitive Data Protection, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Python
To learn how to install and use the client library for Sensitive Data Protection, see Sensitive Data Protection client libraries.
To authenticate to Sensitive Data Protection, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Go
To learn how to install and use the client library for Sensitive Data Protection, see Sensitive Data Protection client libraries.
To authenticate to Sensitive Data Protection, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
PHP
To learn how to install and use the client library for Sensitive Data Protection, see Sensitive Data Protection client libraries.
To authenticate to Sensitive Data Protection, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
C#
To learn how to install and use the client library for Sensitive Data Protection, see Sensitive Data Protection client libraries.
To authenticate to Sensitive Data Protection, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Time extraction
Performing time extraction
(TimePartConfig
in the DLP API) object preserves a portion of a matched value that on
a date, a time, or a timestamp preserves a portion of a matched value. You
specify to Sensitive Data Protection what kind of time value you want to extract, including year, month, day of the month, and so on (enumerated in the
TimePart
object).
For example, suppose you've configured a timePartConfig
transformation by
setting the time part to extract to YEAR
. After sending the data in the first
column below to Sensitive Data Protection, you'd end up with the transformed values
in the second column:
Original values | Transformed values |
---|---|
9/21/1976 |
1976 |
6/7/1945 |
1945 |
1/20/2009 |
2009 |
7/4/1776 |
1776 |
8/1/1984 |
1984 |
4/21/1982 |
1982 |
C#
To learn how to install and use the client library for Sensitive Data Protection, see Sensitive Data Protection client libraries.
To authenticate to Sensitive Data Protection, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Go
To learn how to install and use the client library for Sensitive Data Protection, see Sensitive Data Protection client libraries.
To authenticate to Sensitive Data Protection, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Java
To learn how to install and use the client library for Sensitive Data Protection, see Sensitive Data Protection client libraries.
To authenticate to Sensitive Data Protection, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Node.js
To learn how to install and use the client library for Sensitive Data Protection, see Sensitive Data Protection client libraries.
To authenticate to Sensitive Data Protection, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
PHP
To learn how to install and use the client library for Sensitive Data Protection, see Sensitive Data Protection client libraries.
To authenticate to Sensitive Data Protection, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Python
To learn how to install and use the client library for Sensitive Data Protection, see Sensitive Data Protection client libraries.
To authenticate to Sensitive Data Protection, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.