Transformation reference

This topic covers the available de-identification tokenization techniques, or transformations, in Cloud DLP.

Types of tokenization techniques

Choosing the tokenization transformation you want to use depends on the kind of data you want to de-identify and for what purpose you're de-identifying the data. The tokenization techniques that Cloud DLP supports fall into the following general categories:

  • Redaction: Deletes all or part of a detected sensitive value.
  • Replacement: Replaces a detected sensitive value with a specified surrogate value.
  • Masking: Replaces a number of characters of a sensitive value with a specified surrogate character, such as a hash (#) or asterisk (*).
  • Crypto-based tokenization: Encrypts the original sensitive data value using a cryptographic key. Cloud DLP supports several types of tokenization, including transformations that can be reversed, or "re-identified."
  • Bucketing: "Generalizes" a sensitive value by replacing it with a range of values. (For example, replacing a specific age with an age range, or temperatures with ranges corresponding to "Hot," "Medium," and "Cold.")
  • Date shifting: Shifts sensitive date values by a random amount of time.
  • Time extraction: Extracts or preserves specified portions of date and time values.

The remainder of this topic covers each different type of tokenization transformation and provides examples of their use.

Transformation methods

The following table lists the tokenization transformations that Cloud DLP provides to de-identify sensitive data:

Transformation Object Description Can Reverse1 Referential Integrity2 Input Type
Redaction RedactConfig Redacts a value by removing it. Any
Replacement ReplaceValueConfig Replaces each input value with a given value. Any
Replace with infoType ReplaceWithInfoTypeConfig Replaces an input value with the name of its infoType. Any
Mask with character CharacterMaskConfig Masks a string either fully or partially by replacing a given number of characters with a specified fixed character. Any
Pseudonymization by replacing input value with cryptographic hash CryptoHashConfig Replaces input values with a 32-byte hexadecimal string generated using a given data encryption key. See pseudonymization conceptual documentation to learn more. Strings or integers
Pseudonymization by replacing with cryptographic format preserving token CryptoReplaceFfxFpeConfig Replaces an input value with a token, or surrogate value, of the same length using format-preserving encryption (FPE) with the FFX mode of operation. This allows the output to be used in systems that have format validation or need to appear as real even though the actual information is not revealed. See pseudonymization conceptual documentation to learn more. Strings or integers with a limited number of characters. The alphabet must be made up of at least 2 characters and contain no more than 62.
Pseudonymization by replacing with cryptographic token CryptoDeterministicConfig Replaces an input value with a token, or surrogate value, of the same length using AES in Synthetic Initialization Vector mode (AES-SIV). This transformation method, unlike format-preserving tokenization, has no limitation on supported string character sets, generates identical tokens for each instance of an identical input value, and uses [surrogates](reference/rest/v2/InspectConfig#surrogatetype) to enable re-identification given the original encryption key. Any
Bucket values based on fixed size ranges FixedSizeBucketingConfig Masks input values by replacing them with buckets, or ranges within which the input value falls. Any
Bucket values based on custom size ranges BucketingConfig Buckets input values based on user-configurable ranges and replacement values. Any
Date Shifting DateShiftConfig Shifts dates by a random number of days, with the option to be consistent for the same context.
Preserves sequence and duration
Dates/Times
Extract time data TimePartConfig Extracts or preserves a portion of Date, Timestamp, and TimeOfDay values. Dates/Times

Footnotes

1 Reversible transformations can be reversed to re-identify the sensitive data using the content.reidentify method.
2 Referential integrity allows for records to maintain their relationship to one another while still de-identifying the data. For example, given the same crypto key and context, the data will be replaced with the same obfuscuted form each time it is transformed allowing for connections between records to be preserved.

Redaction

If you want to simply remove sensitive data from your input content, Cloud DLP supports a redaction transformation (RedactConfig in the DLP API).

For example, suppose you want to perform a simple redaction of all EMAIL_ADDRESS infoTypes, and the following string is sent to Cloud DLP:

My name is Alicia Abernathy, and my email address is aabernathy@example.com.

The returned string will be the following:

My name is Alicia Abernathy, and my email address is .

The following JSON example shows how to form the API request and what the Cloud DLP API returns:

JSON Input:

POST https://dlp.googleapis.com/v2/projects/[PROJECT_ID]/content:deidentify?key={YOUR_API_KEY}

{
  "item":{
    "value":"My name is Alicia Abernathy, and my email address is aabernathy@example.com."
  },
  "deidentifyConfig":{
    "infoTypeTransformations":{
      "transformations":[
        {
          "infoTypes":[
            {
              "name":"EMAIL_ADDRESS"
            }
          ],
          "primitiveTransformation":{
            "redactConfig":{

            }
          }
        }
      ]
    }
  },
  "inspectConfig":{
    "infoTypes":[
      {
        "name":"EMAIL_ADDRESS"
      }
    ]
  }
}

JSON Output:

{
  "item":{
    "value":"My name is Alicia Abernathy, and my email address is ."
  },
  "overview":{
    "transformedBytes":"22",
    "transformationSummaries":[
      {
        "infoType":{
          "name":"EMAIL_ADDRESS"
        },
        "transformation":{
          "redactConfig":{

          }
        },
        "results":[
          {
            "count":"1",
            "code":"SUCCESS"
          }
        ],
        "transformedBytes":"22"
      }
    ]
  }
}

Replacement

The replacement transformations replace each input value with either a given token value or with the name of its infoType.

Basic replacement

The basic replacement transformation (ReplaceValueConfig in the DLP API) replaces detected sensitive data values with a value that you specify. For example, suppose you've told Cloud DLP to use "[fake@example.com]" to replace all detected EMAIL_ADDRESS infoTypes, and the following string is sent to Cloud DLP:

My name is Alicia Abernathy, and my email address is aabernathy@example.com.

The returned string is the following:

My name is Alicia Abernathy, and my email address is [fake@example.com].

The following JSON example shows how to form the API request and what the Cloud DLP API returns:

JSON Input:

POST https://dlp.googleapis.com/v2/projects/[PROJECT_ID]/content:deidentify?key={YOUR_API_KEY}

{
  "item":{
    "value":"My name is Alicia Abernathy, and my email address is aabernathy@example.com."
  },
  "deidentifyConfig":{
    "infoTypeTransformations":{
      "transformations":[
        {
          "infoTypes":[
            {
              "name":"EMAIL_ADDRESS"
            }
          ],
          "primitiveTransformation":{
            "replaceConfig":{
              "newValue":{
                "stringValue":"[email-address]"
              }
            }
          }
        }
      ]
    }
  },
  "inspectConfig":{
    "infoTypes":[
      {
        "name":"EMAIL_ADDRESS"
      }
    ]
  }
}

JSON Output:

{
  "item":{
    "value":"My name is Alicia Abernathy, and my email address is [email-address]."
  },
  "overview":{
    "transformedBytes":"22",
    "transformationSummaries":[
      {
        "infoType":{
          "name":"EMAIL_ADDRESS"
        },
        "transformation":{
          "replaceConfig":{
            "newValue":{
              "stringValue":"[email-address]"
            }
          }
        },
        "results":[
          {
            "count":"1",
            "code":"SUCCESS"
          }
        ],
        "transformedBytes":"22"
      }
    ]
  }
}

InfoType replacement

You can also specify an infoType replacement (ReplaceWithInfoTypeConfig in the DLP API). This transformation does the same thing as the basic replacement transformation, but it replaces each detected sensitive data value with the infoType of the detected value.

For example, suppose you've told Cloud DLP to detect both email addresses and last names, and to replace each detected value with the value's infoType. You send the following string to Cloud DLP:

My name is Alicia Abernathy, and my email address is aabernathy@example.com.

The returned string is the following:

My name is Alicia LAST_NAME, and my email address is EMAIL_ADDRESS.

Masking

You can configure Cloud DLP to completely or partially mask a detected sensitive value (CharacterMaskConfig in the DLP API) by replacing each character with a fixed single masking character such as an asterisk (*) or hash (#). Masking can start from the beginning or end of the string. This transformation also works with number types such as long integers.

Cloud DLP's masking transformation has the following options you can specify:

  • Masking character (The maskingCharacter argument in the DLP API): The character to use to mask each character of a sensitive value. For example, you could specify an asterisk (*) or dollar sign ($) to mask a series of numbers such as those in a credit card number.
  • The number of characters to mask (numberToMask): If you don't specify this value, all characters will be masked.
  • Whether to reverse the order (reverseOrder): Whether to mask characters in reverse order. Reversing the order causes characters in matched values to be masked from the end toward the beginning of the value.
  • Characters to ignore (charactersToIgnore): One or more characters to skip when masking values. For example, you can tell Cloud DLP to leave hyphens in place when masking a telephone number. You can also specify a group of common characters (CharsToIgnore) to ignore when masking.

Suppose you send the following string to Cloud DLP and instruct it to use the character masking transformation on email addresses:

My name is Alicia Abernathy, and my email address is aabernathy@example.com.

With the masking character sent to '#,' the characters to ignore set to the common character set, and otherwise default settings, Cloud DLP returns the following:

My name is Alicia Abernathy, and my email address is ##########@#######.###.

The following JSON and code examples demonstrate how the masking transformation works.

Protocol

JSON Input:

POST https://dlp.googleapis.com/v2/projects/[PROJECT_ID]/content:deidentify?key={YOUR_API_KEY}

{
  "item":{
    "value":"My name is Alicia Abernathy, and my email address is aabernathy@example.com."
  },
  "deidentifyConfig":{
    "infoTypeTransformations":{
      "transformations":[
        {
          "infoTypes":[
            {
              "name":"EMAIL_ADDRESS"
            }
          ],
          "primitiveTransformation":{
            "characterMaskConfig":{
              "maskingCharacter":"#",
              "reverseOrder":false,
              "charactersToIgnore":[
                {
                  "charactersToSkip":".@"
                }
              ]
            }
          }
        }
      ]
    }
  },
  "inspectConfig":{
    "infoTypes":[
      {
        "name":"EMAIL_ADDRESS"
      }
    ]
  }
}

JSON Output:

{
  "item":{
    "value":"My name is Alicia Abernathy, and my email address is ##########@#######.###."
  },
  "overview":{
    "transformedBytes":"22",
    "transformationSummaries":[
      {
        "infoType":{
          "name":"EMAIL_ADDRESS"
        },
        "transformation":{
          "characterMaskConfig":{
            "maskingCharacter":"#",
            "charactersToIgnore":[
              {
                "charactersToSkip":".@"
              }
            ]
          }
        },
        "results":[
          {
            "count":"1",
            "code":"SUCCESS"
          }
        ],
        "transformedBytes":"22"
      }
    ]
  }
}

Java

/**
 * Deidentify a string by masking sensitive information with a character using the DLP API.
 *
 * @param string The string to deidentify.
 * @param maskingCharacter (Optional) The character to mask sensitive data with.
 * @param numberToMask (Optional) The number of characters' worth of sensitive data to mask.
 *     Omitting this value or setting it to 0 masks all sensitive chars.
 * @param projectId ID of Google Cloud project to run the API under.
 */
private static void deIdentifyWithMask(
    String string,
    List<InfoType> infoTypes,
    Character maskingCharacter,
    int numberToMask,
    String projectId) {

  // instantiate a client
  try (DlpServiceClient dlpServiceClient = DlpServiceClient.create()) {

    ContentItem contentItem = ContentItem.newBuilder().setValue(string).build();

    CharacterMaskConfig characterMaskConfig =
        CharacterMaskConfig.newBuilder()
            .setMaskingCharacter(maskingCharacter.toString())
            .setNumberToMask(numberToMask)
            .build();

    // Create the deidentification transformation configuration
    PrimitiveTransformation primitiveTransformation =
        PrimitiveTransformation.newBuilder().setCharacterMaskConfig(characterMaskConfig).build();

    InfoTypeTransformation infoTypeTransformationObject =
        InfoTypeTransformation.newBuilder()
            .setPrimitiveTransformation(primitiveTransformation)
            .build();

    InfoTypeTransformations infoTypeTransformationArray =
        InfoTypeTransformations.newBuilder()
            .addTransformations(infoTypeTransformationObject)
            .build();

    InspectConfig inspectConfig =
        InspectConfig.newBuilder()
            .addAllInfoTypes(infoTypes)
            .build();

    DeidentifyConfig deidentifyConfig =
        DeidentifyConfig.newBuilder()
            .setInfoTypeTransformations(infoTypeTransformationArray)
            .build();

    // Create the deidentification request object
    DeidentifyContentRequest request =
        DeidentifyContentRequest.newBuilder()
            .setParent(ProjectName.of(projectId).toString())
            .setInspectConfig(inspectConfig)
            .setDeidentifyConfig(deidentifyConfig)
            .setItem(contentItem)
            .build();

    // Execute the deidentification request
    DeidentifyContentResponse response = dlpServiceClient.deidentifyContent(request);

    // Print the character-masked input value
    // e.g. "My SSN is 123456789" --> "My SSN is *********"
    String result = response.getItem().getValue();
    System.out.println(result);
  } catch (Exception e) {
    System.out.println("Error in deidentifyWithMask: " + e.getMessage());
  }
}

Node.js

// Imports the Google Cloud Data Loss Prevention library
const DLP = require('@google-cloud/dlp');

// Instantiates a client
const dlp = new DLP.DlpServiceClient();

// The project ID to run the API call under
// const callingProjectId = process.env.GCLOUD_PROJECT;

// The string to deidentify
// const string = 'My SSN is 372819127';

// (Optional) The maximum number of sensitive characters to mask in a match
// If omitted from the request or set to 0, the API will mask any matching characters
// const numberToMask = 5;

// (Optional) The character to mask matching sensitive data with
// const maskingCharacter = 'x';

// Construct deidentification request
const item = {value: string};
const request = {
  parent: dlp.projectPath(callingProjectId),
  deidentifyConfig: {
    infoTypeTransformations: {
      transformations: [
        {
          primitiveTransformation: {
            characterMaskConfig: {
              maskingCharacter: maskingCharacter,
              numberToMask: numberToMask,
            },
          },
        },
      ],
    },
  },
  item: item,
};

try {
  // Run deidentification request
  const [response] = await dlp.deidentifyContent(request);
  const deidentifiedItem = response.item;
  console.log(deidentifiedItem.value);
} catch (err) {
  console.log(`Error in deidentifyWithMask: ${err.message || err}`);
}

Python

def deidentify_with_mask(project, string, info_types, masking_character=None,
                         number_to_mask=0):
    """Uses the Data Loss Prevention API to deidentify sensitive data in a
    string by masking it with a character.
    Args:
        project: The Google Cloud project id to use as a parent resource.
        item: The string to deidentify (will be treated as text).
        masking_character: The character to mask matching sensitive data with.
        number_to_mask: The maximum number of sensitive characters to mask in
            a match. If omitted or set to zero, the API will default to no
            maximum.
    Returns:
        None; the response from the API is printed to the terminal.
    """

    # Import the client library
    import google.cloud.dlp

    # Instantiate a client
    dlp = google.cloud.dlp.DlpServiceClient()

    # Convert the project id into a full resource id.
    parent = dlp.project_path(project)

    # Construct inspect configuration dictionary
    inspect_config = {
        'info_types': [{'name': info_type} for info_type in info_types]
    }

    # Construct deidentify configuration dictionary
    deidentify_config = {
        'info_type_transformations': {
            'transformations': [
                {
                    'primitive_transformation': {
                        'character_mask_config': {
                            'masking_character': masking_character,
                            'number_to_mask': number_to_mask
                        }
                    }
                }
            ]
        }
    }

    # Construct item
    item = {'value': string}

    # Call the API
    response = dlp.deidentify_content(
        parent, inspect_config=inspect_config,
        deidentify_config=deidentify_config, item=item)

    # Print out the results.
    print(response.item.value)

Go

// mask deidentifies the input by masking all provided info types with maskingCharacter
// and prints the result to w.
func mask(w io.Writer, client *dlp.Client, project, input string, infoTypes []string, maskingCharacter string, numberToMask int32) {
	// Convert the info type strings to a list of InfoTypes.
	var i []*dlppb.InfoType
	for _, it := range infoTypes {
		i = append(i, &dlppb.InfoType{Name: it})
	}
	// Create a configured request.
	req := &dlppb.DeidentifyContentRequest{
		Parent: "projects/" + project,
		InspectConfig: &dlppb.InspectConfig{
			InfoTypes: i,
		},
		DeidentifyConfig: &dlppb.DeidentifyConfig{
			Transformation: &dlppb.DeidentifyConfig_InfoTypeTransformations{
				InfoTypeTransformations: &dlppb.InfoTypeTransformations{
					Transformations: []*dlppb.InfoTypeTransformations_InfoTypeTransformation{
						{
							InfoTypes: []*dlppb.InfoType{}, // Match all info types.
							PrimitiveTransformation: &dlppb.PrimitiveTransformation{
								Transformation: &dlppb.PrimitiveTransformation_CharacterMaskConfig{
									CharacterMaskConfig: &dlppb.CharacterMaskConfig{
										MaskingCharacter: maskingCharacter,
										NumberToMask:     numberToMask,
									},
								},
							},
						},
					},
				},
			},
		},
		// The item to analyze.
		Item: &dlppb.ContentItem{
			DataItem: &dlppb.ContentItem_Value{
				Value: input,
			},
		},
	}
	// Send the request.
	r, err := client.DeidentifyContent(context.Background(), req)
	if err != nil {
		log.Fatal(err)
	}
	// Print the result.
	fmt.Fprint(w, r.GetItem().GetValue())
}

PHP

use Google\Cloud\Dlp\V2\CharacterMaskConfig;
use Google\Cloud\Dlp\V2\DlpServiceClient;
use Google\Cloud\Dlp\V2\InfoType;
use Google\Cloud\Dlp\V2\PrimitiveTransformation;
use Google\Cloud\Dlp\V2\DeidentifyConfig;
use Google\Cloud\Dlp\V2\InfoTypeTransformations\InfoTypeTransformation;
use Google\Cloud\Dlp\V2\InfoTypeTransformations;
use Google\Cloud\Dlp\V2\ContentItem;

/**
 * Deidentify sensitive data in a string by masking it with a character.
 * @param string $callingProjectId The GCP Project ID to run the API call under
 * @param string $string The string to deidentify
 * @param int $numberToMask (Optional) The maximum number of sensitive characters to mask in a match
 * @param string $maskingCharacter (Optional) The character to mask matching sensitive data with
 */
function deidentify_mask(
  $callingProjectId,
  $string,
  $numberToMask = 0,
  $maskingCharacter = 'x'
) {
    // Instantiate a client.
    $dlp = new DlpServiceClient();

    // The infoTypes of information to mask
    $ssnInfoType = (new InfoType())
        ->setName('US_SOCIAL_SECURITY_NUMBER');
    $infoTypes = [$ssnInfoType];

    // Create the masking configuration object
    $maskConfig = (new CharacterMaskConfig())
        ->setMaskingCharacter($maskingCharacter)
        ->setNumberToMask($numberToMask);

    // Create the information transform configuration objects
    $primitiveTransformation = (new PrimitiveTransformation())
        ->setCharacterMaskConfig($maskConfig);

    $infoTypeTransformation = (new InfoTypeTransformation())
        ->setPrimitiveTransformation($primitiveTransformation)
        ->setInfoTypes($infoTypes);

    $infoTypeTransformations = (new InfoTypeTransformations())
        ->setTransformations([$infoTypeTransformation]);

    // Create the deidentification configuration object
    $deidentifyConfig = (new DeidentifyConfig())
        ->setInfoTypeTransformations($infoTypeTransformations);

    $item = (new ContentItem())
        ->setValue($string);

    $parent = $dlp->projectName($callingProjectId);

    // Run request
    $response = $dlp->deidentifyContent($parent, [
        'deidentifyConfig' => $deidentifyConfig,
        'item' => $item
    ]);

    // Print the results
    $deidentifiedValue = $response->getItem()->getValue();
    print($deidentifiedValue);
}

C#

public static object DeidMask(
    string projectId,
    string dataValue,
    IEnumerable<InfoType> infoTypes,
    string maskingCharacter,
    int numberToMask,
    bool reverseOrder)
{
    var request = new DeidentifyContentRequest
    {
        ParentAsProjectName = new ProjectName(projectId),
        InspectConfig = new InspectConfig
        {
            InfoTypes = { infoTypes }
        },
        DeidentifyConfig = new DeidentifyConfig
        {
            InfoTypeTransformations = new InfoTypeTransformations
            {
                Transformations = {
                    new InfoTypeTransformations.Types.InfoTypeTransformation
                    {
                        PrimitiveTransformation = new PrimitiveTransformation
                        {
                            CharacterMaskConfig = new CharacterMaskConfig
                            {
                                MaskingCharacter = maskingCharacter,
                                NumberToMask = numberToMask,
                                ReverseOrder = reverseOrder
                            }
                        }
                    }
                }
            }
        },
        Item = new ContentItem
        {
            Value = dataValue
        }
    };

    DlpServiceClient dlp = DlpServiceClient.Create();
    var response = dlp.DeidentifyContent(request);

    Console.WriteLine($"Deidentified content: {response.Item.Value}");
    return 0;
}

Crypto-based tokenization

Crypto-based tokenization (also referred to as "pseudonymization") transformations replace the original sensitive data values with encrypted values. Cloud DLP supports the following types of tokenization, including transformations that can be reversed and allow for "re-identification:"

  • Cryptographic hashing: Given a CryptoKey, Cloud DLP uses a SHA-256-based message authentication code (HMAC-SHA-256) on the input value, and then replaces the input value with the hashed value encoded in Base64.
  • Format preserving encryption: Replaces an input value with a token that has been generated using format-preserving encryption (FPE) with the FFX mode of operation. This transformation method produces a token that is limited to the same alphabet as the input value and is the same length as the input value. FPE also supports re-identification given the original encryption key.
  • Deterministic encryption: Replaces an input value with a token that has been generated using AES in Synthetic Initialization Vector mode (AES-SIV). This transformation method has no limitation on supported string character sets, generates identical tokens for each instance of an identical input value, and uses surrogates to enable re-identification given the original encryption key.

Cryptographic hashing

The cryptographic hashing tokenization transformation (CryptoHashConfig in the DLP API) takes an input value (a piece of sensitive data that Cloud DLP has detected) and replaces it with a hashed value. The hash value is generated by using a SHA-256-based message authentication code (HMAC-SHA-256) ) on the input value with a CryptoKey.

Cloud DLP outputs a Base64-encoded representation of the hashed input value in the place of the original value.

Before using the cryptographic hashing transformation, keep in mind the following:

  • The input value is not encrypted but hashed.
  • This transformation can't be reversed. That is, given the hashed output value of the transformation and the original cryptographic key, there is no way to restore the original value.
  • Currently, only string and integer values can be hashed.
  • The hashed output of the transformation is always the same length, depending on the size of the cryptographic key. For example, if you use the cryptographic hashing transformation on 10-digit phone numbers, each phone number will be replaced by a fixed-length Base64-encoded hash value.

Format-preserving encryption

The format-preserving encryption transformation method (CryptoReplaceFfxFpeConfig in the DLP API) takes an input value (a piece of sensitive data that Cloud DLP has detected), encrypts it using format-preserving encryption in FFX mode and a CryptoKey, and then replaces the original value with the encrypted value, or token.

The input value:

  • Must be at least two characters long (or the empty string).
  • Must be encoded as ASCII.
  • Comprised of the characters specified by an "alphabet," which is the set of between 2 and 64 allowed characters in the input value. For more information, see the alphabet field in CryptoReplaceFfxFpeConfig.

The generated token:

  • Is the encrypted input value.
  • Preserves the character set ("alphabet") and length of the input value post-encryption.
  • Is computed using format-preserving encryption (FPE) in FFX mode keyed on the specified cryptographic key.
  • Isn't necessarily unique, as each instance of the same input value de-identifies to the same token. This enables referential integrity, and therefore enables more efficient searching of de-identified data. You can change this behavior by using context "tweaks," as described in Contexts.

If there are multiple instances of an input value in the source content, each one will be de-identified to the same token. FPE preserves both length and alphabet space (the character set), which is limited to 62 characters. This behavior is in contrast to the non-deterministic tokenization method, which de-identifies each instance of the same input value to a unique token, and can be used with any character set. You can change this behavior by using context "tweaks," which can improve security. The addition of a context tweak to the transformation enables Cloud DLP to de-identify multiple instances of the same input value to different tokens. If you don't need to preserve the length and alphabet space of the original values, use deterministic encryption, described below.

Cloud DLP computes the replacement token using a cryptographic key. You provide this key in one of three ways:

  1. By embedding it unencrypted in the API request.
  2. By requesting that the Cloud DLP generate it.
  3. By embedding it encrypted in the API request. For this option, the key is wrapped (encrypted) by a Cloud Key Management Service (Cloud KMS) key.

To create a Cloud KMS wrapped key, send a request containing a 16-, 24-, or 32-byte plaintext field value to the Cloud KMS projects.locations.keyRings.cryptoKeys.encrypt method. The wrapped key is the value in the ciphertext field of the method's response.

The value is a Base64-encoded string by default. To set this value in Cloud DLP, it must be decoded into a byte string. The following code snippets highlight how to do this in several languages. End-to-end examples are provided following these snippets.

Java

KmsWrappedCryptoKey.newBuilder()
    .setWrappedKey(ByteString.copyFrom(BaseEncoding.base64().decode(wrappedKey)))

Python

# The wrapped key is Base64-encoded, but the library expects a binary
# string, so decode it here.
import base64
wrapped_key = base64.b64decode(wrapped_key)

PHP

// Create the wrapped crypto key configuration object
$kmsWrappedCryptoKey = (new KmsWrappedCryptoKey())
    ->setWrappedKey(base64_decode($wrappedKey))
    ->setCryptoKeyName($keyName);

C#

WrappedKey = ByteString.FromBase64(wrappedKey)

For more information about encrypting and decrypting data using Cloud KMS, see Encrypting and Decrypting Data.

Following is sample code in several languages that demonstrates how to use Cloud DLP to de-identify sensitive data by replacing an input value with a token.

Java

/**
 * Deidentify a string by encrypting sensitive information while preserving format.
 *
 * @param string The string to deidentify.
 * @param alphabet The set of characters to use when encrypting the input. For more information,
 *     see cloud.google.com/dlp/docs/reference/rest/v2/content/deidentify
 * @param keyName The name of the Cloud KMS key to use when decrypting the wrapped key.
 * @param wrappedKey The encrypted (or "wrapped") AES-256 encryption key.
 * @param projectId ID of Google Cloud project to run the API under.
 */
private static void deIdentifyWithFpe(
    String string,
    List<InfoType> infoTypes,
    FfxCommonNativeAlphabet alphabet,
    String keyName,
    String wrappedKey,
    String projectId,
    String surrogateType) {
  // instantiate a client
  try (DlpServiceClient dlpServiceClient = DlpServiceClient.create()) {
    ContentItem contentItem = ContentItem.newBuilder().setValue(string).build();

    // Create the format-preserving encryption (FPE) configuration
    KmsWrappedCryptoKey kmsWrappedCryptoKey =
        KmsWrappedCryptoKey.newBuilder()
            .setWrappedKey(ByteString.copyFrom(BaseEncoding.base64().decode(wrappedKey)))
            .setCryptoKeyName(keyName)
            .build();

    CryptoKey cryptoKey = CryptoKey.newBuilder().setKmsWrapped(kmsWrappedCryptoKey).build();

    CryptoReplaceFfxFpeConfig cryptoReplaceFfxFpeConfig =
        CryptoReplaceFfxFpeConfig.newBuilder()
            .setCryptoKey(cryptoKey)
            .setCommonAlphabet(alphabet)
            .setSurrogateInfoType(InfoType.newBuilder().setName(surrogateType).build())
            .build();

    // Create the deidentification transformation configuration
    PrimitiveTransformation primitiveTransformation =
        PrimitiveTransformation.newBuilder()
            .setCryptoReplaceFfxFpeConfig(cryptoReplaceFfxFpeConfig)
            .build();

    InfoTypeTransformation infoTypeTransformationObject =
        InfoTypeTransformation.newBuilder()
            .setPrimitiveTransformation(primitiveTransformation)
            .build();

    InfoTypeTransformations infoTypeTransformationArray =
        InfoTypeTransformations.newBuilder()
            .addTransformations(infoTypeTransformationObject)
            .build();

    InspectConfig inspectConfig =
        InspectConfig.newBuilder()
            .addAllInfoTypes(infoTypes)
            .build();

    // Create the deidentification request object
    DeidentifyConfig deidentifyConfig =
        DeidentifyConfig.newBuilder()
            .setInfoTypeTransformations(infoTypeTransformationArray)
            .build();

    DeidentifyContentRequest request =
        DeidentifyContentRequest.newBuilder()
            .setParent(ProjectName.of(projectId).toString())
            .setInspectConfig(inspectConfig)
            .setDeidentifyConfig(deidentifyConfig)
            .setItem(contentItem)
            .build();

    // Execute the deidentification request
    DeidentifyContentResponse response = dlpServiceClient.deidentifyContent(request);

    // Print the deidentified input value
    // e.g. "My SSN is 123456789" --> "My SSN is 7261298621"
    String result = response.getItem().getValue();
    System.out.println(result);
  } catch (Exception e) {
    System.out.println("Error in deidentifyWithFpe: " + e.getMessage());
  }
}

Node.js

// Imports the Google Cloud Data Loss Prevention library
const DLP = require('@google-cloud/dlp');

// Instantiates a client
const dlp = new DLP.DlpServiceClient();

// The project ID to run the API call under
// const callingProjectId = process.env.GCLOUD_PROJECT;

// The string to deidentify
// const string = 'My SSN is 372819127';

// The set of characters to replace sensitive ones with
// For more information, see https://cloud.google.com/dlp/docs/reference/rest/v2/organizations.deidentifyTemplates#ffxcommonnativealphabet
// const alphabet = 'ALPHA_NUMERIC';

// The name of the Cloud KMS key used to encrypt ('wrap') the AES-256 key
// const keyName = 'projects/YOUR_GCLOUD_PROJECT/locations/YOUR_LOCATION/keyRings/YOUR_KEYRING_NAME/cryptoKeys/YOUR_KEY_NAME';

// The encrypted ('wrapped') AES-256 key to use
// This key should be encrypted using the Cloud KMS key specified above
// const wrappedKey = 'YOUR_ENCRYPTED_AES_256_KEY'

// (Optional) The name of the surrogate custom info type to use
// Only necessary if you want to reverse the deidentification process
// Can be essentially any arbitrary string, as long as it doesn't appear
// in your dataset otherwise.
// const surrogateType = 'SOME_INFO_TYPE_DEID';

// Construct FPE config
const cryptoReplaceFfxFpeConfig = {
  cryptoKey: {
    kmsWrapped: {
      wrappedKey: wrappedKey,
      cryptoKeyName: keyName,
    },
  },
  commonAlphabet: alphabet,
};
if (surrogateType) {
  cryptoReplaceFfxFpeConfig.surrogateInfoType = {
    name: surrogateType,
  };
}

// Construct deidentification request
const item = {value: string};
const request = {
  parent: dlp.projectPath(callingProjectId),
  deidentifyConfig: {
    infoTypeTransformations: {
      transformations: [
        {
          primitiveTransformation: {
            cryptoReplaceFfxFpeConfig: cryptoReplaceFfxFpeConfig,
          },
        },
      ],
    },
  },
  item: item,
};

try {
  // Run deidentification request
  const [response] = await dlp.deidentifyContent(request);
  const deidentifiedItem = response.item;
  console.log(deidentifiedItem.value);
} catch (err) {
  console.log(`Error in deidentifyWithFpe: ${err.message || err}`);
}

Python

def deidentify_with_fpe(project, string, info_types, alphabet=None,
                        surrogate_type=None, key_name=None, wrapped_key=None):
    """Uses the Data Loss Prevention API to deidentify sensitive data in a
    string using Format Preserving Encryption (FPE).
    Args:
        project: The Google Cloud project id to use as a parent resource.
        item: The string to deidentify (will be treated as text).
        alphabet: The set of characters to replace sensitive ones with. For
            more information, see https://cloud.google.com/dlp/docs/reference/
            rest/v2beta2/organizations.deidentifyTemplates#ffxcommonnativealphabet
        surrogate_type: The name of the surrogate custom info type to use. Only
            necessary if you want to reverse the deidentification process. Can
            be essentially any arbitrary string, as long as it doesn't appear
            in your dataset otherwise.
        key_name: The name of the Cloud KMS key used to encrypt ('wrap') the
            AES-256 key. Example:
            key_name = 'projects/YOUR_GCLOUD_PROJECT/locations/YOUR_LOCATION/
            keyRings/YOUR_KEYRING_NAME/cryptoKeys/YOUR_KEY_NAME'
        wrapped_key: The encrypted ('wrapped') AES-256 key to use. This key
            should be encrypted using the Cloud KMS key specified by key_name.
    Returns:
        None; the response from the API is printed to the terminal.
    """
    # Import the client library
    import google.cloud.dlp

    # Instantiate a client
    dlp = google.cloud.dlp.DlpServiceClient()

    # Convert the project id into a full resource id.
    parent = dlp.project_path(project)

    # The wrapped key is base64-encoded, but the library expects a binary
    # string, so decode it here.
    import base64
    wrapped_key = base64.b64decode(wrapped_key)

    # Construct FPE configuration dictionary
    crypto_replace_ffx_fpe_config = {
        'crypto_key': {
            'kms_wrapped': {
                'wrapped_key': wrapped_key,
                'crypto_key_name': key_name
            }
        },
        'common_alphabet': alphabet
    }

    # Add surrogate type
    if surrogate_type:
        crypto_replace_ffx_fpe_config['surrogate_info_type'] = {
            'name': surrogate_type
        }

    # Construct inspect configuration dictionary
    inspect_config = {
        'info_types': [{'name': info_type} for info_type in info_types]
    }

    # Construct deidentify configuration dictionary
    deidentify_config = {
        'info_type_transformations': {
            'transformations': [
                {
                    'primitive_transformation': {
                        'crypto_replace_ffx_fpe_config':
                            crypto_replace_ffx_fpe_config
                    }
                }
            ]
        }
    }

    # Convert string to item
    item = {'value': string}

    # Call the API
    response = dlp.deidentify_content(
        parent, inspect_config=inspect_config,
        deidentify_config=deidentify_config, item=item)

    # Print results
    print(response.item.value)

Go

// deidentifyFPE deidentifies the input with FPE (Format Preserving Encryption).
// keyFileName is the file name with the KMS wrapped key and cryptoKeyName is the
// full KMS key resource name used to wrap the key. surrogateInfoType is an
// optional identifier needed for reidentification. surrogateInfoType can be any
// value not found in your input.
func deidentifyFPE(w io.Writer, client *dlp.Client, project, input string, infoTypes []string, keyFileName, cryptoKeyName, surrogateInfoType string) {
	// Convert the info type strings to a list of InfoTypes.
	var i []*dlppb.InfoType
	for _, it := range infoTypes {
		i = append(i, &dlppb.InfoType{Name: it})
	}
	// Read the key file.
	keyBytes, err := ioutil.ReadFile(keyFileName)
	if err != nil {
		log.Fatalf("error reading file: %v", err)
	}
	// Create a configured request.
	req := &dlppb.DeidentifyContentRequest{
		Parent: "projects/" + project,
		InspectConfig: &dlppb.InspectConfig{
			InfoTypes: i,
		},
		DeidentifyConfig: &dlppb.DeidentifyConfig{
			Transformation: &dlppb.DeidentifyConfig_InfoTypeTransformations{
				InfoTypeTransformations: &dlppb.InfoTypeTransformations{
					Transformations: []*dlppb.InfoTypeTransformations_InfoTypeTransformation{
						{
							InfoTypes: []*dlppb.InfoType{}, // Match all info types.
							PrimitiveTransformation: &dlppb.PrimitiveTransformation{
								Transformation: &dlppb.PrimitiveTransformation_CryptoReplaceFfxFpeConfig{
									CryptoReplaceFfxFpeConfig: &dlppb.CryptoReplaceFfxFpeConfig{
										CryptoKey: &dlppb.CryptoKey{
											Source: &dlppb.CryptoKey_KmsWrapped{
												KmsWrapped: &dlppb.KmsWrappedCryptoKey{
													WrappedKey:    keyBytes,
													CryptoKeyName: cryptoKeyName,
												},
											},
										},
										// Set the alphabet used for the output.
										Alphabet: &dlppb.CryptoReplaceFfxFpeConfig_CommonAlphabet{
											CommonAlphabet: dlppb.CryptoReplaceFfxFpeConfig_ALPHA_NUMERIC,
										},
										// Set the surrogate info type, used for reidentification.
										SurrogateInfoType: &dlppb.InfoType{
											Name: surrogateInfoType,
										},
									},
								},
							},
						},
					},
				},
			},
		},
		// The item to analyze.
		Item: &dlppb.ContentItem{
			DataItem: &dlppb.ContentItem_Value{
				Value: input,
			},
		},
	}
	// Send the request.
	r, err := client.DeidentifyContent(context.Background(), req)
	if err != nil {
		log.Fatal(err)
	}
	// Print the result.
	fmt.Fprint(w, r.GetItem().GetValue())
}

PHP

use Google\Cloud\Dlp\V2\CryptoReplaceFfxFpeConfig;
use Google\Cloud\Dlp\V2\CryptoReplaceFfxFpeConfig\FfxCommonNativeAlphabet;
use Google\Cloud\Dlp\V2\CryptoKey;
use Google\Cloud\Dlp\V2\DlpServiceClient;
use Google\Cloud\Dlp\V2\PrimitiveTransformation;
use Google\Cloud\Dlp\V2\KmsWrappedCryptoKey;
use Google\Cloud\Dlp\V2\InfoType;
use Google\Cloud\Dlp\V2\DeidentifyConfig;
use Google\Cloud\Dlp\V2\InfoTypeTransformations\InfoTypeTransformation;
use Google\Cloud\Dlp\V2\InfoTypeTransformations;
use Google\Cloud\Dlp\V2\ContentItem;

/**
 * Deidentify a string using Format-Preserving Encryption (FPE).
 *
 * @param string $callingProjectId The GCP Project ID to run the API call under
 * @param string $string The string to deidentify
 * @param string $keyName The name of the Cloud KMS key used to encrypt ('wrap') the AES-256 key
 * @param wrappedKey $wrappedKey The AES-256 key to use, encrypted ('wrapped') with the KMS key
 *        defined by $keyName.
 * @param string $surrogateTypeName Optional surrogate custom info type to enable
 *        reidentification. Can be essentially any arbitrary string that doesn't
 *        appear in your dataset'
 */
function deidentify_fpe(
    $callingProjectId,
    $string,
    $keyName,
    $wrappedKey,
    $surrogateTypeName = ''
) {
    // Instantiate a client.
    $dlp = new DlpServiceClient();

    // The infoTypes of information to mask
    $ssnInfoType = (new InfoType())
        ->setName('US_SOCIAL_SECURITY_NUMBER');
    $infoTypes = [$ssnInfoType];

    // Create the wrapped crypto key configuration object
    $kmsWrappedCryptoKey = (new KmsWrappedCryptoKey())
        ->setWrappedKey(base64_decode($wrappedKey))
        ->setCryptoKeyName($keyName);

    // The set of characters to replace sensitive ones with
    // For more information, see https://cloud.google.com/dlp/docs/reference/rest/V2/organizations.deidentifyTemplates#ffxcommonnativealphabet
    $commonAlphabet = FfxCommonNativeAlphabet::NUMERIC;

    // Create the crypto key configuration object
    $cryptoKey = (new CryptoKey())
        ->setKmsWrapped($kmsWrappedCryptoKey);

    // Create the crypto FFX FPE configuration object
    $cryptoReplaceFfxFpeConfig = (new CryptoReplaceFfxFpeConfig())
        ->setCryptoKey($cryptoKey)
        ->setCommonAlphabet($commonAlphabet);
    if ($surrogateTypeName) {
        $surrogateType = (new InfoType())
            ->setName($surrogateTypeName);
        $cryptoReplaceFfxFpeConfig->setSurrogateInfoType($surrogateType);
    }

    // Create the information transform configuration objects
    $primitiveTransformation = (new PrimitiveTransformation())
        ->setCryptoReplaceFfxFpeConfig($cryptoReplaceFfxFpeConfig);

    $infoTypeTransformation = (new InfoTypeTransformation())
        ->setPrimitiveTransformation($primitiveTransformation)
        ->setInfoTypes($infoTypes);

    $infoTypeTransformations = (new InfoTypeTransformations())
        ->setTransformations([$infoTypeTransformation]);

    // Create the deidentification configuration object
    $deidentifyConfig = (new DeidentifyConfig())
        ->setInfoTypeTransformations($infoTypeTransformations);

    $content = (new ContentItem())
        ->setValue($string);

    $parent = $dlp->projectName($callingProjectId);

    // Run request
    $response = $dlp->deidentifyContent($parent, [
        'deidentifyConfig' => $deidentifyConfig,
        'item' => $content
    ]);

    // Print the results
    $deidentifiedValue = $response->getItem()->getValue();
    print($deidentifiedValue);
}

C#

public static object DeidFpe(
    string projectId,
    string dataValue,
    IEnumerable<InfoType> infoTypes,
    string keyName,
    string wrappedKey,
    string alphabet)
{
    var deidentifyConfig = new DeidentifyConfig
    {
        InfoTypeTransformations = new InfoTypeTransformations
        {
            Transformations =
            {
                new InfoTypeTransformations.Types.InfoTypeTransformation
                {
                    PrimitiveTransformation = new PrimitiveTransformation
                    {
                        CryptoReplaceFfxFpeConfig = new CryptoReplaceFfxFpeConfig
                        {
                            CommonAlphabet = (FfxCommonNativeAlphabet) Enum.Parse(typeof(FfxCommonNativeAlphabet), alphabet),
                            CryptoKey = new CryptoKey
                            {
                                KmsWrapped = new KmsWrappedCryptoKey
                                {
                                    CryptoKeyName = keyName,
                                    WrappedKey = ByteString.FromBase64(wrappedKey)
                                }
                            },
                            SurrogateInfoType = new InfoType
                            {
                                Name = "TOKEN"
                            }
                        }
                    }
                }
            }
        }
    };

    DlpServiceClient dlp = DlpServiceClient.Create();
    var response = dlp.DeidentifyContent(
        new DeidentifyContentRequest
        {
            ParentAsProjectName = new ProjectName(projectId),
            InspectConfig = new InspectConfig
            {
                InfoTypes = { infoTypes }
            },
            DeidentifyConfig = deidentifyConfig,
            Item = new ContentItem { Value = dataValue }
        });

    Console.WriteLine($"Deidentified content: {response.Item.Value}");
    return 0;
}

Deterministic encryption

The deterministic encryption transformation method CryptoDeterministicConfig in the DLP API takes an input value (a piece of sensitive data that Cloud DLP has detected), encrypts it using AES-SIV with a CryptoKey, and then replaces the original value with a Base64-encoded representation of the encrypted value.

Using the deterministic encryption transformation enables more efficient searching of encrypted data.

The input value:

  • Must be at least 1 character long.
  • Has no character set limitations.

The generated token:

  • Is a Base64-encoded representation of the encrypted value.
  • Does not preserve the character set ("alphabet") or length of the input value post-encryption.
  • Is computed using AES encryption in SIV mode (AES-SIV) with a CryptoKey.
  • Isn't necessarily unique, as each instance of the same input value de-identifies to the same token. This enables more efficient searching of encrypted data. You can change this behavior by using context "tweaks," as described in Contexts.
  • Is generated with a prefix added, in the form [SURROGATE_TYPE]([LENGTH]):, where [SURROGATE_TYPE] represents a surrogate infoType describing the input value, and [LENGTH] indicates its character length. The surrogate enables the token to be re-identified using the original encryption key used for de-identification.

Following is an example JSON configuration for de-identification using deterministic encryption. Note that we've chosen to use "PHONE_SURROGATE" as our descriptive surrogate type since we're de-identifying telephone numbers. [CRYPTO_KEY] represents an unwrapped cryptographic key obtained from Cloud KMS. (For more information about how to obtain a CryptoKey, see the previous section, Format-preserving encryption.)

{
  "deidentifyConfig":{
    "infoTypeTransformations":{
      "transformations":[
        {
          "infoTypes":[
            {
              "name":"PHONE_NUMBER"
            }
          ],
          "primitiveTransformation":{
            "cryptoDeterministicConfig":{
              "cryptoKey":{
                "unwrapped":{
                  "key":"[CRYPTO_KEY]"
                }
              },
              "surrogateInfoType":{
                "name":"PHONE_SURROGATE"
              }
            }
          }
        }
      ]
    }
  },
  "inspectConfig":{
    "infoTypes":[
      {
        "name":"PHONE_NUMBER"
      }
    ]
  },
  "item":{
    "value":"My phone number is 206-555-0574, call me"
  }
}

De-identifying the string "My phone number is 206-555-0574" using this transformation results in a de-identified string such as the following:

My phone number is PHONE_SURROGATE(36):ATZBu5OCCSwo+e94xSYnKYljk1OQpkW7qhzx, call me

To re-identify this string, you can use a JSON request like the following, where [CRYPTO_KEY] is the same cryptographic key used to de-identify the content.

{
  "reidentifyConfig":{
    "infoTypeTransformations":{
      "transformations":[
        {
          "infoTypes":[
            {
              "name":"PHONE_SURROGATE"
            }
          ],
          "primitiveTransformation":{
            "cryptoDeterministicConfig":{
              "cryptoKey":{
                "unwrapped":{
                  "key":"[CRYPTO_KEY]"
                }
              },
              "surrogateInfoType":{
                "name":"PHONE_SURROGATE"
              }
            }
          }
        }
      ]
    }
  },
  "inspectConfig":{
    "customInfoTypes":[
      {
        "infoType":{
          "name":"PHONE_SURROGATE"
        },
        "surrogateType":{

        }
      }
    ]
  },
  "item":{
    "value":"My phone number is [PHONE_SURROGATE](36):ATZBu5OCCSwo+e94xSYnKYljk1OQpkW7qhzx, call me"
  }
}

Re-identifying this string results in the original string:

My phone number is 206-555-0574, call me

Bucketing

The bucketing transformations serve to de-identify numerical data by "bucketing" it into ranges. The resulting number range is a hyphenated string consisting of a lower bound, a hyphen, and an upper bound.

Fixed-size bucketing

Cloud DLP can bucket numerical input values based on fixed size ranges (FixedSizeBucketingConfig in the DLP API). You specify the following to configure fixed-size bucketing:

  • The lower bound value of all of the buckets. Any values less than the lower bound are grouped together in a single bucket.
  • The upper bound value of all of the buckets. Any values greater than the upper bound are grouped together in a single bucket.
  • The size of each bucket other than the minimum and maximum buckets.

For example, if the lower bound is set to 10, the upper bound is set to 89, and the bucket size is set to 10, then the following buckets would be used: -10, 10-20, 20-30, 30-40, 40-50, 50-60, 60-70, 70-80, 80-89, 89+.

For more information about the concept of bucketing, see Generalization and Bucketing.

Customizable bucketing

Customizable bucketing (BucketingConfig in the DLP API) offers more flexibility than fixed size bucketing. Instead of specifying upper and lower bounds and an interval value with which to create equal-sized buckets, you specify the maximum and minimum values for each bucket you want created. Each maximum and minimum value pair must have the same type.

You set up customizable bucketing by specifying individual buckets. Each bucket has the following properties:

  • The lower bound of the bucket's range. Omit this value to create a bucket that has no lower bound.
  • The upper bound of the bucket's range. Omit this value to create a bucket that has no upper bound.
  • The replacement value for this bucket range. This is the value with which to replace all detected values that fall within the lower and upper bounds. If you don't provide a replacement value, a hyphenated min-max range will be generated instead.

For example, consider the following JSON configuration for this bucketing transformation:

"bucketingConfig":{
  "buckets":[
    {
      "min":{
        "integerValue":"1"
      },
      "max":{
        "integerValue":"30"
      },
      "replacementValue":{
        "stringValue":"LOW"
      }
    },
    {
      "min":{
        "integerValue":"31"
      },
      "max":{
        "integerValue":"65"
      },
      "replacementValue":{
        "stringValue":"MEDIUM"
      }
    },
    {
      "min":{
        "integerValue":"66"
      },
      "max":{
        "integerValue":"100"
      },
      "replacementValue":{
        "stringValue":"HIGH"
      }
    }
  ]
}

This defines the following behavior:

  • Integer values falling between 1 and 30 are masked by being replaced with LOW.
  • Integer values falling between 31-65 are masked by being replaced with MEDIUM.
  • Integer values falling between 66-100 are masked by being replaced with HIGH.

For more information about the concept of bucketing, see Generalization and Bucketing.

Date shifting

When you use the date shifting transformation (DateShiftConfig in the DLP API on a date input value, Cloud DLP shifts the dates by a random number of days.

Date shifting techniques randomly shift a set of dates but preserve the sequence and duration of a period of time. Shifting dates is usually done in context to an individual or an entity. That is, you want to shift all of the dates for a specific individual using the same shift differential but use a separate shift differential for each other individual.

For more information about date shifting, see Date shifting.

Following is sample code in several languages that demonstrates how to use the Cloud DLP API to de-identify dates using date shifting.

Java

/**
 * @param inputCsvPath The path to the CSV file to deidentify
 * @param outputCsvPath (Optional) path to the output CSV file
 * @param dateFields The list of (date) fields in the CSV file to date shift
 * @param lowerBoundDays The maximum number of days to shift a date backward
 * @param upperBoundDays The maximum number of days to shift a date forward
 * @param contextFieldId (Optional) The column to determine date shift, default : a random shift
 *     amount
 * @param wrappedKey (Optional) The encrypted ('wrapped') AES-256 key to use when shifting dates
 * @param keyName (Optional) The name of the Cloud KMS key used to encrypt ('wrap') the AES-256
 *     key
 * @param projectId ID of Google Cloud project to run the API under.
 */
private static void deidentifyWithDateShift(
    Path inputCsvPath,
    Path outputCsvPath,
    String[] dateFields,
    int lowerBoundDays,
    int upperBoundDays,
    String contextFieldId,
    String wrappedKey,
    String keyName,
    String projectId)
    throws Exception {
  // instantiate a client
  try (DlpServiceClient dlpServiceClient = DlpServiceClient.create()) {

    // Set the maximum days to shift a day backward (lowerbound), forward (upperbound)
    DateShiftConfig.Builder dateShiftConfigBuilder =
        DateShiftConfig.newBuilder()
            .setLowerBoundDays(lowerBoundDays)
            .setUpperBoundDays(upperBoundDays);

    // If contextFieldId, keyName or wrappedKey is set: all three arguments must be valid
    if (contextFieldId != null && keyName != null && wrappedKey != null) {
      dateShiftConfigBuilder.setContext(FieldId.newBuilder().setName(contextFieldId).build());
      KmsWrappedCryptoKey kmsWrappedCryptoKey =
          KmsWrappedCryptoKey.newBuilder()
              .setCryptoKeyName(keyName)
              .setWrappedKey(ByteString.copyFrom(BaseEncoding.base64().decode(wrappedKey)))
              .build();
      dateShiftConfigBuilder.setCryptoKey(
          CryptoKey.newBuilder().setKmsWrapped(kmsWrappedCryptoKey).build());

    } else if (contextFieldId != null || keyName != null || wrappedKey != null) {
      throw new IllegalArgumentException(
          "You must set either ALL or NONE of {contextFieldId, keyName, wrappedKey}!");
    }

    // Read and parse the CSV file
    BufferedReader br = null;
    String line;
    List<Table.Row> rows = new ArrayList<>();
    List<FieldId> headers;

    br = new BufferedReader(new FileReader(inputCsvPath.toFile()));

    // convert csv header to FieldId
    headers =
        Arrays.stream(br.readLine().split(","))
            .map(header -> FieldId.newBuilder().setName(header).build())
            .collect(Collectors.toList());

    while ((line = br.readLine()) != null) {
      // convert csv rows to Table.Row
      rows.add(convertCsvRowToTableRow(line));
    }
    br.close();

    Table table = Table.newBuilder().addAllHeaders(headers).addAllRows(rows).build();

    List<FieldId> dateFieldIds =
        Arrays.stream(dateFields)
            .map(field -> FieldId.newBuilder().setName(field).build())
            .collect(Collectors.toList());

    DateShiftConfig dateShiftConfig = dateShiftConfigBuilder.build();

    FieldTransformation fieldTransformation =
        FieldTransformation.newBuilder()
            .addAllFields(dateFieldIds)
            .setPrimitiveTransformation(
                PrimitiveTransformation.newBuilder().setDateShiftConfig(dateShiftConfig).build())
            .build();

    DeidentifyConfig deidentifyConfig =
        DeidentifyConfig.newBuilder()
            .setRecordTransformations(
                RecordTransformations.newBuilder()
                    .addFieldTransformations(fieldTransformation)
                    .build())
            .build();

    ContentItem tableItem = ContentItem.newBuilder().setTable(table).build();

    DeidentifyContentRequest request =
        DeidentifyContentRequest.newBuilder()
            .setParent(ProjectName.of(projectId).toString())
            .setDeidentifyConfig(deidentifyConfig)
            .setItem(tableItem)
            .build();

    // Execute the deidentification request
    DeidentifyContentResponse response = dlpServiceClient.deidentifyContent(request);

    // Write out the response as a CSV file
    List<FieldId> outputHeaderFields = response.getItem().getTable().getHeadersList();
    List<Table.Row> outputRows = response.getItem().getTable().getRowsList();

    List<String> outputHeaders =
        outputHeaderFields.stream().map(FieldId::getName).collect(Collectors.toList());

    File outputFile = outputCsvPath.toFile();
    if (!outputFile.exists()) {
      outputFile.createNewFile();
    }
    BufferedWriter bufferedWriter = new BufferedWriter(new FileWriter(outputFile));

    // write out headers
    bufferedWriter.append(String.join(",", outputHeaders) + "\n");

    // write out each row
    for (Table.Row outputRow : outputRows) {
      String row =
          outputRow
              .getValuesList()
              .stream()
              .map(value -> value.getStringValue())
              .collect(Collectors.joining(","));
      bufferedWriter.append(row + "\n");
    }

    bufferedWriter.flush();
    bufferedWriter.close();

    System.out.println("Successfully saved date-shift output to: " + outputCsvPath.getFileName());
  } catch (Exception e) {
    System.out.println("Error in deidentifyWithDateShift: " + e.getMessage());
  }
}

// Parse string to valid date, return null when invalid
private static LocalDate getValidDate(String dateString) {
  try {
    return LocalDate.parse(dateString);
  } catch (DateTimeParseException e) {
    return null;
  }
}

// convert CSV row into Table.Row
private static Table.Row convertCsvRowToTableRow(String row) {
  String[] values = row.split(",");
  Table.Row.Builder tableRowBuilder = Table.Row.newBuilder();
  for (String value : values) {
    LocalDate date = getValidDate(value);
    if (date != null) {
      // convert to com.google.type.Date
      Date dateValue =
          Date.newBuilder()
              .setYear(date.getYear())
              .setMonth(date.getMonthValue())
              .setDay(date.getDayOfMonth())
              .build();
      Value tableValue = Value.newBuilder().setDateValue(dateValue).build();
      tableRowBuilder.addValues(tableValue);
    } else {
      tableRowBuilder.addValues(Value.newBuilder().setStringValue(value).build());
    }
  }
  return tableRowBuilder.build();
}

Node.js

// Imports the Google Cloud Data Loss Prevention library
const DLP = require('@google-cloud/dlp');

// Instantiates a client
const dlp = new DLP.DlpServiceClient();

// Import other required libraries
const fs = require('fs');

// The project ID to run the API call under
// const callingProjectId = process.env.GCLOUD_PROJECT;

// The path to the CSV file to deidentify
// The first row of the file must specify column names, and all other rows
// must contain valid values
// const inputCsvFile = '/path/to/input/file.csv';

// The path to save the date-shifted CSV file to
// const outputCsvFile = '/path/to/output/file.csv';

// The list of (date) fields in the CSV file to date shift
// const dateFields = [{ name: 'birth_date'}, { name: 'register_date' }];

// The maximum number of days to shift a date backward
// const lowerBoundDays = 1;

// The maximum number of days to shift a date forward
// const upperBoundDays = 1;

// (Optional) The column to determine date shift amount based on
// If this is not specified, a random shift amount will be used for every row
// If this is specified, then 'wrappedKey' and 'keyName' must also be set
// const contextFieldId = [{ name: 'user_id' }];

// (Optional) The name of the Cloud KMS key used to encrypt ('wrap') the AES-256 key
// If this is specified, then 'wrappedKey' and 'contextFieldId' must also be set
// const keyName = 'projects/YOUR_GCLOUD_PROJECT/locations/YOUR_LOCATION/keyRings/YOUR_KEYRING_NAME/cryptoKeys/YOUR_KEY_NAME';

// (Optional) The encrypted ('wrapped') AES-256 key to use when shifting dates
// This key should be encrypted using the Cloud KMS key specified above
// If this is specified, then 'keyName' and 'contextFieldId' must also be set
// const wrappedKey = 'YOUR_ENCRYPTED_AES_256_KEY'

// Helper function for converting CSV rows to Protobuf types
const rowToProto = row => {
  const values = row.split(',');
  const convertedValues = values.map(value => {
    if (Date.parse(value)) {
      const date = new Date(value);
      return {
        dateValue: {
          year: date.getFullYear(),
          month: date.getMonth() + 1,
          day: date.getDate(),
        },
      };
    } else {
      // Convert all non-date values to strings
      return {stringValue: value.toString()};
    }
  });
  return {values: convertedValues};
};

// Read and parse a CSV file
const csvLines = fs
  .readFileSync(inputCsvFile)
  .toString()
  .split('\n')
  .filter(line => line.includes(','));
const csvHeaders = csvLines[0].split(',');
const csvRows = csvLines.slice(1);

// Construct the table object
const tableItem = {
  table: {
    headers: csvHeaders.map(header => {
      return {name: header};
    }),
    rows: csvRows.map(row => rowToProto(row)),
  },
};

// Construct DateShiftConfig
const dateShiftConfig = {
  lowerBoundDays: lowerBoundDays,
  upperBoundDays: upperBoundDays,
};

if (contextFieldId && keyName && wrappedKey) {
  dateShiftConfig.context = {name: contextFieldId};
  dateShiftConfig.cryptoKey = {
    kmsWrapped: {
      wrappedKey: wrappedKey,
      cryptoKeyName: keyName,
    },
  };
} else if (contextFieldId || keyName || wrappedKey) {
  throw new Error(
    'You must set either ALL or NONE of {contextFieldId, keyName, wrappedKey}!'
  );
}

// Construct deidentification request
const request = {
  parent: dlp.projectPath(callingProjectId),
  deidentifyConfig: {
    recordTransformations: {
      fieldTransformations: [
        {
          fields: dateFields,
          primitiveTransformation: {
            dateShiftConfig: dateShiftConfig,
          },
        },
      ],
    },
  },
  item: tableItem,
};

try {
  // Run deidentification request
  const [response] = await dlp.deidentifyContent(request);
  const tableRows = response.item.table.rows;

  // Write results to a CSV file
  tableRows.forEach((row, rowIndex) => {
    const rowValues = row.values.map(
      value =>
        value.stringValue ||
        `${value.dateValue.month}/${value.dateValue.day}/${
          value.dateValue.year
        }`
    );
    csvLines[rowIndex + 1] = rowValues.join(',');
  });
  csvLines.push('');
  fs.writeFileSync(outputCsvFile, csvLines.join('\n'));

  // Print status
  console.log(`Successfully saved date-shift output to ${outputCsvFile}`);
} catch (err) {
  console.log(`Error in deidentifyWithDateShift: ${err.message || err}`);
}

Python

def deidentify_with_date_shift(project, input_csv_file=None,
                               output_csv_file=None, date_fields=None,
                               lower_bound_days=None, upper_bound_days=None,
                               context_field_id=None, wrapped_key=None,
                               key_name=None):
    """Uses the Data Loss Prevention API to deidentify dates in a CSV file by
        pseudorandomly shifting them.
    Args:
        project: The Google Cloud project id to use as a parent resource.
        input_csv_file: The path to the CSV file to deidentify. The first row
            of the file must specify column names, and all other rows must
            contain valid values.
        output_csv_file: The path to save the date-shifted CSV file.
        date_fields: The list of (date) fields in the CSV file to date shift.
            Example: ['birth_date', 'register_date']
        lower_bound_days: The maximum number of days to shift a date backward
        upper_bound_days: The maximum number of days to shift a date forward
        context_field_id: (Optional) The column to determine date shift amount
            based on. If this is not specified, a random shift amount will be
            used for every row. If this is specified, then 'wrappedKey' and
            'keyName' must also be set. Example:
            contextFieldId = [{ 'name': 'user_id' }]
        key_name: (Optional) The name of the Cloud KMS key used to encrypt
            ('wrap') the AES-256 key. Example:
            key_name = 'projects/YOUR_GCLOUD_PROJECT/locations/YOUR_LOCATION/
            keyRings/YOUR_KEYRING_NAME/cryptoKeys/YOUR_KEY_NAME'
        wrapped_key: (Optional) The encrypted ('wrapped') AES-256 key to use.
            This key should be encrypted using the Cloud KMS key specified by
            key_name.
    Returns:
        None; the response from the API is printed to the terminal.
    """
    # Import the client library
    import google.cloud.dlp

    # Instantiate a client
    dlp = google.cloud.dlp.DlpServiceClient()

    # Convert the project id into a full resource id.
    parent = dlp.project_path(project)

    # Convert date field list to Protobuf type
    def map_fields(field):
        return {'name': field}

    if date_fields:
        date_fields = map(map_fields, date_fields)
    else:
        date_fields = []

    # Read and parse the CSV file
    import csv
    from datetime import datetime
    f = []
    with open(input_csv_file, 'r') as csvfile:
        reader = csv.reader(csvfile)
        for row in reader:
            f.append(row)

    #  Helper function for converting CSV rows to Protobuf types
    def map_headers(header):
        return {'name': header}

    def map_data(value):
        try:
            date = datetime.strptime(value, '%m/%d/%Y')
            return {
                'date_value': {
                    'year': date.year,
                    'month': date.month,
                    'day': date.day
                }
            }
        except ValueError:
            return {'string_value': value}

    def map_rows(row):
        return {'values': map(map_data, row)}

    # Using the helper functions, convert CSV rows to protobuf-compatible
    # dictionaries.
    csv_headers = map(map_headers, f[0])
    csv_rows = map(map_rows, f[1:])

    # Construct the table dict
    table_item = {
        'table': {
            'headers': csv_headers,
            'rows': csv_rows
        }
    }

    # Construct date shift config
    date_shift_config = {
        'lower_bound_days': lower_bound_days,
        'upper_bound_days': upper_bound_days
    }

    # If using a Cloud KMS key, add it to the date_shift_config.
    # The wrapped key is base64-encoded, but the library expects a binary
    # string, so decode it here.
    if context_field_id and key_name and wrapped_key:
        import base64
        date_shift_config['context'] = {'name': context_field_id}
        date_shift_config['crypto_key'] = {
            'kms_wrapped': {
                'wrapped_key': base64.b64decode(wrapped_key),
                'crypto_key_name': key_name
            }
        }
    elif context_field_id or key_name or wrapped_key:
        raise ValueError("""You must set either ALL or NONE of
        [context_field_id, key_name, wrapped_key]!""")

    # Construct Deidentify Config
    deidentify_config = {
        'record_transformations': {
            'field_transformations': [
                {
                    'fields': date_fields,
                    'primitive_transformation': {
                        'date_shift_config': date_shift_config
                    }
                }
            ]
        }
    }

    # Write to CSV helper methods
    def write_header(header):
        return header.name

    def write_data(data):
        return data.string_value or '%s/%s/%s' % (data.date_value.month,
                                                  data.date_value.day,
                                                  data.date_value.year)

    # Call the API
    response = dlp.deidentify_content(
        parent, deidentify_config=deidentify_config, item=table_item)

    # Write results to CSV file
    with open(output_csv_file, 'w') as csvfile:
        write_file = csv.writer(csvfile, delimiter=',')
        write_file.writerow(map(write_header, response.item.table.headers))
        for row in response.item.table.rows:
            write_file.writerow(map(write_data, row.values))
    # Print status
    print('Successfully saved date-shift output to {}'.format(
        output_csv_file))

Go

// deidentifyDateShift shifts dates found in the input between lowerBoundDays and
// upperBoundDays.
func deidentifyDateShift(w io.Writer, client *dlp.Client, project string, lowerBoundDays, upperBoundDays int32, input string) {
	// Create a configured request.
	req := &dlppb.DeidentifyContentRequest{
		Parent: "projects/" + project,
		DeidentifyConfig: &dlppb.DeidentifyConfig{
			Transformation: &dlppb.DeidentifyConfig_InfoTypeTransformations{
				InfoTypeTransformations: &dlppb.InfoTypeTransformations{
					Transformations: []*dlppb.InfoTypeTransformations_InfoTypeTransformation{
						{
							InfoTypes: []*dlppb.InfoType{}, // Match all info types.
							PrimitiveTransformation: &dlppb.PrimitiveTransformation{
								Transformation: &dlppb.PrimitiveTransformation_DateShiftConfig{
									DateShiftConfig: &dlppb.DateShiftConfig{
										LowerBoundDays: lowerBoundDays,
										UpperBoundDays: upperBoundDays,
									},
								},
							},
						},
					},
				},
			},
		},
		// The InspectConfig is used to identify the DATE fields.
		InspectConfig: &dlppb.InspectConfig{
			InfoTypes: []*dlppb.InfoType{
				{
					Name: "DATE",
				},
			},
		},
		// The item to analyze.
		Item: &dlppb.ContentItem{
			DataItem: &dlppb.ContentItem_Value{
				Value: input,
			},
		},
	}
	// Send the request.
	r, err := client.DeidentifyContent(context.Background(), req)
	if err != nil {
		log.Fatal(err)
	}
	// Print the result.
	fmt.Fprint(w, r.GetItem().GetValue())
}

PHP

use Google\Cloud\Dlp\V2\ContentItem;
use Google\Cloud\Dlp\V2\CryptoKey;
use Google\Cloud\Dlp\V2\DateShiftConfig;
use Google\Cloud\Dlp\V2\DeidentifyConfig;
use Google\Cloud\Dlp\V2\DlpServiceClient;
use Google\Cloud\Dlp\V2\FieldId;
use Google\Cloud\Dlp\V2\FieldTransformation;
use Google\Cloud\Dlp\V2\KmsWrappedCryptoKey;
use Google\Cloud\Dlp\V2\PrimitiveTransformation;
use Google\Cloud\Dlp\V2\RecordTransformations;
use Google\Cloud\Dlp\V2\Table;
use Google\Cloud\Dlp\V2\Table\Row;
use Google\Cloud\Dlp\V2\Value;
use Google\Type\Date;
use DateTime;

/**
 * Deidentify dates in a CSV file by pseudorandomly shifting them.
 *
 * @param string $callingProject The GCP Project ID to run the API call under
 * @param string $inputCsvFile The path to the CSV file to deidentify
 * @param string $outputCsvFile The path to save the date-shifted CSV file to
 * @param array $dateFieldNames The list of (date) fields in the CSV file to date shift
 * @param string $lowerBoundDays The maximum number of days to shift a date backward
 * @param string $upperBoundDays The maximum number of days to shift a date forward
 * @param string contextFieldName (Optional) The column to determine date shift amount based on
 *        If this is not specified, a random shift amount will be used for every row.
 *        If this is specified, then 'wrappedKey' and 'keyName' must also be set
 * @param string keyName (Optional) The encrypted ('wrapped') AES-256 key to use when shifting dates
 *        If this is specified, then 'wrappedKey' and 'contextFieldName' must also be set
 * @param string wrappedKey (Optional) The name of the Cloud KMS key used to encrypt ('wrap') the AES-256 key
 *        If this is specified, then 'keyName' and 'contextFieldName' must also be set
 */
function deidentify_dates(
    $callingProjectId,
    $inputCsvFile,
    $outputCsvFile,
    $dateFieldNames,
    $lowerBoundDays,
    $upperBoundDays,
    $contextFieldName = '',
    $keyName = '',
    $wrappedKey = ''
) {
    // Instantiate a client.
    $dlp = new DlpServiceClient();

    // Read a CSV file
    $csvLines = file($inputCsvFile, FILE_IGNORE_NEW_LINES);
    $csvHeaders = explode(',', $csvLines[0]);
    $csvRows = array_slice($csvLines, 1);

    // Convert CSV file into protobuf objects
    $tableHeaders = array_map(function ($csvHeader) {
        return (new FieldId)->setName($csvHeader);
    }, $csvHeaders);

    $tableRows = array_map(function ($csvRow) {
        $rowValues = array_map(function ($csvValue) {
            if ($csvDate = DateTime::createFromFormat('m/d/Y', $csvValue)) {
                $date = (new Date())
                    ->setYear((int) $csvDate->format('Y'))
                    ->setMonth((int) $csvDate->format('m'))
                    ->setDay((int) $csvDate->format('d'));
                return (new Value())
                    ->setDateValue($date);
            } else {
                return (new Value())
                    ->setStringValue($csvValue);
            }
        }, explode(',', $csvRow));

        return (new Row())
            ->setValues($rowValues);
    }, $csvRows);

    // Convert date fields into protobuf objects
    $dateFields = array_map(function ($dateFieldName) {
        return (new FieldId())->setName($dateFieldName);
    }, $dateFieldNames);

    // Construct the table object
    $table = (new Table())
        ->setHeaders($tableHeaders)
        ->setRows($tableRows);

    $item = (new ContentItem())
        ->setTable($table);

    // Construct dateShiftConfig
    $dateShiftConfig = (new DateShiftConfig())
        ->setLowerBoundDays($lowerBoundDays)
        ->setUpperBoundDays($upperBoundDays);

    if ($contextFieldName && $keyName && $wrappedKey) {
        $contextField = (new FieldId())
            ->setName($contextFieldName);

        // Create the wrapped crypto key configuration object
        $kmsWrappedCryptoKey = (new KmsWrappedCryptoKey())
            ->setWrappedKey(base64_decode($wrappedKey))
            ->setCryptoKeyName($keyName);

        $cryptoKey = (new CryptoKey())
            ->setKmsWrapped($kmsWrappedCryptoKey);

        $dateShiftConfig
            ->setContext($contextField)
            ->setCryptoKey($cryptoKey);
    } elseif ($contextFieldName || $keyName || $wrappedKey) {
        throw new Exception('You must set either ALL or NONE of {$contextFieldName, $keyName, $wrappedKey}!');
    }

    // Create the information transform configuration objects
    $primitiveTransformation = (new PrimitiveTransformation())
        ->setDateShiftConfig($dateShiftConfig);

    $fieldTransformation = (new FieldTransformation())
        ->setPrimitiveTransformation($primitiveTransformation)
        ->setFields($dateFields);

    $recordTransformations = (new RecordTransformations())
        ->setFieldTransformations([$fieldTransformation]);

    // Create the deidentification configuration object
    $deidentifyConfig = (new DeidentifyConfig())
        ->setRecordTransformations($recordTransformations);

    $parent = $dlp->projectName($callingProjectId);

    // Run request
    $response = $dlp->deidentifyContent($parent, [
        'deidentifyConfig' => $deidentifyConfig,
        'item' => $item
    ]);

    // Check for errors
    foreach ($response->getOverview()->getTransformationSummaries() as $summary) {
        foreach ($summary->getResults() as $result) {
            if ($details = $result->getDetails()) {
                printf('Error: %s' . PHP_EOL, $details);
                return;
            }
        }
    }

    // Save the results to a file
    $csvRef = fopen($outputCsvFile, 'w');
    fputcsv($csvRef, $csvHeaders);
    foreach ($response->getItem()->getTable()->getRows() as $tableRow) {
        $values = array_map(function ($tableValue) {
            if ($tableValue->getStringValue()) {
                return $tableValue->getStringValue();
            }
            $protoDate = $tableValue->getDateValue();
            $date = mktime(0, 0, 0, $protoDate->getMonth(), $protoDate->getDay(), $protoDate->getYear());
            return strftime('%D', $date);
        }, iterator_to_array($tableRow->getValues()));
        fputcsv($csvRef, $values);
    };
    fclose($csvRef);
    printf('Deidentified dates written to %s' . PHP_EOL, $outputCsvFile);
}

C#

public static object DeidDateShift(
    string projectId,
    string inputCsvFile,
    string outputCsvFile,
    int lowerBoundDays,
    int upperBoundDays,
    string dateFields,
    string contextField = "",
    string keyName = "",
    string wrappedKey = "")
{
    var dlp = DlpServiceClient.Create();

    // Read file
    string[] csvLines = File.ReadAllLines(inputCsvFile);
    string[] csvHeaders = csvLines[0].Split(',');
    string[] csvRows = csvLines.Skip(1).ToArray();

    // Convert dates to protobuf format, and everything else to a string
    var protoHeaders = csvHeaders.Select(header => new FieldId { Name = header });
    var protoRows = csvRows.Select(CsvRow =>
    {
        var rowValues = CsvRow.Split(',');
        var protoValues = rowValues.Select(RowValue =>
        {
            System.DateTime parsedDate;
            if (System.DateTime.TryParse(RowValue, out parsedDate))
            {
                return new Value
                {
                    DateValue = new Google.Type.Date
                    {
                        Year = parsedDate.Year,
                        Month = parsedDate.Month,
                        Day = parsedDate.Day
                    }
                };
            }
            else
            {
                return new Value
                {
                    StringValue = RowValue
                };
            }
        });

        var rowObject = new Table.Types.Row();
        rowObject.Values.Add(protoValues);
        return rowObject;
    });

    var dateFieldList = dateFields
        .Split(',')
        .Select(field => new FieldId { Name = field });

    // Construct + execute the request
    var dateShiftConfig = new DateShiftConfig
    {
        LowerBoundDays = lowerBoundDays,
        UpperBoundDays = upperBoundDays
    };
    bool hasKeyName = !String.IsNullOrEmpty(keyName);
    bool hasWrappedKey = !String.IsNullOrEmpty(wrappedKey);
    bool hasContext = !String.IsNullOrEmpty(contextField);
    if (hasKeyName && hasWrappedKey && hasContext)
    {
        dateShiftConfig.Context = new FieldId { Name = contextField };
        dateShiftConfig.CryptoKey = new CryptoKey
        {
            KmsWrapped = new KmsWrappedCryptoKey
            {
                WrappedKey = ByteString.FromBase64(wrappedKey),
                CryptoKeyName = keyName
            }
        };
    }
    else if (hasKeyName || hasWrappedKey || hasContext)
    {
        throw new ArgumentException("Must specify ALL or NONE of: {contextFieldId, keyName, wrappedKey}!");
    }

    var deidConfig = new DeidentifyConfig
    {
        RecordTransformations = new RecordTransformations
        {
            FieldTransformations =
            {
                new FieldTransformation
                {
                    PrimitiveTransformation = new PrimitiveTransformation
                    {
                        DateShiftConfig = dateShiftConfig
                    },
                    Fields = { dateFieldList }
                }
            }
        }
    };

    DeidentifyContentResponse response = dlp.DeidentifyContent(
        new DeidentifyContentRequest
        {
            Parent = $"projects/{projectId}",
            DeidentifyConfig = deidConfig,
            Item = new ContentItem
            {
                Table = new Table
                {
                    Headers = { protoHeaders },
                    Rows = { protoRows }
                }
            }
        });

    // Save the results
    List<String> outputLines = new List<string>();
    outputLines.Add(csvLines[0]);

    outputLines.AddRange(response.Item.Table.Rows.Select(ProtoRow =>
    {
        var Values = ProtoRow.Values.Select(ProtoValue =>
        {
            if (ProtoValue.DateValue != null)
            {
                var ProtoDate = ProtoValue.DateValue;
                System.DateTime Date = new System.DateTime(
                    ProtoDate.Year, ProtoDate.Month, ProtoDate.Day);
                return Date.ToShortDateString();
            }
            else
            {
                return ProtoValue.StringValue;
            }
        });
        return String.Join(',', Values);
    }));

    File.WriteAllLines(outputCsvFile, outputLines);

    return 0;
}

Time extraction

Performing time extraction (TimePartConfig in the DLP API) object preserves a portion of a matched value that on a date, a time, or a timestamp preserves a portion of a matched value. You specify to Cloud DLP what kind of time value you want to extract, including year, month, day of the month, and so on (enumerated in the TimePart object).

For example, suppose you've configured a timePartConfig transformation by setting the time part to extract to YEAR. After sending the data in the first column below to Cloud DLP, you'd end up with the transformed values in the second column:

Original values Transformed values
9/21/1976 1976
6/7/1945 1945
1/20/2009 2009
7/4/1776 1776
8/1/1984 1984
4/21/1982 1982
Was this page helpful? Let us know how we did:

Send feedback about...

Data Loss Prevention Documentation