- Resource: DeidentifyTemplate
- JSON representation
- DeidentifyConfig
- InfoTypeTransformations
- InfoTypeTransformation
- PrimitiveTransformation
- ReplaceValueConfig
- RedactConfig
- CharacterMaskConfig
- CharsToIgnore
- CommonCharsToIgnore
- CryptoReplaceFfxFpeConfig
- CryptoKey
- TransientCryptoKey
- UnwrappedCryptoKey
- KmsWrappedCryptoKey
- FfxCommonNativeAlphabet
- FixedSizeBucketingConfig
- BucketingConfig
- Bucket
- ReplaceWithInfoTypeConfig
- TimePartConfig
- TimePart
- CryptoHashConfig
- DateShiftConfig
- CryptoDeterministicConfig
- ReplaceDictionaryConfig
- RecordTransformations
- FieldTransformation
- RecordCondition
- Expressions
- LogicalOperator
- Conditions
- Condition
- RelationalOperator
- RecordSuppression
- ImageTransformations
- ImageTransformation
- SelectedInfoTypes
- AllInfoTypes
- AllText
- Color
- TransformationErrorHandling
- ThrowError
- LeaveUntransformed
- Methods
Resource: DeidentifyTemplate
DeidentifyTemplates contains instructions on how to de-identify content. See https://cloud.google.com/sensitive-data-protection/docs/concepts-templates to learn more.
JSON representation |
---|
{
"name": string,
"displayName": string,
"description": string,
"createTime": string,
"updateTime": string,
"deidentifyConfig": {
object ( |
Fields | |
---|---|
name |
Output only. The template name. The template will have one of the following formats: |
displayName |
Display name (max 256 chars). |
description |
Short description (max 256 chars). |
createTime |
Output only. The creation timestamp of an inspectTemplate. A timestamp in RFC3339 UTC "Zulu" format, with nanosecond resolution and up to nine fractional digits. Examples: |
updateTime |
Output only. The last update timestamp of an inspectTemplate. A timestamp in RFC3339 UTC "Zulu" format, with nanosecond resolution and up to nine fractional digits. Examples: |
deidentifyConfig |
The core content of the template. |
DeidentifyConfig
The configuration that controls how the data will change.
JSON representation |
---|
{ "transformationErrorHandling": { object ( |
Fields | |
---|---|
transformationErrorHandling |
Mode for handling transformation errors. If left unspecified, the default mode is |
Union field transformation . Type of transformation transformation can be only one of the following: |
|
infoTypeTransformations |
Treat the dataset as free-form text and apply the same free text transformation everywhere. |
recordTransformations |
Treat the dataset as structured. Transformations can be applied to specific locations within structured datasets, such as transforming a column within a table. |
imageTransformations |
Treat the dataset as an image and redact. |
InfoTypeTransformations
A type of transformation that will scan unstructured text and apply various PrimitiveTransformation
s to each finding, where the transformation is applied to only values that were identified as a specific infoType.
JSON representation |
---|
{
"transformations": [
{
object ( |
Fields | |
---|---|
transformations[] |
Required. Transformation for each infoType. Cannot specify more than one for a given infoType. |
InfoTypeTransformation
A transformation to apply to text that is identified as a specific infoType.
JSON representation |
---|
{ "infoTypes": [ { object ( |
Fields | |
---|---|
infoTypes[] |
InfoTypes to apply the transformation to. An empty list will cause this transformation to apply to all findings that correspond to infoTypes that were requested in |
primitiveTransformation |
Required. Primitive transformation to apply to the infoType. |
PrimitiveTransformation
A rule for transforming a value.
JSON representation |
---|
{ // Union field |
Fields | |
---|---|
Union field transformation . Type of transformation. transformation can be only one of the following: |
|
replaceConfig |
Replace with a specified value. |
redactConfig |
Redact |
characterMaskConfig |
Mask |
cryptoReplaceFfxFpeConfig |
Ffx-Fpe |
fixedSizeBucketingConfig |
Fixed size bucketing |
bucketingConfig |
Bucketing |
replaceWithInfoTypeConfig |
Replace with infotype |
timePartConfig |
Time extraction |
cryptoHashConfig |
Crypto |
dateShiftConfig |
Date Shift |
cryptoDeterministicConfig |
Deterministic Crypto |
replaceDictionaryConfig |
Replace with a value randomly drawn (with replacement) from a dictionary. |
ReplaceValueConfig
Replace each input value with a given Value
.
JSON representation |
---|
{
"newValue": {
object ( |
Fields | |
---|---|
newValue |
Value to replace it with. |
RedactConfig
This type has no fields.
Redact a given value. For example, if used with an InfoTypeTransformation
transforming PHONE_NUMBER, and input 'My phone number is 206-555-0123', the output would be 'My phone number is '.
CharacterMaskConfig
Partially mask a string by replacing a given number of characters with a fixed character. Masking can start from the beginning or end of the string. This can be used on data of any type (numbers, longs, and so on) and when de-identifying structured data we'll attempt to preserve the original data's type. (This allows you to take a long like 123 and modify it to a string like **3.
JSON representation |
---|
{
"maskingCharacter": string,
"numberToMask": integer,
"reverseOrder": boolean,
"charactersToIgnore": [
{
object ( |
Fields | |
---|---|
maskingCharacter |
Character to use to mask the sensitive values—for example, |
numberToMask |
Number of characters to mask. If not set, all matching chars will be masked. Skipped characters do not count towards this tally. If
The resulting de-identified string is |
reverseOrder |
Mask characters in reverse order. For example, if |
charactersToIgnore[] |
When masking a string, items in this list will be skipped when replacing characters. For example, if the input string is |
CharsToIgnore
Characters to skip when doing deidentification of a value. These will be left alone and skipped.
JSON representation |
---|
{ // Union field |
Fields | |
---|---|
Union field characters . Type of characters to skip. characters can be only one of the following: |
|
charactersToSkip |
Characters to not transform when masking. |
commonCharactersToIgnore |
Common characters to not transform when masking. Useful to avoid removing punctuation. |
CommonCharsToIgnore
Convenience enum for indicating common characters to not transform.
Enums | |
---|---|
COMMON_CHARS_TO_IGNORE_UNSPECIFIED |
Unused. |
NUMERIC |
0-9 |
ALPHA_UPPER_CASE |
A-Z |
ALPHA_LOWER_CASE |
a-z |
PUNCTUATION |
US Punctuation, one of !"#$%&'()*+,-./:;<=>?@[]^_`{|}~ |
WHITESPACE |
Whitespace character, one of [ \t\n\x0B\f\r] |
CryptoReplaceFfxFpeConfig
Replaces an identifier with a surrogate using Format Preserving Encryption (FPE) with the FFX mode of operation; however when used in the content.reidentify
API method, it serves the opposite function by reversing the surrogate back into the original identifier. The identifier must be encoded as ASCII. For a given crypto key and context, the same identifier will be replaced with the same surrogate. Identifiers must be at least two characters long. In the case that the identifier is the empty string, it will be skipped. See https://cloud.google.com/sensitive-data-protection/docs/pseudonymization to learn more.
Note: We recommend using CryptoDeterministicConfig for all use cases which do not require preserving the input alphabet space and size, plus warrant referential integrity.
JSON representation |
---|
{ "cryptoKey": { object ( |
Fields | |
---|---|
cryptoKey |
Required. The key used by the encryption algorithm. |
context |
The 'tweak', a context may be used for higher security since the same identifier in two different contexts won't be given the same surrogate. If the context is not set, a default tweak will be used. If the context is set but:
a default tweak will be used. Note that case (1) is expected when an The tweak is constructed as a sequence of bytes in big endian byte order such that:
|
surrogateInfoType |
The custom infoType to annotate the surrogate with. This annotation will be applied to the surrogate by prefixing it with the name of the custom infoType followed by the number of characters comprising the surrogate. The following scheme defines the format: info_type_name(surrogate_character_count):surrogate For example, if the name of custom infoType is 'MY_TOKEN_INFO_TYPE' and the surrogate is 'abc', the full replacement value will be: 'MY_TOKEN_INFO_TYPE(3):abc' This annotation identifies the surrogate when inspecting content using the custom infoType In order for inspection to work properly, the name of this infoType must not occur naturally anywhere in your data; otherwise, inspection may find a surrogate that does not correspond to an actual identifier. Therefore, choose your custom infoType name carefully after considering what your data looks like. One way to select a name that has a high chance of yielding reliable detection is to include one or more unicode characters that are highly improbable to exist in your data. For example, assuming your data is entered from a regular ASCII keyboard, the symbol with the hex code point 29DD might be used like so: ⧝MY_TOKEN_TYPE |
Union field alphabet . Choose an alphabet which the data being transformed will be made up of. alphabet can be only one of the following: |
|
commonAlphabet |
Common alphabets. |
customAlphabet |
This is supported by mapping these to the alphanumeric characters that the FFX mode natively supports. This happens before/after encryption/decryption. Each character listed must appear only once. Number of characters must be in the range [2, 95]. This must be encoded as ASCII. The order of characters does not matter. The full list of allowed characters is: |
radix |
The native way to select the alphabet. Must be in the range [2, 95]. |
CryptoKey
This is a data encryption key (DEK) (as opposed to a key encryption key (KEK) stored by Cloud Key Management Service (Cloud KMS). When using Cloud KMS to wrap or unwrap a DEK, be sure to set an appropriate IAM policy on the KEK to ensure an attacker cannot unwrap the DEK.
JSON representation |
---|
{ // Union field |
Fields | |
---|---|
Union field source . Sources of crypto keys. source can be only one of the following: |
|
transient |
Transient crypto key |
unwrapped |
Unwrapped crypto key |
kmsWrapped |
Key wrapped using Cloud KMS |
TransientCryptoKey
Use this to have a random data crypto key generated. It will be discarded after the request finishes.
JSON representation |
---|
{ "name": string } |
Fields | |
---|---|
name |
Required. Name of the key. This is an arbitrary string used to differentiate different keys. A unique key is generated per name: two separate |
UnwrappedCryptoKey
Using raw keys is prone to security risks due to accidentally leaking the key. Choose another type of key if possible.
JSON representation |
---|
{ "key": string } |
Fields | |
---|---|
key |
Required. A 128/192/256 bit key. A base64-encoded string. |
KmsWrappedCryptoKey
Include to use an existing data crypto key wrapped by KMS. The wrapped key must be a 128-, 192-, or 256-bit key. Authorization requires the following IAM permissions when sending a request to perform a crypto transformation using a KMS-wrapped crypto key: dlp.kms.encrypt
For more information, see Creating a wrapped key.
Note: When you use Cloud KMS for cryptographic operations, charges apply.
JSON representation |
---|
{ "wrappedKey": string, "cryptoKeyName": string } |
Fields | |
---|---|
wrappedKey |
Required. The wrapped data crypto key. A base64-encoded string. |
cryptoKeyName |
Required. The resource name of the KMS CryptoKey to use for unwrapping. |
FfxCommonNativeAlphabet
These are commonly used subsets of the alphabet that the FFX mode natively supports. In the algorithm, the alphabet is selected using the "radix". Therefore each corresponds to a particular radix.
Enums | |
---|---|
FFX_COMMON_NATIVE_ALPHABET_UNSPECIFIED |
Unused. |
NUMERIC |
[0-9] (radix of 10) |
HEXADECIMAL |
[0-9A-F] (radix of 16) |
UPPER_CASE_ALPHA_NUMERIC |
[0-9A-Z] (radix of 36) |
ALPHA_NUMERIC |
[0-9A-Za-z] (radix of 62) |
FixedSizeBucketingConfig
Buckets values based on fixed size ranges. The Bucketing transformation can provide all of this functionality, but requires more configuration. This message is provided as a convenience to the user for simple bucketing strategies.
The transformed value will be a hyphenated string of {lowerBound}-{upperBound}. For example, if lowerBound = 10 and upperBound = 20, all values that are within this bucket will be replaced with "10-20".
This can be used on data of type: double, long.
If the bound Value type differs from the type of data being transformed, we will first attempt converting the type of the data to be transformed to match the type of the bound before comparing.
See https://cloud.google.com/sensitive-data-protection/docs/concepts-bucketing to learn more.
JSON representation |
---|
{ "lowerBound": { object ( |
Fields | |
---|---|
lowerBound |
Required. Lower bound value of buckets. All values less than |
upperBound |
Required. Upper bound value of buckets. All values greater than upperBound are grouped together into a single bucket; for example if |
bucketSize |
Required. Size of each bucket (except for minimum and maximum buckets). So if |
BucketingConfig
Generalization function that buckets values based on ranges. The ranges and replacement values are dynamically provided by the user for custom behavior, such as 1-30 -> LOW, 31-65 -> MEDIUM, 66-100 -> HIGH.
This can be used on data of type: number, long, string, timestamp.
If the bound Value
type differs from the type of data being transformed, we will first attempt converting the type of the data to be transformed to match the type of the bound before comparing. See https://cloud.google.com/sensitive-data-protection/docs/concepts-bucketing to learn more.
JSON representation |
---|
{
"buckets": [
{
object ( |
Fields | |
---|---|
buckets[] |
Set of buckets. Ranges must be non-overlapping. |
Bucket
Bucket is represented as a range, along with replacement values.
JSON representation |
---|
{ "min": { object ( |
Fields | |
---|---|
min |
Lower bound of the range, inclusive. Type should be the same as max if used. |
max |
Upper bound of the range, exclusive; type must match min. |
replacementValue |
Required. Replacement value for this bucket. |
ReplaceWithInfoTypeConfig
This type has no fields.
Replace each matching finding with the name of the infoType.
TimePartConfig
For use with Date
, Timestamp
, and TimeOfDay
, extract or preserve a portion of the value.
JSON representation |
---|
{
"partToExtract": enum ( |
Fields | |
---|---|
partToExtract |
The part of the time to keep. |
TimePart
Components that make up time.
Enums | |
---|---|
TIME_PART_UNSPECIFIED |
Unused |
YEAR |
[0-9999] |
MONTH |
[1-12] |
DAY_OF_MONTH |
[1-31] |
DAY_OF_WEEK |
[1-7] |
WEEK_OF_YEAR |
[1-53] |
HOUR_OF_DAY |
[0-23] |
CryptoHashConfig
Pseudonymization method that generates surrogates via cryptographic hashing. Uses SHA-256. The key size must be either 32 or 64 bytes. Outputs a base64 encoded representation of the hashed output (for example, L7k0BHmF1ha5U3NfGykjro4xWi1MPVQPjhMAZbSV9mM=). Currently, only string and integer values can be hashed. See https://cloud.google.com/sensitive-data-protection/docs/pseudonymization to learn more.
JSON representation |
---|
{
"cryptoKey": {
object ( |
Fields | |
---|---|
cryptoKey |
The key used by the hash function. |
DateShiftConfig
Shifts dates by random number of days, with option to be consistent for the same context. See https://cloud.google.com/sensitive-data-protection/docs/concepts-date-shifting to learn more.
JSON representation |
---|
{ "upperBoundDays": integer, "lowerBoundDays": integer, "context": { object ( |
Fields | |
---|---|
upperBoundDays |
Required. Range of shift in days. Actual shift will be selected at random within this range (inclusive ends). Negative means shift to earlier in time. Must not be more than 365250 days (1000 years) each direction. For example, 3 means shift date to at most 3 days into the future. |
lowerBoundDays |
Required. For example, -5 means shift date to at most 5 days back in the past. |
context |
Points to the field that contains the context, for example, an entity id. If set, must also set cryptoKey. If set, shift will be consistent for the given context. |
Union field method . Method for calculating shift that takes context into consideration. If set, must also set context. Can only be applied to table items. method can be only one of the following: |
|
cryptoKey |
Causes the shift to be computed based on this key and the context. This results in the same shift for the same context and cryptoKey. If set, must also set context. Can only be applied to table items. |
CryptoDeterministicConfig
Pseudonymization method that generates deterministic encryption for the given input. Outputs a base64 encoded representation of the encrypted output. Uses AES-SIV based on the RFC https://tools.ietf.org/html/rfc5297.
JSON representation |
---|
{ "cryptoKey": { object ( |
Fields | |
---|---|
cryptoKey |
The key used by the encryption function. For deterministic encryption using AES-SIV, the provided key is internally expanded to 64 bytes prior to use. |
surrogateInfoType |
The custom info type to annotate the surrogate with. This annotation will be applied to the surrogate by prefixing it with the name of the custom info type followed by the number of characters comprising the surrogate. The following scheme defines the format: {info type name}({surrogate character count}):{surrogate} For example, if the name of custom info type is 'MY_TOKEN_INFO_TYPE' and the surrogate is 'abc', the full replacement value will be: 'MY_TOKEN_INFO_TYPE(3):abc' This annotation identifies the surrogate when inspecting content using the custom info type 'Surrogate'. This facilitates reversal of the surrogate when it occurs in free text. Note: For record transformations where the entire cell in a table is being transformed, surrogates are not mandatory. Surrogates are used to denote the location of the token and are necessary for re-identification in free form text. In order for inspection to work properly, the name of this info type must not occur naturally anywhere in your data; otherwise, inspection may either
Therefore, choose your custom info type name carefully after considering what your data looks like. One way to select a name that has a high chance of yielding reliable detection is to include one or more unicode characters that are highly improbable to exist in your data. For example, assuming your data is entered from a regular ASCII keyboard, the symbol with the hex code point 29DD might be used like so: ⧝MY_TOKEN_TYPE. |
context |
A context may be used for higher security and maintaining referential integrity such that the same identifier in two different contexts will be given a distinct surrogate. The context is appended to plaintext value being encrypted. On decryption the provided context is validated against the value used during encryption. If a context was provided during encryption, same context must be provided during decryption as well. If the context is not set, plaintext would be used as is for encryption. If the context is set but:
plaintext would be used as is for encryption. Note that case (1) is expected when an |
ReplaceDictionaryConfig
Replace each input value with a value randomly selected from the dictionary.
JSON representation |
---|
{ // Union field |
Fields | |
---|---|
Union field type . Type of dictionary. type can be only one of the following: |
|
wordList |
A list of words to select from for random replacement. The limits page contains details about the size limits of dictionaries. |
RecordTransformations
A type of transformation that is applied over structured data such as a table.
JSON representation |
---|
{ "fieldTransformations": [ { object ( |
Fields | |
---|---|
fieldTransformations[] |
Transform the record by applying various field transformations. |
recordSuppressions[] |
Configuration defining which records get suppressed entirely. Records that match any suppression rule are omitted from the output. |
FieldTransformation
The transformation to apply to the field.
JSON representation |
---|
{ "fields": [ { object ( |
Fields | |
---|---|
fields[] |
Required. Input field(s) to apply the transformation to. When you have columns that reference their position within a list, omit the index from the FieldId. FieldId name matching ignores the index. For example, instead of "contact.nums[0].type", use "contact.nums.type". |
condition |
Only apply the transformation if the condition evaluates to true for the given Example Use Cases:
|
Union field transformation . Transformation to apply. [required] transformation can be only one of the following: |
|
primitiveTransformation |
Apply the transformation to the entire field. |
infoTypeTransformations |
Treat the contents of the field as free text, and selectively transform content that matches an |
RecordCondition
A condition for determining whether a transformation should be applied to a field.
JSON representation |
---|
{
"expressions": {
object ( |
Fields | |
---|---|
expressions |
An expression. |
Expressions
An expression, consisting of an operator and conditions.
JSON representation |
---|
{ "logicalOperator": enum ( |
Fields | |
---|---|
logicalOperator |
The operator to apply to the result of conditions. Default and currently only supported value is |
Union field type . Expression types. type can be only one of the following: |
|
conditions |
Conditions to apply to the expression. |
LogicalOperator
Logical operators for conditional checks.
Enums | |
---|---|
LOGICAL_OPERATOR_UNSPECIFIED |
Unused |
AND |
Conditional AND |
Conditions
A collection of conditions.
JSON representation |
---|
{
"conditions": [
{
object ( |
Fields | |
---|---|
conditions[] |
A collection of conditions. |
Condition
The field type of value
and field
do not need to match to be considered equal, but not all comparisons are possible. EQUAL_TO and NOT_EQUAL_TO attempt to compare even with incompatible types, but all other comparisons are invalid with incompatible types. A value
of type:
string
can be compared against all other typesboolean
can only be compared against other booleansinteger
can be compared against doubles or a string if the string value can be parsed as an integer.double
can be compared against integers or a string if the string can be parsed as a double.Timestamp
can be compared against strings in RFC 3339 date string format.TimeOfDay
can be compared against timestamps and strings in the format of 'HH:mm:ss'.
If we fail to compare do to type mismatch, a warning will be given and the condition will evaluate to false.
JSON representation |
---|
{ "field": { object ( |
Fields | |
---|---|
field |
Required. Field within the record this condition is evaluated against. |
operator |
Required. Operator used to compare the field or infoType to the value. |
value |
Value to compare against. [Mandatory, except for |
RelationalOperator
Operators available for comparing the value of fields.
Enums | |
---|---|
RELATIONAL_OPERATOR_UNSPECIFIED |
Unused |
EQUAL_TO |
Equal. Attempts to match even with incompatible types. |
NOT_EQUAL_TO |
Not equal to. Attempts to match even with incompatible types. |
GREATER_THAN |
Greater than. |
LESS_THAN |
Less than. |
GREATER_THAN_OR_EQUALS |
Greater than or equals. |
LESS_THAN_OR_EQUALS |
Less than or equals. |
EXISTS |
Exists |
RecordSuppression
Configuration to suppress records whose suppression conditions evaluate to true.
JSON representation |
---|
{
"condition": {
object ( |
Fields | |
---|---|
condition |
A condition that when it evaluates to true will result in the record being evaluated to be suppressed from the transformed content. |
ImageTransformations
A type of transformation that is applied over images.
JSON representation |
---|
{
"transforms": [
{
object ( |
Fields | |
---|---|
transforms[] |
List of transforms to make. |
ImageTransformation
Configuration for determining how redaction of images should occur.
JSON representation |
---|
{ "redactionColor": { object ( |
Fields | |
---|---|
redactionColor |
The color to use when redacting content from an image. If not specified, the default is black. |
Union field target . Part of the image to transform. target can be only one of the following: |
|
selectedInfoTypes |
Apply transformation to the selected infoTypes. |
allInfoTypes |
Apply transformation to all findings not specified in other ImageTransformation's selectedInfoTypes. Only one instance is allowed within the ImageTransformations message. |
allText |
Apply transformation to all text that doesn't match an infoType. Only one instance is allowed within the ImageTransformations message. |
SelectedInfoTypes
Apply transformation to the selected infoTypes.
JSON representation |
---|
{
"infoTypes": [
{
object ( |
Fields | |
---|---|
infoTypes[] |
Required. InfoTypes to apply the transformation to. Required. Provided InfoType must be unique within the ImageTransformations message. |
AllInfoTypes
This type has no fields.
Apply transformation to all findings.
AllText
This type has no fields.
Apply to all text.
Color
Represents a color in the RGB color space.
JSON representation |
---|
{ "red": number, "green": number, "blue": number } |
Fields | |
---|---|
red |
The amount of red in the color as a value in the interval [0, 1]. |
green |
The amount of green in the color as a value in the interval [0, 1]. |
blue |
The amount of blue in the color as a value in the interval [0, 1]. |
TransformationErrorHandling
How to handle transformation errors during de-identification. A transformation error occurs when the requested transformation is incompatible with the data. For example, trying to de-identify an IP address using a DateShift
transformation would result in a transformation error, since date info cannot be extracted from an IP address. Information about any incompatible transformations, and how they were handled, is returned in the response as part of the TransformationOverviews
.
JSON representation |
---|
{ // Union field |
Fields | |
---|---|
Union field mode . How transformation errors should be handled. mode can be only one of the following: |
|
throwError |
Throw an error |
leaveUntransformed |
Ignore errors |
ThrowError
This type has no fields.
Throw an error and fail the request when a transformation error occurs.
LeaveUntransformed
This type has no fields.
Skips the data without modifying it if the requested transformation would cause an error. For example, if a DateShift
transformation were applied an an IP address, this mode would leave the IP address unchanged in the response.
Methods |
|
---|---|
|
Creates a DeidentifyTemplate for reusing frequently used configuration for de-identifying content, images, and storage. |
|
Deletes a DeidentifyTemplate. |
|
Gets a DeidentifyTemplate. |
|
Lists DeidentifyTemplates. |
|
Updates the DeidentifyTemplate. |