轉換參考資料

本主題將介紹可在 Cloud DLP 中使用的去識別代碼化技巧 (即轉換)。

代碼化技巧類型

請根據您要去識別化的資料種類,以及將資料去識別化的目的,選擇您要使用的代碼化轉換。Cloud DLP 支援的代碼化技巧可分為以下幾種一般類別:

  • 刪減:刪除所有或部分偵測到的機密值。
  • 替換:將偵測到的機密值替換為指定代理值。
  • 遮蔽:將機密值的多個字元替換為指定代理字元,例如井字號 (#) 或星號 (*)。
  • 加密式代碼化:使用加密編譯金鑰加密原始機密資料值。Cloud DLP 支援多種代碼化,包括可復原或「重新識別化」的轉換。
  • 特徵分塊:將機密值替換為某個範圍的值,藉此將機密值「一般化」(例如,將特定年齡替換為某個年齡層,或是將溫度替換為對應至「熱」、「適中」和「冷」的範圍)。
  • 日期轉移:將機密資料值轉移隨機時間長度。
  • 時間擷取:擷取或保留日期和時間值的指定部分。

本主題的剩餘篇幅將介紹各種不同類型的代碼化轉換,並提供應用範例。

轉換方法

下表列出您可在 Cloud DLP 中用來將機密資料去識別化的代碼化轉換:

轉換 物件 說明 可復原1 參照完整性2 輸入類型
刪減 RedactConfig 透過移除的方式刪減值。 不限
替換 ReplaceValueConfig 將每個輸入值替換為指定的值。 不限
替換為 infoType ReplaceWithInfoTypeConfig 將輸入值替換為其 infoType 名稱。 不限
使用字元遮蔽 CharacterMaskConfig 將指定的字元數替換為指定的固定字元,以完全或局部遮蔽字串。 不限
將輸入值替換為加密編譯雜湊以進行匿名化 CryptoHashConfig 使用指定的資料加密金鑰,將輸入值替換為產生的 32 位元組十六進位字串。詳情請參閱匿名化概念說明文件一文。 字串或整數
替換為保留格式的加密編譯代碼以進行匿名化 CryptoReplaceFfxFpeConfig 將格式保留加密 (FPE) 與 FFX 作業模式搭配使用,將輸入值替換為相同長度的代碼或代理值。這會允許在具有格式驗證,或即使未顯示實際資訊也需要顯示為真實內容的系統中使用輸出。詳情請參閱匿名化概念說明文件一文。 字元數有限的字串或整數。字母表必須由至少 2 個字元組成,且不超過 62 個字元。
替換為加密編譯代碼以進行匿名化 CryptoDeterministicConfig 使用合成初始向量模式的 AES (AES-SIV),將輸入值替換為相同長度的代碼或代理值。與保留格式的代碼化不同,這種轉換方法對於支援的字串字元集沒有限制、會為每個相同的輸入值產生相同的代碼,並會使用 [代理值](reference/rest/v2/InspectConfig#surrogatetype),在有原始加密金鑰的情況下進行重新識別化。 不限
根據固定大小範圍對值進行特徵分塊 FixedSizeBucketingConfig 將輸入值替換為值區或輸入值落入的範圍,以遮蔽輸入值。 不限
根據自訂大小範圍的值區值 BucketingConfig 根據可由使用者設定的範圍及替換值的值區輸入值。 不限
日期轉移 DateShiftConfig 按隨機天數轉移日期,並提供讓相同背景資訊保持一致的選項。
保留順序與時間長度
日期/時間
擷取時間資料 TimePartConfig 擷取或保留一部分的 DateTimestampTimeOfDay 值。 日期/時間

註釋

1可復原的轉換能夠以 content.reidentify 方法還原,以重新識別機密資料。
2參照完整性可讓記錄保持彼此的關係,同時將資料去識別化。舉例來說,假設加密編譯金鑰和結構定義都相同,則每次轉換資料時,都會將資料替換成相同的模糊形式,以保留記錄之間的連結。

刪減

如果您只想將機密資料從輸入內容中移除,可以使用 Cloud DLP 提供的刪減轉換 (DLP API 中的 RedactConfig)。

舉例來說,假設您要對所有 EMAIL_ADDRESS infoType 執行簡單的刪減作業,並傳送了以下字串至 Cloud DLP:

My name is Alicia Abernathy, and my email address is aabernathy@example.com.

傳回的字串將顯示以下內容:

My name is Alicia Abernathy, and my email address is .

以下 JSON 範例說明如何形成 API 請求,以及 Cloud DLP API 傳回的內容:

JSON 輸入:

POST https://dlp.googleapis.com/v2/projects/[PROJECT_ID]/content:deidentify?key={YOUR_API_KEY}

{
  "item":{
    "value":"My name is Alicia Abernathy, and my email address is aabernathy@example.com."
  },
  "deidentifyConfig":{
    "infoTypeTransformations":{
      "transformations":[
        {
          "infoTypes":[
            {
              "name":"EMAIL_ADDRESS"
            }
          ],
          "primitiveTransformation":{
            "redactConfig":{

            }
          }
        }
      ]
    }
  },
  "inspectConfig":{
    "infoTypes":[
      {
        "name":"EMAIL_ADDRESS"
      }
    ]
  }
}

JSON 輸出:

{
  "item":{
    "value":"My name is Alicia Abernathy, and my email address is ."
  },
  "overview":{
    "transformedBytes":"22",
    "transformationSummaries":[
      {
        "infoType":{
          "name":"EMAIL_ADDRESS"
        },
        "transformation":{
          "redactConfig":{

          }
        },
        "results":[
          {
            "count":"1",
            "code":"SUCCESS"
          }
        ],
        "transformedBytes":"22"
      }
    ]
  }
}

替換

替換轉換會將每個輸入值替換為指定代碼值,或是其 infoType 的名稱。

基本替換

基本替換轉換 (DLP API 中的 ReplaceValueConfig) 會將偵測到的機密資料值替換為您指定的值。舉例來說,假設您指示 Cloud DLP 將所有偵測到的 EMAIL_ADDRESS infoType 替換為「[fake@example.com]」,並傳送了以下字串至 Cloud DLP:

My name is Alicia Abernathy, and my email address is aabernathy@example.com.

傳回的字串將顯示以下內容:

My name is Alicia Abernathy, and my email address is [fake@example.com].

以下 JSON 範例說明如何形成 API 請求,以及 Cloud DLP API 傳回的內容:

JSON 輸入:

POST https://dlp.googleapis.com/v2/projects/[PROJECT_ID]/content:deidentify?key={YOUR_API_KEY}

{
  "item":{
    "value":"My name is Alicia Abernathy, and my email address is aabernathy@example.com."
  },
  "deidentifyConfig":{
    "infoTypeTransformations":{
      "transformations":[
        {
          "infoTypes":[
            {
              "name":"EMAIL_ADDRESS"
            }
          ],
          "primitiveTransformation":{
            "replaceConfig":{
              "newValue":{
                "stringValue":"[email-address]"
              }
            }
          }
        }
      ]
    }
  },
  "inspectConfig":{
    "infoTypes":[
      {
        "name":"EMAIL_ADDRESS"
      }
    ]
  }
}

JSON 輸出:

{
  "item":{
    "value":"My name is Alicia Abernathy, and my email address is [email-address]."
  },
  "overview":{
    "transformedBytes":"22",
    "transformationSummaries":[
      {
        "infoType":{
          "name":"EMAIL_ADDRESS"
        },
        "transformation":{
          "replaceConfig":{
            "newValue":{
              "stringValue":"[email-address]"
            }
          }
        },
        "results":[
          {
            "count":"1",
            "code":"SUCCESS"
          }
        ],
        "transformedBytes":"22"
      }
    ]
  }
}

InfoType 替換

您也可以指定 infoType 替換 (DLP API 中的 ReplaceWithInfoTypeConfig)。這種轉換與基本替換轉換的作用相同,但會將每個偵測到的機密資料值替換為該值的 infoType。

舉例來說,假設您指示 Cloud DLP 偵測電子郵件地址和姓氏,並將每個偵測到的值替換為該值的 infoType。接著您傳送了以下字串至 Cloud DLP:

My name is Alicia Abernathy, and my email address is aabernathy@example.com.

傳回的字串將顯示以下內容:

My name is Alicia LAST_NAME, and my email address is EMAIL_ADDRESS.

遮蔽

您可以設定 Cloud DLP,將每個字元替換為固定的單一遮蔽字元 (例如星號 (*) 或井字號 (#)),藉此完全或部分遮蔽偵測到的機密值 (DLP API 中的 CharacterMaskConfig)。遮蔽可以從字串的開頭或結尾開始。這種轉換也適用於長整數等數字類型。

Cloud DLP 的遮蔽轉換提供下列選項讓您指定:

  • 遮蔽字元 (DLP API 中的 maskingCharacter 引數):這個字元的用途是遮蔽機密值中每個字元。舉例來說,您可以指定星號 (*) 或貨幣符號 ($) 來遮蔽一系列數字,例如信用卡號碼的數字。
  • 要遮蔽的字元數 (numberToMask):如果您未指定這個值,系統會遮蔽所有字元。
  • 是否要反轉順序 (reverseOrder):是否要反向遮蔽字元。如果選擇反轉順序,系統會依相符值結尾至開頭的方向遮蔽值中的字元。
  • 要忽略的字元 (charactersToIgnore):遮蔽值時要略過的一或多個字元。舉例來說,您可以指示 Cloud DLP 在遮蔽電話號碼時,將連字號保留在原有的位置。您也可以指定遮蔽時要忽略的一組常見字元 (CharsToIgnore)。

假設您傳送了以下字串至 Cloud DLP,並指示 Cloud DLP 對電子郵件地址使用字元遮蔽轉換:

My name is Alicia Abernathy, and my email address is aabernathy@example.com.

在將遮蔽字元設為「#」、將要忽略的字元設為常見字元集,並保留其他預設設定的情況下,Cloud DLP 會傳回以下內容:

My name is Alicia Abernathy, and my email address is ##########@#######.###.

以下 JSON 和程式碼範例示範遮蔽轉換的運作方式。

通訊協定

JSON 輸入:

POST https://dlp.googleapis.com/v2/projects/[PROJECT_ID]/content:deidentify?key={YOUR_API_KEY}

{
  "item":{
    "value":"My name is Alicia Abernathy, and my email address is aabernathy@example.com."
  },
  "deidentifyConfig":{
    "infoTypeTransformations":{
      "transformations":[
        {
          "infoTypes":[
            {
              "name":"EMAIL_ADDRESS"
            }
          ],
          "primitiveTransformation":{
            "characterMaskConfig":{
              "maskingCharacter":"#",
              "reverseOrder":false,
              "charactersToIgnore":[
                {
                  "charactersToSkip":".@"
                }
              ]
            }
          }
        }
      ]
    }
  },
  "inspectConfig":{
    "infoTypes":[
      {
        "name":"EMAIL_ADDRESS"
      }
    ]
  }
}

JSON 輸出:

{
  "item":{
    "value":"My name is Alicia Abernathy, and my email address is ##########@#######.###."
  },
  "overview":{
    "transformedBytes":"22",
    "transformationSummaries":[
      {
        "infoType":{
          "name":"EMAIL_ADDRESS"
        },
        "transformation":{
          "characterMaskConfig":{
            "maskingCharacter":"#",
            "charactersToIgnore":[
              {
                "charactersToSkip":".@"
              }
            ]
          }
        },
        "results":[
          {
            "count":"1",
            "code":"SUCCESS"
          }
        ],
        "transformedBytes":"22"
      }
    ]
  }
}

Java

/**
 * Deidentify a string by masking sensitive information with a character using the DLP API.
 *
 * @param string The string to deidentify.
 * @param maskingCharacter (Optional) The character to mask sensitive data with.
 * @param numberToMask (Optional) The number of characters' worth of sensitive data to mask.
 *     Omitting this value or setting it to 0 masks all sensitive chars.
 * @param projectId ID of Google Cloud project to run the API under.
 */
private static void deIdentifyWithMask(
    String string,
    List<InfoType> infoTypes,
    Character maskingCharacter,
    int numberToMask,
    String projectId) {

  // instantiate a client
  try (DlpServiceClient dlpServiceClient = DlpServiceClient.create()) {

    ContentItem contentItem = ContentItem.newBuilder().setValue(string).build();

    CharacterMaskConfig characterMaskConfig =
        CharacterMaskConfig.newBuilder()
            .setMaskingCharacter(maskingCharacter.toString())
            .setNumberToMask(numberToMask)
            .build();

    // Create the deidentification transformation configuration
    PrimitiveTransformation primitiveTransformation =
        PrimitiveTransformation.newBuilder().setCharacterMaskConfig(characterMaskConfig).build();

    InfoTypeTransformation infoTypeTransformationObject =
        InfoTypeTransformation.newBuilder()
            .setPrimitiveTransformation(primitiveTransformation)
            .build();

    InfoTypeTransformations infoTypeTransformationArray =
        InfoTypeTransformations.newBuilder()
            .addTransformations(infoTypeTransformationObject)
            .build();

    InspectConfig inspectConfig =
        InspectConfig.newBuilder()
            .addAllInfoTypes(infoTypes)
            .build();

    DeidentifyConfig deidentifyConfig =
        DeidentifyConfig.newBuilder()
            .setInfoTypeTransformations(infoTypeTransformationArray)
            .build();

    // Create the deidentification request object
    DeidentifyContentRequest request =
        DeidentifyContentRequest.newBuilder()
            .setParent(ProjectName.of(projectId).toString())
            .setInspectConfig(inspectConfig)
            .setDeidentifyConfig(deidentifyConfig)
            .setItem(contentItem)
            .build();

    // Execute the deidentification request
    DeidentifyContentResponse response = dlpServiceClient.deidentifyContent(request);

    // Print the character-masked input value
    // e.g. "My SSN is 123456789" --> "My SSN is *********"
    String result = response.getItem().getValue();
    System.out.println(result);
  } catch (Exception e) {
    System.out.println("Error in deidentifyWithMask: " + e.getMessage());
  }
}

Node.js

// Imports the Google Cloud Data Loss Prevention library
const DLP = require('@google-cloud/dlp');

// Instantiates a client
const dlp = new DLP.DlpServiceClient();

// The project ID to run the API call under
// const callingProjectId = process.env.GCLOUD_PROJECT;

// The string to deidentify
// const string = 'My SSN is 372819127';

// (Optional) The maximum number of sensitive characters to mask in a match
// If omitted from the request or set to 0, the API will mask any matching characters
// const numberToMask = 5;

// (Optional) The character to mask matching sensitive data with
// const maskingCharacter = 'x';

// Construct deidentification request
const item = {value: string};
const request = {
  parent: dlp.projectPath(callingProjectId),
  deidentifyConfig: {
    infoTypeTransformations: {
      transformations: [
        {
          primitiveTransformation: {
            characterMaskConfig: {
              maskingCharacter: maskingCharacter,
              numberToMask: numberToMask,
            },
          },
        },
      ],
    },
  },
  item: item,
};

try {
  // Run deidentification request
  const [response] = await dlp.deidentifyContent(request);
  const deidentifiedItem = response.item;
  console.log(deidentifiedItem.value);
} catch (err) {
  console.log(`Error in deidentifyWithMask: ${err.message || err}`);
}

Python

def deidentify_with_mask(project, string, info_types, masking_character=None,
                         number_to_mask=0):
    """Uses the Data Loss Prevention API to deidentify sensitive data in a
    string by masking it with a character.
    Args:
        project: The Google Cloud project id to use as a parent resource.
        item: The string to deidentify (will be treated as text).
        masking_character: The character to mask matching sensitive data with.
        number_to_mask: The maximum number of sensitive characters to mask in
            a match. If omitted or set to zero, the API will default to no
            maximum.
    Returns:
        None; the response from the API is printed to the terminal.
    """

    # Import the client library
    import google.cloud.dlp

    # Instantiate a client
    dlp = google.cloud.dlp.DlpServiceClient()

    # Convert the project id into a full resource id.
    parent = dlp.project_path(project)

    # Construct inspect configuration dictionary
    inspect_config = {
        'info_types': [{'name': info_type} for info_type in info_types]
    }

    # Construct deidentify configuration dictionary
    deidentify_config = {
        'info_type_transformations': {
            'transformations': [
                {
                    'primitive_transformation': {
                        'character_mask_config': {
                            'masking_character': masking_character,
                            'number_to_mask': number_to_mask
                        }
                    }
                }
            ]
        }
    }

    # Construct item
    item = {'value': string}

    # Call the API
    response = dlp.deidentify_content(
        parent, inspect_config=inspect_config,
        deidentify_config=deidentify_config, item=item)

    # Print out the results.
    print(response.item.value)

Go

// mask deidentifies the input by masking all provided info types with maskingCharacter
// and prints the result to w.
func mask(w io.Writer, client *dlp.Client, project, input string, infoTypes []string, maskingCharacter string, numberToMask int32) {
	// Convert the info type strings to a list of InfoTypes.
	var i []*dlppb.InfoType
	for _, it := range infoTypes {
		i = append(i, &dlppb.InfoType{Name: it})
	}
	// Create a configured request.
	req := &dlppb.DeidentifyContentRequest{
		Parent: "projects/" + project,
		InspectConfig: &dlppb.InspectConfig{
			InfoTypes: i,
		},
		DeidentifyConfig: &dlppb.DeidentifyConfig{
			Transformation: &dlppb.DeidentifyConfig_InfoTypeTransformations{
				InfoTypeTransformations: &dlppb.InfoTypeTransformations{
					Transformations: []*dlppb.InfoTypeTransformations_InfoTypeTransformation{
						{
							InfoTypes: []*dlppb.InfoType{}, // Match all info types.
							PrimitiveTransformation: &dlppb.PrimitiveTransformation{
								Transformation: &dlppb.PrimitiveTransformation_CharacterMaskConfig{
									CharacterMaskConfig: &dlppb.CharacterMaskConfig{
										MaskingCharacter: maskingCharacter,
										NumberToMask:     numberToMask,
									},
								},
							},
						},
					},
				},
			},
		},
		// The item to analyze.
		Item: &dlppb.ContentItem{
			DataItem: &dlppb.ContentItem_Value{
				Value: input,
			},
		},
	}
	// Send the request.
	r, err := client.DeidentifyContent(context.Background(), req)
	if err != nil {
		log.Fatal(err)
	}
	// Print the result.
	fmt.Fprint(w, r.GetItem().GetValue())
}

PHP

use Google\Cloud\Dlp\V2\CharacterMaskConfig;
use Google\Cloud\Dlp\V2\DlpServiceClient;
use Google\Cloud\Dlp\V2\InfoType;
use Google\Cloud\Dlp\V2\PrimitiveTransformation;
use Google\Cloud\Dlp\V2\DeidentifyConfig;
use Google\Cloud\Dlp\V2\InfoTypeTransformations\InfoTypeTransformation;
use Google\Cloud\Dlp\V2\InfoTypeTransformations;
use Google\Cloud\Dlp\V2\ContentItem;

/**
 * Deidentify sensitive data in a string by masking it with a character.
 * @param string $callingProjectId The GCP Project ID to run the API call under
 * @param string $string The string to deidentify
 * @param int $numberToMask (Optional) The maximum number of sensitive characters to mask in a match
 * @param string $maskingCharacter (Optional) The character to mask matching sensitive data with
 */
function deidentify_mask(
  $callingProjectId,
  $string,
  $numberToMask = 0,
  $maskingCharacter = 'x'
) {
    // Instantiate a client.
    $dlp = new DlpServiceClient();

    // The infoTypes of information to mask
    $ssnInfoType = (new InfoType())
        ->setName('US_SOCIAL_SECURITY_NUMBER');
    $infoTypes = [$ssnInfoType];

    // Create the masking configuration object
    $maskConfig = (new CharacterMaskConfig())
        ->setMaskingCharacter($maskingCharacter)
        ->setNumberToMask($numberToMask);

    // Create the information transform configuration objects
    $primitiveTransformation = (new PrimitiveTransformation())
        ->setCharacterMaskConfig($maskConfig);

    $infoTypeTransformation = (new InfoTypeTransformation())
        ->setPrimitiveTransformation($primitiveTransformation)
        ->setInfoTypes($infoTypes);

    $infoTypeTransformations = (new InfoTypeTransformations())
        ->setTransformations([$infoTypeTransformation]);

    // Create the deidentification configuration object
    $deidentifyConfig = (new DeidentifyConfig())
        ->setInfoTypeTransformations($infoTypeTransformations);

    $item = (new ContentItem())
        ->setValue($string);

    $parent = $dlp->projectName($callingProjectId);

    // Run request
    $response = $dlp->deidentifyContent($parent, [
        'deidentifyConfig' => $deidentifyConfig,
        'item' => $item
    ]);

    // Print the results
    $deidentifiedValue = $response->getItem()->getValue();
    print($deidentifiedValue);
}

C#

public static object DeidMask(
    string projectId,
    string dataValue,
    IEnumerable<InfoType> infoTypes,
    string maskingCharacter,
    int numberToMask,
    bool reverseOrder)
{
    var request = new DeidentifyContentRequest
    {
        ParentAsProjectName = new ProjectName(projectId),
        InspectConfig = new InspectConfig
        {
            InfoTypes = { infoTypes }
        },
        DeidentifyConfig = new DeidentifyConfig
        {
            InfoTypeTransformations = new InfoTypeTransformations
            {
                Transformations = {
                    new InfoTypeTransformations.Types.InfoTypeTransformation
                    {
                        PrimitiveTransformation = new PrimitiveTransformation
                        {
                            CharacterMaskConfig = new CharacterMaskConfig
                            {
                                MaskingCharacter = maskingCharacter,
                                NumberToMask = numberToMask,
                                ReverseOrder = reverseOrder
                            }
                        }
                    }
                }
            }
        },
        Item = new ContentItem
        {
            Value = dataValue
        }
    };

    DlpServiceClient dlp = DlpServiceClient.Create();
    var response = dlp.DeidentifyContent(request);

    Console.WriteLine($"Deidentified content: {response.Item.Value}");
    return 0;
}

加密式代碼化

加密式代碼化 (又稱為「匿名化」) 轉換會將原始機密資料值替換為加密值。Cloud DLP 支援下列類型的代碼化,包括可復原和「重新識別化」的轉換。

  • 加密編譯雜湊:在具備 CryptoKey 的情況下,Cloud DLP 會對輸入值使用 SHA-256 式訊息驗證碼 (HMAC-SHA-256),然後將輸入值替換為採 Base64 編碼的雜湊值。
  • 格式保留加密:搭配 FFX 作業模式使用格式保留加密 (FPE) 產生代碼,並將輸入值替換為這個代碼。這種轉換方法只能使用與輸入值相同的字母表產生代碼,且產生的代碼長度與輸入值相同。FPE 也支援重新識別化,前提是要有原始加密金鑰。
  • 確定性加密:使用合成初始向量模式的 AES (AES-SIV) 產生代碼,並將輸入值替換為這個代碼。這種轉換方法對於支援的字串字元集沒有限制、會為每個相同的輸入值產生相同的代碼,並且會使用代理值,在有原始加密金鑰的情況下進行重新識別化。

加密編譯雜湊

加密編譯雜湊代碼化轉換 (DLP API 中的 CryptoHashConfig) 會將輸入值 (Cloud DLP 偵測到的機密資料片段) 替換為雜湊值。雜湊值是系統透過 CryptoKey,對輸入值使用 SHA-256 式訊息驗證碼 (HMAC-SHA-256) 產生的。

在 Cloud DLP 的輸出內容中,原始值會替換為以 Base64 編碼表示的雜湊輸入值。

使用加密編譯雜湊轉換前,請注意以下事項:

  • 輸入值未經過加密,而是經過雜湊處理。
  • 這項轉換無法復原,也就是說,即使有經過雜湊處理的轉換輸出值和原始加密編譯金鑰,也無法還原原始值。
  • 目前,您只能對字串與整數值進行雜湊處理。
  • 經過雜湊處理的轉換輸出長度一律相同,實際長度則視加密編譯金鑰大小而定。舉例來說,如果您對 10 位數的電話號碼使用加密編譯雜湊轉換,每組電話號碼都會替換為固定長度的 Base64 編碼雜湊值。

格式保留加密

格式保留加密轉換方法 (DLP API 中的 CryptoReplaceFfxFpeConfig) 會使用 FFX 模式的格式保留加密和 CryptoKey,對輸入值 (Cloud DLP 偵測到的機密資料片段) 進行加密,然後將原始值替換為加密值或代碼。

輸入值具有以下特性:

  • 至少必須要有兩個字元的長度 (或空字串)。
  • 必須採 ASCII 編碼。
  • 包含由「字母表」組成的字元;字母表是指一組 2 至 64 個可在輸入值中使用的字元。詳情請參閱 CryptoReplaceFfxFpeConfig 中的字母表欄位。

產生的代碼具有以下特性:

  • 是經過加密的輸入值。
  • 加密後會保留輸入值的字元集 (「字母表」) 和長度。
  • 是透過指定加密編譯金鑰使用 FFX 模式的格式保留加密 (FPE) 計算而來。
  • 可能會重複,這是因為每個相同輸入值都會去識別化為相同代碼。這樣有助於維持參照完整性,進而提升搜尋去識別化資料的效率。您可以使用結構定義「修正項」來變更這項行為,如結構定義一節所述。

如果來源內容中某個輸入值出現多次,每個值都會去識別化為相同代碼。FPE 會保留長度和字母表空間 (字元集),而字母表不得超過 62 個字元。這項行為與非確定性代碼化方法形成對比,後者會將每個相同輸入值去識別化為不重複的代碼,並可搭配任何字元集使用。您可以使用結構定義「修正項」來變更這項行為,這樣有助於提升安全性。為轉換加上結構定義修正項,即可讓 Cloud DLP 將每個相同輸入值去識別化為不同的代碼。如果您不需要保留原始值的長度和字母表空間,請使用下方所述的確定性加密

Cloud DLP 會使用加密編譯金鑰計算替換代碼。您可透過下列其中一種方式提供這個金鑰:

  1. 將金鑰以未加密的方式嵌入 API 請求中。
  2. 要求 Cloud DLP 產生這個金鑰。
  3. 將金鑰加密並嵌入 API 請求中。如果選擇這個選項,系統會使用 Cloud Key Management Service (Cloud KMS) 金鑰包裝 (加密) 金鑰。

如要建立 Cloud KMS 包裝的金鑰,請將包含 16、24 或 32 位元組明文欄位值的要求傳送至 Cloud KMS projects.locations.keyRings.cryptoKeys.encrypt 方法。包裝的金鑰是這個方法的回應中 ciphertext 欄位的值。

根據預設,這個值為 Base64 編碼的字串。如要在 Cloud DLP 中設定這個值,您必須將其解碼為位元組字串。以下程式碼片段將示範如何以各種不同的程式語言執行這項操作。這些程式碼片段的後方會提供端對端範例。

Java

KmsWrappedCryptoKey.newBuilder()
    .setWrappedKey(ByteString.copyFrom(BaseEncoding.base64().decode(wrappedKey)))

Python

# The wrapped key is Base64-encoded, but the library expects a binary
# string, so decode it here.
import base64
wrapped_key = base64.b64decode(wrapped_key)

PHP

// Create the wrapped crypto key configuration object
$kmsWrappedCryptoKey = (new KmsWrappedCryptoKey())
    ->setWrappedKey(base64_decode($wrappedKey))
    ->setCryptoKeyName($keyName);

C#

WrappedKey = ByteString.FromBase64(wrappedKey)

如要進一步瞭解如何使用 Cloud KMS 加密及解密資料,請參閱加密及解密資料一文。

以下是幾種不同程式語言的範例程式碼,示範如何使用 Cloud DLP 將輸入值替換為代碼,藉此將機密資料去識別化。

Java

/**
 * Deidentify a string by encrypting sensitive information while preserving format.
 *
 * @param string The string to deidentify.
 * @param alphabet The set of characters to use when encrypting the input. For more information,
 *     see cloud.google.com/dlp/docs/reference/rest/v2/content/deidentify
 * @param keyName The name of the Cloud KMS key to use when decrypting the wrapped key.
 * @param wrappedKey The encrypted (or "wrapped") AES-256 encryption key.
 * @param projectId ID of Google Cloud project to run the API under.
 */
private static void deIdentifyWithFpe(
    String string,
    List<InfoType> infoTypes,
    FfxCommonNativeAlphabet alphabet,
    String keyName,
    String wrappedKey,
    String projectId,
    String surrogateType) {
  // instantiate a client
  try (DlpServiceClient dlpServiceClient = DlpServiceClient.create()) {
    ContentItem contentItem = ContentItem.newBuilder().setValue(string).build();

    // Create the format-preserving encryption (FPE) configuration
    KmsWrappedCryptoKey kmsWrappedCryptoKey =
        KmsWrappedCryptoKey.newBuilder()
            .setWrappedKey(ByteString.copyFrom(BaseEncoding.base64().decode(wrappedKey)))
            .setCryptoKeyName(keyName)
            .build();

    CryptoKey cryptoKey = CryptoKey.newBuilder().setKmsWrapped(kmsWrappedCryptoKey).build();

    CryptoReplaceFfxFpeConfig cryptoReplaceFfxFpeConfig =
        CryptoReplaceFfxFpeConfig.newBuilder()
            .setCryptoKey(cryptoKey)
            .setCommonAlphabet(alphabet)
            .setSurrogateInfoType(InfoType.newBuilder().setName(surrogateType).build())
            .build();

    // Create the deidentification transformation configuration
    PrimitiveTransformation primitiveTransformation =
        PrimitiveTransformation.newBuilder()
            .setCryptoReplaceFfxFpeConfig(cryptoReplaceFfxFpeConfig)
            .build();

    InfoTypeTransformation infoTypeTransformationObject =
        InfoTypeTransformation.newBuilder()
            .setPrimitiveTransformation(primitiveTransformation)
            .build();

    InfoTypeTransformations infoTypeTransformationArray =
        InfoTypeTransformations.newBuilder()
            .addTransformations(infoTypeTransformationObject)
            .build();

    InspectConfig inspectConfig =
        InspectConfig.newBuilder()
            .addAllInfoTypes(infoTypes)
            .build();

    // Create the deidentification request object
    DeidentifyConfig deidentifyConfig =
        DeidentifyConfig.newBuilder()
            .setInfoTypeTransformations(infoTypeTransformationArray)
            .build();

    DeidentifyContentRequest request =
        DeidentifyContentRequest.newBuilder()
            .setParent(ProjectName.of(projectId).toString())
            .setInspectConfig(inspectConfig)
            .setDeidentifyConfig(deidentifyConfig)
            .setItem(contentItem)
            .build();

    // Execute the deidentification request
    DeidentifyContentResponse response = dlpServiceClient.deidentifyContent(request);

    // Print the deidentified input value
    // e.g. "My SSN is 123456789" --> "My SSN is 7261298621"
    String result = response.getItem().getValue();
    System.out.println(result);
  } catch (Exception e) {
    System.out.println("Error in deidentifyWithFpe: " + e.getMessage());
  }
}

Node.js

// Imports the Google Cloud Data Loss Prevention library
const DLP = require('@google-cloud/dlp');

// Instantiates a client
const dlp = new DLP.DlpServiceClient();

// The project ID to run the API call under
// const callingProjectId = process.env.GCLOUD_PROJECT;

// The string to deidentify
// const string = 'My SSN is 372819127';

// The set of characters to replace sensitive ones with
// For more information, see https://cloud.google.com/dlp/docs/reference/rest/v2/organizations.deidentifyTemplates#ffxcommonnativealphabet
// const alphabet = 'ALPHA_NUMERIC';

// The name of the Cloud KMS key used to encrypt ('wrap') the AES-256 key
// const keyName = 'projects/YOUR_GCLOUD_PROJECT/locations/YOUR_LOCATION/keyRings/YOUR_KEYRING_NAME/cryptoKeys/YOUR_KEY_NAME';

// The encrypted ('wrapped') AES-256 key to use
// This key should be encrypted using the Cloud KMS key specified above
// const wrappedKey = 'YOUR_ENCRYPTED_AES_256_KEY'

// (Optional) The name of the surrogate custom info type to use
// Only necessary if you want to reverse the deidentification process
// Can be essentially any arbitrary string, as long as it doesn't appear
// in your dataset otherwise.
// const surrogateType = 'SOME_INFO_TYPE_DEID';

// Construct FPE config
const cryptoReplaceFfxFpeConfig = {
  cryptoKey: {
    kmsWrapped: {
      wrappedKey: wrappedKey,
      cryptoKeyName: keyName,
    },
  },
  commonAlphabet: alphabet,
};
if (surrogateType) {
  cryptoReplaceFfxFpeConfig.surrogateInfoType = {
    name: surrogateType,
  };
}

// Construct deidentification request
const item = {value: string};
const request = {
  parent: dlp.projectPath(callingProjectId),
  deidentifyConfig: {
    infoTypeTransformations: {
      transformations: [
        {
          primitiveTransformation: {
            cryptoReplaceFfxFpeConfig: cryptoReplaceFfxFpeConfig,
          },
        },
      ],
    },
  },
  item: item,
};

try {
  // Run deidentification request
  const [response] = await dlp.deidentifyContent(request);
  const deidentifiedItem = response.item;
  console.log(deidentifiedItem.value);
} catch (err) {
  console.log(`Error in deidentifyWithFpe: ${err.message || err}`);
}

Python

def deidentify_with_fpe(project, string, info_types, alphabet=None,
                        surrogate_type=None, key_name=None, wrapped_key=None):
    """Uses the Data Loss Prevention API to deidentify sensitive data in a
    string using Format Preserving Encryption (FPE).
    Args:
        project: The Google Cloud project id to use as a parent resource.
        item: The string to deidentify (will be treated as text).
        alphabet: The set of characters to replace sensitive ones with. For
            more information, see https://cloud.google.com/dlp/docs/reference/
            rest/v2beta2/organizations.deidentifyTemplates#ffxcommonnativealphabet
        surrogate_type: The name of the surrogate custom info type to use. Only
            necessary if you want to reverse the deidentification process. Can
            be essentially any arbitrary string, as long as it doesn't appear
            in your dataset otherwise.
        key_name: The name of the Cloud KMS key used to encrypt ('wrap') the
            AES-256 key. Example:
            key_name = 'projects/YOUR_GCLOUD_PROJECT/locations/YOUR_LOCATION/
            keyRings/YOUR_KEYRING_NAME/cryptoKeys/YOUR_KEY_NAME'
        wrapped_key: The encrypted ('wrapped') AES-256 key to use. This key
            should be encrypted using the Cloud KMS key specified by key_name.
    Returns:
        None; the response from the API is printed to the terminal.
    """
    # Import the client library
    import google.cloud.dlp

    # Instantiate a client
    dlp = google.cloud.dlp.DlpServiceClient()

    # Convert the project id into a full resource id.
    parent = dlp.project_path(project)

    # The wrapped key is base64-encoded, but the library expects a binary
    # string, so decode it here.
    import base64
    wrapped_key = base64.b64decode(wrapped_key)

    # Construct FPE configuration dictionary
    crypto_replace_ffx_fpe_config = {
        'crypto_key': {
            'kms_wrapped': {
                'wrapped_key': wrapped_key,
                'crypto_key_name': key_name
            }
        },
        'common_alphabet': alphabet
    }

    # Add surrogate type
    if surrogate_type:
        crypto_replace_ffx_fpe_config['surrogate_info_type'] = {
            'name': surrogate_type
        }

    # Construct inspect configuration dictionary
    inspect_config = {
        'info_types': [{'name': info_type} for info_type in info_types]
    }

    # Construct deidentify configuration dictionary
    deidentify_config = {
        'info_type_transformations': {
            'transformations': [
                {
                    'primitive_transformation': {
                        'crypto_replace_ffx_fpe_config':
                            crypto_replace_ffx_fpe_config
                    }
                }
            ]
        }
    }

    # Convert string to item
    item = {'value': string}

    # Call the API
    response = dlp.deidentify_content(
        parent, inspect_config=inspect_config,
        deidentify_config=deidentify_config, item=item)

    # Print results
    print(response.item.value)

Go

// deidentifyFPE deidentifies the input with FPE (Format Preserving Encryption).
// keyFileName is the file name with the KMS wrapped key and cryptoKeyName is the
// full KMS key resource name used to wrap the key. surrogateInfoType is an
// optional identifier needed for reidentification. surrogateInfoType can be any
// value not found in your input.
func deidentifyFPE(w io.Writer, client *dlp.Client, project, input string, infoTypes []string, keyFileName, cryptoKeyName, surrogateInfoType string) {
	// Convert the info type strings to a list of InfoTypes.
	var i []*dlppb.InfoType
	for _, it := range infoTypes {
		i = append(i, &dlppb.InfoType{Name: it})
	}
	// Read the key file.
	keyBytes, err := ioutil.ReadFile(keyFileName)
	if err != nil {
		log.Fatalf("error reading file: %v", err)
	}
	// Create a configured request.
	req := &dlppb.DeidentifyContentRequest{
		Parent: "projects/" + project,
		InspectConfig: &dlppb.InspectConfig{
			InfoTypes: i,
		},
		DeidentifyConfig: &dlppb.DeidentifyConfig{
			Transformation: &dlppb.DeidentifyConfig_InfoTypeTransformations{
				InfoTypeTransformations: &dlppb.InfoTypeTransformations{
					Transformations: []*dlppb.InfoTypeTransformations_InfoTypeTransformation{
						{
							InfoTypes: []*dlppb.InfoType{}, // Match all info types.
							PrimitiveTransformation: &dlppb.PrimitiveTransformation{
								Transformation: &dlppb.PrimitiveTransformation_CryptoReplaceFfxFpeConfig{
									CryptoReplaceFfxFpeConfig: &dlppb.CryptoReplaceFfxFpeConfig{
										CryptoKey: &dlppb.CryptoKey{
											Source: &dlppb.CryptoKey_KmsWrapped{
												KmsWrapped: &dlppb.KmsWrappedCryptoKey{
													WrappedKey:    keyBytes,
													CryptoKeyName: cryptoKeyName,
												},
											},
										},
										// Set the alphabet used for the output.
										Alphabet: &dlppb.CryptoReplaceFfxFpeConfig_CommonAlphabet{
											CommonAlphabet: dlppb.CryptoReplaceFfxFpeConfig_ALPHA_NUMERIC,
										},
										// Set the surrogate info type, used for reidentification.
										SurrogateInfoType: &dlppb.InfoType{
											Name: surrogateInfoType,
										},
									},
								},
							},
						},
					},
				},
			},
		},
		// The item to analyze.
		Item: &dlppb.ContentItem{
			DataItem: &dlppb.ContentItem_Value{
				Value: input,
			},
		},
	}
	// Send the request.
	r, err := client.DeidentifyContent(context.Background(), req)
	if err != nil {
		log.Fatal(err)
	}
	// Print the result.
	fmt.Fprint(w, r.GetItem().GetValue())
}

PHP

use Google\Cloud\Dlp\V2\CryptoReplaceFfxFpeConfig;
use Google\Cloud\Dlp\V2\CryptoReplaceFfxFpeConfig\FfxCommonNativeAlphabet;
use Google\Cloud\Dlp\V2\CryptoKey;
use Google\Cloud\Dlp\V2\DlpServiceClient;
use Google\Cloud\Dlp\V2\PrimitiveTransformation;
use Google\Cloud\Dlp\V2\KmsWrappedCryptoKey;
use Google\Cloud\Dlp\V2\InfoType;
use Google\Cloud\Dlp\V2\DeidentifyConfig;
use Google\Cloud\Dlp\V2\InfoTypeTransformations\InfoTypeTransformation;
use Google\Cloud\Dlp\V2\InfoTypeTransformations;
use Google\Cloud\Dlp\V2\ContentItem;

/**
 * Deidentify a string using Format-Preserving Encryption (FPE).
 *
 * @param string $callingProjectId The GCP Project ID to run the API call under
 * @param string $string The string to deidentify
 * @param string $keyName The name of the Cloud KMS key used to encrypt ('wrap') the AES-256 key
 * @param wrappedKey $wrappedKey The AES-256 key to use, encrypted ('wrapped') with the KMS key
 *        defined by $keyName.
 * @param string $surrogateTypeName Optional surrogate custom info type to enable
 *        reidentification. Can be essentially any arbitrary string that doesn't
 *        appear in your dataset'
 */
function deidentify_fpe(
    $callingProjectId,
    $string,
    $keyName,
    $wrappedKey,
    $surrogateTypeName = ''
) {
    // Instantiate a client.
    $dlp = new DlpServiceClient();

    // The infoTypes of information to mask
    $ssnInfoType = (new InfoType())
        ->setName('US_SOCIAL_SECURITY_NUMBER');
    $infoTypes = [$ssnInfoType];

    // Create the wrapped crypto key configuration object
    $kmsWrappedCryptoKey = (new KmsWrappedCryptoKey())
        ->setWrappedKey(base64_decode($wrappedKey))
        ->setCryptoKeyName($keyName);

    // The set of characters to replace sensitive ones with
    // For more information, see https://cloud.google.com/dlp/docs/reference/rest/V2/organizations.deidentifyTemplates#ffxcommonnativealphabet
    $commonAlphabet = FfxCommonNativeAlphabet::NUMERIC;

    // Create the crypto key configuration object
    $cryptoKey = (new CryptoKey())
        ->setKmsWrapped($kmsWrappedCryptoKey);

    // Create the crypto FFX FPE configuration object
    $cryptoReplaceFfxFpeConfig = (new CryptoReplaceFfxFpeConfig())
        ->setCryptoKey($cryptoKey)
        ->setCommonAlphabet($commonAlphabet);
    if ($surrogateTypeName) {
        $surrogateType = (new InfoType())
            ->setName($surrogateTypeName);
        $cryptoReplaceFfxFpeConfig->setSurrogateInfoType($surrogateType);
    }

    // Create the information transform configuration objects
    $primitiveTransformation = (new PrimitiveTransformation())
        ->setCryptoReplaceFfxFpeConfig($cryptoReplaceFfxFpeConfig);

    $infoTypeTransformation = (new InfoTypeTransformation())
        ->setPrimitiveTransformation($primitiveTransformation)
        ->setInfoTypes($infoTypes);

    $infoTypeTransformations = (new InfoTypeTransformations())
        ->setTransformations([$infoTypeTransformation]);

    // Create the deidentification configuration object
    $deidentifyConfig = (new DeidentifyConfig())
        ->setInfoTypeTransformations($infoTypeTransformations);

    $content = (new ContentItem())
        ->setValue($string);

    $parent = $dlp->projectName($callingProjectId);

    // Run request
    $response = $dlp->deidentifyContent($parent, [
        'deidentifyConfig' => $deidentifyConfig,
        'item' => $content
    ]);

    // Print the results
    $deidentifiedValue = $response->getItem()->getValue();
    print($deidentifiedValue);
}

C#

public static object DeidFpe(
    string projectId,
    string dataValue,
    IEnumerable<InfoType> infoTypes,
    string keyName,
    string wrappedKey,
    string alphabet)
{
    var deidentifyConfig = new DeidentifyConfig
    {
        InfoTypeTransformations = new InfoTypeTransformations
        {
            Transformations =
            {
                new InfoTypeTransformations.Types.InfoTypeTransformation
                {
                    PrimitiveTransformation = new PrimitiveTransformation
                    {
                        CryptoReplaceFfxFpeConfig = new CryptoReplaceFfxFpeConfig
                        {
                            CommonAlphabet = (FfxCommonNativeAlphabet) Enum.Parse(typeof(FfxCommonNativeAlphabet), alphabet),
                            CryptoKey = new CryptoKey
                            {
                                KmsWrapped = new KmsWrappedCryptoKey
                                {
                                    CryptoKeyName = keyName,
                                    WrappedKey = ByteString.FromBase64(wrappedKey)
                                }
                            },
                            SurrogateInfoType = new InfoType
                            {
                                Name = "TOKEN"
                            }
                        }
                    }
                }
            }
        }
    };

    DlpServiceClient dlp = DlpServiceClient.Create();
    var response = dlp.DeidentifyContent(
        new DeidentifyContentRequest
        {
            ParentAsProjectName = new ProjectName(projectId),
            InspectConfig = new InspectConfig
            {
                InfoTypes = { infoTypes }
            },
            DeidentifyConfig = deidentifyConfig,
            Item = new ContentItem { Value = dataValue }
        });

    Console.WriteLine($"Deidentified content: {response.Item.Value}");
    return 0;
}

確定性加密

確定性加密方法 (DLP API 中的 CryptoDeterministicConfig) 會使用 AES-SIV 透過 CryptoKey 加密輸入值 (Cloud DLP 偵測到的機密資料片段),然後將原始值替換為以 Base64 編碼表示的加密值。

使用確定性加密轉換可提升搜尋加密資料的效率。

輸入值具有以下特性:

  • 長度至少要有 1 個字元。
  • 沒有字元集限制。

產生的代碼具有以下特性:

  • 以 Base64 編碼表示的加密值。
  • 加密後不會保留輸入值的字元集 (「字母表」) 或長度。
  • 是透過 CryptoKey 使用 SIV 模式的 AES 加密 (AES-SIV) 計算而來。
  • 可能會重複,這是因為每個相同輸入值都會去識別化為相同代碼。這可提升搜尋加密資料的效率。您可以使用結構定義「修正項」來變更這項行為,如結構定義一節所述。
  • 產生時會加上格式為 [SURROGATE_TYPE]([LENGTH]): 的前置字串,其中 [SURROGATE_TYPE] 表示用來描述輸入值的代理值 infoType[LENGTH] 則表示其字元長度。代理值可讓系統透過用於去識別化的原始加密金鑰,將代碼重新識別化。

以下是使用確定性加密進行去識別化的 JSON 設定示例。請注意,由於我們要將電話號碼去識別化,因此選擇使用「PHONE_SURROGATE」做為描述性代理值類型。 [CRYPTO_KEY] 代表從 Cloud KMS 取得的未包裝加密編譯金鑰。如要進一步瞭解如何取得 CryptoKey,請參閱前一節格式保留金鑰的說明。

{
  "deidentifyConfig":{
    "infoTypeTransformations":{
      "transformations":[
        {
          "infoTypes":[
            {
              "name":"PHONE_NUMBER"
            }
          ],
          "primitiveTransformation":{
            "cryptoDeterministicConfig":{
              "cryptoKey":{
                "unwrapped":{
                  "key":"[CRYPTO_KEY]"
                }
              },
              "surrogateInfoType":{
                "name":"PHONE_SURROGATE"
              }
            }
          }
        }
      ]
    }
  },
  "inspectConfig":{
    "infoTypes":[
      {
        "name":"PHONE_NUMBER"
      }
    ]
  },
  "item":{
    "value":"My phone number is 206-555-0574, call me"
  }
}

使用這個轉換將「My phone number is 206-555-0574」(我的電話號碼是 206-555-0574) 這個字串去識別化後,會產生如下的去識別化字串:

My phone number is PHONE_SURROGATE(36):ATZBu5OCCSwo+e94xSYnKYljk1OQpkW7qhzx, call me

如要將這個字串重新識別化,您可以使用如下的 JSON 要求,其中 [CRYPTO_KEY] 是用來將字串內容去識別化的加密編譯金鑰。

{
  "reidentifyConfig":{
    "infoTypeTransformations":{
      "transformations":[
        {
          "infoTypes":[
            {
              "name":"PHONE_SURROGATE"
            }
          ],
          "primitiveTransformation":{
            "cryptoDeterministicConfig":{
              "cryptoKey":{
                "unwrapped":{
                  "key":"[CRYPTO_KEY]"
                }
              },
              "surrogateInfoType":{
                "name":"PHONE_SURROGATE"
              }
            }
          }
        }
      ]
    }
  },
  "inspectConfig":{
    "customInfoTypes":[
      {
        "infoType":{
          "name":"PHONE_SURROGATE"
        },
        "surrogateType":{

        }
      }
    ]
  },
  "item":{
    "value":"My phone number is [PHONE_SURROGATE](36):ATZBu5OCCSwo+e94xSYnKYljk1OQpkW7qhzx, call me"
  }
}

將這個字串重新識別化後會產生原始字串:

My phone number is 206-555-0574, call me

特徵分塊

特徵分塊轉換可將數值型資料「特徵分塊」為範圍,藉此將其去識別化。產生的數字範圍是以連字號連接的字串,其中包含下限、連字號和上限。

固定大小特徵分塊

Cloud DLP 可根據固定大小範圍將數值輸入資料特徵分塊 (DLP API 中的 FixedSizeBucketingConfig)。您必須指定下列項目來進行固定大小特徵分塊設定:

  • 所有特徵分塊的下限值。系統會將小於這個下限的任何值歸為一個特徵分塊。
  • 所有特徵分塊的上限值。系統會將大於這個上限的任何值歸為一個特徵分塊。
  • 最小與最大特徵分塊以外的每個特徵分塊大小。

舉例來說,如果將下限設為 10、將上限設為 89,並將特徵分塊大小設為 10,則會使用下列特徵分塊:小於 10、10 到 20、20 到 30、30 到 40、40 到 50、50 到 60、60 到 70、70 到 80、80 到 89、大於 89。

如要進一步瞭解特徵分塊的概念,請參閱一般化與特徵分塊一文。

可自訂的特徵分塊

可自訂的特徵分塊 (DLP API 中的 BucketingConfig) 比固定大小特徵分塊更有彈性。您必須為要建立的每個特徵分塊指定最大與最小值,而不是指定上限與下限以及用來建立相同大小特徵分塊的間隔值。每個最大與最小值組合必須是相同的類型。

如要設定可自訂的特徵分塊,您必須指定個別特徵分塊。每個特徵分塊都具有下列屬性:

  • 特徵分塊範圍的下限。省略這個值可建立沒有下限的特徵分塊。
  • 特徵分塊範圍的上限。省略這個值可建立沒有上限的特徵分塊。
  • 這個特徵分塊範圍的替換值。偵測到的值當中,介於下限與上限之間的值全都會替換為這個值。如未提供替換值,系統會改為產生以連字號連接的最小值到最大值範圍。

以下方的特徵分塊轉換 JSON 設定為例:

"bucketingConfig":{
  "buckets":[
    {
      "min":{
        "integerValue":"1"
      },
      "max":{
        "integerValue":"30"
      },
      "replacementValue":{
        "stringValue":"LOW"
      }
    },
    {
      "min":{
        "integerValue":"31"
      },
      "max":{
        "integerValue":"65"
      },
      "replacementValue":{
        "stringValue":"MEDIUM"
      }
    },
    {
      "min":{
        "integerValue":"66"
      },
      "max":{
        "integerValue":"100"
      },
      "replacementValue":{
        "stringValue":"HIGH"
      }
    }
  ]
}

這會定義以下行為:

  • 系統會將介於 1 到 30 之間的整數值替換為 LOW 以遮蔽這些值。
  • 系統會將介於 31 到 65 之間的整數值替換為 MEDIUM 以遮蔽這些值。
  • 系統會將介於 66 到 100 之間的整數值替換為 HIGH 以遮蔽這些值。

如要進一步瞭解特徵分塊的概念,請參閱一般化與特徵分塊一文。

日期轉移

如果您對日期輸入值使用日期轉移轉換 (DLP API 中的 DateShiftConfig),Cloud DLP 會將日期轉移隨機天數。

日期轉移的做法是隨機移動一組日期,但保留日期的順序和日期之間的時間長度。日期的轉移通常是依個人或單一實體來進行。也就是說,您可以使用相同的轉移差異為特定個人轉移所有日期,但要為其他每一個人分別使用不同的轉移差異。

如要進一步瞭解日期轉移,請參閱日期轉移一文。

以下是幾種不同程式語言的範例程式碼,示範如何透過日期轉移,使用 Cloud DLP API 將日期去識別化。

Java

/**
 * @param inputCsvPath The path to the CSV file to deidentify
 * @param outputCsvPath (Optional) path to the output CSV file
 * @param dateFields The list of (date) fields in the CSV file to date shift
 * @param lowerBoundDays The maximum number of days to shift a date backward
 * @param upperBoundDays The maximum number of days to shift a date forward
 * @param contextFieldId (Optional) The column to determine date shift, default : a random shift
 *     amount
 * @param wrappedKey (Optional) The encrypted ('wrapped') AES-256 key to use when shifting dates
 * @param keyName (Optional) The name of the Cloud KMS key used to encrypt ('wrap') the AES-256
 *     key
 * @param projectId ID of Google Cloud project to run the API under.
 */
private static void deidentifyWithDateShift(
    Path inputCsvPath,
    Path outputCsvPath,
    String[] dateFields,
    int lowerBoundDays,
    int upperBoundDays,
    String contextFieldId,
    String wrappedKey,
    String keyName,
    String projectId)
    throws Exception {
  // instantiate a client
  try (DlpServiceClient dlpServiceClient = DlpServiceClient.create()) {

    // Set the maximum days to shift a day backward (lowerbound), forward (upperbound)
    DateShiftConfig.Builder dateShiftConfigBuilder =
        DateShiftConfig.newBuilder()
            .setLowerBoundDays(lowerBoundDays)
            .setUpperBoundDays(upperBoundDays);

    // If contextFieldId, keyName or wrappedKey is set: all three arguments must be valid
    if (contextFieldId != null && keyName != null && wrappedKey != null) {
      dateShiftConfigBuilder.setContext(FieldId.newBuilder().setName(contextFieldId).build());
      KmsWrappedCryptoKey kmsWrappedCryptoKey =
          KmsWrappedCryptoKey.newBuilder()
              .setCryptoKeyName(keyName)
              .setWrappedKey(ByteString.copyFrom(BaseEncoding.base64().decode(wrappedKey)))
              .build();
      dateShiftConfigBuilder.setCryptoKey(
          CryptoKey.newBuilder().setKmsWrapped(kmsWrappedCryptoKey).build());

    } else if (contextFieldId != null || keyName != null || wrappedKey != null) {
      throw new IllegalArgumentException(
          "You must set either ALL or NONE of {contextFieldId, keyName, wrappedKey}!");
    }

    // Read and parse the CSV file
    BufferedReader br = null;
    String line;
    List<Table.Row> rows = new ArrayList<>();
    List<FieldId> headers;

    br = new BufferedReader(new FileReader(inputCsvPath.toFile()));

    // convert csv header to FieldId
    headers =
        Arrays.stream(br.readLine().split(","))
            .map(header -> FieldId.newBuilder().setName(header).build())
            .collect(Collectors.toList());

    while ((line = br.readLine()) != null) {
      // convert csv rows to Table.Row
      rows.add(convertCsvRowToTableRow(line));
    }
    br.close();

    Table table = Table.newBuilder().addAllHeaders(headers).addAllRows(rows).build();

    List<FieldId> dateFieldIds =
        Arrays.stream(dateFields)
            .map(field -> FieldId.newBuilder().setName(field).build())
            .collect(Collectors.toList());

    DateShiftConfig dateShiftConfig = dateShiftConfigBuilder.build();

    FieldTransformation fieldTransformation =
        FieldTransformation.newBuilder()
            .addAllFields(dateFieldIds)
            .setPrimitiveTransformation(
                PrimitiveTransformation.newBuilder().setDateShiftConfig(dateShiftConfig).build())
            .build();

    DeidentifyConfig deidentifyConfig =
        DeidentifyConfig.newBuilder()
            .setRecordTransformations(
                RecordTransformations.newBuilder()
                    .addFieldTransformations(fieldTransformation)
                    .build())
            .build();

    ContentItem tableItem = ContentItem.newBuilder().setTable(table).build();

    DeidentifyContentRequest request =
        DeidentifyContentRequest.newBuilder()
            .setParent(ProjectName.of(projectId).toString())
            .setDeidentifyConfig(deidentifyConfig)
            .setItem(tableItem)
            .build();

    // Execute the deidentification request
    DeidentifyContentResponse response = dlpServiceClient.deidentifyContent(request);

    // Write out the response as a CSV file
    List<FieldId> outputHeaderFields = response.getItem().getTable().getHeadersList();
    List<Table.Row> outputRows = response.getItem().getTable().getRowsList();

    List<String> outputHeaders =
        outputHeaderFields.stream().map(FieldId::getName).collect(Collectors.toList());

    File outputFile = outputCsvPath.toFile();
    if (!outputFile.exists()) {
      outputFile.createNewFile();
    }
    BufferedWriter bufferedWriter = new BufferedWriter(new FileWriter(outputFile));

    // write out headers
    bufferedWriter.append(String.join(",", outputHeaders) + "\n");

    // write out each row
    for (Table.Row outputRow : outputRows) {
      String row =
          outputRow
              .getValuesList()
              .stream()
              .map(value -> value.getStringValue())
              .collect(Collectors.joining(","));
      bufferedWriter.append(row + "\n");
    }

    bufferedWriter.flush();
    bufferedWriter.close();

    System.out.println("Successfully saved date-shift output to: " + outputCsvPath.getFileName());
  } catch (Exception e) {
    System.out.println("Error in deidentifyWithDateShift: " + e.getMessage());
  }
}

// Parse string to valid date, return null when invalid
private static LocalDate getValidDate(String dateString) {
  try {
    return LocalDate.parse(dateString);
  } catch (DateTimeParseException e) {
    return null;
  }
}

// convert CSV row into Table.Row
private static Table.Row convertCsvRowToTableRow(String row) {
  String[] values = row.split(",");
  Table.Row.Builder tableRowBuilder = Table.Row.newBuilder();
  for (String value : values) {
    LocalDate date = getValidDate(value);
    if (date != null) {
      // convert to com.google.type.Date
      Date dateValue =
          Date.newBuilder()
              .setYear(date.getYear())
              .setMonth(date.getMonthValue())
              .setDay(date.getDayOfMonth())
              .build();
      Value tableValue = Value.newBuilder().setDateValue(dateValue).build();
      tableRowBuilder.addValues(tableValue);
    } else {
      tableRowBuilder.addValues(Value.newBuilder().setStringValue(value).build());
    }
  }
  return tableRowBuilder.build();
}

Node.js

// Imports the Google Cloud Data Loss Prevention library
const DLP = require('@google-cloud/dlp');

// Instantiates a client
const dlp = new DLP.DlpServiceClient();

// Import other required libraries
const fs = require('fs');

// The project ID to run the API call under
// const callingProjectId = process.env.GCLOUD_PROJECT;

// The path to the CSV file to deidentify
// The first row of the file must specify column names, and all other rows
// must contain valid values
// const inputCsvFile = '/path/to/input/file.csv';

// The path to save the date-shifted CSV file to
// const outputCsvFile = '/path/to/output/file.csv';

// The list of (date) fields in the CSV file to date shift
// const dateFields = [{ name: 'birth_date'}, { name: 'register_date' }];

// The maximum number of days to shift a date backward
// const lowerBoundDays = 1;

// The maximum number of days to shift a date forward
// const upperBoundDays = 1;

// (Optional) The column to determine date shift amount based on
// If this is not specified, a random shift amount will be used for every row
// If this is specified, then 'wrappedKey' and 'keyName' must also be set
// const contextFieldId = [{ name: 'user_id' }];

// (Optional) The name of the Cloud KMS key used to encrypt ('wrap') the AES-256 key
// If this is specified, then 'wrappedKey' and 'contextFieldId' must also be set
// const keyName = 'projects/YOUR_GCLOUD_PROJECT/locations/YOUR_LOCATION/keyRings/YOUR_KEYRING_NAME/cryptoKeys/YOUR_KEY_NAME';

// (Optional) The encrypted ('wrapped') AES-256 key to use when shifting dates
// This key should be encrypted using the Cloud KMS key specified above
// If this is specified, then 'keyName' and 'contextFieldId' must also be set
// const wrappedKey = 'YOUR_ENCRYPTED_AES_256_KEY'

// Helper function for converting CSV rows to Protobuf types
const rowToProto = row => {
  const values = row.split(',');
  const convertedValues = values.map(value => {
    if (Date.parse(value)) {
      const date = new Date(value);
      return {
        dateValue: {
          year: date.getFullYear(),
          month: date.getMonth() + 1,
          day: date.getDate(),
        },
      };
    } else {
      // Convert all non-date values to strings
      return {stringValue: value.toString()};
    }
  });
  return {values: convertedValues};
};

// Read and parse a CSV file
const csvLines = fs
  .readFileSync(inputCsvFile)
  .toString()
  .split('\n')
  .filter(line => line.includes(','));
const csvHeaders = csvLines[0].split(',');
const csvRows = csvLines.slice(1);

// Construct the table object
const tableItem = {
  table: {
    headers: csvHeaders.map(header => {
      return {name: header};
    }),
    rows: csvRows.map(row => rowToProto(row)),
  },
};

// Construct DateShiftConfig
const dateShiftConfig = {
  lowerBoundDays: lowerBoundDays,
  upperBoundDays: upperBoundDays,
};

if (contextFieldId && keyName && wrappedKey) {
  dateShiftConfig.context = {name: contextFieldId};
  dateShiftConfig.cryptoKey = {
    kmsWrapped: {
      wrappedKey: wrappedKey,
      cryptoKeyName: keyName,
    },
  };
} else if (contextFieldId || keyName || wrappedKey) {
  throw new Error(
    'You must set either ALL or NONE of {contextFieldId, keyName, wrappedKey}!'
  );
}

// Construct deidentification request
const request = {
  parent: dlp.projectPath(callingProjectId),
  deidentifyConfig: {
    recordTransformations: {
      fieldTransformations: [
        {
          fields: dateFields,
          primitiveTransformation: {
            dateShiftConfig: dateShiftConfig,
          },
        },
      ],
    },
  },
  item: tableItem,
};

try {
  // Run deidentification request
  const [response] = await dlp.deidentifyContent(request);
  const tableRows = response.item.table.rows;

  // Write results to a CSV file
  tableRows.forEach((row, rowIndex) => {
    const rowValues = row.values.map(
      value =>
        value.stringValue ||
        `${value.dateValue.month}/${value.dateValue.day}/${
          value.dateValue.year
        }`
    );
    csvLines[rowIndex + 1] = rowValues.join(',');
  });
  csvLines.push('');
  fs.writeFileSync(outputCsvFile, csvLines.join('\n'));

  // Print status
  console.log(`Successfully saved date-shift output to ${outputCsvFile}`);
} catch (err) {
  console.log(`Error in deidentifyWithDateShift: ${err.message || err}`);
}

Python

def deidentify_with_date_shift(project, input_csv_file=None,
                               output_csv_file=None, date_fields=None,
                               lower_bound_days=None, upper_bound_days=None,
                               context_field_id=None, wrapped_key=None,
                               key_name=None):
    """Uses the Data Loss Prevention API to deidentify dates in a CSV file by
        pseudorandomly shifting them.
    Args:
        project: The Google Cloud project id to use as a parent resource.
        input_csv_file: The path to the CSV file to deidentify. The first row
            of the file must specify column names, and all other rows must
            contain valid values.
        output_csv_file: The path to save the date-shifted CSV file.
        date_fields: The list of (date) fields in the CSV file to date shift.
            Example: ['birth_date', 'register_date']
        lower_bound_days: The maximum number of days to shift a date backward
        upper_bound_days: The maximum number of days to shift a date forward
        context_field_id: (Optional) The column to determine date shift amount
            based on. If this is not specified, a random shift amount will be
            used for every row. If this is specified, then 'wrappedKey' and
            'keyName' must also be set. Example:
            contextFieldId = [{ 'name': 'user_id' }]
        key_name: (Optional) The name of the Cloud KMS key used to encrypt
            ('wrap') the AES-256 key. Example:
            key_name = 'projects/YOUR_GCLOUD_PROJECT/locations/YOUR_LOCATION/
            keyRings/YOUR_KEYRING_NAME/cryptoKeys/YOUR_KEY_NAME'
        wrapped_key: (Optional) The encrypted ('wrapped') AES-256 key to use.
            This key should be encrypted using the Cloud KMS key specified by
            key_name.
    Returns:
        None; the response from the API is printed to the terminal.
    """
    # Import the client library
    import google.cloud.dlp

    # Instantiate a client
    dlp = google.cloud.dlp.DlpServiceClient()

    # Convert the project id into a full resource id.
    parent = dlp.project_path(project)

    # Convert date field list to Protobuf type
    def map_fields(field):
        return {'name': field}

    if date_fields:
        date_fields = map(map_fields, date_fields)
    else:
        date_fields = []

    # Read and parse the CSV file
    import csv
    from datetime import datetime
    f = []
    with open(input_csv_file, 'r') as csvfile:
        reader = csv.reader(csvfile)
        for row in reader:
            f.append(row)

    #  Helper function for converting CSV rows to Protobuf types
    def map_headers(header):
        return {'name': header}

    def map_data(value):
        try:
            date = datetime.strptime(value, '%m/%d/%Y')
            return {
                'date_value': {
                    'year': date.year,
                    'month': date.month,
                    'day': date.day
                }
            }
        except ValueError:
            return {'string_value': value}

    def map_rows(row):
        return {'values': map(map_data, row)}

    # Using the helper functions, convert CSV rows to protobuf-compatible
    # dictionaries.
    csv_headers = map(map_headers, f[0])
    csv_rows = map(map_rows, f[1:])

    # Construct the table dict
    table_item = {
        'table': {
            'headers': csv_headers,
            'rows': csv_rows
        }
    }

    # Construct date shift config
    date_shift_config = {
        'lower_bound_days': lower_bound_days,
        'upper_bound_days': upper_bound_days
    }

    # If using a Cloud KMS key, add it to the date_shift_config.
    # The wrapped key is base64-encoded, but the library expects a binary
    # string, so decode it here.
    if context_field_id and key_name and wrapped_key:
        import base64
        date_shift_config['context'] = {'name': context_field_id}
        date_shift_config['crypto_key'] = {
            'kms_wrapped': {
                'wrapped_key': base64.b64decode(wrapped_key),
                'crypto_key_name': key_name
            }
        }
    elif context_field_id or key_name or wrapped_key:
        raise ValueError("""You must set either ALL or NONE of
        [context_field_id, key_name, wrapped_key]!""")

    # Construct Deidentify Config
    deidentify_config = {
        'record_transformations': {
            'field_transformations': [
                {
                    'fields': date_fields,
                    'primitive_transformation': {
                        'date_shift_config': date_shift_config
                    }
                }
            ]
        }
    }

    # Write to CSV helper methods
    def write_header(header):
        return header.name

    def write_data(data):
        return data.string_value or '%s/%s/%s' % (data.date_value.month,
                                                  data.date_value.day,
                                                  data.date_value.year)

    # Call the API
    response = dlp.deidentify_content(
        parent, deidentify_config=deidentify_config, item=table_item)

    # Write results to CSV file
    with open(output_csv_file, 'w') as csvfile:
        write_file = csv.writer(csvfile, delimiter=',')
        write_file.writerow(map(write_header, response.item.table.headers))
        for row in response.item.table.rows:
            write_file.writerow(map(write_data, row.values))
    # Print status
    print('Successfully saved date-shift output to {}'.format(
        output_csv_file))

Go

// deidentifyDateShift shifts dates found in the input between lowerBoundDays and
// upperBoundDays.
func deidentifyDateShift(w io.Writer, client *dlp.Client, project string, lowerBoundDays, upperBoundDays int32, input string) {
	// Create a configured request.
	req := &dlppb.DeidentifyContentRequest{
		Parent: "projects/" + project,
		DeidentifyConfig: &dlppb.DeidentifyConfig{
			Transformation: &dlppb.DeidentifyConfig_InfoTypeTransformations{
				InfoTypeTransformations: &dlppb.InfoTypeTransformations{
					Transformations: []*dlppb.InfoTypeTransformations_InfoTypeTransformation{
						{
							InfoTypes: []*dlppb.InfoType{}, // Match all info types.
							PrimitiveTransformation: &dlppb.PrimitiveTransformation{
								Transformation: &dlppb.PrimitiveTransformation_DateShiftConfig{
									DateShiftConfig: &dlppb.DateShiftConfig{
										LowerBoundDays: lowerBoundDays,
										UpperBoundDays: upperBoundDays,
									},
								},
							},
						},
					},
				},
			},
		},
		// The InspectConfig is used to identify the DATE fields.
		InspectConfig: &dlppb.InspectConfig{
			InfoTypes: []*dlppb.InfoType{
				{
					Name: "DATE",
				},
			},
		},
		// The item to analyze.
		Item: &dlppb.ContentItem{
			DataItem: &dlppb.ContentItem_Value{
				Value: input,
			},
		},
	}
	// Send the request.
	r, err := client.DeidentifyContent(context.Background(), req)
	if err != nil {
		log.Fatal(err)
	}
	// Print the result.
	fmt.Fprint(w, r.GetItem().GetValue())
}

PHP

use Google\Cloud\Dlp\V2\ContentItem;
use Google\Cloud\Dlp\V2\CryptoKey;
use Google\Cloud\Dlp\V2\DateShiftConfig;
use Google\Cloud\Dlp\V2\DeidentifyConfig;
use Google\Cloud\Dlp\V2\DlpServiceClient;
use Google\Cloud\Dlp\V2\FieldId;
use Google\Cloud\Dlp\V2\FieldTransformation;
use Google\Cloud\Dlp\V2\KmsWrappedCryptoKey;
use Google\Cloud\Dlp\V2\PrimitiveTransformation;
use Google\Cloud\Dlp\V2\RecordTransformations;
use Google\Cloud\Dlp\V2\Table;
use Google\Cloud\Dlp\V2\Table\Row;
use Google\Cloud\Dlp\V2\Value;
use Google\Type\Date;
use DateTime;

/**
 * Deidentify dates in a CSV file by pseudorandomly shifting them.
 *
 * @param string $callingProject The GCP Project ID to run the API call under
 * @param string $inputCsvFile The path to the CSV file to deidentify
 * @param string $outputCsvFile The path to save the date-shifted CSV file to
 * @param array $dateFieldNames The list of (date) fields in the CSV file to date shift
 * @param string $lowerBoundDays The maximum number of days to shift a date backward
 * @param string $upperBoundDays The maximum number of days to shift a date forward
 * @param string contextFieldName (Optional) The column to determine date shift amount based on
 *        If this is not specified, a random shift amount will be used for every row.
 *        If this is specified, then 'wrappedKey' and 'keyName' must also be set
 * @param string keyName (Optional) The encrypted ('wrapped') AES-256 key to use when shifting dates
 *        If this is specified, then 'wrappedKey' and 'contextFieldName' must also be set
 * @param string wrappedKey (Optional) The name of the Cloud KMS key used to encrypt ('wrap') the AES-256 key
 *        If this is specified, then 'keyName' and 'contextFieldName' must also be set
 */
function deidentify_dates(
    $callingProjectId,
    $inputCsvFile,
    $outputCsvFile,
    $dateFieldNames,
    $lowerBoundDays,
    $upperBoundDays,
    $contextFieldName = '',
    $keyName = '',
    $wrappedKey = ''
) {
    // Instantiate a client.
    $dlp = new DlpServiceClient();

    // Read a CSV file
    $csvLines = file($inputCsvFile, FILE_IGNORE_NEW_LINES);
    $csvHeaders = explode(',', $csvLines[0]);
    $csvRows = array_slice($csvLines, 1);

    // Convert CSV file into protobuf objects
    $tableHeaders = array_map(function ($csvHeader) {
        return (new FieldId)->setName($csvHeader);
    }, $csvHeaders);

    $tableRows = array_map(function ($csvRow) {
        $rowValues = array_map(function ($csvValue) {
            if ($csvDate = DateTime::createFromFormat('m/d/Y', $csvValue)) {
                $date = (new Date())
                    ->setYear((int) $csvDate->format('Y'))
                    ->setMonth((int) $csvDate->format('m'))
                    ->setDay((int) $csvDate->format('d'));
                return (new Value())
                    ->setDateValue($date);
            } else {
                return (new Value())
                    ->setStringValue($csvValue);
            }
        }, explode(',', $csvRow));

        return (new Row())
            ->setValues($rowValues);
    }, $csvRows);

    // Convert date fields into protobuf objects
    $dateFields = array_map(function ($dateFieldName) {
        return (new FieldId())->setName($dateFieldName);
    }, $dateFieldNames);

    // Construct the table object
    $table = (new Table())
        ->setHeaders($tableHeaders)
        ->setRows($tableRows);

    $item = (new ContentItem())
        ->setTable($table);

    // Construct dateShiftConfig
    $dateShiftConfig = (new DateShiftConfig())
        ->setLowerBoundDays($lowerBoundDays)
        ->setUpperBoundDays($upperBoundDays);

    if ($contextFieldName && $keyName && $wrappedKey) {
        $contextField = (new FieldId())
            ->setName($contextFieldName);

        // Create the wrapped crypto key configuration object
        $kmsWrappedCryptoKey = (new KmsWrappedCryptoKey())
            ->setWrappedKey(base64_decode($wrappedKey))
            ->setCryptoKeyName($keyName);

        $cryptoKey = (new CryptoKey())
            ->setKmsWrapped($kmsWrappedCryptoKey);

        $dateShiftConfig
            ->setContext($contextField)
            ->setCryptoKey($cryptoKey);
    } elseif ($contextFieldName || $keyName || $wrappedKey) {
        throw new Exception('You must set either ALL or NONE of {$contextFieldName, $keyName, $wrappedKey}!');
    }

    // Create the information transform configuration objects
    $primitiveTransformation = (new PrimitiveTransformation())
        ->setDateShiftConfig($dateShiftConfig);

    $fieldTransformation = (new FieldTransformation())
        ->setPrimitiveTransformation($primitiveTransformation)
        ->setFields($dateFields);

    $recordTransformations = (new RecordTransformations())
        ->setFieldTransformations([$fieldTransformation]);

    // Create the deidentification configuration object
    $deidentifyConfig = (new DeidentifyConfig())
        ->setRecordTransformations($recordTransformations);

    $parent = $dlp->projectName($callingProjectId);

    // Run request
    $response = $dlp->deidentifyContent($parent, [
        'deidentifyConfig' => $deidentifyConfig,
        'item' => $item
    ]);

    // Check for errors
    foreach ($response->getOverview()->getTransformationSummaries() as $summary) {
        foreach ($summary->getResults() as $result) {
            if ($details = $result->getDetails()) {
                printf('Error: %s' . PHP_EOL, $details);
                return;
            }
        }
    }

    // Save the results to a file
    $csvRef = fopen($outputCsvFile, 'w');
    fputcsv($csvRef, $csvHeaders);
    foreach ($response->getItem()->getTable()->getRows() as $tableRow) {
        $values = array_map(function ($tableValue) {
            if ($tableValue->getStringValue()) {
                return $tableValue->getStringValue();
            }
            $protoDate = $tableValue->getDateValue();
            $date = mktime(0, 0, 0, $protoDate->getMonth(), $protoDate->getDay(), $protoDate->getYear());
            return strftime('%D', $date);
        }, iterator_to_array($tableRow->getValues()));
        fputcsv($csvRef, $values);
    };
    fclose($csvRef);
    printf('Deidentified dates written to %s' . PHP_EOL, $outputCsvFile);
}

C#

public static object DeidDateShift(
    string projectId,
    string inputCsvFile,
    string outputCsvFile,
    int lowerBoundDays,
    int upperBoundDays,
    string dateFields,
    string contextField = "",
    string keyName = "",
    string wrappedKey = "")
{
    var dlp = DlpServiceClient.Create();

    // Read file
    string[] csvLines = File.ReadAllLines(inputCsvFile);
    string[] csvHeaders = csvLines[0].Split(',');
    string[] csvRows = csvLines.Skip(1).ToArray();

    // Convert dates to protobuf format, and everything else to a string
    var protoHeaders = csvHeaders.Select(header => new FieldId { Name = header });
    var protoRows = csvRows.Select(CsvRow =>
    {
        var rowValues = CsvRow.Split(',');
        var protoValues = rowValues.Select(RowValue =>
        {
            System.DateTime parsedDate;
            if (System.DateTime.TryParse(RowValue, out parsedDate))
            {
                return new Value
                {
                    DateValue = new Google.Type.Date
                    {
                        Year = parsedDate.Year,
                        Month = parsedDate.Month,
                        Day = parsedDate.Day
                    }
                };
            }
            else
            {
                return new Value
                {
                    StringValue = RowValue
                };
            }
        });

        var rowObject = new Table.Types.Row();
        rowObject.Values.Add(protoValues);
        return rowObject;
    });

    var dateFieldList = dateFields
        .Split(',')
        .Select(field => new FieldId { Name = field });

    // Construct + execute the request
    var dateShiftConfig = new DateShiftConfig
    {
        LowerBoundDays = lowerBoundDays,
        UpperBoundDays = upperBoundDays
    };
    bool hasKeyName = !String.IsNullOrEmpty(keyName);
    bool hasWrappedKey = !String.IsNullOrEmpty(wrappedKey);
    bool hasContext = !String.IsNullOrEmpty(contextField);
    if (hasKeyName && hasWrappedKey && hasContext)
    {
        dateShiftConfig.Context = new FieldId { Name = contextField };
        dateShiftConfig.CryptoKey = new CryptoKey
        {
            KmsWrapped = new KmsWrappedCryptoKey
            {
                WrappedKey = ByteString.FromBase64(wrappedKey),
                CryptoKeyName = keyName
            }
        };
    }
    else if (hasKeyName || hasWrappedKey || hasContext)
    {
        throw new ArgumentException("Must specify ALL or NONE of: {contextFieldId, keyName, wrappedKey}!");
    }

    var deidConfig = new DeidentifyConfig
    {
        RecordTransformations = new RecordTransformations
        {
            FieldTransformations =
            {
                new FieldTransformation
                {
                    PrimitiveTransformation = new PrimitiveTransformation
                    {
                        DateShiftConfig = dateShiftConfig
                    },
                    Fields = { dateFieldList }
                }
            }
        }
    };

    DeidentifyContentResponse response = dlp.DeidentifyContent(
        new DeidentifyContentRequest
        {
            Parent = $"projects/{projectId}",
            DeidentifyConfig = deidConfig,
            Item = new ContentItem
            {
                Table = new Table
                {
                    Headers = { protoHeaders },
                    Rows = { protoRows }
                }
            }
        });

    // Save the results
    List<String> outputLines = new List<string>();
    outputLines.Add(csvLines[0]);

    outputLines.AddRange(response.Item.Table.Rows.Select(ProtoRow =>
    {
        var Values = ProtoRow.Values.Select(ProtoValue =>
        {
            if (ProtoValue.DateValue != null)
            {
                var ProtoDate = ProtoValue.DateValue;
                System.DateTime Date = new System.DateTime(
                    ProtoDate.Year, ProtoDate.Month, ProtoDate.Day);
                return Date.ToShortDateString();
            }
            else
            {
                return ProtoValue.StringValue;
            }
        });
        return String.Join(',', Values);
    }));

    File.WriteAllLines(outputCsvFile, outputLines);

    return 0;
}

時間擷取

執行時間擷取 (DLP API 中的 TimePartConfig) 物件會保留一部分符合特定日期、時間或時間戳記的值。您必須向 Cloud DLP 指定要擷取的時間值種類,包括年、月、日等 (在 TimePart 物件中列舉)。

舉例來說,假設您設定了 timePartConfig 轉換,並指定擷取 YEAR 這個時間部分。將下方第一欄中的資料傳送至 Cloud DLP 後,會在第二欄產生轉換值:

原始值 轉換值
9/21/1976 1976
6/7/1945 1945
1/20/2009 2009
7/4/1776 1776
8/1/1984 1984
4/21/1982 1982
本頁內容對您是否有任何幫助?請提供意見:

傳送您對下列選項的寶貴意見...

這個網頁
資料遺失防護說明文件