Redacting sensitive data from text content

Cloud Data Loss Prevention (DLP) can redact or obfuscate sensitive data from a string of text. You can feed textual information to the API using JSON over HTTP, or use one of the client libraries to do so using several popular programming languages.

The API takes the following as arguments:

  • A string of text
  • The placeholder text that will replace sensitive data (in this example, by its InfoType)
  • A list of one or more infoTypes that you want to redact

It returns the string with any sensitive data replaced by your chosen placeholder.

Example text redaction

See the JSON quickstart for more information about using JSON.

JSON Input:

POST https://dlp.googleapis.com/v2/projects/[PROJECT_ID]/content:deidentify?key={YOUR_API_KEY}

{
  "item": {
     "value":"My email is test@example.com",
   },
   "deidentifyConfig": {
     "infoTypeTransformations":{
          "transformations": [
            {
              "primitiveTransformation": {
                "replaceWithInfoTypeConfig": {}
              }
            }
          ]
        }
    },
    "inspectConfig": {
      "infoTypes": {
        "name": "EMAIL_ADDRESS"
      }
    }
}

JSON Output:

{
  "item":{
    "value":"My email is [EMAIL_ADDRESS]"
  },
  "overview":{
    "transformedBytes":"16",
    "transformationSummaries":[
      {
        "infoType":{
          "name":"EMAIL_ADDRESS"
        },
        "transformation":{
          "replaceWithInfoTypeConfig":{

          }
        },
        "results":[
          {
            "count":"1",
            "code":"SUCCESS"
          }
        ],
        "transformedBytes":"16"
      }
    ]
  }
}

Next steps

Redaction is one form of de-identification. To learn more about how to de-identify content, see De-identifying sensitive data in text content.

Was this page helpful? Let us know how we did:

Send feedback about...

Cloud Data Loss Prevention