Como editar dados confidenciais de texto

O Cloud Data Loss Prevention (DLP) pode editar ou ofuscar dados confidenciais de uma string de texto. É possível alimentar informações textuais na API usando JSON por HTTP ou usar uma das bibliotecas de cliente para fazer isso por meio de várias linguagens de programação.

A API aceita os seguintes argumentos:

  • uma string de texto
  • O texto do marcador que substituirá os dados confidenciais (neste exemplo, pelo InfoType)
  • Uma lista de um ou mais InfoTypes que você quer editar

Ela retorna a string com todos os dados confidenciais substituídos pelo marcador escolhido.

Exemplo de edição de texto

Para mais informações sobre o uso de JSON com a API Cloud DLP, consulte o início rápido de JSON.

Protocolo

Entrada JSON:

{
  "item": {
     "value":"My email is test@example.com",
   },
   "deidentifyConfig": {
     "infoTypeTransformations":{
          "transformations": [
            {
              "primitiveTransformation": {
                "replaceWithInfoTypeConfig": {}
              }
            }
          ]
        }
    },
    "inspectConfig": {
      "infoTypes": {
        "name": "EMAIL_ADDRESS"
      }
    }
}

URL:

https://dlp.googleapis.com/v2/projects/[PROJECT_ID]/content:deidentify

O Cloud DLP retorna os seguintes itens depois de receber a solicitação:

Saída JSON:

{
  "item":{
    "value":"My email is [EMAIL_ADDRESS]"
  },
  "overview":{
    "transformedBytes":"16",
    "transformationSummaries":[
      {
        "infoType":{
          "name":"EMAIL_ADDRESS"
        },
        "transformation":{
          "replaceWithInfoTypeConfig":{

          }
        },
        "results":[
          {
            "count":"1",
            "code":"SUCCESS"
          }
        ],
        "transformedBytes":"16"
      }
    ]
  }
}

Java

Ver no GitHub (em inglês) Feedback

import com.google.cloud.dlp.v2.DlpServiceClient;
import com.google.privacy.dlp.v2.ContentItem;
import com.google.privacy.dlp.v2.DeidentifyConfig;
import com.google.privacy.dlp.v2.DeidentifyContentRequest;
import com.google.privacy.dlp.v2.DeidentifyContentResponse;
import com.google.privacy.dlp.v2.InfoType;
import com.google.privacy.dlp.v2.InfoTypeTransformations;
import com.google.privacy.dlp.v2.InfoTypeTransformations.InfoTypeTransformation;
import com.google.privacy.dlp.v2.InspectConfig;
import com.google.privacy.dlp.v2.LocationName;
import com.google.privacy.dlp.v2.PrimitiveTransformation;
import com.google.privacy.dlp.v2.ReplaceWithInfoTypeConfig;
import java.io.IOException;

public class DeIdentifyWithInfoType {

  public static void main(String[] args) throws Exception {
    // TODO(developer): Replace these variables before running the sample.
    String projectId = "your-project-id";
    String textToInspect =
        "My email is test@example.com";
    deIdentifyWithInfoType(projectId, textToInspect);
  }

  public static void deIdentifyWithInfoType(String projectId, String textToRedact)
      throws IOException {
    // Initialize client that will be used to send requests. This client only needs to be created
    // once, and can be reused for multiple requests. After completing all of your requests, call
    // the "close" method on the client to safely clean up any remaining background resources.
    try (DlpServiceClient dlp = DlpServiceClient.create()) {
      // Specify the content to be inspected.
      ContentItem item = ContentItem.newBuilder()
          .setValue(textToRedact).build();

      // Specify the type of info the inspection will look for.
      // See https://cloud.google.com/dlp/docs/infotypes-reference for complete list of info types
      InfoType infoType = InfoType.newBuilder().setName("EMAIL_ADDRESS").build();
      InspectConfig inspectConfig = InspectConfig.newBuilder().addInfoTypes(infoType).build();
      // Specify replacement string to be used for the finding.
      ReplaceWithInfoTypeConfig replaceWithInfoTypeConfig =
          ReplaceWithInfoTypeConfig.newBuilder().build();
      // Define type of deidentification as replacement with info type.
      PrimitiveTransformation primitiveTransformation = PrimitiveTransformation.newBuilder()
          .setReplaceWithInfoTypeConfig(replaceWithInfoTypeConfig)
          .build();
      // Associate deidentification type with info type.
      InfoTypeTransformation transformation = InfoTypeTransformation.newBuilder()
          .addInfoTypes(infoType)
          .setPrimitiveTransformation(primitiveTransformation)
          .build();
      // Construct the configuration for the Redact request and list all desired transformations.
      DeidentifyConfig redactConfig = DeidentifyConfig.newBuilder()
          .setInfoTypeTransformations(InfoTypeTransformations.newBuilder()
              .addTransformations(transformation))
          .build();

      // Construct the Redact request to be sent by the client.
      DeidentifyContentRequest request =
          DeidentifyContentRequest.newBuilder()
              .setParent(LocationName.of(projectId, "global").toString())
              .setItem(item)
              .setDeidentifyConfig(redactConfig)
              .setInspectConfig(inspectConfig)
              .build();

      // Use the client to send the API request.
      DeidentifyContentResponse response = dlp.deidentifyContent(request);

      // Parse the response and process results
      System.out.println("Text after redaction: " + response.getItem().getValue());
    }
  }
}

Python

def deidentify_with_replace_infotype(project, item, info_types):
    """Uses the Data Loss Prevention API to deidentify sensitive data in a
    string by replacing it with the info type.
    Args:
        project: The Google Cloud project id to use as a parent resource.
        item: The string to deidentify (will be treated as text).
        info_types: A list of strings representing info types to look for.
            A full list of info type categories can be fetched from the API.
    Returns:
        None; the response from the API is printed to the terminal.
    """

    # Import the client library
    import google.cloud.dlp

    # Instantiate a client
    dlp = google.cloud.dlp_v2.DlpServiceClient()

    # Convert the project id into a full resource id.
    parent = f"projects/{project}"

    # Construct inspect configuration dictionary
    inspect_config = {"info_types": [{"name": info_type} for info_type in info_types]}

    # Construct deidentify configuration dictionary
    deidentify_config = {
        "info_type_transformations": {
            "transformations": [
                {"primitive_transformation": {"replace_with_info_type_config": {}}}
            ]
        }
    }

    # Call the API
    response = dlp.deidentify_content(
        request={
            "parent": parent,
            "deidentify_config": deidentify_config,
            "inspect_config": inspect_config,
            "item": {"value": item},
        }
    )

    # Print out the results.
    print(response.item.value)

Tente fazer isso usando o APIs Explorer incorporado aqui.

A seguir

A edição é uma forma de desidentificação. Para saber mais sobre como desidentificar conteúdo, consulte Como desidentificar dados confidenciais no conteúdo do texto.