Modificar los detectores de InfoType para acotar los resultados del análisis

Los detectores de infoType integrados de Protección de Datos Sensibles son eficaces para encontrar tipos comunes de datos sensibles. Los detectores de infoType personalizados te permiten personalizar por completo tu propio detector de datos sensibles. Las reglas de inspección ayudan a acotar los resultados del análisis que devuelve Protección de Datos Sensibles modificando el mecanismo de detección de un detector de InfoType concreto.

Si quiere excluir o incluir más valores de los resultados que devuelve un detector de infoType integrado, puede crear un nuevo infoType personalizado desde cero y definir todos los criterios que debe buscar Protección de Datos Sensibles. También puede acotar los resultados que devuelven los detectores integrados o personalizados de Protección de Datos Sensibles según los criterios que especifique. Para ello, puedes añadir reglas de inspección que te ayuden a reducir el ruido, aumentar la precisión y la cobertura, o ajustar la certeza de las detecciones de los análisis.

En este tema se explica cómo usar los dos tipos de reglas de inspección para excluir determinados resultados o añadir otros, todo ello según los criterios personalizados que especifiques. En este tema se describen varias situaciones en las que puede que quieras modificar un detector de infoType.

Hay dos tipos de reglas de inspección:

Reglas de exclusión

Las reglas de exclusión son útiles en situaciones como las siguientes:

  • Quieres excluir las coincidencias de análisis duplicadas en los resultados que se deben a detectores de infoType superpuestos. Por ejemplo, estás buscando direcciones de correo y números de teléfono, pero recibes dos resultados de direcciones de correo que incluyen números de teléfono, como "206-555-0764@example.org".
  • Hay ruido en los resultados del análisis. Por ejemplo, si se devuelve la misma dirección de correo ficticia (como example@example.com) o el mismo dominio (como example.com) un número excesivo de veces en una búsqueda de direcciones de correo legítimas.
  • Tienes una lista de términos, frases o combinaciones de caracteres que quieres excluir de los resultados.
  • Quieres excluir una columna de datos completa de los resultados.
  • Quieres excluir los resultados que estén cerca de una cadena que coincida con una expresión regular.

Las reglas de exclusión no se pueden aplicar a los infoTypes de objeto.

Descripción general de la API Exclusion Rules

Protección de datos sensibles define una regla de exclusión en el objeto ExclusionRule. En ExclusionRule, especifica una de las siguientes opciones:

  • Un objeto Dictionary que contiene una lista de cadenas que se excluirán de los resultados.
  • Un objeto Regex que define un patrón de expresión regular. Las cadenas que coincidan con el patrón se excluyen de los resultados.
  • Un objeto ExcludeInfoTypes que contiene una matriz de detectores de infoType. Si un resultado coincide con alguno de los detectores de infoType que se indican aquí, se excluye de los resultados.
  • Un objeto ExcludeByHotword que contiene lo siguiente:

    • Expresión regular que define la palabra de activación.
    • Un valor de proximidad que define lo cerca que debe estar la palabra clave activadora del resultado.

    Si el hallazgo se encuentra dentro de la proximidad establecida, se excluye de los resultados. En el caso de las tablas, este tipo de regla de exclusión le permite excluir una columna de datos completa de los resultados.

Ejemplos de reglas de exclusión

Cada uno de los siguientes fragmentos de JSON muestra cómo configurar Protección de Datos Sensibles para el caso concreto.

Omitir una dirección de correo específica del análisis del detector EMAIL_ADDRESS

El siguiente fragmento de código JSON y el código en varios idiomas muestran cómo indicar a Protección de Datos Sensibles mediante un InspectConfig que debe evitar buscar coincidencias con "example@example.com" en un análisis que utilice el detector de infoType EMAIL_ADDRESS:

C#

Para saber cómo instalar y usar la biblioteca de cliente de Protección de Datos Sensibles, consulta el artículo sobre las bibliotecas de cliente de Protección de Datos Sensibles.

Para autenticarte en Protección de Datos Sensibles, configura las credenciales predeterminadas de la aplicación. Para obtener más información, consulta el artículo Configurar la autenticación en un entorno de desarrollo local.


using System;
using System.Collections.Generic;
using System.Linq;
using Google.Api.Gax.ResourceNames;
using Google.Cloud.Dlp.V2;

public class InspectStringWithExclusionDict
{
    public static InspectContentResponse Inspect(string projectId, string textToInspect, List<String> excludedMatchList)
    {
        var dlp = DlpServiceClient.Create();

        var byteContentItem = new ByteContentItem
        {
            Type = ByteContentItem.Types.BytesType.TextUtf8,
            Data = Google.Protobuf.ByteString.CopyFromUtf8(textToInspect)
        };

        var contentItem = new ContentItem { ByteItem = byteContentItem };

        var infoTypes = new string[] { "PHONE_NUMBER", "EMAIL_ADDRESS", "CREDIT_CARD_NUMBER" }.Select(it => new InfoType { Name = it });

        var exclusionRule = new ExclusionRule
        {
            MatchingType = MatchingType.FullMatch,
            Dictionary = new CustomInfoType.Types.Dictionary
            {
                WordList = new CustomInfoType.Types.Dictionary.Types.WordList
                {
                    Words = { excludedMatchList }
                }
            }
        };

        var ruleSet = new InspectionRuleSet
        {
            InfoTypes = { new InfoType { Name = "EMAIL_ADDRESS" } },
            Rules = { new InspectionRule { ExclusionRule = exclusionRule } }
        };

        var config = new InspectConfig
        {
            InfoTypes = { infoTypes },
            IncludeQuote = true,
            RuleSet = { ruleSet }
        };

        var request = new InspectContentRequest
        {
            Parent = new LocationName(projectId, "global").ToString(),
            Item = contentItem,
            InspectConfig = config
        };

        var response = dlp.InspectContent(request);

        Console.WriteLine($"Findings: {response.Result.Findings.Count}");
        foreach (var f in response.Result.Findings)
        {
            Console.WriteLine("\tQuote: " + f.Quote);
            Console.WriteLine("\tInfo type: " + f.InfoType.Name);
            Console.WriteLine("\tLikelihood: " + f.Likelihood);
        }

        return response;
    }
}

Go

Para saber cómo instalar y usar la biblioteca de cliente de Protección de Datos Sensibles, consulta el artículo sobre las bibliotecas de cliente de Protección de Datos Sensibles.

Para autenticarte en Protección de Datos Sensibles, configura las credenciales predeterminadas de la aplicación. Para obtener más información, consulta el artículo Configurar la autenticación en un entorno de desarrollo local.

import (
	"context"
	"fmt"
	"io"

	dlp "cloud.google.com/go/dlp/apiv2"
	"cloud.google.com/go/dlp/apiv2/dlppb"
)

// inspectStringWithExclusionDictionary inspects a string for sensitive data
// using exclusion dictionary, here the function omits a specific email address
// from an EMAIL_ADDRESS detector scan with an exclusion dictionary
func inspectStringWithExclusionDictionary(w io.Writer, projectID, textToInspect string, excludedMatchList []string) error {
	// projectID := "my-project-id"
	// textToInspect := "Some email addresses: gary@example.com, example@example.com"
	// excludedMatchList := []string{"example@example.com"}

	ctx := context.Background()

	// Initialize a client once and reuse it to send multiple requests. Clients
	// are safe to use across goroutines. When the client is no longer needed,
	// call the Close method to cleanup its resources.
	client, err := dlp.NewClient(ctx)
	if err != nil {
		return err
	}

	// Closing the client safely cleans up background resources.
	defer client.Close()

	// Specify the type and content to be inspected.
	contentItem := &dlppb.ContentItem{
		DataItem: &dlppb.ContentItem_ByteItem{
			ByteItem: &dlppb.ByteContentItem{
				Type: dlppb.ByteContentItem_TEXT_UTF8,
				Data: []byte(textToInspect),
			},
		},
	}

	// Specify the type of info the inspection will look for.
	// See https://cloud.google.com/dlp/docs/infotypes-reference for complete list of info types.
	infoTypes := []*dlppb.InfoType{
		{Name: "PHONE_NUMBER"},
		{Name: "EMAIL_ADDRESS"},
		{Name: "CREDIT_CARD_NUMBER"},
	}

	// Exclude matches from the specified excludedMatchList.
	exclusionRule := &dlppb.ExclusionRule{
		Type: &dlppb.ExclusionRule_Dictionary{
			Dictionary: &dlppb.CustomInfoType_Dictionary{
				Source: &dlppb.CustomInfoType_Dictionary_WordList_{
					WordList: &dlppb.CustomInfoType_Dictionary_WordList{
						Words: excludedMatchList,
					},
				},
			},
		},
		MatchingType: dlppb.MatchingType_MATCHING_TYPE_FULL_MATCH,
	}

	// Construct a ruleset that applies the exclusion rule to the EMAIL_ADDRESSES infoType.
	ruleSet := &dlppb.InspectionRuleSet{
		InfoTypes: []*dlppb.InfoType{
			{Name: "EMAIL_ADDRESS"},
		},
		Rules: []*dlppb.InspectionRule{
			{
				Type: &dlppb.InspectionRule_ExclusionRule{
					ExclusionRule: exclusionRule,
				},
			},
		},
	}

	// Construct the Inspect request to be sent by the client.
	req := &dlppb.InspectContentRequest{
		Parent: fmt.Sprintf("projects/%s/locations/global", projectID),
		Item:   contentItem,
		// Construct the configuration for the Inspect request, including the ruleSet.
		InspectConfig: &dlppb.InspectConfig{
			InfoTypes:    infoTypes,
			IncludeQuote: true,
			RuleSet: []*dlppb.InspectionRuleSet{
				ruleSet,
			},
		},
	}

	// Send the request.
	resp, err := client.InspectContent(ctx, req)
	if err != nil {
		return err
	}

	// Process the results.
	fmt.Fprintf(w, "Findings: %v\n", len(resp.Result.Findings))
	for _, v := range resp.GetResult().Findings {
		fmt.Fprintf(w, "Quote: %v\n", v.GetQuote())
		fmt.Fprintf(w, "Infotype Name: %v\n", v.GetInfoType().GetName())
		fmt.Fprintf(w, "Likelihood: %v\n", v.GetLikelihood())
	}
	return nil
}

Java

Para saber cómo instalar y usar la biblioteca de cliente de Protección de Datos Sensibles, consulta el artículo sobre las bibliotecas de cliente de Protección de Datos Sensibles.

Para autenticarte en Protección de Datos Sensibles, configura las credenciales predeterminadas de la aplicación. Para obtener más información, consulta el artículo Configurar la autenticación en un entorno de desarrollo local.


import com.google.cloud.dlp.v2.DlpServiceClient;
import com.google.privacy.dlp.v2.ByteContentItem;
import com.google.privacy.dlp.v2.ByteContentItem.BytesType;
import com.google.privacy.dlp.v2.ContentItem;
import com.google.privacy.dlp.v2.CustomInfoType.Dictionary;
import com.google.privacy.dlp.v2.CustomInfoType.Dictionary.WordList;
import com.google.privacy.dlp.v2.ExclusionRule;
import com.google.privacy.dlp.v2.Finding;
import com.google.privacy.dlp.v2.InfoType;
import com.google.privacy.dlp.v2.InspectConfig;
import com.google.privacy.dlp.v2.InspectContentRequest;
import com.google.privacy.dlp.v2.InspectContentResponse;
import com.google.privacy.dlp.v2.InspectionRule;
import com.google.privacy.dlp.v2.InspectionRuleSet;
import com.google.privacy.dlp.v2.LocationName;
import com.google.privacy.dlp.v2.MatchingType;
import com.google.protobuf.ByteString;
import java.io.IOException;
import java.util.ArrayList;
import java.util.Arrays;
import java.util.List;

public class InspectStringWithExclusionDict {

  public static void main(String[] args) throws Exception {
    // TODO(developer): Replace these variables before running the sample.
    String projectId = "your-project-id";
    String textToInspect = "Some email addresses: gary@example.com, example@example.com";
    List<String> excludedMatchList = Arrays.asList("example@example.com");
    inspectStringWithExclusionDict(projectId, textToInspect, excludedMatchList);
  }

  // Inspects the provided text, avoiding matches specified in the exclusion list.
  public static void inspectStringWithExclusionDict(
      String projectId, String textToInspect, List<String> excludedMatchList) throws IOException {
    // Initialize client that will be used to send requests. This client only needs to be created
    // once, and can be reused for multiple requests. After completing all of your requests, call
    // the "close" method on the client to safely clean up any remaining background resources.
    try (DlpServiceClient dlp = DlpServiceClient.create()) {
      // Specify the type and content to be inspected.
      ByteContentItem byteItem =
          ByteContentItem.newBuilder()
              .setType(BytesType.TEXT_UTF8)
              .setData(ByteString.copyFromUtf8(textToInspect))
              .build();
      ContentItem item = ContentItem.newBuilder().setByteItem(byteItem).build();

      // Specify the type of info the inspection will look for.
      // See https://cloud.google.com/dlp/docs/infotypes-reference for complete list of info types.
      List<InfoType> infoTypes = new ArrayList<>();
      for (String typeName : new String[] {"PHONE_NUMBER", "EMAIL_ADDRESS", "CREDIT_CARD_NUMBER"}) {
        infoTypes.add(InfoType.newBuilder().setName(typeName).build());
      }

      // Exclude matches from the specified excludedMatchList.
      ExclusionRule exclusionRule =
          ExclusionRule.newBuilder()
              .setMatchingType(MatchingType.MATCHING_TYPE_FULL_MATCH)
              .setDictionary(
                  Dictionary.newBuilder()
                      .setWordList(WordList.newBuilder().addAllWords(excludedMatchList)))
              .build();

      // Construct a ruleset that applies the exclusion rule to the EMAIL_ADDRESSES infotype.
      InspectionRuleSet ruleSet =
          InspectionRuleSet.newBuilder()
              .addInfoTypes(InfoType.newBuilder().setName("EMAIL_ADDRESS"))
              .addRules(InspectionRule.newBuilder().setExclusionRule(exclusionRule))
              .build();

      // Construct the configuration for the Inspect request, including the ruleset.
      InspectConfig config =
          InspectConfig.newBuilder()
              .addAllInfoTypes(infoTypes)
              .setIncludeQuote(true)
              .addRuleSet(ruleSet)
              .build();

      // Construct the Inspect request to be sent by the client.
      InspectContentRequest request =
          InspectContentRequest.newBuilder()
              .setParent(LocationName.of(projectId, "global").toString())
              .setItem(item)
              .setInspectConfig(config)
              .build();

      // Use the client to send the API request.
      InspectContentResponse response = dlp.inspectContent(request);

      // Parse the response and process results
      System.out.println("Findings: " + response.getResult().getFindingsCount());
      for (Finding f : response.getResult().getFindingsList()) {
        System.out.println("\tQuote: " + f.getQuote());
        System.out.println("\tInfo type: " + f.getInfoType().getName());
        System.out.println("\tLikelihood: " + f.getLikelihood());
      }
    }
  }
}

Node.js

Para saber cómo instalar y usar la biblioteca de cliente de Protección de Datos Sensibles, consulta el artículo sobre las bibliotecas de cliente de Protección de Datos Sensibles.

Para autenticarte en Protección de Datos Sensibles, configura las credenciales predeterminadas de la aplicación. Para obtener más información, consulta el artículo Configurar la autenticación en un entorno de desarrollo local.

// Imports the Google Cloud Data Loss Prevention library
const DLP = require('@google-cloud/dlp');

// Instantiates a client
const dlp = new DLP.DlpServiceClient();

// The project ID to run the API call under
// const projectId = 'my-project';

// The string to inspect
// const string = 'Some email addresses: gary@example.com, example@example.com';

// The infoTypes of information to match
// const infoTypes = [{ name: 'EMAIL_ADDRESS' }];

// Words to exclude
// const excludedWords = ['gary@example.com'];

async function inspectStringWithExclusionDict() {
  // Specify the type and content to be inspected.
  const item = {
    byteItem: {
      type: DLP.protos.google.privacy.dlp.v2.ByteContentItem.BytesType
        .TEXT_UTF8,
      data: Buffer.from(string, 'utf-8'),
    },
  };

  // Exclude matches from the specified excludedWords.
  const exclusionRule = {
    dictionary: {
      wordList: {
        words: excludedWords,
      },
    },
    matchingType:
      DLP.protos.google.privacy.dlp.v2.MatchingType.MATCHING_TYPE_FULL_MATCH,
  };

  // Construct a ruleset that applies the exclusion rule to the specified infotype.
  const ruleSet = [
    {
      infoTypes: infoTypes,
      rules: [
        {
          exclusionRule: exclusionRule,
        },
      ],
    },
  ];

  // Construct the configuration for the Inspect request, including the ruleset.
  const inspectConfig = {
    infoTypes: infoTypes,
    ruleSet: ruleSet,
    includeQuote: true,
  };

  // Construct the Inspect request to be sent by the client.
  const request = {
    parent: `projects/${projectId}/locations/global`,
    inspectConfig: inspectConfig,
    item: item,
  };

  // Use the client to send the API request.
  const [response] = await dlp.inspectContent(request);

  // Print Findings
  const findings = response.result.findings;
  if (findings.length > 0) {
    console.log(`Findings: ${findings.length}\n`);
    findings.forEach(finding => {
      console.log(`InfoType: ${finding.infoType.name}`);
      console.log(`\tQuote: ${finding.quote}`);
      console.log(`\tLikelihood: ${finding.likelihood} \n`);
    });
  } else {
    console.log('No findings.');
  }
}
inspectStringWithExclusionDict();

PHP

Para saber cómo instalar y usar la biblioteca de cliente de Protección de Datos Sensibles, consulta el artículo sobre las bibliotecas de cliente de Protección de Datos Sensibles.

Para autenticarte en Protección de Datos Sensibles, configura las credenciales predeterminadas de la aplicación. Para obtener más información, consulta el artículo Configurar la autenticación en un entorno de desarrollo local.

use Google\Cloud\Dlp\V2\Client\DlpServiceClient;
use Google\Cloud\Dlp\V2\ContentItem;
use Google\Cloud\Dlp\V2\CustomInfoType\Dictionary;
use Google\Cloud\Dlp\V2\CustomInfoType\Dictionary\WordList;
use Google\Cloud\Dlp\V2\ExclusionRule;
use Google\Cloud\Dlp\V2\InfoType;
use Google\Cloud\Dlp\V2\InspectConfig;
use Google\Cloud\Dlp\V2\InspectContentRequest;
use Google\Cloud\Dlp\V2\InspectionRule;
use Google\Cloud\Dlp\V2\InspectionRuleSet;
use Google\Cloud\Dlp\V2\Likelihood;
use Google\Cloud\Dlp\V2\MatchingType;

/**
 * Inspect a string for sensitive data, using exclusion dictionary
 * Omit a specific email address from an EMAIL_ADDRESS detector scan with an exclusion dictionary.
 *
 * @param string $projectId         The Google Cloud project id to use as a parent resource.
 * @param string $textToInspect     The string to inspect.
 */
function inspect_string_with_exclusion_dict(
    // TODO(developer): Replace sample parameters before running the code.
    string $projectId,
    string $textToInspect = 'Some email addresses: gary@example.com, example@example.com'
): void {
    // Instantiate a client.
    $dlp = new DlpServiceClient();

    $parent = "projects/$projectId/locations/global";

    // Specify what content you want the service to Inspect.
    $item = (new ContentItem())
        ->setValue($textToInspect);

    // Specify the type of info the inspection will look for.
    $infotypes = [
        (new InfoType())->setName('PHONE_NUMBER'),
        (new InfoType())->setName('EMAIL_ADDRESS'),
        (new InfoType())->setName('CREDIT_CARD_NUMBER'),
    ];

    // Exclude matches from the specified excludedMatchList.
    $excludedMatchList = (new Dictionary())
        ->setWordList((new WordList())
            ->setWords(['example@example.com']));
    $matchingType = MatchingType::MATCHING_TYPE_FULL_MATCH;
    $exclusionRule = (new ExclusionRule())
        ->setMatchingType($matchingType)
        ->setDictionary($excludedMatchList);

    // Construct a ruleset that applies the exclusion rule to the EMAIL_ADDRESSES infotype.
    $emailAddress = (new InfoType())
        ->setName('EMAIL_ADDRESS');
    $inspectionRuleSet = (new InspectionRuleSet())
        ->setInfoTypes([$emailAddress])
        ->setRules([
            (new InspectionRule())
                ->setExclusionRule($exclusionRule),
        ]);

    // Construct the configuration for the Inspect request, including the ruleset.
    $inspectConfig = (new InspectConfig())
        ->setInfoTypes($infotypes)
        ->setIncludeQuote(true)
        ->setRuleSet([$inspectionRuleSet]);

    // Run request
    $inspectContentRequest = (new InspectContentRequest())
        ->setParent($parent)
        ->setInspectConfig($inspectConfig)
        ->setItem($item);
    $response = $dlp->inspectContent($inspectContentRequest);

    // Print the results
    $findings = $response->getResult()->getFindings();
    if (count($findings) == 0) {
        printf('No findings.' . PHP_EOL);
    } else {
        printf('Findings:' . PHP_EOL);
        foreach ($findings as $finding) {
            printf('  Quote: %s' . PHP_EOL, $finding->getQuote());
            printf('  Info type: %s' . PHP_EOL, $finding->getInfoType()->getName());
            printf('  Likelihood: %s' . PHP_EOL, Likelihood::name($finding->getLikelihood()));
        }
    }
}

Python

Para saber cómo instalar y usar la biblioteca de cliente de Protección de Datos Sensibles, consulta el artículo sobre las bibliotecas de cliente de Protección de Datos Sensibles.

Para autenticarte en Protección de Datos Sensibles, configura las credenciales predeterminadas de la aplicación. Para obtener más información, consulta el artículo Configurar la autenticación en un entorno de desarrollo local.

from typing import List

import google.cloud.dlp


def inspect_string_with_exclusion_dict(
    project: str,
    content_string: str,
    exclusion_list: List[str] = ["example@example.com"],
) -> None:
    """Inspects the provided text, avoiding matches specified in the exclusion list

    Uses the Data Loss Prevention API to omit matches on EMAIL_ADDRESS if they are
    in the specified exclusion list.

    Args:
        project: The Google Cloud project id to use as a parent resource.
        content_string: The string to inspect.
        exclusion_list: The list of strings to ignore matches on

    Returns:
        None; the response from the API is printed to the terminal.
    """

    # Instantiate a client.
    dlp = google.cloud.dlp_v2.DlpServiceClient()

    # Construct a list of infoTypes for DLP to locate in `content_string`. See
    # https://cloud.google.com/dlp/docs/concepts-infotypes for more information
    # about supported infoTypes.
    info_types_to_locate = [{"name": "EMAIL_ADDRESS"}]

    # Construct a rule set that will only match on EMAIL_ADDRESS
    # if the match text is not in the exclusion list.
    rule_set = [
        {
            "info_types": info_types_to_locate,
            "rules": [
                {
                    "exclusion_rule": {
                        "dictionary": {"word_list": {"words": exclusion_list}},
                        "matching_type": google.cloud.dlp_v2.MatchingType.MATCHING_TYPE_FULL_MATCH,
                    }
                }
            ],
        }
    ]

    # Construct the configuration dictionary
    inspect_config = {
        "info_types": info_types_to_locate,
        "rule_set": rule_set,
        "include_quote": True,
    }

    # Construct the `item`.
    item = {"value": content_string}

    # Convert the project id into a full resource id.
    parent = f"projects/{project}"

    # Call the API.
    response = dlp.inspect_content(
        request={"parent": parent, "inspect_config": inspect_config, "item": item}
    )

    # Print out the results.
    if response.result.findings:
        for finding in response.result.findings:
            print(f"Quote: {finding.quote}")
            print(f"Info type: {finding.info_type.name}")
            print(f"Likelihood: {finding.likelihood}")
    else:
        print("No findings.")

REST

Consulta la guía de inicio rápido de JSON para obtener más información sobre cómo usar la API de DLP con JSON.

...
    "inspectConfig":{
      "infoTypes":[
        {
          "name":"EMAIL_ADDRESS"
        }
      ],
      "ruleSet":[
        {
          "infoTypes":[
            {
              "name":"EMAIL_ADDRESS"
            }
          ],
          "rules":[
            {
              "exclusionRule":{
                "dictionary":{
                  "wordList":{
                    "words":[
                      "example@example.com"
                    ]
                  }
                },
                "matchingType": "MATCHING_TYPE_FULL_MATCH"
              }
            }
          ]
        }
      ]
    }
...

Omitir las direcciones de correo que terminan con un dominio específico en el análisis del detector EMAIL_ADDRESS

El siguiente fragmento de código JSON y el código en varios idiomas muestran cómo indicar a Protección de Datos Sensibles mediante un InspectConfig que debe evitar las coincidencias con cualquier dirección de correo que termine en "@example.com" en un análisis que utilice el detector de infoType EMAIL_ADDRESS:

C#

Para saber cómo instalar y usar la biblioteca de cliente de Protección de Datos Sensibles, consulta el artículo sobre las bibliotecas de cliente de Protección de Datos Sensibles.

Para autenticarte en Protección de Datos Sensibles, configura las credenciales predeterminadas de la aplicación. Para obtener más información, consulta el artículo Configurar la autenticación en un entorno de desarrollo local.


using System;
using System.Collections.Generic;
using System.Linq;
using Google.Api.Gax.ResourceNames;
using Google.Cloud.Dlp.V2;

public class InspectStringWithExclusionRegex
{
    public static InspectContentResponse Inspect(string projectId, string textToInspect, string excludedRegex)
    {
        var dlp = DlpServiceClient.Create();

        var byteContentItem = new ByteContentItem
        {
            Type = ByteContentItem.Types.BytesType.TextUtf8,
            Data = Google.Protobuf.ByteString.CopyFromUtf8(textToInspect)
        };

        var contentItem = new ContentItem { ByteItem = byteContentItem };

        var infoTypes = new string[] { "PHONE_NUMBER", "EMAIL_ADDRESS", "CREDIT_CARD_NUMBER" }.Select(it => new InfoType { Name = it });

        var exclusionRule = new ExclusionRule
        {
            MatchingType = MatchingType.FullMatch,
            Regex = new CustomInfoType.Types.Regex { Pattern = excludedRegex }
        };

        var ruleSet = new InspectionRuleSet
        {
            InfoTypes = { new InfoType { Name = "EMAIL_ADDRESS" } },
            Rules = { new InspectionRule { ExclusionRule = exclusionRule } }
        };

        var config = new InspectConfig
        {
            InfoTypes = { infoTypes },
            IncludeQuote = true,
            RuleSet = { ruleSet }
        };

        var request = new InspectContentRequest
        {
            Parent = new LocationName(projectId, "global").ToString(),
            Item = contentItem,
            InspectConfig = config
        };

        var response = dlp.InspectContent(request);

        Console.WriteLine($"Findings: {response.Result.Findings.Count}");
        foreach (var f in response.Result.Findings)
        {
            Console.WriteLine("\tQuote: " + f.Quote);
            Console.WriteLine("\tInfo type: " + f.InfoType.Name);
            Console.WriteLine("\tLikelihood: " + f.Likelihood);
        }

        return response;
    }
}

Go

Para saber cómo instalar y usar la biblioteca de cliente de Protección de Datos Sensibles, consulta el artículo sobre las bibliotecas de cliente de Protección de Datos Sensibles.

Para autenticarte en Protección de Datos Sensibles, configura las credenciales predeterminadas de la aplicación. Para obtener más información, consulta el artículo Configurar la autenticación en un entorno de desarrollo local.

import (
	"context"
	"fmt"
	"io"

	dlp "cloud.google.com/go/dlp/apiv2"
	"cloud.google.com/go/dlp/apiv2/dlppb"
)

// inspectStringWithExclusionRegex inspects a string excluding REGEX matches
// in this function specifically the function omits email addresses ending with a specific
// domain from EMAIL_ADDRESS detector scan.
func inspectStringWithExclusionRegex(w io.Writer, projectID, textToInspect, excludedRegex string) error {
	// projectID := "my-project-id"
	// textToInspect := "Some email addresses: gary@example.com, bob@example.org"
	// excludedRegex := ".+@example.com"

	ctx := context.Background()

	// Initialize a client once and reuse it to send multiple requests. Clients
	// are safe to use across goroutines. When the client is no longer needed,
	// call the Close method to cleanup its resources.
	client, err := dlp.NewClient(ctx)
	if err != nil {
		return err
	}

	// Closing the client safely cleans up background resources.
	defer client.Close()

	// Specify the type and content to be inspected.
	contentItem := &dlppb.ContentItem{
		DataItem: &dlppb.ContentItem_ByteItem{
			ByteItem: &dlppb.ByteContentItem{
				Type: dlppb.ByteContentItem_TEXT_UTF8,
				Data: []byte(textToInspect),
			},
		},
	}

	// Specify the type of info the inspection will look for.
	// See https://cloud.google.com/dlp/docs/infotypes-reference for complete list of info types.
	infoTypes := []*dlppb.InfoType{
		{Name: "PHONE_NUMBER"},
		{Name: "EMAIL_ADDRESS"},
		{Name: "CREDIT_CARD_NUMBER"},
	}

	// Exclude matches from the specified excludedMatchList.
	exclusionRule := &dlppb.ExclusionRule{
		Type: &dlppb.ExclusionRule_Regex{
			Regex: &dlppb.CustomInfoType_Regex{
				Pattern: excludedRegex,
			},
		},
		MatchingType: dlppb.MatchingType_MATCHING_TYPE_FULL_MATCH,
	}

	// Construct a ruleset that applies the exclusion rule to the EMAIL_ADDRESSES infotype.
	ruleSet := &dlppb.InspectionRuleSet{
		InfoTypes: []*dlppb.InfoType{
			{Name: "EMAIL_ADDRESS"},
		},
		Rules: []*dlppb.InspectionRule{
			{
				Type: &dlppb.InspectionRule_ExclusionRule{
					ExclusionRule: exclusionRule,
				},
			},
		},
	}

	// Construct the Inspect request to be sent by the client.
	req := &dlppb.InspectContentRequest{
		Parent: fmt.Sprintf("projects/%s/locations/global", projectID),
		Item:   contentItem,
		InspectConfig: &dlppb.InspectConfig{
			InfoTypes:    infoTypes,
			IncludeQuote: true,
			RuleSet: []*dlppb.InspectionRuleSet{
				ruleSet,
			},
		},
	}

	// Send the request.
	resp, err := client.InspectContent(ctx, req)
	if err != nil {
		return err
	}

	// Process the results.
	fmt.Fprintf(w, "Findings: %v\n", len(resp.Result.Findings))
	for _, v := range resp.GetResult().Findings {
		fmt.Fprintf(w, "Quote: %v\n", v.GetQuote())
		fmt.Fprintf(w, "Infotype Name: %v\n", v.GetInfoType().GetName())
		fmt.Fprintf(w, "Likelihood: %v\n", v.GetLikelihood())
	}
	return nil

}

Java

Para saber cómo instalar y usar la biblioteca de cliente de Protección de Datos Sensibles, consulta el artículo sobre las bibliotecas de cliente de Protección de Datos Sensibles.

Para autenticarte en Protección de Datos Sensibles, configura las credenciales predeterminadas de la aplicación. Para obtener más información, consulta el artículo Configurar la autenticación en un entorno de desarrollo local.


import com.google.cloud.dlp.v2.DlpServiceClient;
import com.google.privacy.dlp.v2.ByteContentItem;
import com.google.privacy.dlp.v2.ByteContentItem.BytesType;
import com.google.privacy.dlp.v2.ContentItem;
import com.google.privacy.dlp.v2.CustomInfoType.Regex;
import com.google.privacy.dlp.v2.ExclusionRule;
import com.google.privacy.dlp.v2.Finding;
import com.google.privacy.dlp.v2.InfoType;
import com.google.privacy.dlp.v2.InspectConfig;
import com.google.privacy.dlp.v2.InspectContentRequest;
import com.google.privacy.dlp.v2.InspectContentResponse;
import com.google.privacy.dlp.v2.InspectionRule;
import com.google.privacy.dlp.v2.InspectionRuleSet;
import com.google.privacy.dlp.v2.LocationName;
import com.google.privacy.dlp.v2.MatchingType;
import com.google.protobuf.ByteString;
import java.io.IOException;
import java.util.ArrayList;
import java.util.List;

public class InspectStringWithExclusionRegex {

  public static void main(String[] args) throws Exception {
    // TODO(developer): Replace these variables before running the sample.
    String projectId = "your-project-id";
    String textToInspect = "Some email addresses: gary@example.com, bob@example.org";
    String excludedRegex = ".+@example.com";
    inspectStringWithExclusionRegex(projectId, textToInspect, excludedRegex);
  }

  // Inspects the provided text, avoiding matches specified in the exclusion list.
  public static void inspectStringWithExclusionRegex(
      String projectId, String textToInspect, String excludedRegex) throws IOException {
    // Initialize client that will be used to send requests. This client only needs to be created
    // once, and can be reused for multiple requests. After completing all of your requests, call
    // the "close" method on the client to safely clean up any remaining background resources.
    try (DlpServiceClient dlp = DlpServiceClient.create()) {
      // Specify the type and content to be inspected.
      ByteContentItem byteItem =
          ByteContentItem.newBuilder()
              .setType(BytesType.TEXT_UTF8)
              .setData(ByteString.copyFromUtf8(textToInspect))
              .build();
      ContentItem item = ContentItem.newBuilder().setByteItem(byteItem).build();

      // Specify the type of info the inspection will look for.
      // See https://cloud.google.com/dlp/docs/infotypes-reference for complete list of info types.
      List<InfoType> infoTypes = new ArrayList<>();
      for (String typeName : new String[] {"PHONE_NUMBER", "EMAIL_ADDRESS", "CREDIT_CARD_NUMBER"}) {
        infoTypes.add(InfoType.newBuilder().setName(typeName).build());
      }

      // Exclude matches from the specified excludedMatchList.
      ExclusionRule exclusionRule =
          ExclusionRule.newBuilder()
              .setMatchingType(MatchingType.MATCHING_TYPE_FULL_MATCH)
              .setRegex(Regex.newBuilder().setPattern(excludedRegex))
              .build();

      // Construct a ruleset that applies the exclusion rule to the EMAIL_ADDRESSES infotype.
      InspectionRuleSet ruleSet =
          InspectionRuleSet.newBuilder()
              .addInfoTypes(InfoType.newBuilder().setName("EMAIL_ADDRESS"))
              .addRules(InspectionRule.newBuilder().setExclusionRule(exclusionRule))
              .build();

      // Construct the configuration for the Inspect request, including the ruleset.
      InspectConfig config =
          InspectConfig.newBuilder()
              .addAllInfoTypes(infoTypes)
              .setIncludeQuote(true)
              .addRuleSet(ruleSet)
              .build();

      // Construct the Inspect request to be sent by the client.
      InspectContentRequest request =
          InspectContentRequest.newBuilder()
              .setParent(LocationName.of(projectId, "global").toString())
              .setItem(item)
              .setInspectConfig(config)
              .build();

      // Use the client to send the API request.
      InspectContentResponse response = dlp.inspectContent(request);

      // Parse the response and process results
      System.out.println("Findings: " + response.getResult().getFindingsCount());
      for (Finding f : response.getResult().getFindingsList()) {
        System.out.println("\tQuote: " + f.getQuote());
        System.out.println("\tInfo type: " + f.getInfoType().getName());
        System.out.println("\tLikelihood: " + f.getLikelihood());
      }
    }
  }
}

Node.js

Para saber cómo instalar y usar la biblioteca de cliente de Protección de Datos Sensibles, consulta el artículo sobre las bibliotecas de cliente de Protección de Datos Sensibles.

Para autenticarte en Protección de Datos Sensibles, configura las credenciales predeterminadas de la aplicación. Para obtener más información, consulta el artículo Configurar la autenticación en un entorno de desarrollo local.

// Imports the Google Cloud Data Loss Prevention library
const DLP = require('@google-cloud/dlp');

// Instantiates a client
const dlp = new DLP.DlpServiceClient();

// The project ID to run the API call under
// const projectId = 'my-project';

// The string to inspect
// const string = 'Some email addresses: gary@example.com, bob@example.org';

// Exclusion Regex
// const exclusionRegex = '.+@example.com';

async function inspectStringMultipleRules() {
  // Specify the type and content to be inspected.
  const item = {
    byteItem: {
      type: DLP.protos.google.privacy.dlp.v2.ByteContentItem.BytesType
        .TEXT_UTF8,
      data: Buffer.from(string, 'utf-8'),
    },
  };

  // Info Types for inspection
  const infoTypes = [{name: 'EMAIL_ADDRESS'}];

  // Construct a exclusion rule that uses a regex pattern
  const exclusionRule = {
    regex: {pattern: exclusionRegex},
    matchingType:
      DLP.protos.google.privacy.dlp.v2.MatchingType.MATCHING_TYPE_FULL_MATCH,
  };

  // Construct a ruleset that applies the exclusion rule to the EMAIL_ADDRESSES infotype.
  const ruleSet = [
    {
      infoTypes: infoTypes,
      rules: [
        {
          exclusionRule: exclusionRule,
        },
      ],
    },
  ];

  // Construct the configuration for the Inspect request, including the ruleset.
  const inspectConfig = {
    infoTypes: infoTypes,
    ruleSet: ruleSet,
    includeQuote: true,
  };

  // Construct the Inspect request to be sent by the client.
  const request = {
    parent: `projects/${projectId}/locations/global`,
    inspectConfig: inspectConfig,
    item: item,
  };

  // Use the client to send the API request.
  const [response] = await dlp.inspectContent(request);

  // Print findings.
  const findings = response.result.findings;
  if (findings.length > 0) {
    console.log(`Findings: ${findings.length}\n`);
    findings.forEach(finding => {
      console.log(`InfoType: ${finding.infoType.name}`);
      console.log(`\tQuote: ${finding.quote}`);
      console.log(`\tLikelihood: ${finding.likelihood} \n`);
    });
  } else {
    console.log('No findings.');
  }
}
inspectStringMultipleRules();

PHP

Para saber cómo instalar y usar la biblioteca de cliente de Protección de Datos Sensibles, consulta el artículo sobre las bibliotecas de cliente de Protección de Datos Sensibles.

Para autenticarte en Protección de Datos Sensibles, configura las credenciales predeterminadas de la aplicación. Para obtener más información, consulta el artículo Configurar la autenticación en un entorno de desarrollo local.

use Google\Cloud\Dlp\V2\Client\DlpServiceClient;
use Google\Cloud\Dlp\V2\ContentItem;
use Google\Cloud\Dlp\V2\CustomInfoType\Regex;
use Google\Cloud\Dlp\V2\ExclusionRule;
use Google\Cloud\Dlp\V2\InfoType;
use Google\Cloud\Dlp\V2\InspectConfig;
use Google\Cloud\Dlp\V2\InspectContentRequest;
use Google\Cloud\Dlp\V2\InspectionRule;
use Google\Cloud\Dlp\V2\InspectionRuleSet;
use Google\Cloud\Dlp\V2\Likelihood;
use Google\Cloud\Dlp\V2\MatchingType;

/**
 * Inspect a string, excluding REGEX matches
 * Omit email addresses ending with a specific domain from an EMAIL_ADDRESS detector scan.
 *
 * @param string $projectId         The Google Cloud project id to use as a parent resource.
 * @param string $textToInspect     The string to inspect.
 */
function inspect_string_with_exclusion_regex(
    // TODO(developer): Replace sample parameters before running the code.
    string $projectId,
    string $textToInspect = 'Some email addresses: gary@example.com, bob@example.org'
): void {
    // Instantiate a client.
    $dlp = new DlpServiceClient();

    $parent = "projects/$projectId/locations/global";

    // Specify what content you want the service to Inspect.
    $item = (new ContentItem())
        ->setValue($textToInspect);

    // Specify the type of info the inspection will look for.
    $infotypes = [
        (new InfoType())->setName('PHONE_NUMBER'),
        (new InfoType())->setName('EMAIL_ADDRESS'),
        (new InfoType())->setName('CREDIT_CARD_NUMBER'),
    ];

    // Exclude matches from the specified excludedRegex.
    $excludedRegex = '.+@example.com';
    $exclusionRule = (new ExclusionRule())
        ->setMatchingType(MatchingType::MATCHING_TYPE_FULL_MATCH)
        ->setRegex((new Regex())
            ->setPattern($excludedRegex));

    // Construct a ruleset that applies the exclusion rule to the EMAIL_ADDRESSES infotype.
    $inspectionRuleSet = (new InspectionRuleSet())
        ->setInfoTypes([
            (new InfoType())
                ->setName('EMAIL_ADDRESS')
        ])
        ->setRules([
            (new InspectionRule())
                ->setExclusionRule($exclusionRule),
        ]);

    // Construct the configuration for the Inspect request, including the ruleset.
    $inspectConfig = (new InspectConfig())
        ->setInfoTypes($infotypes)
        ->setIncludeQuote(true)
        ->setRuleSet([$inspectionRuleSet]);

    // Run request
    $inspectContentRequest = (new InspectContentRequest())
        ->setParent($parent)
        ->setInspectConfig($inspectConfig)
        ->setItem($item);
    $response = $dlp->inspectContent($inspectContentRequest);

    // Print the results
    $findings = $response->getResult()->getFindings();
    if (count($findings) == 0) {
        printf('No findings.' . PHP_EOL);
    } else {
        printf('Findings:' . PHP_EOL);
        foreach ($findings as $finding) {
            printf('  Quote: %s' . PHP_EOL, $finding->getQuote());
            printf('  Info type: %s' . PHP_EOL, $finding->getInfoType()->getName());
            printf('  Likelihood: %s' . PHP_EOL, Likelihood::name($finding->getLikelihood()));
        }
    }
}

Python

Para saber cómo instalar y usar la biblioteca de cliente de Protección de Datos Sensibles, consulta el artículo sobre las bibliotecas de cliente de Protección de Datos Sensibles.

Para autenticarte en Protección de Datos Sensibles, configura las credenciales predeterminadas de la aplicación. Para obtener más información, consulta el artículo Configurar la autenticación en un entorno de desarrollo local.

import google.cloud.dlp


def inspect_string_with_exclusion_regex(
    project: str, content_string: str, exclusion_regex: str = ".+@example.com"
) -> None:
    """Inspects the provided text, avoiding matches specified in the exclusion regex

    Uses the Data Loss Prevention API to omit matches on EMAIL_ADDRESS if they match
    the specified exclusion regex.

    Args:
        project: The Google Cloud project id to use as a parent resource.
        content_string: The string to inspect.
        exclusion_regex: The regular expression to exclude matches on

    Returns:
        None; the response from the API is printed to the terminal.
    """

    # Instantiate a client.
    dlp = google.cloud.dlp_v2.DlpServiceClient()

    # Construct a list of infoTypes for DLP to locate in `content_string`. See
    # https://cloud.google.com/dlp/docs/concepts-infotypes for more information
    # about supported infoTypes.
    info_types_to_locate = [{"name": "EMAIL_ADDRESS"}]

    # Construct a rule set that will only match on EMAIL_ADDRESS
    # if the specified regex doesn't also match.
    rule_set = [
        {
            "info_types": info_types_to_locate,
            "rules": [
                {
                    "exclusion_rule": {
                        "regex": {"pattern": exclusion_regex},
                        "matching_type": google.cloud.dlp_v2.MatchingType.MATCHING_TYPE_FULL_MATCH,
                    }
                }
            ],
        }
    ]

    # Construct the configuration dictionary
    inspect_config = {
        "info_types": info_types_to_locate,
        "rule_set": rule_set,
        "include_quote": True,
    }

    # Construct the `item`.
    item = {"value": content_string}

    # Convert the project id into a full resource id.
    parent = f"projects/{project}"

    # Call the API.
    response = dlp.inspect_content(
        request={"parent": parent, "inspect_config": inspect_config, "item": item}
    )

    # Print out the results.
    if response.result.findings:
        for finding in response.result.findings:
            print(f"Quote: {finding.quote}")
            print(f"Info type: {finding.info_type.name}")
            print(f"Likelihood: {finding.likelihood}")
    else:
        print("No findings.")

REST

Consulta la guía de inicio rápido de JSON para obtener más información sobre cómo usar la API de DLP con JSON.

...
    "inspectConfig":{
      "infoTypes":[
        {
          "name":"EMAIL_ADDRESS"
        }
      ],
      "ruleSet":[
        {
          "infoTypes":[
            {
              "name":"EMAIL_ADDRESS"
            }
          ],
          "rules":[
            {
              "exclusionRule":{
                "regex":{
                  "pattern":".+@example.com"
                },
                "matchingType": "MATCHING_TYPE_FULL_MATCH"
              }
            }
          ]
        }
      ]
    }
...

Omitir las coincidencias de análisis que incluyan la subcadena "TEST"

El siguiente fragmento de código JSON y el código en varios idiomas muestran cómo indicar a Protección de Datos Sensibles mediante un InspectConfig que debe excluir de la lista especificada de infoTypes cualquier resultado que incluya el token "TEST".

Ten en cuenta que esto coincide con "TEST" como token, no como subcadena, de modo que, aunque coincidirá con algo como "TEST@email.com", no lo hará con "TESTER@email.com". Si quieres que se busque una subcadena, usa una expresión regular en la regla de exclusión en lugar de un diccionario.

C#

Para saber cómo instalar y usar la biblioteca de cliente de Protección de Datos Sensibles, consulta el artículo sobre las bibliotecas de cliente de Protección de Datos Sensibles.

Para autenticarte en Protección de Datos Sensibles, configura las credenciales predeterminadas de la aplicación. Para obtener más información, consulta el artículo Configurar la autenticación en un entorno de desarrollo local.


using System;
using System.Collections.Generic;
using System.Linq;
using Google.Api.Gax.ResourceNames;
using Google.Cloud.Dlp.V2;

public class InspectStringWithExclusionDictSubstring
{
    public static InspectContentResponse Inspect(string projectId, string textToInspect, List<String> excludedSubstringList)
    {
        var dlp = DlpServiceClient.Create();

        var byteItem = new ByteContentItem
        {
            Type = ByteContentItem.Types.BytesType.TextUtf8,
            Data = Google.Protobuf.ByteString.CopyFromUtf8(textToInspect)
        };

        var contentItem = new ContentItem { ByteItem = byteItem };

        var infoTypes = new string[]
        {
            "EMAIL_ADDRESS",
            "DOMAIN_NAME",
            "PHONE_NUMBER",
            "PERSON_NAME"
        }.Select(it => new InfoType { Name = it });

        var exclusionRule = new ExclusionRule
        {
            MatchingType = MatchingType.PartialMatch,
            Dictionary = new CustomInfoType.Types.Dictionary
            {
                WordList = new CustomInfoType.Types.Dictionary.Types.WordList
                {
                    Words = { excludedSubstringList }
                }
            }
        };

        var ruleSet = new InspectionRuleSet
        {
            InfoTypes = { infoTypes },
            Rules = { new InspectionRule { ExclusionRule = exclusionRule } }
        };

        var config = new InspectConfig
        {
            InfoTypes = { infoTypes },
            IncludeQuote = true,
            RuleSet = { ruleSet }
        };

        var request = new InspectContentRequest
        {
            Parent = new LocationName(projectId, "global").ToString(),
            Item = contentItem,
            InspectConfig = config
        };

        var response = dlp.InspectContent(request);

        Console.WriteLine($"Findings: {response.Result.Findings.Count}");
        foreach (var f in response.Result.Findings)
        {
            Console.WriteLine("\tQuote: " + f.Quote);
            Console.WriteLine("\tInfo type: " + f.InfoType.Name);
            Console.WriteLine("\tLikelihood: " + f.Likelihood);
        }

        return response;
    }
}

Go

Para saber cómo instalar y usar la biblioteca de cliente de Protección de Datos Sensibles, consulta el artículo sobre las bibliotecas de cliente de Protección de Datos Sensibles.

Para autenticarte en Protección de Datos Sensibles, configura las credenciales predeterminadas de la aplicación. Para obtener más información, consulta el artículo Configurar la autenticación en un entorno de desarrollo local.

import (
	"context"
	"fmt"
	"io"

	dlp "cloud.google.com/go/dlp/apiv2"
	"cloud.google.com/go/dlp/apiv2/dlppb"
)

// inspectStringWithExclusionDictSubstring inspects a string
// with an exclusion dictionary substring
func inspectStringWithExclusionDictSubstring(w io.Writer, projectID, textToInspect string, excludedSubString []string) error {
	// projectID := "my-project-id"
	// textToInspect := "Some email addresses: gary@example.com, TEST@example.com"
	// excludedSubString := []string{"TEST"}

	ctx := context.Background()

	// Initialize a client once and reuse it to send multiple requests. Clients
	// are safe to use across goroutines. When the client is no longer needed,
	// call the Close method to cleanup its resources.
	client, err := dlp.NewClient(ctx)
	if err != nil {
		return err
	}

	// Closing the client safely cleans up background resources.
	defer client.Close()

	// Specify the type and content to be inspected.
	contentItem := &dlppb.ContentItem{
		DataItem: &dlppb.ContentItem_ByteItem{
			ByteItem: &dlppb.ByteContentItem{
				Type: dlppb.ByteContentItem_TEXT_UTF8,
				Data: []byte(textToInspect),
			},
		},
	}

	// Specify the type of info the inspection will look for.
	// See https://cloud.google.com/dlp/docs/infotypes-reference for complete list of info types.
	infoTypes := []*dlppb.InfoType{
		{Name: "EMAIL_ADDRESS"},
		{Name: "DOMAIN_NAME"},
		{Name: "PHONE_NUMBER"},
		{Name: "PERSON_NAME"},
	}

	// Exclude partial matches from the specified excludedSubstringList.
	exclusionRule := &dlppb.ExclusionRule{
		Type: &dlppb.ExclusionRule_Dictionary{
			Dictionary: &dlppb.CustomInfoType_Dictionary{
				Source: &dlppb.CustomInfoType_Dictionary_WordList_{
					WordList: &dlppb.CustomInfoType_Dictionary_WordList{
						Words: excludedSubString,
					},
				},
			},
		},
		MatchingType: dlppb.MatchingType_MATCHING_TYPE_PARTIAL_MATCH,
	}

	// Construct a ruleSet that applies the exclusion rule to the EMAIL_ADDRESSES infoType.
	ruleSet := &dlppb.InspectionRuleSet{
		InfoTypes: infoTypes,
		Rules: []*dlppb.InspectionRule{
			{
				Type: &dlppb.InspectionRule_ExclusionRule{
					ExclusionRule: exclusionRule,
				},
			},
		},
	}

	// Construct the Inspect request to be sent by the client.
	req := &dlppb.InspectContentRequest{
		Parent: fmt.Sprintf("projects/%s/locations/global", projectID),
		Item:   contentItem,
		InspectConfig: &dlppb.InspectConfig{
			InfoTypes:    infoTypes,
			IncludeQuote: true,
			RuleSet: []*dlppb.InspectionRuleSet{
				ruleSet,
			},
		},
	}

	// Send the request.
	resp, err := client.InspectContent(ctx, req)
	if err != nil {
		return err
	}

	// Process the results.
	fmt.Fprintf(w, "Findings: %v\n", len(resp.Result.Findings))
	for _, v := range resp.GetResult().Findings {
		fmt.Fprintf(w, "Quote: %v\n", v.GetQuote())
		fmt.Fprintf(w, "Infotype Name: %v\n", v.GetInfoType().GetName())
		fmt.Fprintf(w, "Likelihood: %v\n", v.GetLikelihood())
	}
	return nil

}

Java

Para saber cómo instalar y usar la biblioteca de cliente de Protección de Datos Sensibles, consulta el artículo sobre las bibliotecas de cliente de Protección de Datos Sensibles.

Para autenticarte en Protección de Datos Sensibles, configura las credenciales predeterminadas de la aplicación. Para obtener más información, consulta el artículo Configurar la autenticación en un entorno de desarrollo local.


import com.google.cloud.dlp.v2.DlpServiceClient;
import com.google.privacy.dlp.v2.ByteContentItem;
import com.google.privacy.dlp.v2.ByteContentItem.BytesType;
import com.google.privacy.dlp.v2.ContentItem;
import com.google.privacy.dlp.v2.CustomInfoType.Dictionary;
import com.google.privacy.dlp.v2.CustomInfoType.Dictionary.WordList;
import com.google.privacy.dlp.v2.ExclusionRule;
import com.google.privacy.dlp.v2.Finding;
import com.google.privacy.dlp.v2.InfoType;
import com.google.privacy.dlp.v2.InspectConfig;
import com.google.privacy.dlp.v2.InspectContentRequest;
import com.google.privacy.dlp.v2.InspectContentResponse;
import com.google.privacy.dlp.v2.InspectionRule;
import com.google.privacy.dlp.v2.InspectionRuleSet;
import com.google.privacy.dlp.v2.LocationName;
import com.google.privacy.dlp.v2.MatchingType;
import com.google.protobuf.ByteString;
import java.io.IOException;
import java.util.ArrayList;
import java.util.Arrays;
import java.util.List;

public class InspectStringWithExclusionDictSubstring {

  public static void main(String[] args) throws Exception {
    // TODO(developer): Replace these variables before running the sample.
    String projectId = "your-project-id";
    String textToInspect = "Some email addresses: gary@example.com, TEST@example.com";
    List<String> excludedSubstringList = Arrays.asList("TEST");
    inspectStringWithExclusionDictSubstring(projectId, textToInspect, excludedSubstringList);
  }

  // Inspects the provided text, avoiding matches specified in the exclusion list.
  public static void inspectStringWithExclusionDictSubstring(
      String projectId, String textToInspect, List<String> excludedSubstringList)
      throws IOException {
    // Initialize client that will be used to send requests. This client only needs to be created
    // once, and can be reused for multiple requests. After completing all of your requests, call
    // the "close" method on the client to safely clean up any remaining background resources.
    try (DlpServiceClient dlp = DlpServiceClient.create()) {
      // Specify the type and content to be inspected.
      ByteContentItem byteItem =
          ByteContentItem.newBuilder()
              .setType(BytesType.TEXT_UTF8)
              .setData(ByteString.copyFromUtf8(textToInspect))
              .build();
      ContentItem item = ContentItem.newBuilder().setByteItem(byteItem).build();

      // Specify the type of info the inspection will look for.
      // See https://cloud.google.com/dlp/docs/infotypes-reference for complete list of info types.
      List<InfoType> infoTypes = new ArrayList<>();
      for (String typeName :
          new String[] {"EMAIL_ADDRESS", "DOMAIN_NAME", "PHONE_NUMBER", "PERSON_NAME"}) {
        infoTypes.add(InfoType.newBuilder().setName(typeName).build());
      }

      // Exclude partial matches from the specified excludedSubstringList.
      ExclusionRule exclusionRule =
          ExclusionRule.newBuilder()
              .setMatchingType(MatchingType.MATCHING_TYPE_PARTIAL_MATCH)
              .setDictionary(
                  Dictionary.newBuilder()
                      .setWordList(WordList.newBuilder().addAllWords(excludedSubstringList)))
              .build();

      // Construct a ruleset that applies the exclusion rule to the EMAIL_ADDRESSES infotype.
      InspectionRuleSet ruleSet =
          InspectionRuleSet.newBuilder()
              .addAllInfoTypes(infoTypes)
              .addRules(InspectionRule.newBuilder().setExclusionRule(exclusionRule))
              .build();

      // Construct the configuration for the Inspect request, including the ruleset.
      InspectConfig config =
          InspectConfig.newBuilder()
              .addAllInfoTypes(infoTypes)
              .setIncludeQuote(true)
              .addRuleSet(ruleSet)
              .build();

      // Construct the Inspect request to be sent by the client.
      InspectContentRequest request =
          InspectContentRequest.newBuilder()
              .setParent(LocationName.of(projectId, "global").toString())
              .setItem(item)
              .setInspectConfig(config)
              .build();

      // Use the client to send the API request.
      InspectContentResponse response = dlp.inspectContent(request);

      // Parse the response and process results
      System.out.println("Findings: " + response.getResult().getFindingsCount());
      for (Finding f : response.getResult().getFindingsList()) {
        System.out.println("\tQuote: " + f.getQuote());
        System.out.println("\tInfo type: " + f.getInfoType().getName());
        System.out.println("\tLikelihood: " + f.getLikelihood());
      }
    }
  }
}

Node.js

Para saber cómo instalar y usar la biblioteca de cliente de Protección de Datos Sensibles, consulta el artículo sobre las bibliotecas de cliente de Protección de Datos Sensibles.

Para autenticarte en Protección de Datos Sensibles, configura las credenciales predeterminadas de la aplicación. Para obtener más información, consulta el artículo Configurar la autenticación en un entorno de desarrollo local.

// Imports the Google Cloud Data Loss Prevention library
const DLP = require('@google-cloud/dlp');

// Instantiates a client
const dlp = new DLP.DlpServiceClient();

// The project ID to run the API call under
// const projectId = 'my-project';

// The string to inspect
// const string = 'Some email addresses: gary@example.com, TEST@example.com';

// List of substrings to be used for exclusion
// const excludedSubstringList = ['Test'];

async function inspectStringExclusionDictSubstr() {
  // Specify the type and content to be inspected.
  const item = {
    byteItem: {
      type: DLP.protos.google.privacy.dlp.v2.ByteContentItem.BytesType
        .TEXT_UTF8,
      data: Buffer.from(string, 'utf-8'),
    },
  };

  // Specify the type of info the inspection will look for.
  // See https://cloud.google.com/dlp/docs/infotypes-reference for complete list of info types.
  const infoTypes = [
    {name: 'EMAIL_ADDRESS'},
    {name: 'DOMAIN_NAME'},
    {name: 'PHONE_NUMBER'},
    {name: 'PERSON_NAME'},
  ];

  // Exclude partial matches from the specified excludedSubstringList.
  const exclusionRule = {
    dictionary: {wordList: {words: excludedSubstringList}},
    matchingType:
      DLP.protos.google.privacy.dlp.v2.MatchingType
        .MATCHING_TYPE_PARTIAL_MATCH,
  };

  // Construct a ruleset that applies the exclusion rule to the EMAIL_ADDRESSES infotype.
  const ruleSet = [
    {
      infoTypes: infoTypes,
      rules: [
        {
          exclusionRule: exclusionRule,
        },
      ],
    },
  ];

  // Construct the configuration for the Inspect request, including the ruleset.
  const inspectConfig = {
    infoTypes: infoTypes,
    ruleSet: ruleSet,
    includeQuote: true,
  };

  // Construct the Inspect request to be sent by the client.
  const request = {
    parent: `projects/${projectId}/locations/global`,
    inspectConfig: inspectConfig,
    item: item,
  };

  // Use the client to send the API request.
  const [response] = await dlp.inspectContent(request);

  // Print findings.
  const findings = response.result.findings;
  if (findings.length > 0) {
    console.log(`Findings: ${findings.length}\n`);
    findings.forEach(finding => {
      console.log(`InfoType: ${finding.infoType.name}`);
      console.log(`\tQuote: ${finding.quote}`);
      console.log(`\tLikelihood: ${finding.likelihood} \n`);
    });
  } else {
    console.log('No findings.');
  }
}
inspectStringExclusionDictSubstr();

PHP

Para saber cómo instalar y usar la biblioteca de cliente de Protección de Datos Sensibles, consulta el artículo sobre las bibliotecas de cliente de Protección de Datos Sensibles.

Para autenticarte en Protección de Datos Sensibles, configura las credenciales predeterminadas de la aplicación. Para obtener más información, consulta el artículo Configurar la autenticación en un entorno de desarrollo local.

use Google\Cloud\Dlp\V2\Client\DlpServiceClient;
use Google\Cloud\Dlp\V2\ContentItem;
use Google\Cloud\Dlp\V2\CustomInfoType\Dictionary;
use Google\Cloud\Dlp\V2\CustomInfoType\Dictionary\WordList;
use Google\Cloud\Dlp\V2\ExclusionRule;
use Google\Cloud\Dlp\V2\InfoType;
use Google\Cloud\Dlp\V2\InspectConfig;
use Google\Cloud\Dlp\V2\InspectContentRequest;
use Google\Cloud\Dlp\V2\InspectionRule;
use Google\Cloud\Dlp\V2\InspectionRuleSet;
use Google\Cloud\Dlp\V2\Likelihood;
use Google\Cloud\Dlp\V2\MatchingType;

/**
 * Inspect a string with an exclusion dictionary substring
 * Omit scan matches that include the substring "TEST".
 *
 * @param string $projectId                 The Google Cloud project id to use as a parent resource.
 * @param string $textToInspect             The string to inspect.
 * @param array  $excludedSubStringArray    The sub string to excluded.
 */
function inspect_string_with_exclusion_dict_substring(
    // TODO(developer): Replace sample parameters before running the code.
    string $projectId,
    string $textToInspect = 'Some email addresses: gary@example.com, TEST@example.com',
    array  $excludedSubStringArray = ['Test']
): void {
    // Instantiate a client.
    $dlp = new DlpServiceClient();

    $parent = "projects/$projectId/locations/global";

    // Specify what content you want the service to Inspect.
    $item = (new ContentItem())
        ->setValue($textToInspect);

    // Specify the type of info the inspection will look for.
    $infotypes = [
        (new InfoType())->setName('PHONE_NUMBER'),
        (new InfoType())->setName('EMAIL_ADDRESS'),
        (new InfoType())->setName('DOMAIN_NAME'),
        (new InfoType())->setName('PERSON_NAME'),
    ];

    // Exclude matches from the specified excludedSubstringList.
    $excludedSubstringList = (new Dictionary())
        ->setWordList((new WordList())
            ->setWords($excludedSubStringArray));

    $exclusionRule = (new ExclusionRule())
        ->setMatchingType(MatchingType::MATCHING_TYPE_PARTIAL_MATCH)
        ->setDictionary($excludedSubstringList);

    // Construct a ruleset that applies the exclusion rule to the EMAIL_ADDRESSES infotype.
    $inspectionRuleSet = (new InspectionRuleSet())
        ->setInfoTypes($infotypes)
        ->setRules([
            (new InspectionRule())
                ->setExclusionRule($exclusionRule),
        ]);

    // Construct the configuration for the Inspect request, including the ruleset.
    $inspectConfig = (new InspectConfig())
        ->setInfoTypes($infotypes)
        ->setIncludeQuote(true)
        ->setRuleSet([$inspectionRuleSet]);

    // Run request
    $inspectContentRequest = (new InspectContentRequest())
        ->setParent($parent)
        ->setInspectConfig($inspectConfig)
        ->setItem($item);
    $response = $dlp->inspectContent($inspectContentRequest);

    // Print the results
    $findings = $response->getResult()->getFindings();
    if (count($findings) == 0) {
        printf('No findings.' . PHP_EOL);
    } else {
        printf('Findings:' . PHP_EOL);
        foreach ($findings as $finding) {
            printf('  Quote: %s' . PHP_EOL, $finding->getQuote());
            printf('  Info type: %s' . PHP_EOL, $finding->getInfoType()->getName());
            printf('  Likelihood: %s' . PHP_EOL, Likelihood::name($finding->getLikelihood()));
        }
    }
}

Python

Para saber cómo instalar y usar la biblioteca de cliente de Protección de Datos Sensibles, consulta el artículo sobre las bibliotecas de cliente de Protección de Datos Sensibles.

Para autenticarte en Protección de Datos Sensibles, configura las credenciales predeterminadas de la aplicación. Para obtener más información, consulta el artículo Configurar la autenticación en un entorno de desarrollo local.

from typing import List

import google.cloud.dlp


def inspect_string_with_exclusion_dict_substring(
    project: str, content_string: str, exclusion_list: List[str] = ["TEST"]
) -> None:
    """Inspects the provided text, avoiding matches that contain excluded tokens

    Uses the Data Loss Prevention API to omit matches if they include tokens
    in the specified exclusion list.

    Args:
        project: The Google Cloud project id to use as a parent resource.
        content_string: The string to inspect.
        exclusion_list: The list of strings to ignore partial matches on

    Returns:
        None; the response from the API is printed to the terminal.
    """

    # Instantiate a client.
    dlp = google.cloud.dlp_v2.DlpServiceClient()

    # Construct a list of infoTypes for DLP to locate in `content_string`. See
    # https://cloud.google.com/dlp/docs/concepts-infotypes for more information
    # about supported infoTypes.
    info_types_to_locate = [{"name": "EMAIL_ADDRESS"}, {"name": "DOMAIN_NAME"}]

    # Construct a rule set that will only match if the match text does not
    # contains tokens from the exclusion list.
    rule_set = [
        {
            "info_types": info_types_to_locate,
            "rules": [
                {
                    "exclusion_rule": {
                        "dictionary": {"word_list": {"words": exclusion_list}},
                        "matching_type": google.cloud.dlp_v2.MatchingType.MATCHING_TYPE_PARTIAL_MATCH,
                    }
                }
            ],
        }
    ]

    # Construct the configuration dictionary
    inspect_config = {
        "info_types": info_types_to_locate,
        "rule_set": rule_set,
        "include_quote": True,
    }

    # Construct the `item`.
    item = {"value": content_string}

    # Convert the project id into a full resource id.
    parent = f"projects/{project}"

    # Call the API.
    response = dlp.inspect_content(
        request={"parent": parent, "inspect_config": inspect_config, "item": item}
    )

    # Print out the results.
    if response.result.findings:
        for finding in response.result.findings:
            print(f"Quote: {finding.quote}")
            print(f"Info type: {finding.info_type.name}")
            print(f"Likelihood: {finding.likelihood}")
    else:
        print("No findings.")

REST

Consulta la guía de inicio rápido de JSON para obtener más información sobre cómo usar la API de DLP con JSON.

...
    "inspectConfig":{
      "infoTypes":[
        {
          "name":"EMAIL_ADDRESS"
        },
        {
          "name":"DOMAIN_NAME"
        },
        {
          "name":"PHONE_NUMBER"
        },
        {
          "name":"PERSON_NAME"
        }
      ],
      "ruleSet":[
        {
          "infoTypes":[
            {
              "name":"EMAIL_ADDRESS"
            },
            {
              "name":"DOMAIN_NAME"
            },
            {
              "name":"PHONE_NUMBER"
            },
            {
              "name":"PERSON_NAME"
            }
          ],
          "rules":[
            {
              "exclusionRule":{
                "dictionary":{
                  "wordList":{
                    "words":[
                      "TEST"
                    ]
                  }
                },
                "matchingType": "MATCHING_TYPE_PARTIAL_MATCH"
              }
            }
          ]
        }
      ]
    }
...

Omitir las coincidencias de análisis que incluyan la subcadena "Jimmy" de un análisis de detector de infoType personalizado

El siguiente fragmento de código JSON y el código en varios idiomas muestran cómo indicar a Protección de Datos Sensibles mediante un InspectConfig que debe evitar buscar coincidencias con el nombre "Jimmy" en un análisis que utilice el detector de expresiones regulares personalizadas especificado:

C#

Para saber cómo instalar y usar la biblioteca de cliente de Protección de Datos Sensibles, consulta el artículo sobre las bibliotecas de cliente de Protección de Datos Sensibles.

Para autenticarte en Protección de Datos Sensibles, configura las credenciales predeterminadas de la aplicación. Para obtener más información, consulta el artículo Configurar la autenticación en un entorno de desarrollo local.


using System;
using System.Collections.Generic;
using Google.Api.Gax.ResourceNames;
using Google.Cloud.Dlp.V2;

public class InspectStringCustomExcludingSubstring
{
    public static InspectContentResponse Inspect(string projectId, string textToInspect, string customDetectorPattern, List<String> excludedSubstringList)
    {
        var dlp = DlpServiceClient.Create();

        var byteContentItem = new ByteContentItem
        {
            Type = ByteContentItem.Types.BytesType.TextUtf8,
            Data = Google.Protobuf.ByteString.CopyFromUtf8(textToInspect)
        };

        var contentItem = new ContentItem { ByteItem = byteContentItem };

        var infoType = new InfoType
        {
            Name = "CUSTOM_NAME_DETECTOR"
        };
        var customInfoType = new CustomInfoType
        {
            InfoType = infoType,
            Regex = new CustomInfoType.Types.Regex { Pattern = customDetectorPattern }
        };

        var exclusionRule = new ExclusionRule
        {
            MatchingType = MatchingType.PartialMatch,
            Dictionary = new CustomInfoType.Types.Dictionary
            {
                WordList = new CustomInfoType.Types.Dictionary.Types.WordList
                {
                    Words = { excludedSubstringList }
                }
            }
        };

        var ruleSet = new InspectionRuleSet
        {
            InfoTypes = { infoType },
            Rules = { new InspectionRule { ExclusionRule = exclusionRule } }
        };

        var config = new InspectConfig
        {
            CustomInfoTypes = { customInfoType },
            IncludeQuote = true,
            RuleSet = { ruleSet }
        };

        var request = new InspectContentRequest
        {
            Parent = new LocationName(projectId, "global").ToString(),
            Item = contentItem,
            InspectConfig = config
        };

        var response = dlp.InspectContent(request);

        Console.WriteLine($"Findings: {response.Result.Findings.Count}");
        foreach (var f in response.Result.Findings)
        {
            Console.WriteLine("\tQuote: " + f.Quote);
            Console.WriteLine("\tInfo type: " + f.InfoType.Name);
            Console.WriteLine("\tLikelihood: " + f.Likelihood);
        }

        return response;
    }
}

Go

Para saber cómo instalar y usar la biblioteca de cliente de Protección de Datos Sensibles, consulta el artículo sobre las bibliotecas de cliente de Protección de Datos Sensibles.

Para autenticarte en Protección de Datos Sensibles, configura las credenciales predeterminadas de la aplicación. Para obtener más información, consulta el artículo Configurar la autenticación en un entorno de desarrollo local.

import (
	"context"
	"fmt"
	"io"

	dlp "cloud.google.com/go/dlp/apiv2"
	"cloud.google.com/go/dlp/apiv2/dlppb"
)

// inspectStringCustomExcludingSubstring inspects a string for sensitive data, excluding a custom substring
func inspectStringCustomExcludingSubstring(w io.Writer, projectID, textToInspect, customDetectorPattern string, excludedSubstringList []string) error {
	// projectId := "your-project-id"
	// textToInspect := "Name: Doe, John. Name: Example, Jimmy"
	// customDetectorPattern := "[A-Z][a-z]{1,15}, [A-Z][a-z]{1,15}"
	// excludedSubstringList := []string{"Jimmy"}

	ctx := context.Background()

	// Initialize a client once and reuse it to send multiple requests. Clients
	// are safe to use across goroutines. When the client is no longer needed,
	// call the Close method to cleanup its resources.
	client, err := dlp.NewClient(ctx)
	if err != nil {
		return err
	}

	// Closing the client safely cleans up background resources.
	defer client.Close()

	// Specify the type and content to be inspected.
	contentItem := &dlppb.ContentItem{
		DataItem: &dlppb.ContentItem_ByteItem{
			ByteItem: &dlppb.ByteContentItem{
				Type: dlppb.ByteContentItem_TEXT_UTF8,
				Data: []byte(textToInspect),
			},
		},
	}

	// Specify the type of info the inspection will look for.
	infoType := &dlppb.InfoType{
		Name: "CUSTOM_NAME_DETECTOR",
	}

	customInfotype := &dlppb.CustomInfoType{
		InfoType: infoType,
		Type: &dlppb.CustomInfoType_Regex_{
			Regex: &dlppb.CustomInfoType_Regex{
				Pattern: customDetectorPattern,
			},
		},
	}

	// Exclude partial matches from the specified excludedSubstringList.
	exclusionRule := &dlppb.ExclusionRule{
		Type: &dlppb.ExclusionRule_Dictionary{
			Dictionary: &dlppb.CustomInfoType_Dictionary{
				Source: &dlppb.CustomInfoType_Dictionary_WordList_{
					WordList: &dlppb.CustomInfoType_Dictionary_WordList{
						Words: excludedSubstringList,
					},
				},
			},
		},
		MatchingType: dlppb.MatchingType_MATCHING_TYPE_PARTIAL_MATCH,
	}

	// Construct a ruleset that applies the exclusion rule to the EMAIL_ADDRESSES infoType.
	ruleSet := &dlppb.InspectionRuleSet{
		InfoTypes: []*dlppb.InfoType{
			infoType,
		},
		Rules: []*dlppb.InspectionRule{
			{
				Type: &dlppb.InspectionRule_ExclusionRule{
					ExclusionRule: exclusionRule,
				},
			},
		},
	}

	// Construct the Inspect request to be sent by the client.
	req := &dlppb.InspectContentRequest{
		Parent: fmt.Sprintf("projects/%s/locations/global", projectID),
		Item:   contentItem,
		InspectConfig: &dlppb.InspectConfig{
			CustomInfoTypes: []*dlppb.CustomInfoType{
				customInfotype,
			},
			IncludeQuote: true,
			RuleSet: []*dlppb.InspectionRuleSet{
				ruleSet,
			},
		},
	}

	// Send the request.
	resp, err := client.InspectContent(ctx, req)
	if err != nil {
		return err
	}

	// Process the results.
	fmt.Fprintf(w, "Findings: %v\n", len(resp.Result.Findings))
	for _, v := range resp.GetResult().Findings {
		fmt.Fprintf(w, "Quote: %v\n", v.GetQuote())
		fmt.Fprintf(w, "Infotype Name: %v\n", v.GetInfoType().GetName())
		fmt.Fprintf(w, "Likelihood: %v\n", v.GetLikelihood())
	}
	return nil

}

Java

Para saber cómo instalar y usar la biblioteca de cliente de Protección de Datos Sensibles, consulta el artículo sobre las bibliotecas de cliente de Protección de Datos Sensibles.

Para autenticarte en Protección de Datos Sensibles, configura las credenciales predeterminadas de la aplicación. Para obtener más información, consulta el artículo Configurar la autenticación en un entorno de desarrollo local.


import com.google.cloud.dlp.v2.DlpServiceClient;
import com.google.privacy.dlp.v2.ByteContentItem;
import com.google.privacy.dlp.v2.ByteContentItem.BytesType;
import com.google.privacy.dlp.v2.ContentItem;
import com.google.privacy.dlp.v2.CustomInfoType;
import com.google.privacy.dlp.v2.CustomInfoType.Dictionary;
import com.google.privacy.dlp.v2.CustomInfoType.Dictionary.WordList;
import com.google.privacy.dlp.v2.CustomInfoType.Regex;
import com.google.privacy.dlp.v2.ExclusionRule;
import com.google.privacy.dlp.v2.Finding;
import com.google.privacy.dlp.v2.InfoType;
import com.google.privacy.dlp.v2.InspectConfig;
import com.google.privacy.dlp.v2.InspectContentRequest;
import com.google.privacy.dlp.v2.InspectContentResponse;
import com.google.privacy.dlp.v2.InspectionRule;
import com.google.privacy.dlp.v2.InspectionRuleSet;
import com.google.privacy.dlp.v2.LocationName;
import com.google.privacy.dlp.v2.MatchingType;
import com.google.protobuf.ByteString;
import java.io.IOException;
import java.util.Arrays;
import java.util.List;

public class InspectStringCustomExcludingSubstring {

  public static void main(String[] args) throws Exception {
    // TODO(developer): Replace these variables before running the sample.
    String projectId = "your-project-id";
    String textToInspect = "Name: Doe, John. Name: Example, Jimmy";
    String customDetectorPattern = "[A-Z][a-z]{1,15}, [A-Z][a-z]{1,15}";
    List<String> excludedSubstringList = Arrays.asList("Jimmy");
    inspectStringCustomExcludingSubstring(
        projectId, textToInspect, customDetectorPattern, excludedSubstringList);
  }

  // Inspects the provided text, avoiding matches specified in the exclusion list.
  public static void inspectStringCustomExcludingSubstring(
      String projectId,
      String textToInspect,
      String customDetectorPattern,
      List<String> excludedSubstringList)
      throws IOException {
    // Initialize client that will be used to send requests. This client only needs to be created
    // once, and can be reused for multiple requests. After completing all of your requests, call
    // the "close" method on the client to safely clean up any remaining background resources.
    try (DlpServiceClient dlp = DlpServiceClient.create()) {
      // Specify the type and content to be inspected.
      ByteContentItem byteItem =
          ByteContentItem.newBuilder()
              .setType(BytesType.TEXT_UTF8)
              .setData(ByteString.copyFromUtf8(textToInspect))
              .build();
      ContentItem item = ContentItem.newBuilder().setByteItem(byteItem).build();

      // Specify the type of info the inspection will look for.
      InfoType infoType = InfoType.newBuilder().setName("CUSTOM_NAME_DETECTOR").build();
      CustomInfoType customInfoType =
          CustomInfoType.newBuilder()
              .setInfoType(infoType)
              .setRegex(Regex.newBuilder().setPattern(customDetectorPattern))
              .build();

      // Exclude partial matches from the specified excludedSubstringList.
      ExclusionRule exclusionRule =
          ExclusionRule.newBuilder()
              .setMatchingType(MatchingType.MATCHING_TYPE_PARTIAL_MATCH)
              .setDictionary(
                  Dictionary.newBuilder()
                      .setWordList(WordList.newBuilder().addAllWords(excludedSubstringList)))
              .build();

      // Construct a ruleset that applies the exclusion rule to the EMAIL_ADDRESSES infotype.
      InspectionRuleSet ruleSet =
          InspectionRuleSet.newBuilder()
              .addInfoTypes(infoType)
              .addRules(InspectionRule.newBuilder().setExclusionRule(exclusionRule))
              .build();

      // Construct the configuration for the Inspect request, including the ruleset.
      InspectConfig config =
          InspectConfig.newBuilder()
              .addCustomInfoTypes(customInfoType)
              .setIncludeQuote(true)
              .addRuleSet(ruleSet)
              .build();

      // Construct the Inspect request to be sent by the client.
      InspectContentRequest request =
          InspectContentRequest.newBuilder()
              .setParent(LocationName.of(projectId, "global").toString())
              .setItem(item)
              .setInspectConfig(config)
              .build();

      // Use the client to send the API request.
      InspectContentResponse response = dlp.inspectContent(request);

      // Parse the response and process results
      System.out.println("Findings: " + response.getResult().getFindingsCount());
      for (Finding f : response.getResult().getFindingsList()) {
        System.out.println("\tQuote: " + f.getQuote());
        System.out.println("\tInfo type: " + f.getInfoType().getName());
        System.out.println("\tLikelihood: " + f.getLikelihood());
      }
    }
  }
}

Node.js

Para saber cómo instalar y usar la biblioteca de cliente de Protección de Datos Sensibles, consulta el artículo sobre las bibliotecas de cliente de Protección de Datos Sensibles.

Para autenticarte en Protección de Datos Sensibles, configura las credenciales predeterminadas de la aplicación. Para obtener más información, consulta el artículo Configurar la autenticación en un entorno de desarrollo local.

// Imports the Google Cloud Data Loss Prevention library
const DLP = require('@google-cloud/dlp');

// Instantiates a client
const dlp = new DLP.DlpServiceClient();

// The project ID to run the API call under
// const projectId = 'my-project';

// The string to inspect
// const string = 'Name: Doe, John. Name: Example, Jimmy';

// Words to exclude
// const excludedWords = ['Jimmy'];

async function inspectStringCustomExclusionDict() {
  // Specify the content to be inspected.
  const item = {value: string};

  // Specify the type of info the inspection will look for.
  const customInfoTypes = [
    {
      infoType: {name: 'CUSTOM_NAME_DETECTOR'},
      regex: {pattern: '[A-Z][a-z]{1,15}, [A-Z][a-z]{1,15}'},
    },
  ];

  // Exclude partial matches from the specified excludedSubstringList.
  const exclusionRUle = {
    dictionary: {wordList: {words: excludedWords}},
    matchingType:
      DLP.protos.google.privacy.dlp.v2.MatchingType
        .MATCHING_TYPE_PARTIAL_MATCH,
  };

  // Construct a rule set that will only match if the match text does not
  // contains tokens from the exclusion list.
  const ruleSet = [
    {
      infoTypes: [{name: 'CUSTOM_NAME_DETECTOR'}],
      rules: [
        {
          exclusionRule: exclusionRUle,
        },
      ],
    },
  ];

  // Construct the configuration for the Inspect request, including the ruleset.
  const inspectConfig = {
    customInfoTypes: customInfoTypes,
    ruleSet: ruleSet,
    includeQuote: true,
  };

  // Construct the Inspect request to be sent by the client.
  const request = {
    parent: `projects/${projectId}/locations/global`,
    inspectConfig: inspectConfig,
    item: item,
  };

  // Use the client to send the API request.
  const [response] = await dlp.inspectContent(request);

  // Print findings.
  const findings = response.result.findings;
  if (findings.length > 0) {
    console.log(`Findings: ${findings.length}\n`);
    findings.forEach(finding => {
      console.log(`InfoType: ${finding.infoType.name}`);
      console.log(`\tQuote: ${finding.quote}`);
      console.log(`\tLikelihood: ${finding.likelihood} \n`);
    });
  } else {
    console.log('No findings.');
  }
}
inspectStringCustomExclusionDict();

PHP

Para saber cómo instalar y usar la biblioteca de cliente de Protección de Datos Sensibles, consulta el artículo sobre las bibliotecas de cliente de Protección de Datos Sensibles.

Para autenticarte en Protección de Datos Sensibles, configura las credenciales predeterminadas de la aplicación. Para obtener más información, consulta el artículo Configurar la autenticación en un entorno de desarrollo local.

use Google\Cloud\Dlp\V2\Client\DlpServiceClient;
use Google\Cloud\Dlp\V2\ContentItem;
use Google\Cloud\Dlp\V2\CustomInfoType;
use Google\Cloud\Dlp\V2\CustomInfoType\Dictionary;
use Google\Cloud\Dlp\V2\CustomInfoType\Dictionary\WordList;
use Google\Cloud\Dlp\V2\CustomInfoType\Regex;
use Google\Cloud\Dlp\V2\ExclusionRule;
use Google\Cloud\Dlp\V2\InfoType;
use Google\Cloud\Dlp\V2\InspectConfig;
use Google\Cloud\Dlp\V2\InspectContentRequest;
use Google\Cloud\Dlp\V2\InspectionRule;
use Google\Cloud\Dlp\V2\InspectionRuleSet;
use Google\Cloud\Dlp\V2\Likelihood;
use Google\Cloud\Dlp\V2\MatchingType;

/**
 * Inspect a string for sensitive data, excluding a custom substring
 * Illustrates how to use an InspectConfig to instruct Cloud DLP to avoid matching on the name "Jimmy" in a scan that uses the specified custom regular expression detector.
 *
 * @param string $projectId         The Google Cloud project id to use as a parent resource.
 * @param string $textToInspect     The string to inspect.
 */
function inspect_string_custom_excluding_substring(
    // TODO(developer): Replace sample parameters before running the code.
    string $projectId,
    string $textToInspect = 'Name: Doe, John. Name: Example, Jimmy'
): void {
    // Instantiate a client.
    $dlp = new DlpServiceClient();
    $customDetectorPattern = '[A-Z][a-z]{1,15}, [A-Z][a-z]{1,15}';

    $parent = "projects/$projectId/locations/global";

    // Specify what content you want the service to Inspect.
    $item = (new ContentItem())
        ->setValue($textToInspect);

    // Specify the type of info the inspection will look for.
    $customerNameDetector = (new InfoType())
        ->setName('CUSTOM_NAME_DETECTOR');
    $customInfoType = (new CustomInfoType())
        ->setInfoType($customerNameDetector)
        ->setRegex((new Regex())
            ->setPattern($customDetectorPattern));

    // Exclude partial matches from the specified excludedSubstringList.
    $excludedSubstringList = (new Dictionary())
        ->setWordList((new WordList())
            ->setWords(['Jimmy']));

    $exclusionRule = (new ExclusionRule())
        ->setMatchingType(MatchingType::MATCHING_TYPE_PARTIAL_MATCH)
        ->setDictionary($excludedSubstringList);

    // Construct a ruleset that applies the exclusion rule.
    $inspectionRuleSet = (new InspectionRuleSet())
        ->setInfoTypes([$customerNameDetector])
        ->setRules([
            (new InspectionRule())
                ->setExclusionRule($exclusionRule),
        ]);

    // Construct the configuration for the Inspect request, including the ruleset.
    $inspectConfig = (new InspectConfig())
        ->setCustomInfoTypes([$customInfoType])
        ->setIncludeQuote(true)
        ->setRuleSet([$inspectionRuleSet]);

    // Run request
    $inspectContentRequest = (new InspectContentRequest())
        ->setParent($parent)
        ->setInspectConfig($inspectConfig)
        ->setItem($item);
    $response = $dlp->inspectContent($inspectContentRequest);

    // Print the results
    $findings = $response->getResult()->getFindings();
    if (count($findings) == 0) {
        printf('No findings.' . PHP_EOL);
    } else {
        printf('Findings:' . PHP_EOL);
        foreach ($findings as $finding) {
            printf('  Quote: %s' . PHP_EOL, $finding->getQuote());
            printf('  Info type: %s' . PHP_EOL, $finding->getInfoType()->getName());
            printf('  Likelihood: %s' . PHP_EOL, Likelihood::name($finding->getLikelihood()));
        }
    }
}

Python

Para saber cómo instalar y usar la biblioteca de cliente de Protección de Datos Sensibles, consulta el artículo sobre las bibliotecas de cliente de Protección de Datos Sensibles.

Para autenticarte en Protección de Datos Sensibles, configura las credenciales predeterminadas de la aplicación. Para obtener más información, consulta el artículo Configurar la autenticación en un entorno de desarrollo local.

from typing import List

import google.cloud.dlp


def inspect_string_custom_excluding_substring(
    project: str, content_string: str, exclusion_list: List[str] = ["jimmy"]
) -> None:
    """Inspects the provided text with a custom detector, avoiding matches on specific tokens

    Uses the Data Loss Prevention API to omit matches on a custom detector
    if they include tokens in the specified exclusion list.

    Args:
        project: The Google Cloud project id to use as a parent resource.
        content_string: The string to inspect.
        exclusion_list: The list of strings to ignore matches on

    Returns:
        None; the response from the API is printed to the terminal.
    """

    # Instantiate a client.
    dlp = google.cloud.dlp_v2.DlpServiceClient()

    # Construct a custom regex detector for names
    custom_info_types = [
        {
            "info_type": {"name": "CUSTOM_NAME_DETECTOR"},
            "regex": {"pattern": "[A-Z][a-z]{1,15}, [A-Z][a-z]{1,15}"},
        }
    ]

    # Construct a rule set that will only match if the match text does not
    # contains tokens from the exclusion list.
    rule_set = [
        {
            "info_types": [{"name": "CUSTOM_NAME_DETECTOR"}],
            "rules": [
                {
                    "exclusion_rule": {
                        "dictionary": {"word_list": {"words": exclusion_list}},
                        "matching_type": google.cloud.dlp_v2.MatchingType.MATCHING_TYPE_PARTIAL_MATCH,
                    }
                }
            ],
        }
    ]

    # Construct the configuration dictionary
    inspect_config = {
        "custom_info_types": custom_info_types,
        "rule_set": rule_set,
        "include_quote": True,
    }

    # Construct the `item`.
    item = {"value": content_string}

    # Convert the project id into a full resource id.
    parent = f"projects/{project}"

    # Call the API.
    response = dlp.inspect_content(
        request={"parent": parent, "inspect_config": inspect_config, "item": item}
    )

    # Print out the results.
    if response.result.findings:
        for finding in response.result.findings:
            print(f"Quote: {finding.quote}")
            print(f"Info type: {finding.info_type.name}")
            print(f"Likelihood: {finding.likelihood}")
    else:
        print("No findings.")

REST

Consulta la guía de inicio rápido de JSON para obtener más información sobre cómo usar la API de DLP con JSON.

...
    "inspectConfig":{
      "customInfoTypes":[
        {
          "infoType":{
            "name":"CUSTOM_NAME_DETECTOR"
          },
          "regex":{
            "pattern":"[A-Z][a-z]{1,15}, [A-Z][a-z]{1,15}"
          }
        }
      ],
      "ruleSet":[
        {
          "infoTypes":[
            {
              "name":"CUSTOM_NAME_DETECTOR"
            }
          ],
          "rules":[
            {
              "exclusionRule":{
                "dictionary":{
                  "wordList":{
                    "words":[
                      "Jimmy"
                    ]
                  }
                },
                "matchingType": "MATCHING_TYPE_PARTIAL_MATCH"
              }
            }
          ]
        }
      ]
    }
...

Omitir las coincidencias de análisis de un detector PERSON_NAME que se solapen con un detector personalizado

En este caso, el usuario no quiere que se devuelva una coincidencia de una comprobación de Protección de Datos Sensibles con el detector integrado PERSON_NAME si la coincidencia también se encuentra en una comprobación con el detector de expresiones regulares personalizadas definido en la primera parte del fragmento.

El siguiente fragmento de JSON y el código en varios idiomas especifican tanto un detector de regex personalizado como una regla de exclusión en InspectConfig. El detector de expresiones regulares personalizadas especifica los nombres que se deben excluir de los resultados. La regla de exclusión especifica que, si algún resultado devuelto de un análisis de PERSON_NAME también coincide con el detector de regex personalizado, se omite. Ten en cuenta que, en este caso, VIP_DETECTOR está marcado como EXCLUSION_TYPE_EXCLUDE, por lo que no generará resultados por sí mismo. Solo afectará a los resultados producidos por el detector PERSON_NAME.

C#

Para saber cómo instalar y usar la biblioteca de cliente de Protección de Datos Sensibles, consulta el artículo sobre las bibliotecas de cliente de Protección de Datos Sensibles.

Para autenticarte en Protección de Datos Sensibles, configura las credenciales predeterminadas de la aplicación. Para obtener más información, consulta el artículo Configurar la autenticación en un entorno de desarrollo local.


using System;
using Google.Api.Gax.ResourceNames;
using Google.Cloud.Dlp.V2;
using static Google.Cloud.Dlp.V2.CustomInfoType.Types;

public class InspectStringCustomOmitOverlap
{
    public static InspectContentResponse Inspect(string projectId, string textToInspect)
    {
        var dlp = DlpServiceClient.Create();

        var byteItem = new ByteContentItem
        {
            Type = ByteContentItem.Types.BytesType.TextUtf8,
            Data = Google.Protobuf.ByteString.CopyFromUtf8(textToInspect)
        };

        var contentItem = new ContentItem { ByteItem = byteItem };

        var customInfoType = new CustomInfoType
        {
            InfoType = new InfoType { Name = "VIP_DETECTOR" },
            Regex = new CustomInfoType.Types.Regex { Pattern = "Larry Page|Sergey Brin" },
            ExclusionType = ExclusionType.Exclude
        };

        var exclusionRule = new ExclusionRule
        {
            ExcludeInfoTypes = new ExcludeInfoTypes { InfoTypes = { customInfoType.InfoType } },
            MatchingType = MatchingType.FullMatch
        };

        var ruleSet = new InspectionRuleSet
        {
            InfoTypes = { new InfoType { Name = "PERSON_NAME" } },
            Rules = { new InspectionRule { ExclusionRule = exclusionRule } }
        };

        var config = new InspectConfig
        {
            InfoTypes = { new InfoType { Name = "PERSON_NAME" } },
            CustomInfoTypes = { customInfoType },
            IncludeQuote = true,
            RuleSet = { ruleSet }
        };

        var request = new InspectContentRequest
        {
            Parent = new LocationName(projectId, "global").ToString(),
            Item = contentItem,
            InspectConfig = config
        };

        var response = dlp.InspectContent(request);

        Console.WriteLine($"Findings: {response.Result.Findings.Count}");
        foreach (var f in response.Result.Findings)
        {
            Console.WriteLine("\tQuote: " + f.Quote);
            Console.WriteLine("\tInfo type: " + f.InfoType.Name);
            Console.WriteLine("\tLikelihood: " + f.Likelihood);
        }

        return response;
    }
}

Go

Para saber cómo instalar y usar la biblioteca de cliente de Protección de Datos Sensibles, consulta el artículo sobre las bibliotecas de cliente de Protección de Datos Sensibles.

Para autenticarte en Protección de Datos Sensibles, configura las credenciales predeterminadas de la aplicación. Para obtener más información, consulta el artículo Configurar la autenticación en un entorno de desarrollo local.

import (
	"context"
	"fmt"
	"io"

	dlp "cloud.google.com/go/dlp/apiv2"
	"cloud.google.com/go/dlp/apiv2/dlppb"
)

// inspectStringCustomOmitOverlap inspects a string for sensitive data with omitting custom matches
func inspectStringCustomOmitOverlap(w io.Writer, projectID, textToInspect, customInfoTypeName, infoTypeName, regex string) error {
	// projectID := "my-project-id"
	// textToInspect := "Name: Jane Doe. Name: Larry Page."
	// customInfoTypeName := "VIP_DETECTOR"
	// infoTypeName := "PERSON_NAME"
	// regex := "Larry Page|Sergey Brin"

	ctx := context.Background()

	// Initialize a client once and reuse it to send multiple requests. Clients
	// are safe to use across goroutines. When the client is no longer needed,
	// call the Close method to cleanup its resources.
	client, err := dlp.NewClient(ctx)
	if err != nil {
		return err
	}

	// Closing the client safely cleans up background resources.
	defer client.Close()

	// Specify the type and content to be inspected.
	contentItem := &dlppb.ContentItem{
		DataItem: &dlppb.ContentItem_ByteItem{
			ByteItem: &dlppb.ByteContentItem{
				Type: dlppb.ByteContentItem_TEXT_UTF8,
				Data: []byte(textToInspect),
			},
		},
	}
	// Construct the custom infoType.
	customInfotype := &dlppb.CustomInfoType{
		InfoType: &dlppb.InfoType{
			Name: customInfoTypeName,
		},
		Type: &dlppb.CustomInfoType_Regex_{
			Regex: &dlppb.CustomInfoType_Regex{
				Pattern: regex,
			},
		},
		ExclusionType: dlppb.CustomInfoType_EXCLUSION_TYPE_EXCLUDE,
	}

	// Exclude matches that also match the custom infoType.
	exclusionRule := &dlppb.ExclusionRule{
		Type: &dlppb.ExclusionRule_ExcludeInfoTypes{
			ExcludeInfoTypes: &dlppb.ExcludeInfoTypes{
				InfoTypes: []*dlppb.InfoType{
					customInfotype.GetInfoType(),
				},
			},
		},
		MatchingType: dlppb.MatchingType_MATCHING_TYPE_FULL_MATCH,
	}

	// Construct a ruleSet that applies the exclusion rule to the PERSON_NAME infoType.
	ruleSet := &dlppb.InspectionRuleSet{
		InfoTypes: []*dlppb.InfoType{
			{Name: infoTypeName},
		},
		Rules: []*dlppb.InspectionRule{
			{
				Type: &dlppb.InspectionRule_ExclusionRule{
					ExclusionRule: exclusionRule,
				},
			},
		},
	}

	// Construct the configuration for the Inspect request, including the ruleSet.
	config := &dlppb.InspectConfig{
		InfoTypes: []*dlppb.InfoType{
			{Name: infoTypeName},
		},
		CustomInfoTypes: []*dlppb.CustomInfoType{
			customInfotype,
		},
		IncludeQuote: true,
		RuleSet: []*dlppb.InspectionRuleSet{
			ruleSet,
		},
	}

	// Create a configured request.
	req := &dlppb.InspectContentRequest{
		Parent:        fmt.Sprintf("projects/%s/locations/global", projectID),
		Item:          contentItem,
		InspectConfig: config,
	}

	// Send the request.
	resp, err := client.InspectContent(ctx, req)
	if err != nil {
		return err
	}

	// Parse the response and process results
	fmt.Fprintf(w, "Findings: %v\n", len(resp.Result.Findings))
	for _, v := range resp.GetResult().Findings {
		fmt.Fprintf(w, "Quote: %v\n", v.GetQuote())
		fmt.Fprintf(w, "Infotype Name: %v\n", v.GetInfoType().GetName())
		fmt.Fprintf(w, "Likelihood: %v\n", v.GetLikelihood())
	}
	return nil
}

Java

Para saber cómo instalar y usar la biblioteca de cliente de Protección de Datos Sensibles, consulta el artículo sobre las bibliotecas de cliente de Protección de Datos Sensibles.

Para autenticarte en Protección de Datos Sensibles, configura las credenciales predeterminadas de la aplicación. Para obtener más información, consulta el artículo Configurar la autenticación en un entorno de desarrollo local.


import com.google.cloud.dlp.v2.DlpServiceClient;
import com.google.privacy.dlp.v2.ByteContentItem;
import com.google.privacy.dlp.v2.ByteContentItem.BytesType;
import com.google.privacy.dlp.v2.ContentItem;
import com.google.privacy.dlp.v2.CustomInfoType;
import com.google.privacy.dlp.v2.CustomInfoType.ExclusionType;
import com.google.privacy.dlp.v2.CustomInfoType.Regex;
import com.google.privacy.dlp.v2.ExcludeInfoTypes;
import com.google.privacy.dlp.v2.ExclusionRule;
import com.google.privacy.dlp.v2.Finding;
import com.google.privacy.dlp.v2.InfoType;
import com.google.privacy.dlp.v2.InspectConfig;
import com.google.privacy.dlp.v2.InspectContentRequest;
import com.google.privacy.dlp.v2.InspectContentResponse;
import com.google.privacy.dlp.v2.InspectionRule;
import com.google.privacy.dlp.v2.InspectionRuleSet;
import com.google.privacy.dlp.v2.LocationName;
import com.google.privacy.dlp.v2.MatchingType;
import com.google.protobuf.ByteString;
import java.io.IOException;

public class InspectStringCustomOmitOverlap {

  public static void main(String[] args) throws Exception {
    // TODO(developer): Replace these variables before running the sample.
    String projectId = "your-project-id";
    String textToInspect = "Name: Jane Doe. Name: Larry Page.";
    inspectStringCustomOmitOverlap(projectId, textToInspect);
  }

  // Inspects the provided text, avoiding matches specified in the exclusion list.
  public static void inspectStringCustomOmitOverlap(String projectId, String textToInspect)
      throws IOException {
    // Initialize client that will be used to send requests. This client only needs to be created
    // once, and can be reused for multiple requests. After completing all of your requests, call
    // the "close" method on the client to safely clean up any remaining background resources.
    try (DlpServiceClient dlp = DlpServiceClient.create()) {
      // Specify the type and content to be inspected.
      ByteContentItem byteItem =
          ByteContentItem.newBuilder()
              .setType(BytesType.TEXT_UTF8)
              .setData(ByteString.copyFromUtf8(textToInspect))
              .build();
      ContentItem item = ContentItem.newBuilder().setByteItem(byteItem).build();

      // Construct the custom infotype.
      CustomInfoType customInfoType =
          CustomInfoType.newBuilder()
              .setInfoType(InfoType.newBuilder().setName("VIP_DETECTOR"))
              .setRegex(Regex.newBuilder().setPattern("Larry Page|Sergey Brin"))
              .setExclusionType(ExclusionType.EXCLUSION_TYPE_EXCLUDE)
              .build();

      // Exclude matches that also match the custom infotype.
      ExclusionRule exclusionRule =
          ExclusionRule.newBuilder()
              .setExcludeInfoTypes(
                  ExcludeInfoTypes.newBuilder().addInfoTypes(customInfoType.getInfoType()))
              .setMatchingType(MatchingType.MATCHING_TYPE_FULL_MATCH)
              .build();

      // Construct a ruleset that applies the exclusion rule to the PERSON_NAME infotype.
      InspectionRuleSet ruleSet =
          InspectionRuleSet.newBuilder()
              .addInfoTypes(InfoType.newBuilder().setName("PERSON_NAME"))
              .addRules(InspectionRule.newBuilder().setExclusionRule(exclusionRule))
              .build();

      // Construct the configuration for the Inspect request, including the ruleset.
      InspectConfig config =
          InspectConfig.newBuilder()
              .addInfoTypes(InfoType.newBuilder().setName("PERSON_NAME"))
              .addCustomInfoTypes(customInfoType)
              .setIncludeQuote(true)
              .addRuleSet(ruleSet)
              .build();

      // Construct the Inspect request to be sent by the client.
      InspectContentRequest request =
          InspectContentRequest.newBuilder()
              .setParent(LocationName.of(projectId, "global").toString())
              .setItem(item)
              .setInspectConfig(config)
              .build();

      // Use the client to send the API request.
      InspectContentResponse response = dlp.inspectContent(request);

      // Parse the response and process results
      System.out.println("Findings: " + response.getResult().getFindingsCount());
      for (Finding f : response.getResult().getFindingsList()) {
        System.out.println("\tQuote: " + f.getQuote());
        System.out.println("\tInfo type: " + f.getInfoType().getName());
        System.out.println("\tLikelihood: " + f.getLikelihood());
      }
    }
  }
}

Node.js

Para saber cómo instalar y usar la biblioteca de cliente de Protección de Datos Sensibles, consulta el artículo sobre las bibliotecas de cliente de Protección de Datos Sensibles.

Para autenticarte en Protección de Datos Sensibles, configura las credenciales predeterminadas de la aplicación. Para obtener más información, consulta el artículo Configurar la autenticación en un entorno de desarrollo local.

// Imports the Google Cloud Data Loss Prevention library
const DLP = require('@google-cloud/dlp');

// Instantiates a client
const dlp = new DLP.DlpServiceClient();

// The project ID to run the API call under
// const projectId = 'my-project';

// The string to inspect
// const string = 'Name: Jane Doe. Name: Larry Page.';

async function inspectStringCustomOmitOverlap() {
  // Construct item to inspect
  const item = {value: string};

  // Construct a custom regex detector for names
  const customInfoTypes = [
    {
      infoType: {name: 'VIP_DETECTOR'},
      regex: {pattern: 'Larry Page|Sergey Brin'},
      exclusionType:
        DLP.protos.google.privacy.dlp.v2.CustomInfoType.ExclusionType
          .EXCLUSION_TYPE_EXCLUDE,
    },
  ];

  // Construct a exclusion rule
  const exclusionRule = {
    excludeInfoTypes: {
      infoTypes: [{name: 'VIP_DETECTOR'}],
    },
    matchingType:
      DLP.protos.google.privacy.dlp.v2.MatchingType.MATCHING_TYPE_FULL_MATCH,
  };

  // Construct a rule set that will only match if the match text does not
  // contains Info type matches.
  const ruleSet = [
    {
      infoTypes: [{name: 'PERSON_NAME'}],
      rules: [
        {
          exclusionRule: exclusionRule,
        },
      ],
    },
  ];

  // Construct the inspect configuration
  const inspectConfig = {
    infoTypes: [{name: 'PERSON_NAME'}],
    customInfoTypes: customInfoTypes,
    ruleSet: ruleSet,
    includeQuote: true,
  };

  // Combine configurations into a request for the service.
  const request = {
    parent: `projects/${projectId}/locations/global`,
    inspectConfig: inspectConfig,
    item: item,
  };

  // Run request
  const [response] = await dlp.inspectContent(request);

  // Print Findings
  const findings = response.result.findings;
  if (findings.length > 0) {
    console.log(`Findings: ${findings.length}\n`);
    findings.forEach(finding => {
      console.log(`InfoType: ${finding.infoType.name}`);
      console.log(`\tQuote: ${finding.quote}`);
      console.log(`\tLikelihood: ${finding.likelihood} \n`);
    });
  } else {
    console.log('No findings.');
  }
}
inspectStringCustomOmitOverlap();

PHP

Para saber cómo instalar y usar la biblioteca de cliente de Protección de Datos Sensibles, consulta el artículo sobre las bibliotecas de cliente de Protección de Datos Sensibles.

Para autenticarte en Protección de Datos Sensibles, configura las credenciales predeterminadas de la aplicación. Para obtener más información, consulta el artículo Configurar la autenticación en un entorno de desarrollo local.

use Google\Cloud\Dlp\V2\Client\DlpServiceClient;
use Google\Cloud\Dlp\V2\ContentItem;
use Google\Cloud\Dlp\V2\CustomInfoType;
use Google\Cloud\Dlp\V2\CustomInfoType\ExclusionType;
use Google\Cloud\Dlp\V2\CustomInfoType\Regex;
use Google\Cloud\Dlp\V2\ExcludeInfoTypes;
use Google\Cloud\Dlp\V2\ExclusionRule;
use Google\Cloud\Dlp\V2\InfoType;
use Google\Cloud\Dlp\V2\InspectConfig;
use Google\Cloud\Dlp\V2\InspectContentRequest;
use Google\Cloud\Dlp\V2\InspectionRule;
use Google\Cloud\Dlp\V2\InspectionRuleSet;
use Google\Cloud\Dlp\V2\Likelihood;
use Google\Cloud\Dlp\V2\MatchingType;

/**
 * Inspect a string for sensitive data, omitting custom matches
 * Omit scan matches from a PERSON_NAME detector scan that overlap with a custom detector.
 *
 * @param string $projectId         The Google Cloud project id to use as a parent resource.
 * @param string $textToInspect     The string to inspect.
 */
function inspect_string_custom_omit_overlap(
    // TODO(developer): Replace sample parameters before running the code.
    string $projectId,
    string $textToInspect = 'Name: Jane Doe. Name: Larry Page.'
): void {
    // Instantiate a client.
    $dlp = new DlpServiceClient();

    $parent = "projects/$projectId/locations/global";

    // Specify what content you want the service to Inspect.
    $item = (new ContentItem())
        ->setValue($textToInspect);

    // Specify the type of info the inspection will look for.
    $vipDetector = (new InfoType())
        ->setName('VIP_DETECTOR');
    $pattern = 'Larry Page|Sergey Brin';
    $customInfoType = (new CustomInfoType())
        ->setInfoType($vipDetector)
        ->setRegex((new Regex())
            ->setPattern($pattern))
        ->setExclusionType(ExclusionType::EXCLUSION_TYPE_EXCLUDE);

    // Exclude matches that also match the custom infotype.
    $exclusionRule = (new ExclusionRule())
        ->setMatchingType(MatchingType::MATCHING_TYPE_FULL_MATCH)
        ->setExcludeInfoTypes((new ExcludeInfoTypes())
                ->setInfoTypes([$customInfoType->getInfoType()])
        );

    // Construct a ruleset that applies the exclusion rule to the PERSON_NAME infotype.
    $personName = (new InfoType())
        ->setName('PERSON_NAME');
    $inspectionRuleSet = (new InspectionRuleSet())
        ->setInfoTypes([$personName])
        ->setRules([
            (new InspectionRule())
                ->setExclusionRule($exclusionRule),
        ]);

    // Construct the configuration for the Inspect request, including the ruleset.
    $inspectConfig = (new InspectConfig())
        ->setInfoTypes([$personName])
        ->setCustomInfoTypes([$customInfoType])
        ->setIncludeQuote(true)
        ->setRuleSet([$inspectionRuleSet]);

    // Run request
    $inspectContentRequest = (new InspectContentRequest())
        ->setParent($parent)
        ->setInspectConfig($inspectConfig)
        ->setItem($item);
    $response = $dlp->inspectContent($inspectContentRequest);

    // Print the results
    $findings = $response->getResult()->getFindings();
    if (count($findings) == 0) {
        printf('No findings.' . PHP_EOL);
    } else {
        printf('Findings:' . PHP_EOL);
        foreach ($findings as $finding) {
            printf('  Quote: %s' . PHP_EOL, $finding->getQuote());
            printf('  Info type: %s' . PHP_EOL, $finding->getInfoType()->getName());
            printf('  Likelihood: %s' . PHP_EOL, Likelihood::name($finding->getLikelihood()));
        }
    }
}

Python

Para saber cómo instalar y usar la biblioteca de cliente de Protección de Datos Sensibles, consulta el artículo sobre las bibliotecas de cliente de Protección de Datos Sensibles.

Para autenticarte en Protección de Datos Sensibles, configura las credenciales predeterminadas de la aplicación. Para obtener más información, consulta el artículo Configurar la autenticación en un entorno de desarrollo local.

import google.cloud.dlp


def inspect_string_custom_omit_overlap(project: str, content_string: str) -> None:
    """Matches PERSON_NAME and a custom detector,
    but if they overlap only matches the custom detector

    Uses the Data Loss Prevention API to omit matches on a built-in detector
    if they overlap with matches from a custom detector

    Args:
        project: The Google Cloud project id to use as a parent resource.
        content_string: The string to inspect.

    Returns:
        None; the response from the API is printed to the terminal.
    """

    # Instantiate a client.
    dlp = google.cloud.dlp_v2.DlpServiceClient()

    # Construct a custom regex detector for names
    custom_info_types = [
        {
            "info_type": {"name": "VIP_DETECTOR"},
            "regex": {"pattern": "Larry Page|Sergey Brin"},
            "exclusion_type": google.cloud.dlp_v2.CustomInfoType.ExclusionType.EXCLUSION_TYPE_EXCLUDE,
        }
    ]

    # Construct a rule set that will exclude PERSON_NAME matches
    # that overlap with VIP_DETECTOR matches
    rule_set = [
        {
            "info_types": [{"name": "PERSON_NAME"}],
            "rules": [
                {
                    "exclusion_rule": {
                        "exclude_info_types": {
                            "info_types": [{"name": "VIP_DETECTOR"}]
                        },
                        "matching_type": google.cloud.dlp_v2.MatchingType.MATCHING_TYPE_FULL_MATCH,
                    }
                }
            ],
        }
    ]

    # Construct the configuration dictionary
    inspect_config = {
        "info_types": [{"name": "PERSON_NAME"}],
        "custom_info_types": custom_info_types,
        "rule_set": rule_set,
        "include_quote": True,
    }

    # Construct the `item`.
    item = {"value": content_string}

    # Convert the project id into a full resource id.
    parent = f"projects/{project}"

    # Call the API.
    response = dlp.inspect_content(
        request={"parent": parent, "inspect_config": inspect_config, "item": item}
    )

    # Print out the results.
    if response.result.findings:
        for finding in response.result.findings:
            print(f"Quote: {finding.quote}")
            print(f"Info type: {finding.info_type.name}")
            print(f"Likelihood: {finding.likelihood}")
    else:
        print("No findings.")

REST

Consulta la guía de inicio rápido de JSON para obtener más información sobre cómo usar la API de DLP con JSON.

...
    "inspectConfig":{
      "infoTypes":[
        {
          "name":"PERSON_NAME"
        }
      ],
      "customInfoTypes":[
        {
          "infoType":{
            "name":"VIP_DETECTOR"
          },
          "regex":{
            "pattern":"Dana Williams|Quinn Garcia"
          },
          "exclusionType":"EXCLUSION_TYPE_EXCLUDE"
        }
      ],
      "ruleSet":[
        {
          "infoTypes":[
            {
              "name":"PERSON_NAME"
            }
          ],
          "rules":[
            {
              "exclusionRule":{
                "excludeInfoTypes":{
                  "infoTypes":[
                    {
                      "name":"VIP_DETECTOR"
                    }
                  ]
                },
                "matchingType": "MATCHING_TYPE_FULL_MATCH"
              }
            }
          ]
        }
      ]
    }
...

Omitir las coincidencias del detector PERSON_NAME si también coinciden con el detector EMAIL_ADDRESS

El siguiente fragmento de código JSON y el código en varios idiomas muestran cómo indicar a Protección de Datos Sensibles mediante un InspectConfig que solo debe devolver una coincidencia en el caso de que las coincidencias del detector PERSON_NAME se solapen con las del detector EMAIL_ADDRESS. De esta forma, se evita que una dirección de correo, como "james@example.com", coincida con los detectores PERSON_NAME y EMAIL_ADDRESS.

C#

Para saber cómo instalar y usar la biblioteca de cliente de Protección de Datos Sensibles, consulta el artículo sobre las bibliotecas de cliente de Protección de Datos Sensibles.

Para autenticarte en Protección de Datos Sensibles, configura las credenciales predeterminadas de la aplicación. Para obtener más información, consulta el artículo Configurar la autenticación en un entorno de desarrollo local.


using System;
using System.Linq;
using Google.Api.Gax.ResourceNames;
using Google.Cloud.Dlp.V2;

public class InspectStringOmitOverlap
{
    public static InspectContentResponse Inspect(string projectId, string textToInspect)
    {
        // Initialize client that will be used to send requests. This client only needs to be created
        // once, and can be reused for multiple requests. After completing all of your requests, call
        // the "close" method on the client to safely clean up any remaining background resources.
        var dlp = DlpServiceClient.Create();

        // Specify the type and content to be inspected.
        var byteItem = new ByteContentItem
        {
            Type = ByteContentItem.Types.BytesType.TextUtf8,
            Data = Google.Protobuf.ByteString.CopyFromUtf8(textToInspect)
        };
        var contentItem = new ContentItem { ByteItem = byteItem };

        // Specify the type of info the inspection will look for.
        // See https://cloud.google.com/dlp/docs/infotypes-reference for complete list of info types.
        var infoTypes = new string[] { "PERSON_NAME", "EMAIL_ADDRESS" }.Select(it => new InfoType { Name = it });

        // Exclude EMAIL_ADDRESS matches
        var exclusionRule = new ExclusionRule
        {
            ExcludeInfoTypes = new ExcludeInfoTypes { InfoTypes = { new InfoType { Name = "EMAIL_ADDRESS" } } },
            MatchingType = MatchingType.PartialMatch
        };

        // Construct a ruleset that applies the exclusion rule to the PERSON_NAME infotype.
        // If a PERSON_NAME match overlaps with an EMAIL_ADDRESS match, the PERSON_NAME match will
        // be excluded.
        var ruleSet = new InspectionRuleSet
        {
            InfoTypes = { new InfoType { Name = "PERSON_NAME" } },
            Rules = { new InspectionRule { ExclusionRule = exclusionRule } }
        };

        // Construct the configuration for the Inspect request, including the ruleset.
        var config = new InspectConfig
        {
            InfoTypes = { infoTypes },
            IncludeQuote = true,
            RuleSet = { ruleSet }
        };

        // Construct the Inspect request to be sent by the client.
        var request = new InspectContentRequest
        {
            Parent = new LocationName(projectId, "global").ToString(),
            Item = contentItem,
            InspectConfig = config
        };

        // Use the client to send the API request.
        var response = dlp.InspectContent(request);

        // Parse the response and process results
        Console.WriteLine($"Findings: {response.Result.Findings.Count}");
        foreach (var f in response.Result.Findings)
        {
            Console.WriteLine("\tQuote: " + f.Quote);
            Console.WriteLine("\tInfo type: " + f.InfoType.Name);
            Console.WriteLine("\tLikelihood: " + f.Likelihood);
        }

        return response;
    }
}

Go

Para saber cómo instalar y usar la biblioteca de cliente de Protección de Datos Sensibles, consulta el artículo sobre las bibliotecas de cliente de Protección de Datos Sensibles.

Para autenticarte en Protección de Datos Sensibles, configura las credenciales predeterminadas de la aplicación. Para obtener más información, consulta el artículo Configurar la autenticación en un entorno de desarrollo local.

import (
	"context"
	"fmt"
	"io"

	dlp "cloud.google.com/go/dlp/apiv2"
	"cloud.google.com/go/dlp/apiv2/dlppb"
)

// inspectStringOmitOverlap inspects a string for sensitive data and
// omitting overlapping matches on person and email
func inspectStringOmitOverlap(w io.Writer, projectID, textToInspect string) error {
	// projectID := "my-project-id"
	// textToInspect := "gary@example.com"

	ctx := context.Background()

	// Initialize a client once and reuse it to send multiple requests. Clients
	// are safe to use across goroutines. When the client is no longer needed,
	// call the Close method to cleanup its resources.
	client, err := dlp.NewClient(ctx)
	if err != nil {
		return err
	}

	// Closing the client safely cleans up background resources.
	defer client.Close()

	// Specify the type and content to be inspected.
	contentItem := &dlppb.ContentItem{
		DataItem: &dlppb.ContentItem_ByteItem{
			ByteItem: &dlppb.ByteContentItem{
				Type: dlppb.ByteContentItem_TEXT_UTF8,
				Data: []byte(textToInspect),
			},
		},
	}

	// Specify the type of info the inspection will look for.
	// See https://cloud.google.com/dlp/docs/infotypes-reference for complete list of info types.
	infoTypes := []*dlppb.InfoType{
		{Name: "PERSON_NAME"},
		{Name: "EMAIL_ADDRESS"},
	}

	// Exclude EMAIL_ADDRESS matches
	exclusionRule := &dlppb.ExclusionRule{
		Type: &dlppb.ExclusionRule_ExcludeInfoTypes{
			ExcludeInfoTypes: &dlppb.ExcludeInfoTypes{
				InfoTypes: []*dlppb.InfoType{
					{Name: "EMAIL_ADDRESS"},
				},
			},
		},
		MatchingType: dlppb.MatchingType_MATCHING_TYPE_PARTIAL_MATCH,
	}

	// Construct a ruleSet that applies the exclusion rule to the PERSON_NAME infoType.
	// If a PERSON_NAME match overlaps with an EMAIL_ADDRESS match, the PERSON_NAME match will
	// be excluded.
	ruleSet := &dlppb.InspectionRuleSet{
		InfoTypes: []*dlppb.InfoType{
			{Name: "PERSON_NAME"},
		},
		Rules: []*dlppb.InspectionRule{
			{
				Type: &dlppb.InspectionRule_ExclusionRule{
					ExclusionRule: exclusionRule,
				},
			},
		},
	}

	// Construct the Inspect request to be sent by the client.
	req := &dlppb.InspectContentRequest{
		Parent: fmt.Sprintf("projects/%s/locations/global", projectID),
		Item:   contentItem,
		InspectConfig: &dlppb.InspectConfig{
			InfoTypes:    infoTypes,
			IncludeQuote: true,
			RuleSet: []*dlppb.InspectionRuleSet{
				ruleSet,
			},
		},
	}

	// Send the request.
	resp, err := client.InspectContent(ctx, req)
	if err != nil {
		return err
	}

	// Process the results.
	fmt.Fprintf(w, "Findings: %v\n", len(resp.Result.Findings))
	for _, v := range resp.GetResult().Findings {
		fmt.Fprintf(w, "Quote: %v\n", v.GetQuote())
		fmt.Fprintf(w, "Infotype Name: %v\n", v.GetInfoType().GetName())
		fmt.Fprintf(w, "Likelihood: %v\n", v.GetLikelihood())
	}
	return nil

}

Java

Para saber cómo instalar y usar la biblioteca de cliente de Protección de Datos Sensibles, consulta el artículo sobre las bibliotecas de cliente de Protección de Datos Sensibles.

Para autenticarte en Protección de Datos Sensibles, configura las credenciales predeterminadas de la aplicación. Para obtener más información, consulta el artículo Configurar la autenticación en un entorno de desarrollo local.


import com.google.cloud.dlp.v2.DlpServiceClient;
import com.google.privacy.dlp.v2.ByteContentItem;
import com.google.privacy.dlp.v2.ByteContentItem.BytesType;
import com.google.privacy.dlp.v2.ContentItem;
import com.google.privacy.dlp.v2.ExcludeInfoTypes;
import com.google.privacy.dlp.v2.ExclusionRule;
import com.google.privacy.dlp.v2.Finding;
import com.google.privacy.dlp.v2.InfoType;
import com.google.privacy.dlp.v2.InspectConfig;
import com.google.privacy.dlp.v2.InspectContentRequest;
import com.google.privacy.dlp.v2.InspectContentResponse;
import com.google.privacy.dlp.v2.InspectionRule;
import com.google.privacy.dlp.v2.InspectionRuleSet;
import com.google.privacy.dlp.v2.LocationName;
import com.google.privacy.dlp.v2.MatchingType;
import com.google.protobuf.ByteString;
import java.io.IOException;
import java.util.ArrayList;
import java.util.List;

public class InspectStringOmitOverlap {

  public static void main(String[] args) throws Exception {
    // TODO(developer): Replace these variables before running the sample.
    String projectId = "your-project-id";
    String textToInspect = "james@example.com";
    inspectStringOmitOverlap(projectId, textToInspect);
  }

  // Inspects the provided text, avoiding matches specified in the exclusion list.
  public static void inspectStringOmitOverlap(String projectId, String textToInspect)
      throws IOException {
    // Initialize client that will be used to send requests. This client only needs to be created
    // once, and can be reused for multiple requests. After completing all of your requests, call
    // the "close" method on the client to safely clean up any remaining background resources.
    try (DlpServiceClient dlp = DlpServiceClient.create()) {
      // Specify the type and content to be inspected.
      ByteContentItem byteItem =
          ByteContentItem.newBuilder()
              .setType(BytesType.TEXT_UTF8)
              .setData(ByteString.copyFromUtf8(textToInspect))
              .build();
      ContentItem item = ContentItem.newBuilder().setByteItem(byteItem).build();

      // Specify the type of info the inspection will look for.
      // See https://cloud.google.com/dlp/docs/infotypes-reference for complete list of info types.
      List<InfoType> infoTypes = new ArrayList<>();
      for (String typeName : new String[] {"PERSON_NAME", "EMAIL_ADDRESS"}) {
        infoTypes.add(InfoType.newBuilder().setName(typeName).build());
      }

      // Exclude EMAIL_ADDRESS matches
      ExclusionRule exclusionRule =
          ExclusionRule.newBuilder()
              .setExcludeInfoTypes(
                  ExcludeInfoTypes.newBuilder()
                      .addInfoTypes(InfoType.newBuilder().setName("EMAIL_ADDRESS")))
              .setMatchingType(MatchingType.MATCHING_TYPE_PARTIAL_MATCH)
              .build();

      // Construct a ruleset that applies the exclusion rule to the PERSON_NAME infotype.
      // If a PERSON_NAME match overlaps with an EMAIL_ADDRESS match, the PERSON_NAME match will
      // be excluded.
      InspectionRuleSet ruleSet =
          InspectionRuleSet.newBuilder()
              .addInfoTypes(InfoType.newBuilder().setName("PERSON_NAME"))
              .addRules(InspectionRule.newBuilder().setExclusionRule(exclusionRule))
              .build();

      // Construct the configuration for the Inspect request, including the ruleset.
      InspectConfig config =
          InspectConfig.newBuilder()
              .addAllInfoTypes(infoTypes)
              .setIncludeQuote(true)
              .addRuleSet(ruleSet)
              .build();

      // Construct the Inspect request to be sent by the client.
      InspectContentRequest request =
          InspectContentRequest.newBuilder()
              .setParent(LocationName.of(projectId, "global").toString())
              .setItem(item)
              .setInspectConfig(config)
              .build();

      // Use the client to send the API request.
      InspectContentResponse response = dlp.inspectContent(request);

      // Parse the response and process results
      System.out.println("Findings: " + response.getResult().getFindingsCount());
      for (Finding f : response.getResult().getFindingsList()) {
        System.out.println("\tQuote: " + f.getQuote());
        System.out.println("\tInfo type: " + f.getInfoType().getName());
        System.out.println("\tLikelihood: " + f.getLikelihood());
      }
    }
  }
}

Node.js

Para saber cómo instalar y usar la biblioteca de cliente de Protección de Datos Sensibles, consulta el artículo sobre las bibliotecas de cliente de Protección de Datos Sensibles.

Para autenticarte en Protección de Datos Sensibles, configura las credenciales predeterminadas de la aplicación. Para obtener más información, consulta el artículo Configurar la autenticación en un entorno de desarrollo local.

// Imports the Google Cloud Data Loss Prevention library
const DLP = require('@google-cloud/dlp');

// Instantiates a client
const dlp = new DLP.DlpServiceClient();

// The project ID to run the API call under
// const projectId = 'my-project';

// The string to inspect
// const string = 'james@example.com';

async function inspectStringWithoutOverlapPersonEmail() {
  // Construct item to inspect
  const item = {
    byteItem: {
      type: DLP.protos.google.privacy.dlp.v2.ByteContentItem.BytesType
        .TEXT_UTF8,
      data: Buffer.from(string, 'utf-8'),
    },
  };

  // Specify the type of info the inspection will look for.
  const infoTypes = [{name: 'PERSON_NAME'}, {name: 'EMAIL_ADDRESS'}];

  // Exclude EMAIL_ADDRESS matches
  const exclusionRule = {
    excludeInfoTypes: {
      infoTypes: [{name: 'EMAIL_ADDRESS'}],
    },
    matchingType:
      DLP.protos.google.privacy.dlp.v2.MatchingType
        .MATCHING_TYPE_PARTIAL_MATCH,
  };

  // Construct a ruleset that applies the exclusion rule to the PERSON_NAME infotype.
  // If a PERSON_NAME match overlaps with an EMAIL_ADDRESS match, the PERSON_NAME match will
  // be excluded.
  const ruleSet = [
    {
      infoTypes: [{name: 'PERSON_NAME'}],
      rules: [
        {
          exclusionRule: exclusionRule,
        },
      ],
    },
  ];

  // Construct the configuration for the Inspect request, including the ruleset.
  const inspectConfig = {
    infoTypes: infoTypes,
    ruleSet: ruleSet,
    includeQuote: true,
  };

  // Construct the Inspect request to be sent by the client.
  const request = {
    parent: `projects/${projectId}/locations/global`,
    inspectConfig: inspectConfig,
    item: item,
  };

  // Run request
  const [response] = await dlp.inspectContent(request);

  // Print Findings
  const findings = response.result.findings;
  if (findings.length > 0) {
    console.log(`Findings: ${findings.length}\n`);
    findings.forEach(finding => {
      console.log(`InfoType: ${finding.infoType.name}`);
      console.log(`\tQuote: ${finding.quote}`);
      console.log(`\tLikelihood: ${finding.likelihood} \n`);
    });
  } else {
    console.log('No findings.');
  }
}
inspectStringWithoutOverlapPersonEmail();

PHP

Para saber cómo instalar y usar la biblioteca de cliente de Protección de Datos Sensibles, consulta el artículo sobre las bibliotecas de cliente de Protección de Datos Sensibles.

Para autenticarte en Protección de Datos Sensibles, configura las credenciales predeterminadas de la aplicación. Para obtener más información, consulta el artículo Configurar la autenticación en un entorno de desarrollo local.

use Google\Cloud\Dlp\V2\Client\DlpServiceClient;
use Google\Cloud\Dlp\V2\ContentItem;
use Google\Cloud\Dlp\V2\ExcludeInfoTypes;
use Google\Cloud\Dlp\V2\ExclusionRule;
use Google\Cloud\Dlp\V2\InfoType;
use Google\Cloud\Dlp\V2\InspectConfig;
use Google\Cloud\Dlp\V2\InspectContentRequest;
use Google\Cloud\Dlp\V2\InspectionRule;
use Google\Cloud\Dlp\V2\InspectionRuleSet;
use Google\Cloud\Dlp\V2\Likelihood;
use Google\Cloud\Dlp\V2\MatchingType;

/**
 * Inspect a string for sensitive data, omitting overlapping matches on person and email
 * Omit matches on a PERSON_NAME detector if also matched by an EMAIL_ADDRESS detector.
 *
 * @param string $projectId         The Google Cloud project id to use as a parent resource.
 * @param string $textToInspect     The string to inspect.
 */
function inspect_string_omit_overlap(
    // TODO(developer): Replace sample parameters before running the code.
    string $projectId,
    string $textToInspect = 'james@example.org is an email.'
): void {
    // Instantiate a client.
    $dlp = new DlpServiceClient();

    $parent = "projects/$projectId/locations/global";

    // Specify what content you want the service to Inspect.
    $item = (new ContentItem())
        ->setValue($textToInspect);

    // Specify the type of info the inspection will look for.
    $personName = (new InfoType())
        ->setName('PERSON_NAME');
    $emailAddress = (new InfoType())
        ->setName('EMAIL_ADDRESS');
    $infoTypes = [$personName, $emailAddress];

    // Exclude EMAIL_ADDRESS matches
    $exclusionRule = (new ExclusionRule())
        ->setMatchingType(MatchingType::MATCHING_TYPE_PARTIAL_MATCH)
        ->setExcludeInfoTypes((new ExcludeInfoTypes())
                ->setInfoTypes([$emailAddress])
        );

    // Construct a ruleset that applies the exclusion rule to the PERSON_NAME infotype.
    // If a PERSON_NAME match overlaps with an EMAIL_ADDRESS match, the PERSON_NAME match will
    // be excluded.
    $inspectionRuleSet = (new InspectionRuleSet())
        ->setInfoTypes([$personName])
        ->setRules([
            (new InspectionRule())
                ->setExclusionRule($exclusionRule),
        ]);

    // Construct the configuration for the Inspect request, including the ruleset.
    $inspectConfig = (new InspectConfig())
        ->setInfoTypes($infoTypes)
        ->setIncludeQuote(true)
        ->setRuleSet([$inspectionRuleSet]);

    // Run request
    $inspectContentRequest = (new InspectContentRequest())
        ->setParent($parent)
        ->setInspectConfig($inspectConfig)
        ->setItem($item);
    $response = $dlp->inspectContent($inspectContentRequest);

    // Print the results
    $findings = $response->getResult()->getFindings();
    if (count($findings) == 0) {
        printf('No findings.' . PHP_EOL);
    } else {
        printf('Findings:' . PHP_EOL);
        foreach ($findings as $finding) {
            printf('  Quote: %s' . PHP_EOL, $finding->getQuote());
            printf('  Info type: %s' . PHP_EOL, $finding->getInfoType()->getName());
            printf('  Likelihood: %s' . PHP_EOL, Likelihood::name($finding->getLikelihood()));
        }
    }
}

Python

Para saber cómo instalar y usar la biblioteca de cliente de Protección de Datos Sensibles, consulta el artículo sobre las bibliotecas de cliente de Protección de Datos Sensibles.

Para autenticarte en Protección de Datos Sensibles, configura las credenciales predeterminadas de la aplicación. Para obtener más información, consulta el artículo Configurar la autenticación en un entorno de desarrollo local.

import google.cloud.dlp


def inspect_string_omit_overlap(
    project: str,
    content_string: str,
) -> None:
    """Matches PERSON_NAME and EMAIL_ADDRESS, but not both.

    Uses the Data Loss Prevention API omit matches on PERSON_NAME if the
    EMAIL_ADDRESS detector also matches.

    Args:
        project: The Google Cloud project id to use as a parent resource.
        content_string: The string to inspect.

    Returns:
        None; the response from the API is printed to the terminal.
    """

    # Instantiate a client.
    dlp = google.cloud.dlp_v2.DlpServiceClient()

    # Construct a list of infoTypes for DLP to locate in `content_string`. See
    # https://cloud.google.com/dlp/docs/concepts-infotypes for more information
    # about supported infoTypes.
    info_types_to_locate = [{"name": "PERSON_NAME"}, {"name": "EMAIL_ADDRESS"}]

    # Construct the configuration dictionary that will only match on PERSON_NAME
    # if the EMAIL_ADDRESS doesn't also match. This configuration helps reduce
    # the total number of findings when there is a large overlap between different
    # infoTypes.
    inspect_config = {
        "info_types": info_types_to_locate,
        "rule_set": [
            {
                "info_types": [{"name": "PERSON_NAME"}],
                "rules": [
                    {
                        "exclusion_rule": {
                            "exclude_info_types": {
                                "info_types": [{"name": "EMAIL_ADDRESS"}]
                            },
                            "matching_type": google.cloud.dlp_v2.MatchingType.MATCHING_TYPE_PARTIAL_MATCH,
                        }
                    }
                ],
            }
        ],
        "include_quote": True,
    }

    # Construct the `item`.
    item = {"value": content_string}

    # Convert the project id into a full resource id.
    parent = f"projects/{project}"

    # Call the API.
    response = dlp.inspect_content(
        request={"parent": parent, "inspect_config": inspect_config, "item": item}
    )

    # Print out the results.
    if response.result.findings:
        for finding in response.result.findings:
            print(f"Quote: {finding.quote}")
            print(f"Info type: {finding.info_type.name}")
            print(f"Likelihood: {finding.likelihood}")
    else:
        print("No findings.")

REST

Consulta la guía de inicio rápido de JSON para obtener más información sobre cómo usar la API de DLP con JSON.

...
    "inspectConfig":{
      "infoTypes":[
        {
          "name":"PERSON_NAME"
        },
        {
          "name":"EMAIL_ADDRESS"
        }
      ],
      "ruleSet":[
        {
          "infoTypes":[
            {
              "name":"PERSON_NAME"
            }
          ],
          "rules":[
            {
              "exclusionRule":{
                "excludeInfoTypes":{
                  "infoTypes":[
                    {
                      "name":"EMAIL_ADDRESS"
                    }
                  ]
                },
                "matchingType": "MATCHING_TYPE_PARTIAL_MATCH"
              }
            }
          ]
        }
      ]
    }
...

Omitir las coincidencias de nombres de dominio que forman parte de direcciones de correo en un análisis de detector DOMAIN_NAME

El siguiente fragmento de código JSON y el código en varios idiomas muestran cómo indicar a Protección de Datos Sensibles mediante un InspectConfig que solo debe devolver coincidencias de una comprobación del detector DOMAIN_NAME si la coincidencia no se superpone con una coincidencia de una comprobación del detector EMAIL_ADDRESS. En este caso, el análisis principal es un análisis de detector DOMAIN_NAME. El usuario no quiere que se devuelva una coincidencia de nombre de dominio en los resultados si el nombre de dominio se usa en una dirección de correo:

C#

Para saber cómo instalar y usar la biblioteca de cliente de Protección de Datos Sensibles, consulta el artículo sobre las bibliotecas de cliente de Protección de Datos Sensibles.

Para autenticarte en Protección de Datos Sensibles, configura las credenciales predeterminadas de la aplicación. Para obtener más información, consulta el artículo Configurar la autenticación en un entorno de desarrollo local.


using System.Linq;
using Google.Api.Gax.ResourceNames;
using Google.Cloud.Dlp.V2;
using static Google.Cloud.Dlp.V2.CustomInfoType.Types;

public class InspectStringWithoutOverlap
{
    public static InspectContentResponse Inspect(string projectId, string textToInspect)
    {
        // Initialize client that will be used to send requests. This client only needs to be created
        // once, and can be reused for multiple requests. After completing all of your requests, call
        // the "close" method on the client to safely clean up any remaining background resources.
        var dlp = DlpServiceClient.Create();

        // Specify the type and content to be inspected.
        var byteContentItem = new ByteContentItem
        {
            Type = ByteContentItem.Types.BytesType.TextUtf8,
            Data = Google.Protobuf.ByteString.CopyFromUtf8(textToInspect)
        };
        var contentItem = new ContentItem
        {
            ByteItem = byteContentItem
        };

        // Specify the type of info the inspection will look for.
        // See https://cloud.google.com/dlp/docs/infotypes-reference for complete list of info types.
        var infoTypes = new string[] { "DOMAIN_NAME", "EMAIL_ADDRESS" }.Select(it => new InfoType { Name = it });

        // Define a custom info type to exclude email addresses
        var customInfoType = new CustomInfoType
        {
            InfoType = new InfoType { Name = "EMAIL_ADDRESS" },
            ExclusionType = ExclusionType.Exclude
        };

        // Exclude EMAIL_ADDRESS matches
        var exclusionRule = new ExclusionRule
        {
            ExcludeInfoTypes = new ExcludeInfoTypes
            {
                InfoTypes = { new InfoType { Name = "EMAIL_ADDRESS" } }
            },
            MatchingType = MatchingType.PartialMatch
        };

        // Construct a ruleset that applies the exclusion rule to the DOMAIN_NAME infotype.
        // If a DOMAIN_NAME match is part of an EMAIL_ADDRESS match, the DOMAIN_NAME match will
        // be excluded.
        var ruleSet = new InspectionRuleSet
        {
            InfoTypes = { new InfoType { Name = "DOMAIN_NAME" } },
            Rules = { new InspectionRule { ExclusionRule = exclusionRule } }
        };

        // Construct the configuration for the Inspect request, including the ruleset.
        var config = new InspectConfig
        {
            InfoTypes = { infoTypes },
            CustomInfoTypes = { customInfoType },
            IncludeQuote = true,
            RuleSet = { ruleSet }
        };

        // Construct the Inspect request to be sent by the client.
        var request = new InspectContentRequest
        {
            Parent = new LocationName(projectId, "global").ToString(),
            Item = contentItem,
            InspectConfig = config
        };

        // Use the client to send the API request.
        var response = dlp.InspectContent(request);

        return response;
    }
}

Go

Para saber cómo instalar y usar la biblioteca de cliente de Protección de Datos Sensibles, consulta el artículo sobre las bibliotecas de cliente de Protección de Datos Sensibles.

Para autenticarte en Protección de Datos Sensibles, configura las credenciales predeterminadas de la aplicación. Para obtener más información, consulta el artículo Configurar la autenticación en un entorno de desarrollo local.


import (
	"context"
	"fmt"
	"io"

	dlp "cloud.google.com/go/dlp/apiv2"
	"cloud.google.com/go/dlp/apiv2/dlppb"
)

// inspectStringWithoutOverlap inspects a string for sensitive data
// and omit overlapping matches on domain and email
func inspectStringWithoutOverlap(w io.Writer, projectID, textToInspect string) error {
	// projectID := "my-project-id"
	// textToInspect := "example.com is a domain, james@example.org is an email."

	ctx := context.Background()

	// Initialize a client once and reuse it to send multiple requests. Clients
	// are safe to use across goroutines. When the client is no longer needed,
	// call the Close method to cleanup its resources.
	client, err := dlp.NewClient(ctx)
	if err != nil {
		return err
	}

	// Closing the client safely cleans up background resources.
	defer client.Close()

	// Specify the type and content to be inspected.
	contentItem := &dlppb.ContentItem{
		DataItem: &dlppb.ContentItem_ByteItem{
			ByteItem: &dlppb.ByteContentItem{
				Type: dlppb.ByteContentItem_TEXT_UTF8,
				Data: []byte(textToInspect),
			},
		},
	}

	// Specify the type of info the inspection will look for.
	// See https://cloud.google.com/dlp/docs/infotypes-reference for complete list of info types.
	infoTypes := []*dlppb.InfoType{
		{Name: "DOMAIN_NAME"},
		{Name: "EMAIL_ADDRESS"},
	}

	// Define a custom info type to exclude email addresses
	customInfotype := &dlppb.CustomInfoType{
		InfoType: &dlppb.InfoType{
			Name: "EMAIL_ADDRESS",
		},
		ExclusionType: dlppb.CustomInfoType_EXCLUSION_TYPE_EXCLUDE,
	}

	// Exclude EMAIL_ADDRESS matches
	exclusionRule := &dlppb.ExclusionRule{
		Type: &dlppb.ExclusionRule_ExcludeInfoTypes{
			ExcludeInfoTypes: &dlppb.ExcludeInfoTypes{
				InfoTypes: []*dlppb.InfoType{
					{Name: "EMAIL_ADDRESS"},
				},
			},
		},
		MatchingType: dlppb.MatchingType_MATCHING_TYPE_PARTIAL_MATCH,
	}

	// Construct a ruleSet that applies the exclusion rule to the DOMAIN_NAME infoType.
	// If a DOMAIN_NAME match is part of an EMAIL_ADDRESS match, the DOMAIN_NAME match will
	// be excluded.
	ruleSet := &dlppb.InspectionRuleSet{
		InfoTypes: []*dlppb.InfoType{
			{Name: "DOMAIN_NAME"},
		},
		Rules: []*dlppb.InspectionRule{
			{
				Type: &dlppb.InspectionRule_ExclusionRule{
					ExclusionRule: exclusionRule,
				},
			},
		},
	}

	// Construct the Inspect request to be sent by the client.
	req := &dlppb.InspectContentRequest{
		Parent: fmt.Sprintf("projects/%s/locations/global", projectID),
		Item:   contentItem,
		// Construct the Inspect request to be sent by the client.
		InspectConfig: &dlppb.InspectConfig{
			InfoTypes: infoTypes,
			CustomInfoTypes: []*dlppb.CustomInfoType{
				customInfotype,
			},
			IncludeQuote: true,
			RuleSet: []*dlppb.InspectionRuleSet{
				ruleSet,
			},
		},
	}

	// Send the request.
	resp, err := client.InspectContent(ctx, req)
	if err != nil {
		return err
	}

	// Process the results.
	fmt.Fprintf(w, "Findings: %v\n", len(resp.Result.Findings))
	for _, v := range resp.GetResult().Findings {
		fmt.Fprintf(w, "Quote: %v\n", v.GetQuote())
		fmt.Fprintf(w, "Infotype Name: %v\n", v.GetInfoType().GetName())
		fmt.Fprintf(w, "Likelihood: %v\n", v.GetLikelihood())
	}
	return nil
}

Java

Para saber cómo instalar y usar la biblioteca de cliente de Protección de Datos Sensibles, consulta el artículo sobre las bibliotecas de cliente de Protección de Datos Sensibles.

Para autenticarte en Protección de Datos Sensibles, configura las credenciales predeterminadas de la aplicación. Para obtener más información, consulta el artículo Configurar la autenticación en un entorno de desarrollo local.


import com.google.cloud.dlp.v2.DlpServiceClient;
import com.google.privacy.dlp.v2.ByteContentItem;
import com.google.privacy.dlp.v2.ByteContentItem.BytesType;
import com.google.privacy.dlp.v2.ContentItem;
import com.google.privacy.dlp.v2.CustomInfoType;
import com.google.privacy.dlp.v2.CustomInfoType.ExclusionType;
import com.google.privacy.dlp.v2.ExcludeInfoTypes;
import com.google.privacy.dlp.v2.ExclusionRule;
import com.google.privacy.dlp.v2.Finding;
import com.google.privacy.dlp.v2.InfoType;
import com.google.privacy.dlp.v2.InspectConfig;
import com.google.privacy.dlp.v2.InspectContentRequest;
import com.google.privacy.dlp.v2.InspectContentResponse;
import com.google.privacy.dlp.v2.InspectionRule;
import com.google.privacy.dlp.v2.InspectionRuleSet;
import com.google.privacy.dlp.v2.LocationName;
import com.google.privacy.dlp.v2.MatchingType;
import com.google.protobuf.ByteString;
import java.io.IOException;
import java.util.ArrayList;
import java.util.List;

public class InspectStringWithoutOverlap {

  public static void main(String[] args) throws Exception {
    // TODO(developer): Replace these variables before running the sample.
    String projectId = "your-project-id";
    String textToInspect = "example.com is a domain, james@example.org is an email.";
    inspectStringWithoutOverlap(projectId, textToInspect);
  }

  // Inspects the provided text, avoiding matches specified in the exclusion list.
  public static void inspectStringWithoutOverlap(String projectId, String textToInspect)
      throws IOException {
    // Initialize client that will be used to send requests. This client only needs to be created
    // once, and can be reused for multiple requests. After completing all of your requests, call
    // the "close" method on the client to safely clean up any remaining background resources.
    try (DlpServiceClient dlp = DlpServiceClient.create()) {
      // Specify the type and content to be inspected.
      ByteContentItem byteItem =
          ByteContentItem.newBuilder()
              .setType(BytesType.TEXT_UTF8)
              .setData(ByteString.copyFromUtf8(textToInspect))
              .build();
      ContentItem item = ContentItem.newBuilder().setByteItem(byteItem).build();

      // Specify the type of info the inspection will look for.
      // See https://cloud.google.com/dlp/docs/infotypes-reference for complete list of info types.
      List<InfoType> infoTypes = new ArrayList<>();
      for (String typeName : new String[] {"DOMAIN_NAME", "EMAIL_ADDRESS"}) {
        infoTypes.add(InfoType.newBuilder().setName(typeName).build());
      }

      // Define a custom info type to exclude email addresses
      CustomInfoType customInfoType =
          CustomInfoType.newBuilder()
              .setInfoType(InfoType.newBuilder().setName("EMAIL_ADDRESS"))
              .setExclusionType(ExclusionType.EXCLUSION_TYPE_EXCLUDE)
              .build();

      // Exclude EMAIL_ADDRESS matches
      ExclusionRule exclusionRule =
          ExclusionRule.newBuilder()
              .setExcludeInfoTypes(
                  ExcludeInfoTypes.newBuilder()
                      .addInfoTypes(InfoType.newBuilder().setName("EMAIL_ADDRESS")))
              .setMatchingType(MatchingType.MATCHING_TYPE_PARTIAL_MATCH)
              .build();

      // Construct a ruleset that applies the exclusion rule to the DOMAIN_NAME infotype.
      // If a DOMAIN_NAME match is part of an EMAIL_ADDRESS match, the DOMAIN_NAME match will
      // be excluded.
      InspectionRuleSet ruleSet =
          InspectionRuleSet.newBuilder()
              .addInfoTypes(InfoType.newBuilder().setName("DOMAIN_NAME"))
              .addRules(InspectionRule.newBuilder().setExclusionRule(exclusionRule))
              .build();

      // Construct the configuration for the Inspect request, including the ruleset.
      InspectConfig config =
          InspectConfig.newBuilder()
              .addAllInfoTypes(infoTypes)
              .addCustomInfoTypes(customInfoType)
              .setIncludeQuote(true)
              .addRuleSet(ruleSet)
              .build();

      // Construct the Inspect request to be sent by the client.
      InspectContentRequest request =
          InspectContentRequest.newBuilder()
              .setParent(LocationName.of(projectId, "global").toString())
              .setItem(item)
              .setInspectConfig(config)
              .build();

      // Use the client to send the API request.
      InspectContentResponse response = dlp.inspectContent(request);

      // Parse the response and process results
      System.out.println("Findings: " + response.getResult().getFindingsCount());
      for (Finding f : response.getResult().getFindingsList()) {
        System.out.println("\tQuote: " + f.getQuote());
        System.out.println("\tInfo type: " + f.getInfoType().getName());
        System.out.println("\tLikelihood: " + f.getLikelihood());
      }
    }
  }
}

Node.js

Para saber cómo instalar y usar la biblioteca de cliente de Protección de Datos Sensibles, consulta el artículo sobre las bibliotecas de cliente de Protección de Datos Sensibles.

Para autenticarte en Protección de Datos Sensibles, configura las credenciales predeterminadas de la aplicación. Para obtener más información, consulta el artículo Configurar la autenticación en un entorno de desarrollo local.

// Imports the Google Cloud Data Loss Prevention library
const DLP = require('@google-cloud/dlp');

// Instantiates a client
const dlp = new DLP.DlpServiceClient();

// The project ID to run the API call under
// const projectId = 'my-project';

// The string to inspect
// const string = 'example.com is a domain, james@example.org is an email.';

async function inspectStringWithoutOverlap() {
  // Construct item to inspect
  const item = {value: string};

  // The infoTypes of information to match
  const infoTypes = [{name: 'DOMAIN_NAME'}, {name: 'EMAIL_ADDRESS'}];

  // Define a custom info type to exclude email addresses
  const customInfoTypes = [
    {
      infoType: {name: 'EMAIL_ADDRESS'},
      exclusionType:
        DLP.protos.google.privacy.dlp.v2.CustomInfoType.ExclusionType
          .EXCLUSION_TYPE_EXCLUDE,
    },
  ];

  // Construct a exclusion rule
  const exclusionRule = {
    excludeInfoTypes: {
      infoTypes: [{name: 'EMAIL_ADDRESS'}],
    },
    matchingType:
      DLP.protos.google.privacy.dlp.v2.MatchingType
        .MATCHING_TYPE_PARTIAL_MATCH,
  };

  // Construct a rule set with exclusions
  const ruleSet = [
    {
      infoTypes: [{name: 'DOMAIN_NAME'}],
      rules: [
        {
          exclusionRule: exclusionRule,
        },
      ],
    },
  ];

  // Construct the inspect configuration
  const inspectConfig = {
    infoTypes: infoTypes,
    customInfoTypes: customInfoTypes,
    ruleSet: ruleSet,
    includeQuote: true,
  };

  // Combine configurations into a request for the service.
  const request = {
    parent: `projects/${projectId}/locations/global`,
    inspectConfig: inspectConfig,
    item: item,
  };

  // Run request
  const [response] = await dlp.inspectContent(request);

  // Print Findings
  const findings = response.result.findings;
  if (findings.length > 0) {
    console.log(`Findings: ${findings.length}\n`);
    findings.forEach(finding => {
      console.log(`InfoType: ${finding.infoType.name}`);
      console.log(`\tQuote: ${finding.quote}`);
      console.log(`\tLikelihood: ${finding.likelihood} \n`);
    });
  } else {
    console.log('No findings.');
  }
}
inspectStringWithoutOverlap();

PHP

Para saber cómo instalar y usar la biblioteca de cliente de Protección de Datos Sensibles, consulta el artículo sobre las bibliotecas de cliente de Protección de Datos Sensibles.

Para autenticarte en Protección de Datos Sensibles, configura las credenciales predeterminadas de la aplicación. Para obtener más información, consulta el artículo Configurar la autenticación en un entorno de desarrollo local.

use Google\Cloud\Dlp\V2\Client\DlpServiceClient;
use Google\Cloud\Dlp\V2\ContentItem;
use Google\Cloud\Dlp\V2\CustomInfoType;
use Google\Cloud\Dlp\V2\CustomInfoType\ExclusionType;
use Google\Cloud\Dlp\V2\ExcludeInfoTypes;
use Google\Cloud\Dlp\V2\ExclusionRule;
use Google\Cloud\Dlp\V2\InfoType;
use Google\Cloud\Dlp\V2\InspectConfig;
use Google\Cloud\Dlp\V2\InspectContentRequest;
use Google\Cloud\Dlp\V2\InspectionRule;
use Google\Cloud\Dlp\V2\InspectionRuleSet;
use Google\Cloud\Dlp\V2\Likelihood;
use Google\Cloud\Dlp\V2\MatchingType;

/**
 * Inspect a string for sensitive data, omitting overlapping matches on domain and email
 * Omit matches on domain names that are part of email addresses in a DOMAIN_NAME detector scan.
 *
 * @param string $projectId         The Google Cloud project id to use as a parent resource.
 * @param string $textToInspect     The string to inspect.
 */
function inspect_string_without_overlap(
    // TODO(developer): Replace sample parameters before running the code.
    string $projectId,
    string $textToInspect = 'example.com is a domain, james@example.org is an email.'
): void {
    // Instantiate a client.
    $dlp = new DlpServiceClient();

    $parent = "projects/$projectId/locations/global";

    // Specify what content you want the service to Inspect.
    $item = (new ContentItem())
        ->setValue($textToInspect);

    // Specify the type of info the inspection will look for.
    $domainName = (new InfoType())
        ->setName('DOMAIN_NAME');
    $emailAddress = (new InfoType())
        ->setName('EMAIL_ADDRESS');
    $infoTypes = [$domainName, $emailAddress];

    // Define a custom info type to exclude email addresses
    $customInfoType = (new CustomInfoType())
        ->setInfoType($emailAddress)
        ->setExclusionType(ExclusionType::EXCLUSION_TYPE_EXCLUDE);

    // Exclude EMAIL_ADDRESS matches
    $matchingType = MatchingType::MATCHING_TYPE_PARTIAL_MATCH;

    $exclusionRule = (new ExclusionRule())
        ->setMatchingType($matchingType)
        ->setExcludeInfoTypes((new ExcludeInfoTypes())
            ->setInfoTypes([$customInfoType->getInfoType()])
        );

    // Construct a ruleset that applies the exclusion rule to the DOMAIN_NAME infotype.
    // If a DOMAIN_NAME match is part of an EMAIL_ADDRESS match, the DOMAIN_NAME match will
    // be excluded.
    $inspectionRuleSet = (new InspectionRuleSet())
        ->setInfoTypes([$domainName])
        ->setRules([
            (new InspectionRule())
                ->setExclusionRule($exclusionRule),
        ]);

    // Construct the configuration for the Inspect request, including the ruleset.
    $inspectConfig = (new InspectConfig())
        ->setInfoTypes($infoTypes)
        ->setCustomInfoTypes([$customInfoType])
        ->setIncludeQuote(true)
        ->setRuleSet([$inspectionRuleSet]);

    // Run request
    $inspectContentRequest = (new InspectContentRequest())
        ->setParent($parent)
        ->setInspectConfig($inspectConfig)
        ->setItem($item);
    $response = $dlp->inspectContent($inspectContentRequest);

    // Print the results
    $findings = $response->getResult()->getFindings();
    if (count($findings) == 0) {
        printf('No findings.' . PHP_EOL);
    } else {
        printf('Findings:' . PHP_EOL);
        foreach ($findings as $finding) {
            printf('  Quote: %s' . PHP_EOL, $finding->getQuote());
            printf('  Info type: %s' . PHP_EOL, $finding->getInfoType()->getName());
            printf(
                '  Likelihood: %s' . PHP_EOL,
                Likelihood::name($finding->getLikelihood()));
        }
    }
}

Python

Para saber cómo instalar y usar la biblioteca de cliente de Protección de Datos Sensibles, consulta el artículo sobre las bibliotecas de cliente de Protección de Datos Sensibles.

Para autenticarte en Protección de Datos Sensibles, configura las credenciales predeterminadas de la aplicación. Para obtener más información, consulta el artículo Configurar la autenticación en un entorno de desarrollo local.

import google.cloud.dlp


def inspect_string_without_overlap(project: str, content_string: str) -> None:
    """Matches EMAIL_ADDRESS and DOMAIN_NAME, but DOMAIN_NAME is omitted
    if it overlaps with EMAIL_ADDRESS

    Uses the Data Loss Prevention API to omit matches of one infotype
    that overlap with another.

    Args:
        project: The Google Cloud project id to use as a parent resource.
        content_string: The string to inspect.

    Returns:
        None; the response from the API is printed to the terminal.
    """

    # Instantiate a client.
    dlp = google.cloud.dlp_v2.DlpServiceClient()

    # Construct a list of infoTypes for DLP to locate in `content_string`. See
    # https://cloud.google.com/dlp/docs/concepts-infotypes for more information
    # about supported infoTypes.
    info_types_to_locate = [{"name": "DOMAIN_NAME"}, {"name": "EMAIL_ADDRESS"}]

    # Define a custom info type to exclude email addresses
    custom_info_types = [
        {
            "info_type": {"name": "EMAIL_ADDRESS"},
            "exclusion_type": google.cloud.dlp_v2.CustomInfoType.ExclusionType.EXCLUSION_TYPE_EXCLUDE,
        }
    ]

    # Construct a rule set that will exclude DOMAIN_NAME matches
    # that overlap with EMAIL_ADDRESS matches
    rule_set = [
        {
            "info_types": [{"name": "DOMAIN_NAME"}],
            "rules": [
                {
                    "exclusion_rule": {
                        "exclude_info_types": {
                            "info_types": [{"name": "EMAIL_ADDRESS"}]
                        },
                        "matching_type": google.cloud.dlp_v2.MatchingType.MATCHING_TYPE_PARTIAL_MATCH,
                    }
                }
            ],
        }
    ]

    # Construct the configuration dictionary
    inspect_config = {
        "info_types": info_types_to_locate,
        "custom_info_types": custom_info_types,
        "rule_set": rule_set,
        "include_quote": True,
    }

    # Construct the `item`.
    item = {"value": content_string}

    # Convert the project id into a full resource id.
    parent = f"projects/{project}"

    # Call the API.
    response = dlp.inspect_content(
        request={"parent": parent, "inspect_config": inspect_config, "item": item}
    )

    # Print out the results.
    if response.result.findings:
        for finding in response.result.findings:
            print(f"Quote: {finding.quote}")
            print(f"Info type: {finding.info_type.name}")
            print(f"Likelihood: {finding.likelihood}")
    else:
        print("No findings.")

REST

Consulta la guía de inicio rápido de JSON para obtener más información sobre cómo usar la API de DLP con JSON.

...
    "inspectConfig":{
      "infoTypes":[
        {
          "name":"DOMAIN_NAME"
        },
        {
          "name":"EMAIL_ADDRESS"
        }
      ],
      "customInfoTypes":[
        {
          "infoType":{
            "name":"EMAIL_ADDRESS"
          },
          "exclusionType":"EXCLUSION_TYPE_EXCLUDE"
        }
      ],
      "ruleSet":[
        {
          "infoTypes":[
            {
              "name":"DOMAIN_NAME"
            }
          ],
          "rules":[
            {
              "exclusionRule":{
                "excludeInfoTypes":{
                  "infoTypes":[
                    {
                      "name":"EMAIL_ADDRESS"
                    }
                  ]
                },
                "matchingType": "MATCHING_TYPE_PARTIAL_MATCH"
              }
            }
          ]
        }
      ]
    }
...

Omitir las coincidencias si se encuentran cerca de una cadena

En el siguiente ejemplo se muestra cómo excluir las coincidencias del detector de infoType US_SOCIAL_SECURITY_NUMBER si la palabra "SKU" se encuentra a 10 caracteres antes o después del resultado.

Debido a la regla de exclusión, en este ejemplo no se clasifica 222-22-2222 como un posible número de la Seguridad Social de EE. UU.

{
  "item": {
    "value": "The customer sent the product SKU 222-22-2222"
  },
  "inspectConfig": {
    "infoTypes": [
      {
        "name": "US_SOCIAL_SECURITY_NUMBER"
      }
    ],
    "ruleSet": [
      {
        "infoTypes": [
          {
            "name": "US_SOCIAL_SECURITY_NUMBER"
          }
        ],
        "rules": [
          {
            "exclusionRule": {
              "excludeByHotword": {
                "hotwordRegex": {
                  "pattern": "(SKU)"
                },
                "proximity": {
                  "windowBefore": 10,
                  "windowAfter": 10
                }
              },
              "matchingType": "MATCHING_TYPE_FULL_MATCH"
            }
          }
        ]
      }
    ],
    "includeQuote": true
  }
}

Omitir resultados en una columna de datos completa

En el siguiente ejemplo se muestra cómo excluir resultados de toda una columna de datos tabulares si el nombre de esa columna coincide con una expresión regular. En este caso, cualquier resultado que coincida con el detector de infoType US_SOCIAL_SECURITY_NUMBER se excluye de los resultados si se encuentra en la columna Número de la Seguridad Social falso.

En este ejemplo, solo se devuelve 222-22-2222, porque 111-11-1111 está en la columna Número de identificación personal falso.

{
  "item": {
    "table": {
      "headers": [
        {
          "name": "Fake Social Security Number"
        },
        {
          "name": "Real Social Security Number"
        }
      ],
      "rows": [
        {
          "values": [
            {
              "stringValue": "111-11-1111"
            },
            {
              "stringValue": "222-22-2222"
            }
          ]
        }
      ]
    }
  },
  "inspectConfig": {
    "infoTypes": [
      {
        "name": "US_SOCIAL_SECURITY_NUMBER"
      }
    ],
    "includeQuote": true,
    "ruleSet": [
      {
        "infoTypes": [
          {
            "name": "US_SOCIAL_SECURITY_NUMBER"
          }
        ],
        "rules": [
          {
            "exclusionRule": {
              "excludeByHotword": {
                "hotwordRegex": {
                  "pattern": "(Fake Social Security Number)"
                },
                "proximity": {
                  "windowBefore": 1
                }
              },
              "matchingType": "MATCHING_TYPE_FULL_MATCH"
            }
          }
        ]
      }
    ],
    "minLikelihood": "POSSIBLE"
  }
}

Reglas de palabras de activación

Las reglas de palabras de activación son útiles en situaciones como las siguientes:

  • Quieres cambiar los valores de probabilidad asignados a las coincidencias de análisis en función de la proximidad de la coincidencia a una palabra de activación. Por ejemplo, quieres asignar un valor de probabilidad más alto a las coincidencias de nombres de pacientes en función de la proximidad de los nombres a la palabra "paciente".
  • Cuando inspeccione datos tabulares estructurados, querrá cambiar los valores de probabilidad asignados a las coincidencias en función del nombre del encabezado de una columna. Por ejemplo, quieres asignar un valor de probabilidad más alto a US_SOCIAL_SECURITY_NUMBER cuando se encuentre en una columna con el encabezado ACCOUNT_ID.

Descripción general de la API Hotword rules

En el objeto InspectionRule de Protección de Datos Sensibles, especifica un objeto HotwordRule que ajusta la probabilidad de los hallazgos dentro de una proximidad determinada de las palabras activadoras.

Los objetos InspectionRule se agrupan como un "conjunto de reglas" en un objeto InspectionRuleSet, junto con una lista de detectores de infoType a los que se aplica el conjunto de reglas. Las reglas de un conjunto de reglas se aplican en el orden especificado.

Ejemplos de situaciones de reglas de palabras activas

En el siguiente fragmento de código se muestra cómo configurar Protección de Datos Sensibles para el caso práctico indicado.

Aumentar las probabilidades de que se detecte un PERSON_NAME si está cerca la palabra de activación "paciente"

El siguiente fragmento de código JSON y el código en varios idiomas muestran cómo usar la propiedad InspectConfig para buscar nombres de pacientes en una base de datos médica. Puedes usar el detector de PERSON_NAME infoType integrado de Protección de Datos Sensibles, pero esto hará que Protección de Datos Sensibles coincida con todos los nombres de personas, no solo con los nombres de pacientes. Para solucionar este problema, puedes incluir una regla de palabra clave que busque la palabra "paciente" a una distancia determinada del primer carácter de las posibles coincidencias. A continuación, puedes asignar a las detecciones que coincidan con este patrón una probabilidad de "muy probable", ya que se corresponden con tus criterios especiales. Si se define el mínimo Likelihood en VERY_LIKELY dentro de InspectConfig , se asegura de que solo se devuelvan coincidencias con esta configuración en los resultados.

C#

Para saber cómo instalar y usar la biblioteca de cliente de Protección de Datos Sensibles, consulta el artículo sobre las bibliotecas de cliente de Protección de Datos Sensibles.

Para autenticarte en Protección de Datos Sensibles, configura las credenciales predeterminadas de la aplicación. Para obtener más información, consulta el artículo Configurar la autenticación en un entorno de desarrollo local.


using Google.Api.Gax.ResourceNames;
using Google.Cloud.Dlp.V2;
using System;
using static Google.Cloud.Dlp.V2.CustomInfoType.Types;

public class InspectStringCustomHotword
{
    public static InspectContentResponse Inspect(string projectId, string textToInspect, string customHotword)
    {
        var dlp = DlpServiceClient.Create();

        var byteContentItem = new ByteContentItem
        {
            Type = ByteContentItem.Types.BytesType.TextUtf8,
            Data = Google.Protobuf.ByteString.CopyFromUtf8(textToInspect)
        };

        var contentItem = new ContentItem
        {
            ByteItem = byteContentItem
        };

        var hotwordRule = new DetectionRule.Types.HotwordRule
        {
            HotwordRegex = new Regex { Pattern = customHotword },
            Proximity = new DetectionRule.Types.Proximity { WindowBefore = 50 },
            LikelihoodAdjustment = new DetectionRule.Types.LikelihoodAdjustment { FixedLikelihood = Likelihood.VeryLikely }
        };

        var infoType = new InfoType { Name = "PERSON_NAME" };

        var inspectionRuleSet = new InspectionRuleSet
        {
            InfoTypes = { infoType },
            Rules = { new InspectionRule { HotwordRule = hotwordRule } }
        };

        var inspectConfig = new InspectConfig
        {
            InfoTypes = { infoType },
            IncludeQuote = true,
            RuleSet = { inspectionRuleSet },
            MinLikelihood = Likelihood.VeryLikely
        };

        var request = new InspectContentRequest
        {
            Parent = new LocationName(projectId, "global").ToString(),
            Item = contentItem,
            InspectConfig = inspectConfig
        };

        var response = dlp.InspectContent(request);

        Console.WriteLine($"Findings: {response.Result.Findings.Count}");
        foreach (var f in response.Result.Findings)
        {
            Console.WriteLine("\tQuote: " + f.Quote);
            Console.WriteLine("\tInfo type: " + f.InfoType.Name);
            Console.WriteLine("\tLikelihood: " + f.Likelihood);
        }

        return response;
    }
}

Go

Para saber cómo instalar y usar la biblioteca de cliente de Protección de Datos Sensibles, consulta el artículo sobre las bibliotecas de cliente de Protección de Datos Sensibles.

Para autenticarte en Protección de Datos Sensibles, configura las credenciales predeterminadas de la aplicación. Para obtener más información, consulta el artículo Configurar la autenticación en un entorno de desarrollo local.

import (
	"context"
	"fmt"
	"io"

	dlp "cloud.google.com/go/dlp/apiv2"
	"cloud.google.com/go/dlp/apiv2/dlppb"
)

// inspectStringCustomHotWord inspects a string from sensitive data by using a custom hot word
func inspectStringCustomHotWord(w io.Writer, projectID, textToInspect, customHotWord, infoTypeName string) error {
	// projectID := "my-project-id"
	// textToInspect := "patient name: John Doe"
	// customHotWord := "patient"
	// infoTypeName := "PERSON_NAME"

	ctx := context.Background()

	// Initialize a client once and reuse it to send multiple requests. Clients
	// are safe to use across goroutines. When the client is no longer needed,
	// call the Close method to cleanup its resources.
	client, err := dlp.NewClient(ctx)
	if err != nil {
		return err
	}

	// Closing the client safely cleans up background resources.
	defer client.Close()

	// Specify the type and content to be inspected.
	contentItem := &dlppb.ContentItem{
		DataItem: &dlppb.ContentItem_ByteItem{
			ByteItem: &dlppb.ByteContentItem{
				Type: dlppb.ByteContentItem_TEXT_UTF8,
				Data: []byte(textToInspect),
			},
		},
	}

	// Increase likelihood of matches that have customHotword nearby
	hotwordRule := &dlppb.InspectionRule_HotwordRule{

		HotwordRule: &dlppb.CustomInfoType_DetectionRule_HotwordRule{
			HotwordRegex: &dlppb.CustomInfoType_Regex{
				Pattern: customHotWord,
			},
			Proximity: &dlppb.CustomInfoType_DetectionRule_Proximity{
				WindowBefore: 50,
			},
			LikelihoodAdjustment: &dlppb.CustomInfoType_DetectionRule_LikelihoodAdjustment{
				Adjustment: &dlppb.CustomInfoType_DetectionRule_LikelihoodAdjustment_FixedLikelihood{
					FixedLikelihood: dlppb.Likelihood_VERY_LIKELY,
				},
			},
		},
	}

	// Construct a ruleset that applies the hotword rule to the PERSON_NAME infotype.
	ruleSet := &dlppb.InspectionRuleSet{
		InfoTypes: []*dlppb.InfoType{
			{Name: infoTypeName}, //"PERSON_NAME"
		},
		Rules: []*dlppb.InspectionRule{
			{
				Type: &dlppb.InspectionRule_HotwordRule{
					HotwordRule: hotwordRule.HotwordRule,
				},
			},
		},
	}

	// Construct the Inspect request to be sent by the client.
	req := &dlppb.InspectContentRequest{
		Parent: fmt.Sprintf("projects/%s/locations/global", projectID),
		Item:   contentItem,
		InspectConfig: &dlppb.InspectConfig{
			InfoTypes: []*dlppb.InfoType{
				{Name: infoTypeName}, //"PERSON_NAME"
			},
			IncludeQuote:  true,
			MinLikelihood: dlppb.Likelihood_VERY_LIKELY,
			RuleSet: []*dlppb.InspectionRuleSet{
				ruleSet,
			},
		},
	}

	// Send the request.
	resp, err := client.InspectContent(ctx, req)
	if err != nil {
		return err
	}

	// Parse the response and process results
	fmt.Fprintf(w, "Findings: %v\n", len(resp.Result.Findings))
	for _, v := range resp.GetResult().Findings {
		fmt.Fprintf(w, "Quote: %v\n", v.GetQuote())
		fmt.Fprintf(w, "Infotype Name: %v\n", v.GetInfoType().GetName())
		fmt.Fprintf(w, "Likelihood: %v\n", v.GetLikelihood())
	}
	return nil

}

Java

Para saber cómo instalar y usar la biblioteca de cliente de Protección de Datos Sensibles, consulta el artículo sobre las bibliotecas de cliente de Protección de Datos Sensibles.

Para autenticarte en Protección de Datos Sensibles, configura las credenciales predeterminadas de la aplicación. Para obtener más información, consulta el artículo Configurar la autenticación en un entorno de desarrollo local.


import com.google.cloud.dlp.v2.DlpServiceClient;
import com.google.privacy.dlp.v2.ByteContentItem;
import com.google.privacy.dlp.v2.ByteContentItem.BytesType;
import com.google.privacy.dlp.v2.ContentItem;
import com.google.privacy.dlp.v2.CustomInfoType.DetectionRule.HotwordRule;
import com.google.privacy.dlp.v2.CustomInfoType.DetectionRule.LikelihoodAdjustment;
import com.google.privacy.dlp.v2.CustomInfoType.DetectionRule.Proximity;
import com.google.privacy.dlp.v2.CustomInfoType.Regex;
import com.google.privacy.dlp.v2.Finding;
import com.google.privacy.dlp.v2.InfoType;
import com.google.privacy.dlp.v2.InspectConfig;
import com.google.privacy.dlp.v2.InspectContentRequest;
import com.google.privacy.dlp.v2.InspectContentResponse;
import com.google.privacy.dlp.v2.InspectionRule;
import com.google.privacy.dlp.v2.InspectionRuleSet;
import com.google.privacy.dlp.v2.Likelihood;
import com.google.privacy.dlp.v2.LocationName;
import com.google.protobuf.ByteString;
import java.io.IOException;

public class InspectStringCustomHotword {

  public static void main(String[] args) throws Exception {
    // TODO(developer): Replace these variables before running the sample.
    String projectId = "your-project-id";
    String textToInspect = "patient name: John Doe";
    String customHotword = "patient";
    inspectStringCustomHotword(projectId, textToInspect, customHotword);
  }

  // Inspects the provided text.
  public static void inspectStringCustomHotword(
      String projectId, String textToInspect, String customHotword) throws IOException {
    // Initialize client that will be used to send requests. This client only needs to be created
    // once, and can be reused for multiple requests. After completing all of your requests, call
    // the "close" method on the client to safely clean up any remaining background resources.
    try (DlpServiceClient dlp = DlpServiceClient.create()) {
      // Specify the type and content to be inspected.
      ByteContentItem byteItem =
          ByteContentItem.newBuilder()
              .setType(BytesType.TEXT_UTF8)
              .setData(ByteString.copyFromUtf8(textToInspect))
              .build();
      ContentItem item = ContentItem.newBuilder().setByteItem(byteItem).build();

      // Increase likelihood of matches that have customHotword nearby
      HotwordRule hotwordRule =
          HotwordRule.newBuilder()
              .setHotwordRegex(Regex.newBuilder().setPattern(customHotword))
              .setProximity(Proximity.newBuilder().setWindowBefore(50))
              .setLikelihoodAdjustment(
                  LikelihoodAdjustment.newBuilder().setFixedLikelihood(Likelihood.VERY_LIKELY))
              .build();

      // Construct a ruleset that applies the hotword rule to the PERSON_NAME infotype.
      InspectionRuleSet ruleSet =
          InspectionRuleSet.newBuilder()
              .addInfoTypes(InfoType.newBuilder().setName("PERSON_NAME").build())
              .addRules(InspectionRule.newBuilder().setHotwordRule(hotwordRule))
              .build();

      // Construct the configuration for the Inspect request.
      InspectConfig config =
          InspectConfig.newBuilder()
              .addInfoTypes(InfoType.newBuilder().setName("PERSON_NAME").build())
              .setIncludeQuote(true)
              .addRuleSet(ruleSet)
              .setMinLikelihood(Likelihood.VERY_LIKELY)
              .build();

      // Construct the Inspect request to be sent by the client.
      InspectContentRequest request =
          InspectContentRequest.newBuilder()
              .setParent(LocationName.of(projectId, "global").toString())
              .setItem(item)
              .setInspectConfig(config)
              .build();

      // Use the client to send the API request.
      InspectContentResponse response = dlp.inspectContent(request);

      // Parse the response and process results
      System.out.println("Findings: " + response.getResult().getFindingsCount());
      for (Finding f : response.getResult().getFindingsList()) {
        System.out.println("\tQuote: " + f.getQuote());
        System.out.println("\tInfo type: " + f.getInfoType().getName());
        System.out.println("\tLikelihood: " + f.getLikelihood());
      }
    }
  }
}

Node.js

Para saber cómo instalar y usar la biblioteca de cliente de Protección de Datos Sensibles, consulta el artículo sobre las bibliotecas de cliente de Protección de Datos Sensibles.

Para autenticarte en Protección de Datos Sensibles, configura las credenciales predeterminadas de la aplicación. Para obtener más información, consulta el artículo Configurar la autenticación en un entorno de desarrollo local.

// Imports the Google Cloud Data Loss Prevention library
const DLP = require('@google-cloud/dlp');

// Instantiates a client
const dlp = new DLP.DlpServiceClient();

// The project ID to run the API call under
// const projectId = 'my-project';

// The string to inspect
// const string = 'patient name: John Doe';

// Custom hotward
// const customHotword = 'patient';

async function inspectStringCustomHotword() {
  // Specify the type and content to be inspected.
  const item = {
    byteItem: {
      type: DLP.protos.google.privacy.dlp.v2.ByteContentItem.BytesType
        .TEXT_UTF8,
      data: Buffer.from(string, 'utf-8'),
    },
  };

  // Increase likelihood of matches that have customHotword nearby.
  const hotwordRule = {
    hotwordRegex: {
      pattern: customHotword,
    },
    proximity: {
      windowBefore: 50,
    },
    likelihoodAdjustment: {
      fixedLikelihood:
        DLP.protos.google.privacy.dlp.v2.Likelihood.VERY_LIKELY,
    },
  };

  // Construct a ruleset that applies the hotword rule to the PERSON_NAME infotype.
  const ruleSet = [
    {
      infoTypes: [{name: 'PERSON_NAME'}],
      rules: [
        {
          hotwordRule: hotwordRule,
        },
      ],
    },
  ];

  // Construct the configuration for the Inspect request.
  const inspectConfig = {
    infoTypes: [{name: 'PERSON_NAME'}],
    ruleSet: ruleSet,
    includeQuote: true,
  };

  // Construct the Inspect request to be sent by the client.
  const request = {
    parent: `projects/${projectId}/locations/global`,
    inspectConfig: inspectConfig,
    item: item,
  };

  // Use the client to send the API request.
  const [response] = await dlp.inspectContent(request);

  // Print findings.
  const findings = response.result.findings;
  if (findings.length > 0) {
    console.log(`Findings: ${findings.length}\n`);
    findings.forEach(finding => {
      console.log(`InfoType: ${finding.infoType.name}`);
      console.log(`\tQuote: ${finding.quote}`);
      console.log(`\tLikelihood: ${finding.likelihood} \n`);
    });
  } else {
    console.log('No findings.');
  }
}
inspectStringCustomHotword();

PHP

Para saber cómo instalar y usar la biblioteca de cliente de Protección de Datos Sensibles, consulta el artículo sobre las bibliotecas de cliente de Protección de Datos Sensibles.

Para autenticarte en Protección de Datos Sensibles, configura las credenciales predeterminadas de la aplicación. Para obtener más información, consulta el artículo Configurar la autenticación en un entorno de desarrollo local.

use Google\Cloud\Dlp\V2\Client\DlpServiceClient;
use Google\Cloud\Dlp\V2\ContentItem;
use Google\Cloud\Dlp\V2\CustomInfoType\DetectionRule\HotwordRule;
use Google\Cloud\Dlp\V2\CustomInfoType\DetectionRule\LikelihoodAdjustment;
use Google\Cloud\Dlp\V2\CustomInfoType\DetectionRule\Proximity;
use Google\Cloud\Dlp\V2\CustomInfoType\Regex;
use Google\Cloud\Dlp\V2\InfoType;
use Google\Cloud\Dlp\V2\InspectConfig;
use Google\Cloud\Dlp\V2\InspectContentRequest;
use Google\Cloud\Dlp\V2\InspectionRule;
use Google\Cloud\Dlp\V2\InspectionRuleSet;
use Google\Cloud\Dlp\V2\Likelihood;

/**
 * Inspect a string from sensitive data by using a custom hotword
 * Increase the likelihood of a PERSON_NAME match if there is the hotword "patient" nearby. Illustrates using the InspectConfig property for the purpose of scanning a medical database for patient names. You can use Cloud DLP's built-in PERSON_NAME infoType detector, but that causes Cloud DLP to match on all names of people, not just names of patients. To fix this, you can include a hotword rule that looks for the word "patient" within a certain character proximity from the first character of potential matches. You can then assign findings that match this pattern a likelihood of "very likely," since they correspond to your special criteria. Setting the minimum likelihood to VERY_LIKELY within InspectConfig ensures that only matches to this configuration are returned in findings.
 *
 * @param string $projectId         The Google Cloud project id to use as a parent resource.
 * @param string $textToInspect     The string to inspect.
 */
function inspect_string_custom_hotword(
    // TODO(developer): Replace sample parameters before running the code.
    string $projectId,
    string $textToInspect = 'patient name: John Doe'
): void {
    // Instantiate a client.
    $dlp = new DlpServiceClient();

    $parent = "projects/$projectId/locations/global";

    // Specify what content you want the service to Inspect.
    $item = (new ContentItem())
        ->setValue($textToInspect);

    // Construct hotword rules
    $hotwordRule = (new HotwordRule())
        ->setHotwordRegex(
            (new Regex())
                ->setPattern('patient')
        )
        ->setProximity(
            (new Proximity())
                ->setWindowBefore(50)
        )
        ->setLikelihoodAdjustment(
            (new LikelihoodAdjustment())
                ->setFixedLikelihood(Likelihood::VERY_LIKELY)
        );

    // Construct a ruleset that applies the hotword rule to the PERSON_NAME infotype.
    $personName = (new InfoType())
        ->setName('PERSON_NAME');
    $inspectionRuleSet = (new InspectionRuleSet())
        ->setInfoTypes([$personName])
        ->setRules([
            (new InspectionRule())
                ->setHotwordRule($hotwordRule),
        ]);

    // Construct the configuration for the Inspect request, including the ruleset.
    $inspectConfig = (new InspectConfig())
        ->setInfoTypes([$personName])
        ->setIncludeQuote(true)
        ->setRuleSet([$inspectionRuleSet])
        ->setMinLikelihood(Likelihood::VERY_LIKELY);

    // Run request
    $inspectContentRequest = (new InspectContentRequest())
        ->setParent($parent)
        ->setInspectConfig($inspectConfig)
        ->setItem($item);
    $response = $dlp->inspectContent($inspectContentRequest);

    // Print the results
    $findings = $response->getResult()->getFindings();
    if (count($findings) == 0) {
        printf('No findings.' . PHP_EOL);
    } else {
        printf('Findings:' . PHP_EOL);
        foreach ($findings as $finding) {
            printf('  Quote: %s' . PHP_EOL, $finding->getQuote());
            printf('  Info type: %s' . PHP_EOL, $finding->getInfoType()->getName());
            printf('  Likelihood: %s' . PHP_EOL, Likelihood::name($finding->getLikelihood()));
        }
    }
}

Python

Para saber cómo instalar y usar la biblioteca de cliente de Protección de Datos Sensibles, consulta el artículo sobre las bibliotecas de cliente de Protección de Datos Sensibles.

Para autenticarte en Protección de Datos Sensibles, configura las credenciales predeterminadas de la aplicación. Para obtener más información, consulta el artículo Configurar la autenticación en un entorno de desarrollo local.

import google.cloud.dlp


def inspect_string_w_custom_hotword(
    project: str, content_string: str, custom_hotword: str = "patient"
) -> None:
    """Uses the Data Loss Prevention API increase likelihood for matches on
       PERSON_NAME if the user specified custom hot-word is present. Only
       includes findings with the increased likelihood by setting a minimum
       likelihood threshold of VERY_LIKELY.

    Args:
        project: The Google Cloud project id to use as a parent resource.
        content_string: The string to inspect.
        custom_hotword: The custom hot-word used for likelihood boosting.
    """

    # Instantiate a client.
    dlp = google.cloud.dlp_v2.DlpServiceClient()

    # Construct a rule set with caller provided hotword, with a likelihood
    # boost to VERY_LIKELY when the hotword are present within the 50 character-
    # window preceding the PII finding.
    hotword_rule = {
        "hotword_regex": {"pattern": custom_hotword},
        "likelihood_adjustment": {
            "fixed_likelihood": google.cloud.dlp_v2.Likelihood.VERY_LIKELY
        },
        "proximity": {"window_before": 50},
    }

    rule_set = [
        {
            "info_types": [{"name": "PERSON_NAME"}],
            "rules": [{"hotword_rule": hotword_rule}],
        }
    ]

    # Construct the configuration dictionary with the custom regex info type.
    inspect_config = {
        "rule_set": rule_set,
        "min_likelihood": google.cloud.dlp_v2.Likelihood.VERY_LIKELY,
        "include_quote": True,
    }

    # Construct the `item`.
    item = {"value": content_string}

    # Convert the project id into a full resource id.
    parent = f"projects/{project}"

    # Call the API.
    response = dlp.inspect_content(
        request={"parent": parent, "inspect_config": inspect_config, "item": item}
    )

    # Print out the results.
    if response.result.findings:
        for finding in response.result.findings:
            print(f"Quote: {finding.quote}")
            print(f"Info type: {finding.info_type.name}")
            print(f"Likelihood: {finding.likelihood}")
    else:
        print("No findings.")

REST

Consulta la guía de inicio rápido de JSON para obtener más información sobre cómo usar la API de DLP con JSON.

...
  "inspectConfig":{
    "infoTypes":[
      {
        "name":"PERSON_NAME"
      }
    ],
    "ruleSet":[
      {
        "infoTypes":[
          {
            "name":"PERSON_NAME"
          }
        ],
        "rules":[
          {
            "hotwordRule":{
              "hotwordRegex":{
                "pattern":"patient"
              },
              "proximity":{
                "windowBefore":50
              },
              "likelihoodAdjustment":{
                "fixedLikelihood":"VERY_LIKELY"
              }
            }
          }
        ]
      }
    ],
    "minLikelihood":"VERY_LIKELY"
  }
...

Para obtener información más detallada sobre las palabras de activación, consulta Personalizar la probabilidad de coincidencia.

Situación con varias reglas de inspección

El siguiente fragmento de código JSON InspectConfig y el código en varios idiomas ilustran cómo aplicar reglas de exclusión y de palabras de activación. El conjunto de reglas de este fragmento incluye reglas de palabras de activación y reglas de exclusión de diccionario y expresiones regulares. Observa que las cuatro reglas se especifican en un array dentro del elemento rules.

C#

Para saber cómo instalar y usar la biblioteca de cliente de Protección de Datos Sensibles, consulta el artículo sobre las bibliotecas de cliente de Protección de Datos Sensibles.

Para autenticarte en Protección de Datos Sensibles, configura las credenciales predeterminadas de la aplicación. Para obtener más información, consulta el artículo Configurar la autenticación en un entorno de desarrollo local.


using System;
using System.Text.RegularExpressions;
using Google.Api.Gax.ResourceNames;
using Google.Cloud.Dlp.V2;
using static Google.Cloud.Dlp.V2.CustomInfoType.Types;

public class InspectStringMultipleRules
{
    public static InspectContentResponse Inspect(string projectId, string textToInspect)
    {
        var dlp = DlpServiceClient.Create();

        var byteContentItem = new ByteContentItem
        {
            Type = ByteContentItem.Types.BytesType.TextUtf8,
            Data = Google.Protobuf.ByteString.CopyFromUtf8(textToInspect)
        };

        var contentItem = new ContentItem
        {
            ByteItem = byteContentItem
        };

        var patientRule = new DetectionRule.Types.HotwordRule
        {
            HotwordRegex = new CustomInfoType.Types.Regex { Pattern = "patient" },
            Proximity = new DetectionRule.Types.Proximity { WindowBefore = 10 },
            LikelihoodAdjustment = new DetectionRule.Types.LikelihoodAdjustment { FixedLikelihood = Likelihood.VeryLikely }
        };

        var doctorRule = new DetectionRule.Types.HotwordRule
        {
            HotwordRegex = new CustomInfoType.Types.Regex { Pattern = "doctor" },
            Proximity = new DetectionRule.Types.Proximity { WindowBefore = 10 },
            LikelihoodAdjustment = new DetectionRule.Types.LikelihoodAdjustment { FixedLikelihood = Likelihood.Unlikely }
        };

        // Construct exclusion rules
        var quasimodoRule = new ExclusionRule
        {
            Dictionary = new Dictionary { WordList = new Dictionary.Types.WordList { Words = { "Quasimodo" } } },
            MatchingType = MatchingType.PartialMatch
        };

        var redactedRule = new ExclusionRule
        {
            Regex = new CustomInfoType.Types.Regex { Pattern = "REDACTED" },
            MatchingType = MatchingType.PartialMatch
        };

        var infoType = new InfoType { Name = "PERSON_NAME" };

        var inspectionRuleSet = new InspectionRuleSet
        {
            InfoTypes = { infoType },
            Rules =
            {
                new InspectionRule { HotwordRule = patientRule },
                new InspectionRule { HotwordRule = doctorRule},
                new InspectionRule { ExclusionRule = quasimodoRule },
                new InspectionRule { ExclusionRule = redactedRule }
            }
        };

        var inspectConfig = new InspectConfig
        {
            InfoTypes = { infoType },
            IncludeQuote = true,
            RuleSet = { inspectionRuleSet }
        };

        var request = new InspectContentRequest
        {
            Parent = new LocationName(projectId, "global").ToString(),
            Item = contentItem,
            InspectConfig = inspectConfig
        };

        var response = dlp.InspectContent(request);

        Console.WriteLine($"Findings: {response.Result.Findings.Count}");
        foreach (var f in response.Result.Findings)
        {
            Console.WriteLine("\tQuote: " + f.Quote);
            Console.WriteLine("\tInfo type: " + f.InfoType.Name);
            Console.WriteLine("\tLikelihood: " + f.Likelihood);
        }

        return response;
    }
}

Go

Para saber cómo instalar y usar la biblioteca de cliente de Protección de Datos Sensibles, consulta el artículo sobre las bibliotecas de cliente de Protección de Datos Sensibles.

Para autenticarte en Protección de Datos Sensibles, configura las credenciales predeterminadas de la aplicación. Para obtener más información, consulta el artículo Configurar la autenticación en un entorno de desarrollo local.

import (
	"context"
	"fmt"
	"io"

	dlp "cloud.google.com/go/dlp/apiv2"
	"cloud.google.com/go/dlp/apiv2/dlppb"
)

// inspectStringMultipleRules inspects the provided text,
// avoiding matches specified in the exclusion list.
// this function implements the both exclusion and hot word rules.
func inspectStringMultipleRules(w io.Writer, projectID, textToInspect string) error {
	// projectID := "my-project-id"
	// textToInspect := "patient: Jane Doe"

	ctx := context.Background()

	// Initialize a client once and reuse it to send multiple requests. Clients
	// are safe to use across goroutines. When the client is no longer needed,
	// call the Close method to cleanup its resources.
	client, err := dlp.NewClient(ctx)
	if err != nil {
		return err
	}

	// Closing the client safely cleans up background resources.
	defer client.Close()

	// Specify the type and content to be inspected.
	ContentItem := &dlppb.ContentItem{
		DataItem: &dlppb.ContentItem_ByteItem{
			ByteItem: &dlppb.ByteContentItem{
				Type: dlppb.ByteContentItem_TEXT_UTF8,
				Data: []byte(textToInspect),
			},
		},
	}

	// Construct the rule that adjust the likelihood of findings
	// within a certain proximity of hot-word("patient").
	patientRule := &dlppb.InspectionRule_HotwordRule{
		HotwordRule: &dlppb.CustomInfoType_DetectionRule_HotwordRule{
			HotwordRegex: &dlppb.CustomInfoType_Regex{
				Pattern: "patient",
			},
			Proximity: &dlppb.CustomInfoType_DetectionRule_Proximity{
				WindowBefore: 10,
			},
			LikelihoodAdjustment: &dlppb.CustomInfoType_DetectionRule_LikelihoodAdjustment{
				Adjustment: &dlppb.CustomInfoType_DetectionRule_LikelihoodAdjustment_FixedLikelihood{
					FixedLikelihood: dlppb.Likelihood_VERY_LIKELY,
				},
			},
		},
	}

	// Construct the rule that adjust the likelihood of findings
	// within a certain proximity of hot-word("doctor").
	doctorRule := &dlppb.InspectionRule_HotwordRule{
		HotwordRule: &dlppb.CustomInfoType_DetectionRule_HotwordRule{
			HotwordRegex: &dlppb.CustomInfoType_Regex{
				Pattern: "doctor",
			},
			Proximity: &dlppb.CustomInfoType_DetectionRule_Proximity{
				WindowBefore: 10,
			},
			LikelihoodAdjustment: &dlppb.CustomInfoType_DetectionRule_LikelihoodAdjustment{
				Adjustment: &dlppb.CustomInfoType_DetectionRule_LikelihoodAdjustment_FixedLikelihood{
					FixedLikelihood: dlppb.Likelihood_UNLIKELY,
				},
			},
		},
	}

	// Construct the exclusion rule that avoids the matches for the specified word list.
	quasimodoRule := &dlppb.ExclusionRule{
		Type: &dlppb.ExclusionRule_Dictionary{
			Dictionary: &dlppb.CustomInfoType_Dictionary{
				Source: &dlppb.CustomInfoType_Dictionary_WordList_{
					WordList: &dlppb.CustomInfoType_Dictionary_WordList{
						Words: []string{"Quasimodo"},
					},
				},
			},
		},
		MatchingType: dlppb.MatchingType_MATCHING_TYPE_PARTIAL_MATCH,
	}

	// Construct the exclusion rule that avoids the matches for the specified regex pattern.
	redactedRule := &dlppb.ExclusionRule{
		Type: &dlppb.ExclusionRule_Regex{
			Regex: &dlppb.CustomInfoType_Regex{
				Pattern: "REDACTED",
			},
		},
		MatchingType: dlppb.MatchingType_MATCHING_TYPE_PARTIAL_MATCH,
	}

	// Construct a ruleSet that applies the rules to the PERSON_NAME infoType.
	ruleSet := &dlppb.InspectionRuleSet{
		InfoTypes: []*dlppb.InfoType{
			{Name: "PERSON_NAME"},
		},
		Rules: []*dlppb.InspectionRule{
			{Type: &dlppb.InspectionRule_HotwordRule{HotwordRule: doctorRule.HotwordRule}},
			{Type: &dlppb.InspectionRule_HotwordRule{HotwordRule: patientRule.HotwordRule}},
			{Type: &dlppb.InspectionRule_ExclusionRule{ExclusionRule: quasimodoRule}},
			{Type: &dlppb.InspectionRule_ExclusionRule{ExclusionRule: redactedRule}},
		},
	}

	// Construct the configuration for the Inspect request, including the ruleSet.
	inspectConfig := &dlppb.InspectConfig{
		InfoTypes: []*dlppb.InfoType{
			{Name: "PERSON_NAME"},
		},
		IncludeQuote: true,
		RuleSet: []*dlppb.InspectionRuleSet{
			ruleSet,
		},
	}

	// Create and send the request.
	req := &dlppb.InspectContentRequest{
		Parent:        fmt.Sprintf("projects/%s/locations/global", projectID),
		Item:          ContentItem,
		InspectConfig: inspectConfig,
	}

	// Send the request.
	resp, err := client.InspectContent(ctx, req)
	if err != nil {
		return err
	}

	// Process the results.
	fmt.Fprintf(w, "Findings: %v\n", len(resp.Result.Findings))
	for _, v := range resp.GetResult().Findings {
		fmt.Fprintf(w, "Quote: %v\n", v.GetQuote())
		fmt.Fprintf(w, "Infotype Name: %v\n", v.GetInfoType().GetName())
		fmt.Fprintf(w, "Likelihood: %v\n", v.GetLikelihood())
	}
	return nil

}

Java

Para saber cómo instalar y usar la biblioteca de cliente de Protección de Datos Sensibles, consulta el artículo sobre las bibliotecas de cliente de Protección de Datos Sensibles.

Para autenticarte en Protección de Datos Sensibles, configura las credenciales predeterminadas de la aplicación. Para obtener más información, consulta el artículo Configurar la autenticación en un entorno de desarrollo local.


import com.google.cloud.dlp.v2.DlpServiceClient;
import com.google.privacy.dlp.v2.ByteContentItem;
import com.google.privacy.dlp.v2.ByteContentItem.BytesType;
import com.google.privacy.dlp.v2.ContentItem;
import com.google.privacy.dlp.v2.CustomInfoType.DetectionRule.HotwordRule;
import com.google.privacy.dlp.v2.CustomInfoType.DetectionRule.LikelihoodAdjustment;
import com.google.privacy.dlp.v2.CustomInfoType.DetectionRule.Proximity;
import com.google.privacy.dlp.v2.CustomInfoType.Dictionary;
import com.google.privacy.dlp.v2.CustomInfoType.Dictionary.WordList;
import com.google.privacy.dlp.v2.CustomInfoType.Regex;
import com.google.privacy.dlp.v2.ExclusionRule;
import com.google.privacy.dlp.v2.Finding;
import com.google.privacy.dlp.v2.InfoType;
import com.google.privacy.dlp.v2.InspectConfig;
import com.google.privacy.dlp.v2.InspectContentRequest;
import com.google.privacy.dlp.v2.InspectContentResponse;
import com.google.privacy.dlp.v2.InspectionRule;
import com.google.privacy.dlp.v2.InspectionRuleSet;
import com.google.privacy.dlp.v2.Likelihood;
import com.google.privacy.dlp.v2.LocationName;
import com.google.privacy.dlp.v2.MatchingType;
import com.google.protobuf.ByteString;
import java.io.IOException;

public class InspectStringMultipleRules {

  public static void main(String[] args) throws Exception {
    // TODO(developer): Replace these variables before running the sample.
    String projectId = "your-project-id";
    String textToInspect = "patient: Jane Doe";
    inspectStringMultipleRules(projectId, textToInspect);
  }

  // Inspects the provided text, avoiding matches specified in the exclusion list.
  public static void inspectStringMultipleRules(String projectId, String textToInspect)
      throws IOException {
    // Initialize client that will be used to send requests. This client only needs to be created
    // once, and can be reused for multiple requests. After completing all of your requests, call
    // the "close" method on the client to safely clean up any remaining background resources.
    try (DlpServiceClient dlp = DlpServiceClient.create()) {
      // Specify the type and content to be inspected.
      ByteContentItem byteItem =
          ByteContentItem.newBuilder()
              .setType(BytesType.TEXT_UTF8)
              .setData(ByteString.copyFromUtf8(textToInspect))
              .build();
      ContentItem item = ContentItem.newBuilder().setByteItem(byteItem).build();

      // Construct hotword rules
      HotwordRule patientRule =
          HotwordRule.newBuilder()
              .setHotwordRegex(Regex.newBuilder().setPattern("patient"))
              .setProximity(Proximity.newBuilder().setWindowBefore(10))
              .setLikelihoodAdjustment(
                  LikelihoodAdjustment.newBuilder().setFixedLikelihood(Likelihood.VERY_LIKELY))
              .build();

      HotwordRule doctorRule =
          HotwordRule.newBuilder()
              .setHotwordRegex(Regex.newBuilder().setPattern("doctor"))
              .setProximity(Proximity.newBuilder().setWindowBefore(10))
              .setLikelihoodAdjustment(
                  LikelihoodAdjustment.newBuilder().setFixedLikelihood(Likelihood.UNLIKELY))
              .build();

      // Construct exclusion rules
      ExclusionRule quasimodoRule =
          ExclusionRule.newBuilder()
              .setDictionary(
                  Dictionary.newBuilder().setWordList(WordList.newBuilder().addWords("Quasimodo")))
              .setMatchingType(MatchingType.MATCHING_TYPE_PARTIAL_MATCH)
              .build();

      ExclusionRule redactedRule =
          ExclusionRule.newBuilder()
              .setRegex(Regex.newBuilder().setPattern("REDACTED"))
              .setMatchingType(MatchingType.MATCHING_TYPE_PARTIAL_MATCH)
              .build();

      // Construct a ruleset that applies the rules to the PERSON_NAME infotype.
      InspectionRuleSet ruleSet =
          InspectionRuleSet.newBuilder()
              .addInfoTypes(InfoType.newBuilder().setName("PERSON_NAME"))
              .addRules(InspectionRule.newBuilder().setHotwordRule(patientRule))
              .addRules(InspectionRule.newBuilder().setHotwordRule(doctorRule))
              .addRules(InspectionRule.newBuilder().setExclusionRule(quasimodoRule))
              .addRules(InspectionRule.newBuilder().setExclusionRule(redactedRule))
              .build();

      // Construct the configuration for the Inspect request, including the ruleset.
      InspectConfig config =
          InspectConfig.newBuilder()
              .addInfoTypes(InfoType.newBuilder().setName("PERSON_NAME"))
              .setIncludeQuote(true)
              .addRuleSet(ruleSet)
              .build();

      // Construct the Inspect request to be sent by the client.
      InspectContentRequest request =
          InspectContentRequest.newBuilder()
              .setParent(LocationName.of(projectId, "global").toString())
              .setItem(item)
              .setInspectConfig(config)
              .build();

      // Use the client to send the API request.
      InspectContentResponse response = dlp.inspectContent(request);

      // Parse the response and process results
      System.out.println("Findings: " + response.getResult().getFindingsCount());
      for (Finding f : response.getResult().getFindingsList()) {
        System.out.println("\tQuote: " + f.getQuote());
        System.out.println("\tInfo type: " + f.getInfoType().getName());
        System.out.println("\tLikelihood: " + f.getLikelihood());
      }
    }
  }
}

Node.js

Para saber cómo instalar y usar la biblioteca de cliente de Protección de Datos Sensibles, consulta el artículo sobre las bibliotecas de cliente de Protección de Datos Sensibles.

Para autenticarte en Protección de Datos Sensibles, configura las credenciales predeterminadas de la aplicación. Para obtener más información, consulta el artículo Configurar la autenticación en un entorno de desarrollo local.

// Imports the Google Cloud Data Loss Prevention library
const DLP = require('@google-cloud/dlp');

// Instantiates a client
const dlp = new DLP.DlpServiceClient();

// The project ID to run the API call under
// const projectId = 'my-project';

// The string to inspect
// const string = 'patient: Jane Doe';

async function inspectStringMultipleRules() {
  // Construct item to inspect
  const item = {value: string};

  // Construct hotword rules
  const patientRule = {
    hotwordRegex: {pattern: 'patient'},
    proximity: {windowBefore: 10},
    likelihoodAdjustment: {
      fixedLikelihood:
        DLP.protos.google.privacy.dlp.v2.Likelihood.VERY_LIKELY,
    },
  };

  const doctorRule = {
    hotwordRegex: {pattern: 'doctor'},
    proximity: {windowBefore: 10},
    likelihoodAdjustment: {
      fixedLikelihood: DLP.protos.google.privacy.dlp.v2.Likelihood.UNLIKELY,
    },
  };

  // Construct exclusion rules
  const quasimodoRule = {
    dictionary: {wordList: {words: ['quasimodo']}},
    matchingType:
      DLP.protos.google.privacy.dlp.v2.MatchingType
        .MATCHING_TYPE_PARTIAL_MATCH,
  };

  const redactedRule = {
    regex: {pattern: 'REDACTED'},
    matchingType:
      DLP.protos.google.privacy.dlp.v2.MatchingType
        .MATCHING_TYPE_PARTIAL_MATCH,
  };

  // The infoTypes of information to match
  const infoTypes = [{name: 'PERSON_NAME'}];

  // Construct a ruleset that applies the rules to the PERSON_NAME infotype.
  const ruleSet = [
    {
      infoTypes: infoTypes,
      rules: [
        {hotwordRule: patientRule},
        {hotwordRule: doctorRule},
        {exclusionRule: quasimodoRule},
        {exclusionRule: redactedRule},
      ],
    },
  ];

  // Construct the configuration for the Inspect request, including the ruleset.
  const inspectConfig = {
    infoTypes: infoTypes,
    ruleSet: ruleSet,
    includeQuote: true,
  };

  // Construct the Inspect request to be sent by the client.
  const request = {
    parent: `projects/${projectId}/locations/global`,
    inspectConfig: inspectConfig,
    item: item,
  };

  // Use the client to send the API request.
  const [response] = await dlp.inspectContent(request);

  // Print findings.
  const findings = response.result.findings;
  if (findings.length > 0) {
    console.log(`Findings: ${findings.length}\n`);
    findings.forEach(finding => {
      console.log(`InfoType: ${finding.infoType.name}`);
      console.log(`\tQuote: ${finding.quote}`);
      console.log(`\tLikelihood: ${finding.likelihood} \n`);
    });
  } else {
    console.log('No findings.');
  }
}
inspectStringMultipleRules();

PHP

Para saber cómo instalar y usar la biblioteca de cliente de Protección de Datos Sensibles, consulta el artículo sobre las bibliotecas de cliente de Protección de Datos Sensibles.

Para autenticarte en Protección de Datos Sensibles, configura las credenciales predeterminadas de la aplicación. Para obtener más información, consulta el artículo Configurar la autenticación en un entorno de desarrollo local.

use Google\Cloud\Dlp\V2\Client\DlpServiceClient;
use Google\Cloud\Dlp\V2\ContentItem;
use Google\Cloud\Dlp\V2\CustomInfoType\DetectionRule\HotwordRule;
use Google\Cloud\Dlp\V2\CustomInfoType\DetectionRule\LikelihoodAdjustment;
use Google\Cloud\Dlp\V2\CustomInfoType\DetectionRule\Proximity;
use Google\Cloud\Dlp\V2\CustomInfoType\Dictionary;
use Google\Cloud\Dlp\V2\CustomInfoType\Dictionary\WordList;
use Google\Cloud\Dlp\V2\CustomInfoType\Regex;
use Google\Cloud\Dlp\V2\ExclusionRule;
use Google\Cloud\Dlp\V2\InfoType;
use Google\Cloud\Dlp\V2\InspectConfig;
use Google\Cloud\Dlp\V2\InspectContentRequest;
use Google\Cloud\Dlp\V2\InspectionRule;
use Google\Cloud\Dlp\V2\InspectionRuleSet;
use Google\Cloud\Dlp\V2\Likelihood;
use Google\Cloud\Dlp\V2\MatchingType;

/**
 * Inspect a string for sensitive data by using multiple rules
 * Illustrates applying both exclusion and hotword rules. This snippet's rule set includes both hotword rules and dictionary and regex exclusion rules. Notice that the four rules are specified in an array within the rules element.
 *
 * @param string $projectId         The Google Cloud project id to use as a parent resource.
 * @param string $textToInspect     The string to inspect.
 */
function inspect_string_multiple_rules(
    // TODO(developer): Replace sample parameters before running the code.
    string $projectId,
    string $textToInspect = 'patient: Jane Doe'
): void {
    // Instantiate a client.
    $dlp = new DlpServiceClient();

    $parent = "projects/$projectId/locations/global";

    // Specify what content you want the service to Inspect.
    $item = (new ContentItem())
        ->setValue($textToInspect);

    // Construct hotword rules
    $patientRule = (new HotwordRule())
        ->setHotwordRegex((new Regex())
            ->setPattern('patient'))
        ->setProximity((new Proximity())
            ->setWindowBefore(10))
        ->setLikelihoodAdjustment((new LikelihoodAdjustment())
            ->setFixedLikelihood(Likelihood::VERY_LIKELY));

    $doctorRule = (new HotwordRule())
        ->setHotwordRegex((new Regex())
            ->setPattern('doctor'))
        ->setProximity((new Proximity())
            ->setWindowBefore(10))
        ->setLikelihoodAdjustment((new LikelihoodAdjustment())
            ->setFixedLikelihood(Likelihood::VERY_UNLIKELY));

    // Construct exclusion rules
    $wordList = (new Dictionary())
        ->setWordList((new WordList())
            ->setWords(['Quasimodo']));

    $quasimodoRule = (new ExclusionRule())
        ->setMatchingType(MatchingType::MATCHING_TYPE_PARTIAL_MATCH)
        ->setDictionary($wordList);

    $redactedRule = (new ExclusionRule())
        ->setMatchingType(MatchingType::MATCHING_TYPE_PARTIAL_MATCH)
        ->setRegex((new Regex())
            ->setPattern('REDACTED'));

    // Specify the exclusion rule and build-in info type the inspection will look for.
    $personName = (new InfoType())
        ->setName('PERSON_NAME');
    $inspectionRuleSet = (new InspectionRuleSet())
        ->setInfoTypes([$personName])
        ->setRules([
            (new InspectionRule())
                ->setHotwordRule($patientRule),
            (new InspectionRule())
                ->setHotwordRule($doctorRule),
            (new InspectionRule())
                ->setExclusionRule($quasimodoRule),
            (new InspectionRule())
                ->setExclusionRule($redactedRule),
        ]);

    // Construct the configuration for the Inspect request, including the ruleset.
    $inspectConfig = (new InspectConfig())
        ->setInfoTypes([$personName])
        ->setIncludeQuote(true)
        ->setRuleSet([$inspectionRuleSet]);

    // Run request
    $inspectContentRequest = (new InspectContentRequest())
        ->setParent($parent)
        ->setInspectConfig($inspectConfig)
        ->setItem($item);
    $response = $dlp->inspectContent($inspectContentRequest);

    // Print the results
    $findings = $response->getResult()->getFindings();
    if (count($findings) == 0) {
        printf('No findings.' . PHP_EOL);
    } else {
        printf('Findings:' . PHP_EOL);
        foreach ($findings as $finding) {
            printf('  Quote: %s' . PHP_EOL, $finding->getQuote());
            printf('  Info type: %s' . PHP_EOL, $finding->getInfoType()->getName());
            printf('  Likelihood: %s' . PHP_EOL, Likelihood::name($finding->getLikelihood()));
        }
    }
}

Python

Para saber cómo instalar y usar la biblioteca de cliente de Protección de Datos Sensibles, consulta el artículo sobre las bibliotecas de cliente de Protección de Datos Sensibles.

Para autenticarte en Protección de Datos Sensibles, configura las credenciales predeterminadas de la aplicación. Para obtener más información, consulta el artículo Configurar la autenticación en un entorno de desarrollo local.

import google.cloud.dlp


def inspect_string_multiple_rules(project: str, content_string: str) -> None:
    """Uses the Data Loss Prevention API to modify likelihood for matches on
       PERSON_NAME combining multiple hotword and exclusion rules.

    Args:
        project: The Google Cloud project id to use as a parent resource.
        content_string: The string to inspect.

    Returns:
        None; the response from the API is printed to the terminal.
    """

    # Instantiate a client.
    dlp = google.cloud.dlp_v2.DlpServiceClient()

    # Construct hotword rules
    patient_rule = {
        "hotword_regex": {"pattern": "patient"},
        "proximity": {"window_before": 10},
        "likelihood_adjustment": {
            "fixed_likelihood": google.cloud.dlp_v2.Likelihood.VERY_LIKELY
        },
    }
    doctor_rule = {
        "hotword_regex": {"pattern": "doctor"},
        "proximity": {"window_before": 10},
        "likelihood_adjustment": {
            "fixed_likelihood": google.cloud.dlp_v2.Likelihood.UNLIKELY
        },
    }

    # Construct exclusion rules
    quasimodo_rule = {
        "dictionary": {"word_list": {"words": ["quasimodo"]}},
        "matching_type": google.cloud.dlp_v2.MatchingType.MATCHING_TYPE_PARTIAL_MATCH,
    }
    redacted_rule = {
        "regex": {"pattern": "REDACTED"},
        "matching_type": google.cloud.dlp_v2.MatchingType.MATCHING_TYPE_PARTIAL_MATCH,
    }

    # Construct the rule set, combining the above rules
    rule_set = [
        {
            "info_types": [{"name": "PERSON_NAME"}],
            "rules": [
                {"hotword_rule": patient_rule},
                {"hotword_rule": doctor_rule},
                {"exclusion_rule": quasimodo_rule},
                {"exclusion_rule": redacted_rule},
            ],
        }
    ]

    # Construct the configuration dictionary
    inspect_config = {
        "info_types": [{"name": "PERSON_NAME"}],
        "rule_set": rule_set,
        "include_quote": True,
    }

    # Construct the `item`.
    item = {"value": content_string}

    # Convert the project id into a full resource id.
    parent = f"projects/{project}"

    # Call the API.
    response = dlp.inspect_content(
        request={"parent": parent, "inspect_config": inspect_config, "item": item}
    )

    # Print out the results.
    if response.result.findings:
        for finding in response.result.findings:
            print(f"Quote: {finding.quote}")
            print(f"Info type: {finding.info_type.name}")
            print(f"Likelihood: {finding.likelihood}")
    else:
        print("No findings.")

REST

Consulta la guía de inicio rápido de JSON para obtener más información sobre cómo usar la API de DLP con JSON.

...
  "inspectConfig":{
    "infoTypes":[
      {
        "name":"PERSON_NAME"
      }
    ],
    "ruleSet":[
      {
        "infoTypes":[
          {
            "name":"PERSON_NAME"
          }
        ],
        "rules":[
          {
            "hotwordRule":{
              "hotwordRegex":{
                "pattern":"patient"
              },
              "proximity":{
                "windowBefore":10
              },
              "likelihoodAdjustment":{
                "fixedLikelihood":"VERY_LIKELY"
              }
            }
          },
          {
            "hotwordRule":{
              "hotwordRegex":{
                "pattern":"doctor"
              },
              "proximity":{
                "windowBefore":10
              },
              "likelihoodAdjustment":{
                "fixedLikelihood":"UNLIKELY"
              }
            }
          },
          {
            "exclusionRule":{
              "dictionary":{
                "wordList":{
                  "words":[
                    "Quasimodo"
                  ]
                }
              },
              "matchingType": "MATCHING_TYPE_PARTIAL_MATCH"
            }
          },
          {
            "exclusionRule":{
              "regex":{
                "pattern":"REDACTED"
              },
              "matchingType": "MATCHING_TYPE_PARTIAL_MATCH"
            }
          }
        ]
      }
    ]
  }
...

Detectores de infoType superpuestos

Es posible definir un detector de infoType personalizado que tenga el mismo nombre que un detector de infoType integrado. Como se muestra en el ejemplo de la sección Situaciones de ejemplo de reglas de palabras activas, cuando creas un detector de infoType personalizado con el mismo nombre que un infoType integrado, los resultados que detecte el nuevo detector de infoType se añadirán a los que detecte el detector integrado. Esto solo se cumple si el infoType integrado se especifica en la lista de infoTypes del objeto InspectConfig.

Cuando cree detectores personalizados de InfoType, pruébelos a fondo con contenido de ejemplo para asegurarse de que funcionan como espera.