Mejora un detector de Infotipo incorporado

En este ejemplo, se muestra cómo agregar términos para que coincidan con un detector de Infotipo existente.

Explora más

Para obtener documentación en la que se incluye esta muestra de código, consulta lo siguiente:

Muestra de código

C#

Para obtener información sobre cómo instalar y usar la biblioteca cliente de la protección de datos sensibles, consulta Bibliotecas cliente de la protección de datos sensibles.

Para autenticarte en la protección de datos sensibles, configura las credenciales predeterminadas de la aplicación. Si deseas obtener más información, consulta Configura la autenticación para un entorno de desarrollo local.


using Google.Api.Gax.ResourceNames;
using Google.Cloud.Dlp.V2;
using Google.Protobuf;
using System;
using System.Linq;
using static Google.Cloud.Dlp.V2.CustomInfoType.Types;

public class InspectDataUsingAugmentInfoTypes
{
    public static InspectContentResponse InspectData(
        string projectId,
        string text,
        InfoType infoType = null)
    {
        // Instantiate the dlp client.
        var dlp = DlpServiceClient.Create();

        // Specify the type of info to be inspected and construct the infotype.
        var infotype = infoType ?? new InfoType { Name = "PERSON_NAME" };

        // Construct the custom infoTypes with dictionary.
        var customInfoTypes = new CustomInfoType
        {
            InfoType = infotype,
            Dictionary = new Dictionary
            {
                WordList = new Dictionary.Types.WordList
                {
                    Words = { new string[] { "quasimodo" } }
                }
            }
        };

        // Construct the inspect config using custom infoTypes.
        var inspectConfig = new InspectConfig
        {
            CustomInfoTypes = { customInfoTypes },
            IncludeQuote = true,
            InfoTypes = { infotype }
        };

        // Construct the request.
        var request = new InspectContentRequest
        {
            ParentAsLocationName = new LocationName(projectId, "global"),
            InspectConfig = inspectConfig,
            Item = new ContentItem
            {
                ByteItem = new ByteContentItem
                {
                    Data = ByteString.CopyFromUtf8(text),
                    Type = ByteContentItem.Types.BytesType.TextUtf8
                }
            }
        };

        // Call the API.
        InspectContentResponse response = dlp.InspectContent(request);

        // Parse the response.
        var findings = response.Result.Findings;
        Console.WriteLine($"Finding: {findings.Count}");

        foreach (var f in findings)
        {
            Console.WriteLine("\tQuote: " + f.Quote);
            Console.WriteLine("\tInfo type: " + f.InfoType.Name);
            Console.WriteLine("\tLikelihood: " + f.Likelihood);
        }

        return response;
    }
}

Go

Para obtener información sobre cómo instalar y usar la biblioteca cliente de la protección de datos sensibles, consulta Bibliotecas cliente de la protección de datos sensibles.

Para autenticarte en la protección de datos sensibles, configura las credenciales predeterminadas de la aplicación. Si deseas obtener más información, consulta Configura la autenticación para un entorno de desarrollo local.

import (
	"context"
	"fmt"
	"io"

	dlp "cloud.google.com/go/dlp/apiv2"
	"cloud.google.com/go/dlp/apiv2/dlppb"
)

// inspectAugmentInfoTypes performs info type augmentation using Google Cloud DLP.
// It enhances data inspection by supplementing existing info types with custom-defined ones,
// expanding the ability to identify sensitive information in different contexts.
func inspectAugmentInfoTypes(w io.Writer, projectID, textToInspect string, wordList []string) error {
	// projectID := "your-project-id"
	// textToInspect := "The patient's name is quasimodo"
	// wordList := []string{"quasimodo"}

	ctx := context.Background()

	// Initialize a client once and reuse it to send multiple requests. Clients
	// are safe to use across goroutines. When the client is no longer needed,
	// call the Close method to cleanup its resources.
	client, err := dlp.NewClient(ctx)
	if err != nil {
		return err
	}

	// Closing the client safely cleans up background resources.
	defer client.Close()

	// Specify the content to be inspected.
	item := &dlppb.ContentItem{
		DataItem: &dlppb.ContentItem_Value{
			Value: textToInspect,
		},
	}

	// Construct the custom word list to be detected.
	dictionary := &dlppb.CustomInfoType_Dictionary{
		Source: &dlppb.CustomInfoType_Dictionary_WordList_{
			WordList: &dlppb.CustomInfoType_Dictionary_WordList{
				Words: wordList,
			},
		},
	}

	// Specify the type of info the inspection will look for.
	// See https://cloud.google.com/dlp/docs/infotypes-reference for complete list of info types.
	infoType := &dlppb.InfoType{
		Name: "PERSON_NAME",
	}

	// Construct a custom infoType detector by augmenting the PERSON_NAME detector with a word list.
	customInfoType := &dlppb.CustomInfoType{
		InfoType: infoType,
		Type: &dlppb.CustomInfoType_Dictionary_{
			Dictionary: dictionary,
		},
	}

	// Specify the inspect config for data inspection settings in DLP API, enabling rule
	// specification, custom info types, and actions on sensitive data. Crucial for tailored
	// data protection and privacy regulation compliance.
	inspectConfig := &dlppb.InspectConfig{
		CustomInfoTypes: []*dlppb.CustomInfoType{
			customInfoType,
		},
		IncludeQuote: true,
	}

	// Construct the Inspect request to be sent by the client.
	req := &dlppb.InspectContentRequest{
		Parent:        fmt.Sprintf("projects/%s/locations/global", projectID),
		Item:          item,
		InspectConfig: inspectConfig,
	}

	// Create the request for the job configured above.
	resp, err := client.InspectContent(ctx, req)
	if err != nil {
		return err
	}

	// Process the results.
	result := resp.Result
	fmt.Fprintf(w, "Findings: %d\n", len(result.Findings))
	for _, f := range result.Findings {
		fmt.Fprintf(w, "\tQuote: %s\n", f.Quote)
		fmt.Fprintf(w, "\tInfo type: %s\n", f.InfoType.Name)
		fmt.Fprintf(w, "\tLikelihood: %s\n", f.Likelihood)
	}
	return nil
}

Java

Para obtener información sobre cómo instalar y usar la biblioteca cliente de la protección de datos sensibles, consulta Bibliotecas cliente de la protección de datos sensibles.

Para autenticarte en la protección de datos sensibles, configura las credenciales predeterminadas de la aplicación. Si deseas obtener más información, consulta Configura la autenticación para un entorno de desarrollo local.


import com.google.cloud.dlp.v2.DlpServiceClient;
import com.google.privacy.dlp.v2.ByteContentItem;
import com.google.privacy.dlp.v2.ContentItem;
import com.google.privacy.dlp.v2.CustomInfoType;
import com.google.privacy.dlp.v2.Finding;
import com.google.privacy.dlp.v2.InfoType;
import com.google.privacy.dlp.v2.InspectConfig;
import com.google.privacy.dlp.v2.InspectContentRequest;
import com.google.privacy.dlp.v2.InspectContentResponse;
import com.google.privacy.dlp.v2.LocationName;
import com.google.protobuf.ByteString;
import java.io.IOException;
import java.util.Arrays;
import java.util.List;

public class InspectStringAugmentInfoType {

  public static void main(String[] args) throws Exception {
    // TODO(developer): Replace these variables before running the sample.
    // The Google Cloud project id to use as a parent resource.
    String projectId = "your-project-id";
    // The string to de-identify.
    String textToInspect = "The patient's name is quasimodo";
    // The string to be additionally matched.
    List<String> wordList = Arrays.asList("quasimodo");
    inspectStringAugmentInfoType(projectId, textToInspect, wordList);
  }

  // Inspects the text using new custom words added to the dictionary.
  public static void inspectStringAugmentInfoType(
      String projectId, String textToInspect, List<String> wordList) throws IOException {
    // Initialize client that will be used to send requests. This client only needs to be created
    // once, and can be reused for multiple requests. After completing all of your requests, call
    // the "close" method on the client to safely clean up any remaining background resources.
    try (DlpServiceClient dlp = DlpServiceClient.create()) {
      // Specify the type and content to be inspected.
      ByteContentItem byteItem =
          ByteContentItem.newBuilder()
              .setType(ByteContentItem.BytesType.TEXT_UTF8)
              .setData(ByteString.copyFromUtf8(textToInspect))
              .build();
      ContentItem item = ContentItem.newBuilder().setByteItem(byteItem).build();

      // Construct the custom word list to be detected.
      CustomInfoType.Dictionary dictionary =
          CustomInfoType.Dictionary.newBuilder()
              .setWordList(
                  CustomInfoType.Dictionary.WordList.newBuilder().addAllWords(wordList).build())
              .build();

      InfoType infoType = InfoType.newBuilder().setName("PERSON_NAME").build();
      // Construct a custom infotype detector by augmenting the PERSON_NAME detector with a word
      // list.
      CustomInfoType customInfoType =
          CustomInfoType.newBuilder().setInfoType(infoType).setDictionary(dictionary).build();

      InspectConfig inspectConfig =
          InspectConfig.newBuilder()
              .addCustomInfoTypes(customInfoType)
              .setIncludeQuote(true)
              .build();

      // Construct the Inspect request to be sent by the client.
      InspectContentRequest request =
          InspectContentRequest.newBuilder()
              .setParent(LocationName.of(projectId, "global").toString())
              .setItem(item)
              .setInspectConfig(inspectConfig)
              .build();

      // Use the client to send the API request.
      InspectContentResponse response = dlp.inspectContent(request);

      // Parse the response and process results
      System.out.println("Findings: " + response.getResult().getFindingsCount());
      for (Finding f : response.getResult().getFindingsList()) {
        System.out.println("\tQuote: " + f.getQuote());
        System.out.println("\tInfo type: " + f.getInfoType().getName());
        System.out.println("\tLikelihood: " + f.getLikelihood());
      }
    }
  }
}

Node.js

Para obtener información sobre cómo instalar y usar la biblioteca cliente de la protección de datos sensibles, consulta Bibliotecas cliente de la protección de datos sensibles.

Para autenticarte en la protección de datos sensibles, configura las credenciales predeterminadas de la aplicación. Si deseas obtener más información, consulta Configura la autenticación para un entorno de desarrollo local.

// Imports the Google Cloud client library
const DLP = require('@google-cloud/dlp');
// Instantiates a client
const dlp = new DLP.DlpServiceClient();

// The project ID to run the API call under
// const projectId = 'my-project';

// The string to inspect
// const string = "The patient's name is quasimodo";

// Word list
// const words = ['quasimodo'];

async function inspectStringAugmentInfoType() {
  // Specify the type and content to be inspected.
  const byteItem = {
    type: 'BYTES',
    data: Buffer.from(string),
  };
  const item = {byteItem: byteItem};

  // Construct the custom word list to be detected.
  const dictionary = {
    wordList: {
      words: words,
    },
  };

  // Construct a custom infotype detector by augmenting the PERSON_NAME detector with a word list.
  const customInfoType = {
    infoType: {name: 'PERSON_NAME'},
    dictionary: dictionary,
  };

  const inspectConfig = {
    customInfoTypes: [customInfoType],
    includeQuote: true,
  };

  // Construct the Inspect request to be sent by the client.
  const inspectRequest = {
    parent: `projects/${projectId}/locations/global`,
    inspectConfig: inspectConfig,
    item: item,
  };

  // Use the client to send the API request.
  const [response] = await dlp.inspectContent(inspectRequest);

  // Print Findings.
  const findings = response.result.findings;
  if (findings.length > 0) {
    console.log(`Findings: ${findings.length}\n`);
    findings.forEach(finding => {
      console.log(`InfoType: ${finding.infoType.name}`);
      console.log(`\tQuote: ${finding.quote}`);
      console.log(`\tLikelihood: ${finding.likelihood} \n`);
    });
  } else {
    console.log('No findings.');
  }
}
inspectStringAugmentInfoType();

PHP

Para obtener información sobre cómo instalar y usar la biblioteca cliente de la protección de datos sensibles, consulta Bibliotecas cliente de la protección de datos sensibles.

Para autenticarte en la protección de datos sensibles, configura las credenciales predeterminadas de la aplicación. Si deseas obtener más información, consulta Configura la autenticación para un entorno de desarrollo local.

use Google\Cloud\Dlp\V2\Client\DlpServiceClient;
use Google\Cloud\Dlp\V2\ContentItem;
use Google\Cloud\Dlp\V2\CustomInfoType;
use Google\Cloud\Dlp\V2\CustomInfoType\Dictionary;
use Google\Cloud\Dlp\V2\CustomInfoType\Dictionary\WordList;
use Google\Cloud\Dlp\V2\InfoType;
use Google\Cloud\Dlp\V2\InspectConfig;
use Google\Cloud\Dlp\V2\InspectContentRequest;
use Google\Cloud\Dlp\V2\Likelihood;

/**
 * Augment a built-in infotype detector.
 * Consider a scenario in which a built-in infoType detector isn’t returning the correct values.
 * For example, you want to return matches on person names, but Cloud DLP's built-in
 * PERSON_NAME detector is failing to return matches on some person names that are common in your dataset.
 * Cloud DLP allows you to augment built-in infoType detectors by including a built-in detector in the
 * declaration for a custom infoType detector, as shown in the following example. This snippet
 * illustrates how to configure Cloud DLP so that the PERSON_NAME built-in infoType detector will
 * additionally match the name “Quasimodo:”.
 *
 * @param string $projectId         The Google Cloud project id to use as a parent resource.
 * @param string $textToInspect     The string to inspect.
 * @param array  $matchWordList     Specify the set of words to match.
 */
function inspect_augment_infotypes(
    // TODO(developer): Replace sample parameters before running the code.
    string $projectId,
    string $textToInspect = 'Smith and Quasimodo are good cricketer',
    array  $matchWordList = ['quasimodo']
): void {
    // Instantiate a client.
    $dlp = new DlpServiceClient();

    $parent = "projects/$projectId/locations/global";

    // Specify what content you want the service to Inspect.
    $item = (new ContentItem())
        ->setValue($textToInspect);

    // The infoTypes of information to match.
    $personNameInfoType = (new InfoType())
        ->setName('PERSON_NAME');

    // Construct the word list to be detected.
    $wordList = (new Dictionary())
        ->setWordList((new WordList())
            ->setWords($matchWordList));

    // Construct the custom infotype detector.
    $customInfoType = (new CustomInfoType())
        ->setInfoType($personNameInfoType)
        ->setLikelihood(Likelihood::POSSIBLE)
        ->setDictionary($wordList);

    // Construct the configuration for the Inspect request.
    $inspectConfig = (new InspectConfig())
        ->setCustomInfoTypes([$customInfoType])
        ->setIncludeQuote(true);

    // Run request.
    $inspectContentRequest = (new InspectContentRequest())
        ->setParent($parent)
        ->setInspectConfig($inspectConfig)
        ->setItem($item);
    $response = $dlp->inspectContent($inspectContentRequest);

    // Print the results.
    $findings = $response->getResult()->getFindings();
    if (count($findings) == 0) {
        printf('No findings.' . PHP_EOL);
    } else {
        printf('Findings:' . PHP_EOL);
        foreach ($findings as $finding) {
            printf('  Quote: %s' . PHP_EOL, $finding->getQuote());
            printf('  Info type: %s' . PHP_EOL, $finding->getInfoType()->getName());
            printf('  Likelihood: %s' . PHP_EOL, Likelihood::name($finding->getLikelihood()));
        }
    }
}

Python

Para obtener información sobre cómo instalar y usar la biblioteca cliente de la protección de datos sensibles, consulta Bibliotecas cliente de la protección de datos sensibles.

Para autenticarte en la protección de datos sensibles, configura las credenciales predeterminadas de la aplicación. Si deseas obtener más información, consulta Configura la autenticación para un entorno de desarrollo local.

from typing import List

import google.cloud.dlp

def inspect_string_augment_infotype(
    project: str,
    input_str: str,
    info_type: str,
    word_list: List[str],
) -> None:
    """Uses the Data Loss Prevention API to augment built-in infoType
    detector and inspect the content string with augmented infoType.
    Args:
        project: The Google Cloud project id to use as a parent resource.
        input_str: The string to inspect using augmented infoType
            (will be treated as text).
        info_type: A string representing built-in infoType to augment.
            A full list of infoType categories can be fetched from the API.
        word_list: List of words or phrases to be added to extend the behaviour
            of built-in infoType.
    """

    # Instantiate a client.
    dlp = google.cloud.dlp_v2.DlpServiceClient()

    # Construct the custom infoTypes dictionary with declaration of a built-in detector.
    custom_info_types = [
        {
            "info_type": {"name": info_type},
            "dictionary": {"word_list": {"words": word_list}},
        }
    ]

    # Construct inspect configuration dictionary with the custom info type.
    inspect_config = {
        "custom_info_types": custom_info_types,
        "include_quote": True,
    }

    # Construct the `item` to be inspected.
    item = {"value": input_str}

    # Convert the project id into a full resource id.
    parent = f"projects/{project}"

    # Call the API.
    response = dlp.inspect_content(
        request={
            "parent": parent,
            "inspect_config": inspect_config,
            "item": item,
        }
    )

    # Print out the results.
    if response.result.findings:
        for finding in response.result.findings:
            print(f"Quote: {finding.quote}")
            print(f"Info type: {finding.info_type.name}")
            print(f"Likelihood: {finding.likelihood} \n")
    else:
        print("No findings.")

¿Qué sigue?

Para buscar y filtrar muestras de código de otros productos de Google Cloud, consulta el navegador de muestra de Google Cloud.