Inspecciona una string en busca de datos sensibles mediante varias reglas

Ilustra la aplicación de reglas de exclusión y de palabras clave. En el conjunto de reglas de este fragmento de código, se incluyen reglas de palabra clave y reglas de exclusión de regex y diccionario. Nótese que las cuatro reglas se especifican en un arreglo dentro del elemento rules.

Páginas de documentación que incluyen esta muestra de código

Para ver la muestra de código usada en contexto, consulta la siguiente documentación:

Muestra de código

C#

Si deseas obtener información para instalar y usar la biblioteca cliente de Cloud DLP, consulta las Bibliotecas cliente de Cloud DLP.


using System;
using System.Text.RegularExpressions;
using Google.Api.Gax.ResourceNames;
using Google.Cloud.Dlp.V2;
using static Google.Cloud.Dlp.V2.CustomInfoType.Types;

public class InspectStringMultipleRules
{
    public static InspectContentResponse Inspect(string projectId, string textToInspect)
    {
        var dlp = DlpServiceClient.Create();

        var byteContentItem = new ByteContentItem
        {
            Type = ByteContentItem.Types.BytesType.TextUtf8,
            Data = Google.Protobuf.ByteString.CopyFromUtf8(textToInspect)
        };

        var contentItem = new ContentItem
        {
            ByteItem = byteContentItem
        };

        var patientRule = new DetectionRule.Types.HotwordRule
        {
            HotwordRegex = new CustomInfoType.Types.Regex { Pattern = "patient" },
            Proximity = new DetectionRule.Types.Proximity { WindowBefore = 10 },
            LikelihoodAdjustment = new DetectionRule.Types.LikelihoodAdjustment { FixedLikelihood = Likelihood.VeryLikely }
        };

        var doctorRule = new DetectionRule.Types.HotwordRule
        {
            HotwordRegex = new CustomInfoType.Types.Regex { Pattern = "doctor" },
            Proximity = new DetectionRule.Types.Proximity { WindowBefore = 10 },
            LikelihoodAdjustment = new DetectionRule.Types.LikelihoodAdjustment { FixedLikelihood = Likelihood.Unlikely }
        };

        // Construct exclusion rules
        var quasimodoRule = new ExclusionRule
        {
            Dictionary = new Dictionary { WordList = new Dictionary.Types.WordList { Words = { "Quasimodo" } } },
            MatchingType = MatchingType.PartialMatch
        };

        var redactedRule = new ExclusionRule
        {
            Regex = new CustomInfoType.Types.Regex { Pattern = "REDACTED" },
            MatchingType = MatchingType.PartialMatch
        };

        var infoType = new InfoType { Name = "PERSON_NAME" };

        var inspectionRuleSet = new InspectionRuleSet
        {
            InfoTypes = { infoType },
            Rules =
            {
                new InspectionRule { HotwordRule = patientRule },
                new InspectionRule { HotwordRule = doctorRule},
                new InspectionRule { ExclusionRule = quasimodoRule },
                new InspectionRule { ExclusionRule = redactedRule }
            }
        };

        var inspectConfig = new InspectConfig
        {
            InfoTypes = { infoType },
            IncludeQuote = true,
            RuleSet = { inspectionRuleSet }
        };

        var request = new InspectContentRequest
        {
            Parent = new LocationName(projectId, "global").ToString(),
            Item = contentItem,
            InspectConfig = inspectConfig
        };

        var response = dlp.InspectContent(request);

        Console.WriteLine($"Findings: {response.Result.Findings.Count}");
        foreach (var f in response.Result.Findings)
        {
            Console.WriteLine("\tQuote: " + f.Quote);
            Console.WriteLine("\tInfo type: " + f.InfoType.Name);
            Console.WriteLine("\tLikelihood: " + f.Likelihood);
        }

        return response;
    }
}

Java

Si deseas obtener información para instalar y usar la biblioteca cliente de Cloud DLP, consulta las Bibliotecas cliente de Cloud DLP.


import com.google.cloud.dlp.v2.DlpServiceClient;
import com.google.privacy.dlp.v2.ByteContentItem;
import com.google.privacy.dlp.v2.ByteContentItem.BytesType;
import com.google.privacy.dlp.v2.ContentItem;
import com.google.privacy.dlp.v2.CustomInfoType.DetectionRule.HotwordRule;
import com.google.privacy.dlp.v2.CustomInfoType.DetectionRule.LikelihoodAdjustment;
import com.google.privacy.dlp.v2.CustomInfoType.DetectionRule.Proximity;
import com.google.privacy.dlp.v2.CustomInfoType.Dictionary;
import com.google.privacy.dlp.v2.CustomInfoType.Dictionary.WordList;
import com.google.privacy.dlp.v2.CustomInfoType.Regex;
import com.google.privacy.dlp.v2.ExclusionRule;
import com.google.privacy.dlp.v2.Finding;
import com.google.privacy.dlp.v2.InfoType;
import com.google.privacy.dlp.v2.InspectConfig;
import com.google.privacy.dlp.v2.InspectContentRequest;
import com.google.privacy.dlp.v2.InspectContentResponse;
import com.google.privacy.dlp.v2.InspectionRule;
import com.google.privacy.dlp.v2.InspectionRuleSet;
import com.google.privacy.dlp.v2.Likelihood;
import com.google.privacy.dlp.v2.LocationName;
import com.google.privacy.dlp.v2.MatchingType;
import com.google.protobuf.ByteString;
import java.io.IOException;

public class InspectStringMultipleRules {

  public static void main(String[] args) throws Exception {
    // TODO(developer): Replace these variables before running the sample.
    String projectId = "your-project-id";
    String textToInspect = "patient: Jane Doe";
    inspectStringMultipleRules(projectId, textToInspect);
  }

  // Inspects the provided text, avoiding matches specified in the exclusion list.
  public static void inspectStringMultipleRules(String projectId, String textToInspect)
      throws IOException {
    // Initialize client that will be used to send requests. This client only needs to be created
    // once, and can be reused for multiple requests. After completing all of your requests, call
    // the "close" method on the client to safely clean up any remaining background resources.
    try (DlpServiceClient dlp = DlpServiceClient.create()) {
      // Specify the type and content to be inspected.
      ByteContentItem byteItem =
          ByteContentItem.newBuilder()
              .setType(BytesType.TEXT_UTF8)
              .setData(ByteString.copyFromUtf8(textToInspect))
              .build();
      ContentItem item = ContentItem.newBuilder().setByteItem(byteItem).build();

      // Construct hotword rules
      HotwordRule patientRule =
          HotwordRule.newBuilder()
              .setHotwordRegex(Regex.newBuilder().setPattern("patient"))
              .setProximity(Proximity.newBuilder().setWindowBefore(10))
              .setLikelihoodAdjustment(
                  LikelihoodAdjustment.newBuilder().setFixedLikelihood(Likelihood.VERY_LIKELY))
              .build();

      HotwordRule doctorRule =
          HotwordRule.newBuilder()
              .setHotwordRegex(Regex.newBuilder().setPattern("doctor"))
              .setProximity(Proximity.newBuilder().setWindowBefore(10))
              .setLikelihoodAdjustment(
                  LikelihoodAdjustment.newBuilder().setFixedLikelihood(Likelihood.UNLIKELY))
              .build();

      // Construct exclusion rules
      ExclusionRule quasimodoRule =
          ExclusionRule.newBuilder()
              .setDictionary(
                  Dictionary.newBuilder().setWordList(WordList.newBuilder().addWords("Quasimodo")))
              .setMatchingType(MatchingType.MATCHING_TYPE_PARTIAL_MATCH)
              .build();

      ExclusionRule redactedRule =
          ExclusionRule.newBuilder()
              .setRegex(Regex.newBuilder().setPattern("REDACTED"))
              .setMatchingType(MatchingType.MATCHING_TYPE_PARTIAL_MATCH)
              .build();

      // Construct a ruleset that applies the rules to the PERSON_NAME infotype.
      InspectionRuleSet ruleSet =
          InspectionRuleSet.newBuilder()
              .addInfoTypes(InfoType.newBuilder().setName("PERSON_NAME"))
              .addRules(InspectionRule.newBuilder().setHotwordRule(patientRule))
              .addRules(InspectionRule.newBuilder().setHotwordRule(doctorRule))
              .addRules(InspectionRule.newBuilder().setExclusionRule(quasimodoRule))
              .addRules(InspectionRule.newBuilder().setExclusionRule(redactedRule))
              .build();

      // Construct the configuration for the Inspect request, including the ruleset.
      InspectConfig config =
          InspectConfig.newBuilder()
              .addInfoTypes(InfoType.newBuilder().setName("PERSON_NAME"))
              .setIncludeQuote(true)
              .addRuleSet(ruleSet)
              .build();

      // Construct the Inspect request to be sent by the client.
      InspectContentRequest request =
          InspectContentRequest.newBuilder()
              .setParent(LocationName.of(projectId, "global").toString())
              .setItem(item)
              .setInspectConfig(config)
              .build();

      // Use the client to send the API request.
      InspectContentResponse response = dlp.inspectContent(request);

      // Parse the response and process results
      System.out.println("Findings: " + response.getResult().getFindingsCount());
      for (Finding f : response.getResult().getFindingsList()) {
        System.out.println("\tQuote: " + f.getQuote());
        System.out.println("\tInfo type: " + f.getInfoType().getName());
        System.out.println("\tLikelihood: " + f.getLikelihood());
      }
    }
  }
}

Python

Si deseas obtener información para instalar y usar la biblioteca cliente de Cloud DLP, consulta las Bibliotecas cliente de Cloud DLP.

def inspect_string_multiple_rules(
    project, content_string
):
    """Uses the Data Loss Prevention API to modify likelihood for matches on
       PERSON_NAME combining multiple hotword and exclusion rules.

    Args:
        project: The Google Cloud project id to use as a parent resource.
        content_string: The string to inspect.

    Returns:
        None; the response from the API is printed to the terminal.
    """

    # Import the client library.
    import google.cloud.dlp

    # Instantiate a client.
    dlp = google.cloud.dlp_v2.DlpServiceClient()

    # Construct hotword rules
    patient_rule = {
        "hotword_regex": {"pattern": "patient"},
        "proximity": {"window_before": 10},
        "likelihood_adjustment": {
            "fixed_likelihood": google.cloud.dlp_v2.Likelihood.VERY_LIKELY
        },
    }
    doctor_rule = {
        "hotword_regex": {"pattern": "doctor"},
        "proximity": {"window_before": 10},
        "likelihood_adjustment": {
            "fixed_likelihood": google.cloud.dlp_v2.Likelihood.UNLIKELY
        },
    }

    # Construct exclusion rules
    quasimodo_rule = {
        "dictionary": {
            "word_list": {
                "words": ["quasimodo"]
            },
        },
        "matching_type": google.cloud.dlp_v2.MatchingType.MATCHING_TYPE_PARTIAL_MATCH,
    }
    redacted_rule = {
        "regex": {"pattern": "REDACTED"},
        "matching_type": google.cloud.dlp_v2.MatchingType.MATCHING_TYPE_PARTIAL_MATCH,
    }

    # Construct the rule set, combining the above rules
    rule_set = [
        {
            "info_types": [{"name": "PERSON_NAME"}],
            "rules": [
                {"hotword_rule": patient_rule},
                {"hotword_rule": doctor_rule},
                {"exclusion_rule": quasimodo_rule},
                {"exclusion_rule": redacted_rule},
            ],
        }
    ]

    # Construct the configuration dictionary
    inspect_config = {
        "info_types": [{"name": "PERSON_NAME"}],
        "rule_set": rule_set,
        "include_quote": True,
    }

    # Construct the `item`.
    item = {"value": content_string}

    # Convert the project id into a full resource id.
    parent = f"projects/{project}"

    # Call the API.
    response = dlp.inspect_content(
        request={"parent": parent, "inspect_config": inspect_config, "item": item}
    )

    # Print out the results.
    if response.result.findings:
        for finding in response.result.findings:
            print(f"Quote: {finding.quote}")
            print(f"Info type: {finding.info_type.name}")
            print(f"Likelihood: {finding.likelihood}")
    else:
        print("No findings.")

¿Qué sigue?

A fin de buscar y filtrar muestras de código para otros productos de Google Cloud, consulta el navegador de muestra de Google Cloud.