String mit sensiblen Daten durch ein benutzerdefiniertes Hotword prüfen

Wahrscheinlichkeit einer PERSON_NAME-Übereinstimmung erhöhen, wenn sich das Hotword "Patient" in der Nähe befindet Illustrationen, die die InspectConfig-Eigenschaft verwenden, um eine medizinische Datenbank nach Patientennamen zu scannen Sie können zwar den in Cloud DLP integrierten infoType-Detektor PERSON_NAME verwenden, dies führt aber dazu, dass Cloud DLP Übereinstimmungen bei allen Personennamen zurückgibt, nicht nur bei Namen von Patienten. Zur Behebung dieses Problems können Sie eine Hotword-Regel einfügen, die in einem bestimmten Zeichenabstand vom ersten Zeichen möglicher Übereinstimmungen nach dem Wort "Patient" sucht. Ergebnissen, die diesem Muster entsprechen, können Sie dann eine Wahrscheinlichkeit von "sehr wahrscheinlich" zuweisen, da sie Ihren speziellen Kriterien entsprechen. Wenn Sie die Mindestwahrscheinlichkeit innerhalb von InspectConfig auf VERY_LIKELY festlegen, ist sichergestellt, dass nur Übereinstimmungen mit genau dieser Konfiguration als Ergebnisse geliefert werden.

Weitere Informationen

Eine ausführliche Dokumentation, die dieses Codebeispiel enthält, finden Sie hier:

Codebeispiel

C#

Informationen zum Installieren und Verwenden der Clientbibliothek für den Schutz sensibler Daten finden Sie unter Clientbibliotheken für den Schutz sensibler Daten.

Richten Sie Standardanmeldedaten für Anwendungen ein, um sich beim Schutz sensibler Daten zu authentifizieren. Weitere Informationen finden Sie unter Authentifizierung für eine lokale Entwicklungsumgebung einrichten.


using Google.Api.Gax.ResourceNames;
using Google.Cloud.Dlp.V2;
using System;
using static Google.Cloud.Dlp.V2.CustomInfoType.Types;

public class InspectStringCustomHotword
{
    public static InspectContentResponse Inspect(string projectId, string textToInspect, string customHotword)
    {
        var dlp = DlpServiceClient.Create();

        var byteContentItem = new ByteContentItem
        {
            Type = ByteContentItem.Types.BytesType.TextUtf8,
            Data = Google.Protobuf.ByteString.CopyFromUtf8(textToInspect)
        };

        var contentItem = new ContentItem
        {
            ByteItem = byteContentItem
        };

        var hotwordRule = new DetectionRule.Types.HotwordRule
        {
            HotwordRegex = new Regex { Pattern = customHotword },
            Proximity = new DetectionRule.Types.Proximity { WindowBefore = 50 },
            LikelihoodAdjustment = new DetectionRule.Types.LikelihoodAdjustment { FixedLikelihood = Likelihood.VeryLikely }
        };

        var infoType = new InfoType { Name = "PERSON_NAME" };

        var inspectionRuleSet = new InspectionRuleSet
        {
            InfoTypes = { infoType },
            Rules = { new InspectionRule { HotwordRule = hotwordRule } }
        };

        var inspectConfig = new InspectConfig
        {
            InfoTypes = { infoType },
            IncludeQuote = true,
            RuleSet = { inspectionRuleSet },
            MinLikelihood = Likelihood.VeryLikely
        };

        var request = new InspectContentRequest
        {
            Parent = new LocationName(projectId, "global").ToString(),
            Item = contentItem,
            InspectConfig = inspectConfig
        };

        var response = dlp.InspectContent(request);

        Console.WriteLine($"Findings: {response.Result.Findings.Count}");
        foreach (var f in response.Result.Findings)
        {
            Console.WriteLine("\tQuote: " + f.Quote);
            Console.WriteLine("\tInfo type: " + f.InfoType.Name);
            Console.WriteLine("\tLikelihood: " + f.Likelihood);
        }

        return response;
    }
}

Go

Informationen zum Installieren und Verwenden der Clientbibliothek für den Schutz sensibler Daten finden Sie unter Clientbibliotheken für den Schutz sensibler Daten.

Richten Sie Standardanmeldedaten für Anwendungen ein, um sich beim Schutz sensibler Daten zu authentifizieren. Weitere Informationen finden Sie unter Authentifizierung für eine lokale Entwicklungsumgebung einrichten.

import (
	"context"
	"fmt"
	"io"

	dlp "cloud.google.com/go/dlp/apiv2"
	"cloud.google.com/go/dlp/apiv2/dlppb"
)

// inspectStringCustomHotWord inspects a string from sensitive data by using a custom hot word
func inspectStringCustomHotWord(w io.Writer, projectID, textToInspect, customHotWord, infoTypeName string) error {
	// projectID := "my-project-id"
	// textToInspect := "patient name: John Doe"
	// customHotWord := "patient"
	// infoTypeName := "PERSON_NAME"

	ctx := context.Background()

	// Initialize a client once and reuse it to send multiple requests. Clients
	// are safe to use across goroutines. When the client is no longer needed,
	// call the Close method to cleanup its resources.
	client, err := dlp.NewClient(ctx)
	if err != nil {
		return err
	}

	// Closing the client safely cleans up background resources.
	defer client.Close()

	// Specify the type and content to be inspected.
	contentItem := &dlppb.ContentItem{
		DataItem: &dlppb.ContentItem_ByteItem{
			ByteItem: &dlppb.ByteContentItem{
				Type: dlppb.ByteContentItem_TEXT_UTF8,
				Data: []byte(textToInspect),
			},
		},
	}

	// Increase likelihood of matches that have customHotword nearby
	hotwordRule := &dlppb.InspectionRule_HotwordRule{

		HotwordRule: &dlppb.CustomInfoType_DetectionRule_HotwordRule{
			HotwordRegex: &dlppb.CustomInfoType_Regex{
				Pattern: customHotWord,
			},
			Proximity: &dlppb.CustomInfoType_DetectionRule_Proximity{
				WindowBefore: 50,
			},
			LikelihoodAdjustment: &dlppb.CustomInfoType_DetectionRule_LikelihoodAdjustment{
				Adjustment: &dlppb.CustomInfoType_DetectionRule_LikelihoodAdjustment_FixedLikelihood{
					FixedLikelihood: dlppb.Likelihood_VERY_LIKELY,
				},
			},
		},
	}

	// Construct a ruleset that applies the hotword rule to the PERSON_NAME infotype.
	ruleSet := &dlppb.InspectionRuleSet{
		InfoTypes: []*dlppb.InfoType{
			{Name: infoTypeName}, //"PERSON_NAME"
		},
		Rules: []*dlppb.InspectionRule{
			{
				Type: &dlppb.InspectionRule_HotwordRule{
					HotwordRule: hotwordRule.HotwordRule,
				},
			},
		},
	}

	// Construct the Inspect request to be sent by the client.
	req := &dlppb.InspectContentRequest{
		Parent: fmt.Sprintf("projects/%s/locations/global", projectID),
		Item:   contentItem,
		InspectConfig: &dlppb.InspectConfig{
			InfoTypes: []*dlppb.InfoType{
				{Name: infoTypeName}, //"PERSON_NAME"
			},
			IncludeQuote:  true,
			MinLikelihood: dlppb.Likelihood_VERY_LIKELY,
			RuleSet: []*dlppb.InspectionRuleSet{
				ruleSet,
			},
		},
	}

	// Send the request.
	resp, err := client.InspectContent(ctx, req)
	if err != nil {
		return err
	}

	// Parse the response and process results
	fmt.Fprintf(w, "Findings: %v\n", len(resp.Result.Findings))
	for _, v := range resp.GetResult().Findings {
		fmt.Fprintf(w, "Quote: %v\n", v.GetQuote())
		fmt.Fprintf(w, "Infotype Name: %v\n", v.GetInfoType().GetName())
		fmt.Fprintf(w, "Likelihood: %v\n", v.GetLikelihood())
	}
	return nil

}

Java

Informationen zum Installieren und Verwenden der Clientbibliothek für den Schutz sensibler Daten finden Sie unter Clientbibliotheken für den Schutz sensibler Daten.

Richten Sie Standardanmeldedaten für Anwendungen ein, um sich beim Schutz sensibler Daten zu authentifizieren. Weitere Informationen finden Sie unter Authentifizierung für eine lokale Entwicklungsumgebung einrichten.


import com.google.cloud.dlp.v2.DlpServiceClient;
import com.google.privacy.dlp.v2.ByteContentItem;
import com.google.privacy.dlp.v2.ByteContentItem.BytesType;
import com.google.privacy.dlp.v2.ContentItem;
import com.google.privacy.dlp.v2.CustomInfoType.DetectionRule.HotwordRule;
import com.google.privacy.dlp.v2.CustomInfoType.DetectionRule.LikelihoodAdjustment;
import com.google.privacy.dlp.v2.CustomInfoType.DetectionRule.Proximity;
import com.google.privacy.dlp.v2.CustomInfoType.Regex;
import com.google.privacy.dlp.v2.Finding;
import com.google.privacy.dlp.v2.InfoType;
import com.google.privacy.dlp.v2.InspectConfig;
import com.google.privacy.dlp.v2.InspectContentRequest;
import com.google.privacy.dlp.v2.InspectContentResponse;
import com.google.privacy.dlp.v2.InspectionRule;
import com.google.privacy.dlp.v2.InspectionRuleSet;
import com.google.privacy.dlp.v2.Likelihood;
import com.google.privacy.dlp.v2.LocationName;
import com.google.protobuf.ByteString;
import java.io.IOException;

public class InspectStringCustomHotword {

  public static void main(String[] args) throws Exception {
    // TODO(developer): Replace these variables before running the sample.
    String projectId = "your-project-id";
    String textToInspect = "patient name: John Doe";
    String customHotword = "patient";
    inspectStringCustomHotword(projectId, textToInspect, customHotword);
  }

  // Inspects the provided text.
  public static void inspectStringCustomHotword(
      String projectId, String textToInspect, String customHotword) throws IOException {
    // Initialize client that will be used to send requests. This client only needs to be created
    // once, and can be reused for multiple requests. After completing all of your requests, call
    // the "close" method on the client to safely clean up any remaining background resources.
    try (DlpServiceClient dlp = DlpServiceClient.create()) {
      // Specify the type and content to be inspected.
      ByteContentItem byteItem =
          ByteContentItem.newBuilder()
              .setType(BytesType.TEXT_UTF8)
              .setData(ByteString.copyFromUtf8(textToInspect))
              .build();
      ContentItem item = ContentItem.newBuilder().setByteItem(byteItem).build();

      // Increase likelihood of matches that have customHotword nearby
      HotwordRule hotwordRule =
          HotwordRule.newBuilder()
              .setHotwordRegex(Regex.newBuilder().setPattern(customHotword))
              .setProximity(Proximity.newBuilder().setWindowBefore(50))
              .setLikelihoodAdjustment(
                  LikelihoodAdjustment.newBuilder().setFixedLikelihood(Likelihood.VERY_LIKELY))
              .build();

      // Construct a ruleset that applies the hotword rule to the PERSON_NAME infotype.
      InspectionRuleSet ruleSet =
          InspectionRuleSet.newBuilder()
              .addInfoTypes(InfoType.newBuilder().setName("PERSON_NAME").build())
              .addRules(InspectionRule.newBuilder().setHotwordRule(hotwordRule))
              .build();

      // Construct the configuration for the Inspect request.
      InspectConfig config =
          InspectConfig.newBuilder()
              .addInfoTypes(InfoType.newBuilder().setName("PERSON_NAME").build())
              .setIncludeQuote(true)
              .addRuleSet(ruleSet)
              .setMinLikelihood(Likelihood.VERY_LIKELY)
              .build();

      // Construct the Inspect request to be sent by the client.
      InspectContentRequest request =
          InspectContentRequest.newBuilder()
              .setParent(LocationName.of(projectId, "global").toString())
              .setItem(item)
              .setInspectConfig(config)
              .build();

      // Use the client to send the API request.
      InspectContentResponse response = dlp.inspectContent(request);

      // Parse the response and process results
      System.out.println("Findings: " + response.getResult().getFindingsCount());
      for (Finding f : response.getResult().getFindingsList()) {
        System.out.println("\tQuote: " + f.getQuote());
        System.out.println("\tInfo type: " + f.getInfoType().getName());
        System.out.println("\tLikelihood: " + f.getLikelihood());
      }
    }
  }
}

Node.js

Informationen zum Installieren und Verwenden der Clientbibliothek für den Schutz sensibler Daten finden Sie unter Clientbibliotheken für den Schutz sensibler Daten.

Richten Sie Standardanmeldedaten für Anwendungen ein, um sich beim Schutz sensibler Daten zu authentifizieren. Weitere Informationen finden Sie unter Authentifizierung für eine lokale Entwicklungsumgebung einrichten.

// Imports the Google Cloud Data Loss Prevention library
const DLP = require('@google-cloud/dlp');

// Instantiates a client
const dlp = new DLP.DlpServiceClient();

// The project ID to run the API call under
// const projectId = 'my-project';

// The string to inspect
// const string = 'patient name: John Doe';

// Custom hotward
// const customHotword = 'patient';

async function inspectStringCustomHotword() {
  // Specify the type and content to be inspected.
  const item = {
    byteItem: {
      type: DLP.protos.google.privacy.dlp.v2.ByteContentItem.BytesType
        .TEXT_UTF8,
      data: Buffer.from(string, 'utf-8'),
    },
  };

  // Increase likelihood of matches that have customHotword nearby.
  const hotwordRule = {
    hotwordRegex: {
      pattern: customHotword,
    },
    proximity: {
      windowBefore: 50,
    },
    likelihoodAdjustment: {
      fixedLikelihood:
        DLP.protos.google.privacy.dlp.v2.Likelihood.VERY_LIKELY,
    },
  };

  // Construct a ruleset that applies the hotword rule to the PERSON_NAME infotype.
  const ruleSet = [
    {
      infoTypes: [{name: 'PERSON_NAME'}],
      rules: [
        {
          hotwordRule: hotwordRule,
        },
      ],
    },
  ];

  // Construct the configuration for the Inspect request.
  const inspectConfig = {
    infoTypes: [{name: 'PERSON_NAME'}],
    ruleSet: ruleSet,
    includeQuote: true,
  };

  // Construct the Inspect request to be sent by the client.
  const request = {
    parent: `projects/${projectId}/locations/global`,
    inspectConfig: inspectConfig,
    item: item,
  };

  // Use the client to send the API request.
  const [response] = await dlp.inspectContent(request);

  // Print findings.
  const findings = response.result.findings;
  if (findings.length > 0) {
    console.log(`Findings: ${findings.length}\n`);
    findings.forEach(finding => {
      console.log(`InfoType: ${finding.infoType.name}`);
      console.log(`\tQuote: ${finding.quote}`);
      console.log(`\tLikelihood: ${finding.likelihood} \n`);
    });
  } else {
    console.log('No findings.');
  }
}
inspectStringCustomHotword();

PHP

Informationen zum Installieren und Verwenden der Clientbibliothek für den Schutz sensibler Daten finden Sie unter Clientbibliotheken für den Schutz sensibler Daten.

Richten Sie Standardanmeldedaten für Anwendungen ein, um sich beim Schutz sensibler Daten zu authentifizieren. Weitere Informationen finden Sie unter Authentifizierung für eine lokale Entwicklungsumgebung einrichten.

use Google\Cloud\Dlp\V2\Client\DlpServiceClient;
use Google\Cloud\Dlp\V2\ContentItem;
use Google\Cloud\Dlp\V2\CustomInfoType\DetectionRule\HotwordRule;
use Google\Cloud\Dlp\V2\CustomInfoType\DetectionRule\LikelihoodAdjustment;
use Google\Cloud\Dlp\V2\CustomInfoType\DetectionRule\Proximity;
use Google\Cloud\Dlp\V2\CustomInfoType\Regex;
use Google\Cloud\Dlp\V2\InfoType;
use Google\Cloud\Dlp\V2\InspectConfig;
use Google\Cloud\Dlp\V2\InspectContentRequest;
use Google\Cloud\Dlp\V2\InspectionRule;
use Google\Cloud\Dlp\V2\InspectionRuleSet;
use Google\Cloud\Dlp\V2\Likelihood;

/**
 * Inspect a string from sensitive data by using a custom hotword
 * Increase the likelihood of a PERSON_NAME match if there is the hotword "patient" nearby. Illustrates using the InspectConfig property for the purpose of scanning a medical database for patient names. You can use Cloud DLP's built-in PERSON_NAME infoType detector, but that causes Cloud DLP to match on all names of people, not just names of patients. To fix this, you can include a hotword rule that looks for the word "patient" within a certain character proximity from the first character of potential matches. You can then assign findings that match this pattern a likelihood of "very likely," since they correspond to your special criteria. Setting the minimum likelihood to VERY_LIKELY within InspectConfig ensures that only matches to this configuration are returned in findings.
 *
 * @param string $projectId         The Google Cloud project id to use as a parent resource.
 * @param string $textToInspect     The string to inspect.
 */
function inspect_string_custom_hotword(
    // TODO(developer): Replace sample parameters before running the code.
    string $projectId,
    string $textToInspect = 'patient name: John Doe'
): void {
    // Instantiate a client.
    $dlp = new DlpServiceClient();

    $parent = "projects/$projectId/locations/global";

    // Specify what content you want the service to Inspect.
    $item = (new ContentItem())
        ->setValue($textToInspect);

    // Construct hotword rules
    $hotwordRule = (new HotwordRule())
        ->setHotwordRegex(
            (new Regex())
                ->setPattern('patient')
        )
        ->setProximity(
            (new Proximity())
                ->setWindowBefore(50)
        )
        ->setLikelihoodAdjustment(
            (new LikelihoodAdjustment())
                ->setFixedLikelihood(Likelihood::VERY_LIKELY)
        );

    // Construct a ruleset that applies the hotword rule to the PERSON_NAME infotype.
    $personName = (new InfoType())
        ->setName('PERSON_NAME');
    $inspectionRuleSet = (new InspectionRuleSet())
        ->setInfoTypes([$personName])
        ->setRules([
            (new InspectionRule())
                ->setHotwordRule($hotwordRule),
        ]);

    // Construct the configuration for the Inspect request, including the ruleset.
    $inspectConfig = (new InspectConfig())
        ->setInfoTypes([$personName])
        ->setIncludeQuote(true)
        ->setRuleSet([$inspectionRuleSet])
        ->setMinLikelihood(Likelihood::VERY_LIKELY);

    // Run request
    $inspectContentRequest = (new InspectContentRequest())
        ->setParent($parent)
        ->setInspectConfig($inspectConfig)
        ->setItem($item);
    $response = $dlp->inspectContent($inspectContentRequest);

    // Print the results
    $findings = $response->getResult()->getFindings();
    if (count($findings) == 0) {
        printf('No findings.' . PHP_EOL);
    } else {
        printf('Findings:' . PHP_EOL);
        foreach ($findings as $finding) {
            printf('  Quote: %s' . PHP_EOL, $finding->getQuote());
            printf('  Info type: %s' . PHP_EOL, $finding->getInfoType()->getName());
            printf('  Likelihood: %s' . PHP_EOL, Likelihood::name($finding->getLikelihood()));
        }
    }
}

Python

Informationen zum Installieren und Verwenden der Clientbibliothek für den Schutz sensibler Daten finden Sie unter Clientbibliotheken für den Schutz sensibler Daten.

Richten Sie Standardanmeldedaten für Anwendungen ein, um sich beim Schutz sensibler Daten zu authentifizieren. Weitere Informationen finden Sie unter Authentifizierung für eine lokale Entwicklungsumgebung einrichten.

import google.cloud.dlp

def inspect_string_w_custom_hotword(
    project: str, content_string: str, custom_hotword: str = "patient"
) -> None:
    """Uses the Data Loss Prevention API increase likelihood for matches on
       PERSON_NAME if the user specified custom hot-word is present. Only
       includes findings with the increased likelihood by setting a minimum
       likelihood threshold of VERY_LIKELY.

    Args:
        project: The Google Cloud project id to use as a parent resource.
        content_string: The string to inspect.
        custom_hotword: The custom hot-word used for likelihood boosting.
    """

    # Instantiate a client.
    dlp = google.cloud.dlp_v2.DlpServiceClient()

    # Construct a rule set with caller provided hotword, with a likelihood
    # boost to VERY_LIKELY when the hotword are present within the 50 character-
    # window preceding the PII finding.
    hotword_rule = {
        "hotword_regex": {"pattern": custom_hotword},
        "likelihood_adjustment": {
            "fixed_likelihood": google.cloud.dlp_v2.Likelihood.VERY_LIKELY
        },
        "proximity": {"window_before": 50},
    }

    rule_set = [
        {
            "info_types": [{"name": "PERSON_NAME"}],
            "rules": [{"hotword_rule": hotword_rule}],
        }
    ]

    # Construct the configuration dictionary with the custom regex info type.
    inspect_config = {
        "rule_set": rule_set,
        "min_likelihood": google.cloud.dlp_v2.Likelihood.VERY_LIKELY,
        "include_quote": True,
    }

    # Construct the `item`.
    item = {"value": content_string}

    # Convert the project id into a full resource id.
    parent = f"projects/{project}"

    # Call the API.
    response = dlp.inspect_content(
        request={"parent": parent, "inspect_config": inspect_config, "item": item}
    )

    # Print out the results.
    if response.result.findings:
        for finding in response.result.findings:
            print(f"Quote: {finding.quote}")
            print(f"Info type: {finding.info_type.name}")
            print(f"Likelihood: {finding.likelihood}")
    else:
        print("No findings.")

Nächste Schritte

Informationen zum Suchen und Filtern von Codebeispielen für andere Google Cloud-Produkte finden Sie im Google Cloud-Beispielbrowser.