Como inspecionar texto de dados confidenciais

O Cloud Data Loss Prevention (DLP) pode detectar e classificar dados confidenciais em conteúdo de texto. A API Cloud DLP retorna detalhes sobre os InfoTypes encontrados em uma entrada de texto fornecida, um valor de probabilidade e informações de deslocamento.

Práticas recomendadas

Como identificar e priorizar a verificação

É importante identificar seus recursos e especificar quais têm a maior prioridade de verificação. Ao começar, é possível ter um grande acúmulo de dados que precisam de classificação, e será impossível verificá-los imediatamente. Inicialmente, escolha dados que apresentem o maior risco, por exemplo, dados que são acessados com frequência, amplamente acessíveis ou desconhecidos.

Reduzir a latência

A latência é afetada por vários fatores: a quantidade de dados a serem verificados, o repositório de armazenamento que está sendo verificado e o tipo e número de infoTypes que estão ativados.

Para ajudar a reduzir a latência do job, tente o seguinte:

  • Ative amostragem.
  • Evite ativar todos os infoTypes se você não precisar de todos eles. Embora sejam úteis em alguns cenários, alguns infoTypes, incluindo PERSON_NAME, FEMALE_NAME, MALE_NAME, FIRST_NAME, LAST_NAME, DATE, DATE_OF_BIRTH, TIME, LOCATION, STREET_ADDRESS, MEDICAL_TERM, ORGANIZATION_NAME e ALL_BASIC - podem fazer solicitações muito mais lentamente do que solicitações que não as incluam.
  • Considere a possibilidade de organizar os dados a serem inspecionados em uma tabela com linhas e colunas, se possível, para reduzir as viagens de ida e volta para a rede.

Limitar o escopo das suas primeiras verificações

Para melhores resultados, limite o escopo das suas primeiras verificações em vez de verificar todos os seus dados. Comece com algumas solicitações. Suas descobertas serão mais significativas quando você ajustar quais detectores serão ativados e quais regras de exclusão serão necessárias para reduzir os falsos positivos. Evite ativar todos os infoTypes se você não precisar deles todos, porque falsos positivos ou descobertas inúteis podem dificultar a avaliação de seu risco. Embora sejam úteis em determinados cenários, alguns infoTypes, como DATE, TIME, DOMAIN_NAME e URL, detectam uma ampla variedade de descobertas e podem não ser úteis para serem ativados.

Verificações locais, híbridas e em várias nuvens

Se os dados a serem verificados residirem no local ou fora do Google Cloud, use os métodos de API content.inspect e content.deidentify para verificar o conteúdo e classificar as descobertas e pseudonimizar o conteúdo sem persistir o conteúdo fora do armazenamento local.

Como inspecionar uma string de texto

Veja a seguir exemplos de JSON e de código em vários idiomas que demonstram como usar a API Cloud DLP para inspecionar strings de texto em busca de dados confidenciais.

Protocolo

Consulte o guia de início rápido do JSON para mais informações sobre o uso da API Cloud DLP com o JSON.

Entrada JSON:

POST https://dlp.googleapis.com/v2/projects/[PROJECT_ID]/content:inspect?key={YOUR_API_KEY}

    {
      "item":{
        "value":"My phone number is (415) 555-0890"
      },
      "inspectConfig":{
        "includeQuote":true,
        "minLikelihood":"POSSIBLE",
        "infoTypes":{
          "name":"PHONE_NUMBER"
        }
      }
    }
    

Saída JSON:

{
      "result":{
        "findings":[
          {
            "quote":"(415) 555-0890",
            "infoType":{
              "name":"PHONE_NUMBER"
            },
            "likelihood":"VERY_LIKELY",
            "location":{
              "byteRange":{
                "start":"19",
                "end":"33"
              },
              "codepointRange":{
                "start":"19",
                "end":"33"
              }
            },
            "createTime":"2018-11-13T19:29:15.412Z"
          }
        ]
      }
    }
    

Java

import com.google.cloud.dlp.v2.DlpServiceClient;
    import com.google.privacy.dlp.v2.ByteContentItem;
    import com.google.privacy.dlp.v2.ByteContentItem.BytesType;
    import com.google.privacy.dlp.v2.ContentItem;
    import com.google.privacy.dlp.v2.Finding;
    import com.google.privacy.dlp.v2.InfoType;
    import com.google.privacy.dlp.v2.InspectConfig;
    import com.google.privacy.dlp.v2.InspectContentRequest;
    import com.google.privacy.dlp.v2.InspectContentResponse;
    import com.google.privacy.dlp.v2.ProjectName;
    import com.google.protobuf.ByteString;
    import java.util.ArrayList;
    import java.util.List;

    public class InspectString {

      public static void inspectString() {
        // TODO(developer): Replace these variables before running the sample.
        String projectId = "your-project-id";
        String textToInspect = "My name is Gary and my email is gary@example.com";
        inspectString(projectId, textToInspect);
      }

      // Inspects the provided text.
      public static void inspectString(String projectId, String textToInspect) {
        // Initialize client that will be used to send requests. This client only needs to be created
        // once, and can be reused for multiple requests. After completing all of your requests, call
        // the "close" method on the client to safely clean up any remaining background resources.
        try (DlpServiceClient dlp = DlpServiceClient.create()) {
          // Specify the project used for request.
          ProjectName project = ProjectName.of(projectId);

          // Specify the type and content to be inspected.
          ByteContentItem byteItem =
              ByteContentItem.newBuilder()
                  .setType(BytesType.TEXT_UTF8)
                  .setData(ByteString.copyFromUtf8(textToInspect))
                  .build();
          ContentItem item = ContentItem.newBuilder().setByteItem(byteItem).build();

          // Specify the type of info the inspection will look for.
          List<InfoType> infoTypes = new ArrayList<>();
          // See https://cloud.google.com/dlp/docs/infotypes-reference for complete list of info types
          for (String typeName : new String[] {"PHONE_NUMBER", "EMAIL_ADDRESS", "CREDIT_CARD_NUMBER"}) {
            infoTypes.add(InfoType.newBuilder().setName(typeName).build());
          }

          // Construct the configuration for the Inspect request.
          InspectConfig config =
              InspectConfig.newBuilder().addAllInfoTypes(infoTypes).setIncludeQuote(true).build();

          // Construct the Inspect request to be sent by the client.
          InspectContentRequest request =
              InspectContentRequest.newBuilder()
                  .setParent(project.toString())
                  .setItem(item)
                  .setInspectConfig(config)
                  .build();

          // Use the client to send the API request.
          InspectContentResponse response = dlp.inspectContent(request);

          // Parse the response and process results
          System.out.println("Findings: " + response.getResult().getFindingsCount());
          for (Finding f : response.getResult().getFindingsList()) {
            System.out.println("\tQuote: " + f.getQuote());
            System.out.println("\tInfo type: " + f.getInfoType().getName());
            System.out.println("\tLikelihood: " + f.getLikelihood());
          }
        } catch (Exception e) {
          System.out.println("Error during inspectString: \n" + e.toString());
        }
      }
    }

Node.js

// Imports the Google Cloud Data Loss Prevention library
    const DLP = require('@google-cloud/dlp');

    // Instantiates a client
    const dlp = new DLP.DlpServiceClient();

    // The project ID to run the API call under
    // const callingProjectId = process.env.GCLOUD_PROJECT;

    // The string to inspect
    // const string = 'My name is Gary and my email is gary@example.com';

    // The minimum likelihood required before returning a match
    // const minLikelihood = 'LIKELIHOOD_UNSPECIFIED';

    // The maximum number of findings to report per request (0 = server maximum)
    // const maxFindings = 0;

    // The infoTypes of information to match
    // const infoTypes = [{ name: 'PHONE_NUMBER' }, { name: 'EMAIL_ADDRESS' }, { name: 'CREDIT_CARD_NUMBER' }];

    // The customInfoTypes of information to match
    // const customInfoTypes = [{ infoType: { name: 'DICT_TYPE' }, dictionary: { wordList: { words: ['foo', 'bar', 'baz']}}},
    //   { infoType: { name: 'REGEX_TYPE' }, regex: '\\(\\d{3}\\) \\d{3}-\\d{4}'}];

    // Whether to include the matching string
    // const includeQuote = true;

    // Construct item to inspect
    const item = {value: string};

    // Construct request
    const request = {
      parent: dlp.projectPath(callingProjectId),
      inspectConfig: {
        infoTypes: infoTypes,
        customInfoTypes: customInfoTypes,
        minLikelihood: minLikelihood,
        includeQuote: includeQuote,
        limits: {
          maxFindingsPerRequest: maxFindings,
        },
      },
      item: item,
    };

    // Run request
    try {
      const [response] = await dlp.inspectContent(request);
      const findings = response.result.findings;
      if (findings.length > 0) {
        console.log('Findings:');
        findings.forEach(finding => {
          if (includeQuote) {
            console.log(`\tQuote: ${finding.quote}`);
          }
          console.log(`\tInfo type: ${finding.infoType.name}`);
          console.log(`\tLikelihood: ${finding.likelihood}`);
        });
      } else {
        console.log('No findings.');
      }
    } catch (err) {
      console.log(`Error in inspectString: ${err.message || err}`);
    }
    

Python

def inspect_string(
        project,
        content_string,
        info_types,
        custom_dictionaries=None,
        custom_regexes=None,
        min_likelihood=None,
        max_findings=None,
        include_quote=True,
    ):
        """Uses the Data Loss Prevention API to analyze strings for protected data.
        Args:
            project: The Google Cloud project id to use as a parent resource.
            content_string: The string to inspect.
            info_types: A list of strings representing info types to look for.
                A full list of info type categories can be fetched from the API.
            min_likelihood: A string representing the minimum likelihood threshold
                that constitutes a match. One of: 'LIKELIHOOD_UNSPECIFIED',
                'VERY_UNLIKELY', 'UNLIKELY', 'POSSIBLE', 'LIKELY', 'VERY_LIKELY'.
            max_findings: The maximum number of findings to report; 0 = no maximum.
            include_quote: Boolean for whether to display a quote of the detected
                information in the results.
        Returns:
            None; the response from the API is printed to the terminal.
        """

        # Import the client library.
        import google.cloud.dlp

        # Instantiate a client.
        dlp = google.cloud.dlp_v2.DlpServiceClient()

        # Prepare info_types by converting the list of strings into a list of
        # dictionaries (protos are also accepted).
        info_types = [{"name": info_type} for info_type in info_types]

        # Prepare custom_info_types by parsing the dictionary word lists and
        # regex patterns.
        if custom_dictionaries is None:
            custom_dictionaries = []
        dictionaries = [
            {
                "info_type": {"name": "CUSTOM_DICTIONARY_{}".format(i)},
                "dictionary": {"word_list": {"words": custom_dict.split(",")}},
            }
            for i, custom_dict in enumerate(custom_dictionaries)
        ]
        if custom_regexes is None:
            custom_regexes = []
        regexes = [
            {
                "info_type": {"name": "CUSTOM_REGEX_{}".format(i)},
                "regex": {"pattern": custom_regex},
            }
            for i, custom_regex in enumerate(custom_regexes)
        ]
        custom_info_types = dictionaries + regexes

        # Construct the configuration dictionary. Keys which are None may
        # optionally be omitted entirely.
        inspect_config = {
            "info_types": info_types,
            "custom_info_types": custom_info_types,
            "min_likelihood": min_likelihood,
            "include_quote": include_quote,
            "limits": {"max_findings_per_request": max_findings},
        }

        # Construct the `item`.
        item = {"value": content_string}

        # Convert the project id into a full resource id.
        parent = dlp.project_path(project)

        # Call the API.
        response = dlp.inspect_content(parent, inspect_config, item)

        # Print out the results.
        if response.result.findings:
            for finding in response.result.findings:
                try:
                    if finding.quote:
                        print("Quote: {}".format(finding.quote))
                except AttributeError:
                    pass
                print("Info type: {}".format(finding.info_type.name))
                print("Likelihood: {}".format(finding.likelihood))
        else:
            print("No findings.")

    

Go

import (
    	"context"
    	"fmt"
    	"io"

    	dlp "cloud.google.com/go/dlp/apiv2"
    	dlppb "google.golang.org/genproto/googleapis/privacy/dlp/v2"
    )

    // inspectString inspects the a given string, and prints results.
    func inspectString(w io.Writer, projectID, textToInspect string) error {
    	// projectID := "my-project-id"
    	// textToInspect := "My name is Gary and my email is gary@example.com"
    	ctx := context.Background()

    	// Initialize client.
    	client, err := dlp.NewClient(ctx)
    	if err != nil {
    		return err
    	}
    	defer client.Close() // Closing the client safely cleans up background resources.

    	// Create and send the request.
    	req := &dlppb.InspectContentRequest{
    		Parent: "projects/" + projectID,
    		Item: &dlppb.ContentItem{
    			DataItem: &dlppb.ContentItem_Value{
    				Value: textToInspect,
    			},
    		},
    		InspectConfig: &dlppb.InspectConfig{
    			InfoTypes: []*dlppb.InfoType{
    				{Name: "PHONE_NUMBER"},
    				{Name: "EMAIL_ADDRESS"},
    				{Name: "CREDIT_CARD_NUMBER"},
    			},
    			IncludeQuote: true,
    		},
    	}
    	resp, err := client.InspectContent(ctx, req)
    	if err != nil {
    		return err
    	}

    	// Process the results.
    	result := resp.Result
    	fmt.Fprintf(w, "Findings: %d\n", len(result.Findings))
    	for _, f := range result.Findings {
    		fmt.Fprintf(w, "\tQoute: %s\n", f.Quote)
    		fmt.Fprintf(w, "\tInfo type: %s\n", f.InfoType.Name)
    		fmt.Fprintf(w, "\tLikelihood: %s\n", f.Likelihood)
    	}
    	return nil
    }
    

PHP

use Google\Cloud\Dlp\V2\DlpServiceClient;
    use Google\Cloud\Dlp\V2\ContentItem;
    use Google\Cloud\Dlp\V2\InfoType;
    use Google\Cloud\Dlp\V2\InspectConfig;
    use Google\Cloud\Dlp\V2\Likelihood;

    /** Uncomment and populate these variables in your code */
    // $projectId = 'YOUR_PROJECT_ID';
    // $textToInspect = 'My name is Gary and my email is gary@example.com';

    // Instantiate a client.
    $dlp = new DlpServiceClient();

    // Construct request
    $parent = $dlp->projectName($projectId);
    $item = (new ContentItem())
        ->setValue($textToInspect);
    $inspectConfig = (new InspectConfig())
        // The infoTypes of information to match
        ->setInfoTypes([
            (new InfoType())->setName('PHONE_NUMBER'),
            (new InfoType())->setName('EMAIL_ADDRESS'),
            (new InfoType())->setName('CREDIT_CARD_NUMBER')
        ])
        // Whether to include the matching string
        ->setIncludeQuote(true);

    // Run request
    $response = $dlp->inspectContent($parent, [
        'inspectConfig' => $inspectConfig,
        'item' => $item
    ]);

    // Print the results
    $findings = $response->getResult()->getFindings();
    if (count($findings) == 0) {
        print('No findings.' . PHP_EOL);
    } else {
        print('Findings:' . PHP_EOL);
        foreach ($findings as $finding) {
            print('  Quote: ' . $finding->getQuote() . PHP_EOL);
            print('  Info type: ' . $finding->getInfoType()->getName() . PHP_EOL);
            $likelihoodString = Likelihood::name($finding->getLikelihood());
            print('  Likelihood: ' . $likelihoodString . PHP_EOL);
        }
    }

Ruby

# project_id   = "Your Google Cloud project ID"
    # content      = "The text to inspect"
    # max_findings = "Maximum number of findings to report per request (0 = server maximum)"

    require "google/cloud/dlp"

    dlp = Google::Cloud::Dlp.new
    inspect_config = {
      # The types of information to match
      info_types:     [{ name: "PERSON_NAME" }, { name: "US_STATE" }],

      # Only return results above a likelihood threshold (0 for all)
      min_likelihood: :POSSIBLE,

      # Limit the number of findings (0 for no limit)
      limits:         { max_findings_per_request: max_findings },

      # Whether to include the matching string in the response
      include_quote:  true
    }

    # The item to inspect
    item_to_inspect = { value: content }

    # Run request
    parent = "projects/#{project_id}"
    response = dlp.inspect_content parent,
                                   inspect_config: inspect_config,
                                   item:           item_to_inspect

    # Print the results
    if response.result.findings.empty?
      puts "No findings"
    else
      response.result.findings.each do |finding|
        puts "Quote:      #{finding.quote}"
        puts "Info type:  #{finding.info_type.name}"
        puts "Likelihood: #{finding.likelihood}"
      end
    end

C#

public static object InspectString(
        string projectId,
        string dataValue,
        string minLikelihood,
        int maxFindings,
        bool includeQuote,
        IEnumerable<InfoType> infoTypes,
        IEnumerable<CustomInfoType> customInfoTypes)
    {
        var inspectConfig = new InspectConfig
        {
            MinLikelihood = (Likelihood)System.Enum.Parse(typeof(Likelihood), minLikelihood),
            Limits = new InspectConfig.Types.FindingLimits
            {
                MaxFindingsPerRequest = maxFindings
            },
            IncludeQuote = includeQuote,
            InfoTypes = { infoTypes },
            CustomInfoTypes = { customInfoTypes }
        };
        var request = new InspectContentRequest
        {
            ParentAsProjectName = new ProjectName(projectId),
            Item = new ContentItem
            {
                Value = dataValue
            },
            InspectConfig = inspectConfig
        };

        DlpServiceClient dlp = DlpServiceClient.Create();
        InspectContentResponse response = dlp.InspectContent(request);

        var findings = response.Result.Findings;
        if (findings.Count > 0)
        {
            Console.WriteLine("Findings:");
            foreach (var finding in findings)
            {
                if (includeQuote)
                {
                    Console.WriteLine($"  Quote: {finding.Quote}");
                }
                Console.WriteLine($"  InfoType: {finding.InfoType}");
                Console.WriteLine($"  Likelihood: {finding.Likelihood}");
            }
        }
        else
        {
            Console.WriteLine("No findings.");
        }

        return 0;
    }
    

Como inspecionar uma tabela

Veja os exemplos de código abaixo para saber como verificar conteúdo confidencial em uma tabela de dados. As tabelas são compatíveis com vários tipos.

Consulte o guia de início rápido do JSON para mais informações sobre o uso da API Cloud DLP com o JSON.

Entrada JSON:

POST https://dlp.googleapis.com/v2/projects/[PROJECT_ID]/content:inspect?key={YOUR_API_KEY}

    {
      "item":{
        "table":{
          "headers": [{"name":"column1"}],
          "rows": [{
            "values":[
              {"string_value": "My phone number is (206) 555-0123"},
            ]},
          ],
        }
      },
      "inspectConfig":{
        "infoTypes":[
          {
            "name":"PHONE_NUMBER"
          },
          {
            "name":"US_TOLLFREE_PHONE_NUMBER"
          }
        ],
        "minLikelihood":"POSSIBLE",
        "limits":{
          "maxFindingsPerItem":0
        },
        "includeQuote":true
      }
    }
    

Saída JSON:

{
      "result":{
        "findings":[
          {
            "quote":"(206) 555-0123",
            "infoType":{
              "name":"PHONE_NUMBER"
            },
            "likelihood":"LIKELY",
            "location":{
              "byteRange":{
                "start":"19",
                "end":"33"
              },
              "codepointRange":{
                "start":"19",
                "end":"33"
              },
              "contentLocations":[
                {
                  "recordLocation":{
                    "fieldId":{
                      "name":"column1"
                    },
                    "tableLocation":{

                    }
                  }
                }
              ]
            },
            "createTime":"2018-10-30T00:09:04.569Z"
          }
        ]
      }
    }
    

Como inspecionar um arquivo de texto

Veja os exemplos de código abaixo para saber como verificar conteúdo confidencial em um arquivo de texto.

Java

import com.google.cloud.dlp.v2.DlpServiceClient;
    import com.google.privacy.dlp.v2.ByteContentItem;
    import com.google.privacy.dlp.v2.ByteContentItem.BytesType;
    import com.google.privacy.dlp.v2.ContentItem;
    import com.google.privacy.dlp.v2.Finding;
    import com.google.privacy.dlp.v2.InfoType;
    import com.google.privacy.dlp.v2.InspectConfig;
    import com.google.privacy.dlp.v2.InspectContentRequest;
    import com.google.privacy.dlp.v2.InspectContentResponse;
    import com.google.privacy.dlp.v2.ProjectName;
    import com.google.protobuf.ByteString;
    import java.io.FileInputStream;
    import java.util.ArrayList;
    import java.util.List;

    public class InspectTextFile {

      public static void inspectTextFile() {
        // TODO(developer): Replace these variables before running the sample.
        String projectId = "your-project-id";
        String filePath = "path/to/file.txt";
        inspectTextFile(projectId, filePath);
      }

      // Inspects the specified text file.
      public static void inspectTextFile(String projectId, String filePath) {
        // Initialize client that will be used to send requests. This client only needs to be created
        // once, and can be reused for multiple requests. After completing all of your requests, call
        // the "close" method on the client to safely clean up any remaining background resources.
        try (DlpServiceClient dlp = DlpServiceClient.create()) {
          // Specify the project used for request.
          ProjectName project = ProjectName.of(projectId);

          // Specify the type and content to be inspected.
          ByteString fileBytes = ByteString.readFrom(new FileInputStream(filePath));
          ByteContentItem byteItem =
              ByteContentItem.newBuilder().setType(BytesType.TEXT_UTF8).setData(fileBytes).build();
          ContentItem item = ContentItem.newBuilder().setByteItem(byteItem).build();

          // Specify the type of info the inspection will look for.
          List<InfoType> infoTypes = new ArrayList<>();
          // See https://cloud.google.com/dlp/docs/infotypes-reference for complete list of info types
          for (String typeName : new String[] {"PHONE_NUMBER", "EMAIL_ADDRESS", "CREDIT_CARD_NUMBER"}) {
            infoTypes.add(InfoType.newBuilder().setName(typeName).build());
          }

          // Construct the configuration for the Inspect request.
          InspectConfig config =
              InspectConfig.newBuilder().addAllInfoTypes(infoTypes).setIncludeQuote(true).build();

          // Construct the Inspect request to be sent by the client.
          InspectContentRequest request =
              InspectContentRequest.newBuilder()
                  .setParent(project.toString())
                  .setItem(item)
                  .setInspectConfig(config)
                  .build();

          // Use the client to send the API request.
          InspectContentResponse response = dlp.inspectContent(request);

          // Parse the response and process results
          System.out.println("Findings: " + response.getResult().getFindingsCount());
          for (Finding f : response.getResult().getFindingsList()) {
            System.out.println("\tQuote: " + f.getQuote());
            System.out.println("\tInfo type: " + f.getInfoType().getName());
            System.out.println("\tLikelihood: " + f.getLikelihood());
          }
        } catch (Exception e) {
          System.out.println("Error during inspectFile: \n" + e.toString());
        }
      }
    }

Node.js

// Imports the Google Cloud Data Loss Prevention library
    const DLP = require('@google-cloud/dlp');

    // Import other required libraries
    const fs = require('fs');
    const mime = require('mime');

    // Instantiates a client
    const dlp = new DLP.DlpServiceClient();

    // The project ID to run the API call under
    // const callingProjectId = process.env.GCLOUD_PROJECT;

    // The path to a local file to inspect. Can be a text, JPG, or PNG file.
    // const filepath = 'path/to/image.png';

    // The minimum likelihood required before returning a match
    // const minLikelihood = 'LIKELIHOOD_UNSPECIFIED';

    // The maximum number of findings to report per request (0 = server maximum)
    // const maxFindings = 0;

    // The infoTypes of information to match
    // const infoTypes = [{ name: 'PHONE_NUMBER' }, { name: 'EMAIL_ADDRESS' }, { name: 'CREDIT_CARD_NUMBER' }];

    // The customInfoTypes of information to match
    // const customInfoTypes = [{ infoType: { name: 'DICT_TYPE' }, dictionary: { wordList: { words: ['foo', 'bar', 'baz']}}},
    //   { infoType: { name: 'REGEX_TYPE' }, regex: '\\(\\d{3}\\) \\d{3}-\\d{4}'}];

    // Whether to include the matching string
    // const includeQuote = true;

    // Construct file data to inspect
    const fileTypeConstant =
      ['image/jpeg', 'image/bmp', 'image/png', 'image/svg'].indexOf(
        mime.getType(filepath)
      ) + 1;
    const fileBytes = Buffer.from(fs.readFileSync(filepath)).toString('base64');
    const item = {
      byteItem: {
        type: fileTypeConstant,
        data: fileBytes,
      },
    };

    // Construct request
    const request = {
      parent: dlp.projectPath(callingProjectId),
      inspectConfig: {
        infoTypes: infoTypes,
        customInfoTypes: customInfoTypes,
        minLikelihood: minLikelihood,
        includeQuote: includeQuote,
        limits: {
          maxFindingsPerRequest: maxFindings,
        },
      },
      item: item,
    };

    // Run request
    try {
      const [response] = await dlp.inspectContent(request);
      const findings = response.result.findings;
      if (findings.length > 0) {
        console.log('Findings:');
        findings.forEach(finding => {
          if (includeQuote) {
            console.log(`\tQuote: ${finding.quote}`);
          }
          console.log(`\tInfo type: ${finding.infoType.name}`);
          console.log(`\tLikelihood: ${finding.likelihood}`);
        });
      } else {
        console.log('No findings.');
      }
    } catch (err) {
      console.log(`Error in inspectFile: ${err.message || err}`);
    }

Python



    def inspect_file(
        project,
        filename,
        info_types,
        min_likelihood=None,
        custom_dictionaries=None,
        custom_regexes=None,
        max_findings=None,
        include_quote=True,
        mime_type=None,
    ):
        """Uses the Data Loss Prevention API to analyze a file for protected data.
        Args:
            project: The Google Cloud project id to use as a parent resource.
            filename: The path to the file to inspect.
            info_types: A list of strings representing info types to look for.
                A full list of info type categories can be fetched from the API.
            min_likelihood: A string representing the minimum likelihood threshold
                that constitutes a match. One of: 'LIKELIHOOD_UNSPECIFIED',
                'VERY_UNLIKELY', 'UNLIKELY', 'POSSIBLE', 'LIKELY', 'VERY_LIKELY'.
            max_findings: The maximum number of findings to report; 0 = no maximum.
            include_quote: Boolean for whether to display a quote of the detected
                information in the results.
            mime_type: The MIME type of the file. If not specified, the type is
                inferred via the Python standard library's mimetypes module.
        Returns:
            None; the response from the API is printed to the terminal.
        """

        import mimetypes

        # Import the client library.
        import google.cloud.dlp

        # Instantiate a client.
        dlp = google.cloud.dlp_v2.DlpServiceClient()

        # Prepare info_types by converting the list of strings into a list of
        # dictionaries (protos are also accepted).
        if not info_types:
            info_types = ["FIRST_NAME", "LAST_NAME", "EMAIL_ADDRESS"]
        info_types = [{"name": info_type} for info_type in info_types]

        # Prepare custom_info_types by parsing the dictionary word lists and
        # regex patterns.
        if custom_dictionaries is None:
            custom_dictionaries = []
        dictionaries = [
            {
                "info_type": {"name": "CUSTOM_DICTIONARY_{}".format(i)},
                "dictionary": {"word_list": {"words": custom_dict.split(",")}},
            }
            for i, custom_dict in enumerate(custom_dictionaries)
        ]
        if custom_regexes is None:
            custom_regexes = []
        regexes = [
            {
                "info_type": {"name": "CUSTOM_REGEX_{}".format(i)},
                "regex": {"pattern": custom_regex},
            }
            for i, custom_regex in enumerate(custom_regexes)
        ]
        custom_info_types = dictionaries + regexes

        # Construct the configuration dictionary. Keys which are None may
        # optionally be omitted entirely.
        inspect_config = {
            "info_types": info_types,
            "custom_info_types": custom_info_types,
            "min_likelihood": min_likelihood,
            "limits": {"max_findings_per_request": max_findings},
        }

        # If mime_type is not specified, guess it from the filename.
        if mime_type is None:
            mime_guess = mimetypes.MimeTypes().guess_type(filename)
            mime_type = mime_guess[0]

        # Select the content type index from the list of supported types.
        supported_content_types = {
            None: 0,  # "Unspecified"
            "image/jpeg": 1,
            "image/bmp": 2,
            "image/png": 3,
            "image/svg": 4,
            "text/plain": 5,
        }
        content_type_index = supported_content_types.get(mime_type, 0)

        # Construct the item, containing the file's byte data.
        with open(filename, mode="rb") as f:
            item = {"byte_item": {"type": content_type_index, "data": f.read()}}
        # Convert the project id into a full resource id.
        parent = dlp.project_path(project)

        # Call the API.
        response = dlp.inspect_content(parent, inspect_config, item)

        # Print out the results.
        if response.result.findings:
            for finding in response.result.findings:
                try:
                    print("Quote: {}".format(finding.quote))
                except AttributeError:
                    pass
                print("Info type: {}".format(finding.info_type.name))
                print("Likelihood: {}".format(finding.likelihood))
        else:
            print("No findings.")

    

Go

import (
    	"context"
    	"fmt"
    	"io"
    	"io/ioutil"

    	dlp "cloud.google.com/go/dlp/apiv2"
    	dlppb "google.golang.org/genproto/googleapis/privacy/dlp/v2"
    )

    // inspectTextFile inspects a text file at a given filePath, and prints results.
    func inspectTextFile(w io.Writer, projectID, filePath string) error {
    	// projectID := "my-project-id"
    	// filePath := "path/to/image.png"
    	ctx := context.Background()

    	// Initialize client.
    	client, err := dlp.NewClient(ctx)
    	if err != nil {
    		return err
    	}
    	defer client.Close() // Closing the client safely cleans up background resources.

    	// Gather the resources for the request.
    	data, err := ioutil.ReadFile(filePath)
    	if err != nil {
    		return err
    	}

    	// Create and send the request.
    	req := &dlppb.InspectContentRequest{
    		Parent: "projects/" + projectID,
    		Item: &dlppb.ContentItem{
    			DataItem: &dlppb.ContentItem_ByteItem{
    				ByteItem: &dlppb.ByteContentItem{
    					Type: dlppb.ByteContentItem_TEXT_UTF8,
    					Data: data,
    				},
    			},
    		},
    		InspectConfig: &dlppb.InspectConfig{
    			InfoTypes: []*dlppb.InfoType{
    				{Name: "PHONE_NUMBER"},
    				{Name: "EMAIL_ADDRESS"},
    				{Name: "CREDIT_CARD_NUMBER"},
    			},
    			IncludeQuote: true,
    		},
    	}
    	resp, err := client.InspectContent(ctx, req)
    	if err != nil {
    		return fmt.Errorf("InspectContent: %v", err)
    	}

    	// Process the results.
    	fmt.Fprintf(w, "Findings: %d\n", len(resp.Result.Findings))
    	for _, f := range resp.Result.Findings {
    		fmt.Fprintf(w, "\tQoute: %s\n", f.Quote)
    		fmt.Fprintf(w, "\tInfo type: %s\n", f.InfoType.Name)
    		fmt.Fprintf(w, "\tLikelihood: %s\n", f.Likelihood)
    	}
    	return nil
    }
    

PHP

use Google\Cloud\Dlp\V2\DlpServiceClient;
    use Google\Cloud\Dlp\V2\ContentItem;
    use Google\Cloud\Dlp\V2\InfoType;
    use Google\Cloud\Dlp\V2\InspectConfig;
    use Google\Cloud\Dlp\V2\ByteContentItem;
    use Google\Cloud\Dlp\V2\ByteContentItem\BytesType;
    use Google\Cloud\Dlp\V2\Likelihood;

    /** Uncomment and populate these variables in your code */
    // $projectId = 'YOUR_PROJECT_ID';
    // $filepath = 'path/to/image.png';

    // Instantiate a client.
    $dlp = new DlpServiceClient();

    // Get the bytes of the file
    $fileBytes = (new ByteContentItem())
        ->setType(BytesType::TEXT_UTF8)
        ->setData(file_get_contents($filepath));

    // Construct request
    $parent = $dlp->projectName($projectId);
    $item = (new ContentItem())
        ->setByteItem($fileBytes);
    $inspectConfig = (new InspectConfig())
        // The infoTypes of information to match
        ->setInfoTypes([
            (new InfoType())->setName('PHONE_NUMBER'),
            (new InfoType())->setName('EMAIL_ADDRESS'),
            (new InfoType())->setName('CREDIT_CARD_NUMBER')
        ])
        // Whether to include the matching string
        ->setIncludeQuote(true);

    // Run request
    $response = $dlp->inspectContent($parent, [
        'inspectConfig' => $inspectConfig,
        'item' => $item
    ]);

    // Print the results
    $findings = $response->getResult()->getFindings();
    if (count($findings) == 0) {
        print('No findings.' . PHP_EOL);
    } else {
        print('Findings:' . PHP_EOL);
        foreach ($findings as $finding) {
            print('  Quote: ' . $finding->getQuote() . PHP_EOL);
            print('  Info type: ' . $finding->getInfoType()->getName() . PHP_EOL);
            $likelihoodString = Likelihood::name($finding->getLikelihood());
            print('  Likelihood: ' . $likelihoodString . PHP_EOL);
        }
    }

Ruby

# project_id   = "Your Google Cloud project ID"
    # filename     = "The file path to the file to inspect"
    # max_findings = "Maximum number of findings to report per request (0 = server maximum)"

    require "google/cloud/dlp"

    dlp = Google::Cloud::Dlp.new
    inspect_config = {
      # The types of information to match
      info_types:     [{ name: "PERSON_NAME" }, { name: "PHONE_NUMBER" }],

      # Only return results above a likelihood threshold (0 for all)
      min_likelihood: :POSSIBLE,

      # Limit the number of findings (0 for no limit)
      limits:         { max_findings_per_request: max_findings },

      # Whether to include the matching string in the response
      include_quote:  true
    }

    # The item to inspect
    file = File.open filename, "rb"
    item_to_inspect = { byte_item: { type: :BYTES_TYPE_UNSPECIFIED, data: file.read } }

    # Run request
    parent = "projects/#{project_id}"
    response = dlp.inspect_content parent,
                                   inspect_config: inspect_config,
                                   item:           item_to_inspect

    # Print the results
    if response.result.findings.empty?
      puts "No findings"
    else
      response.result.findings.each do |finding|
        puts "Quote:      #{finding.quote}"
        puts "Info type:  #{finding.info_type.name}"
        puts "Likelihood: #{finding.likelihood}"
      end
    end

C#

private static readonly Dictionary<string, ByteContentItem.Types.BytesType> s_fileTypes =
        new Dictionary<string, ByteContentItem.Types.BytesType>()
    {
        { ".bmp", ByteContentItem.Types.BytesType.ImageBmp },
        { ".jpg", ByteContentItem.Types.BytesType.ImageJpeg },
        { ".jpeg", ByteContentItem.Types.BytesType.ImageJpeg },
        { ".png", ByteContentItem.Types.BytesType.ImagePng },
        { ".svg", ByteContentItem.Types.BytesType.ImageSvg },
        { ".txt", ByteContentItem.Types.BytesType.TextUtf8 }
    };

    public static object InspectFile(
        string projectId,
        string file,
        string minLikelihood,
        int maxFindings,
        bool includeQuote,
        IEnumerable<InfoType> infoTypes,
        IEnumerable<CustomInfoType> customInfoTypes)
    {
        var fileStream = new FileStream(file, FileMode.Open);
        try
        {
            var inspectConfig = new InspectConfig
            {
                MinLikelihood = (Likelihood)System.Enum.Parse(typeof(Likelihood), minLikelihood),
                Limits = new FindingLimits
                {
                    MaxFindingsPerRequest = maxFindings
                },
                IncludeQuote = includeQuote,
                InfoTypes = { infoTypes },
                CustomInfoTypes = { customInfoTypes }
            };
            DlpServiceClient dlp = DlpServiceClient.Create();
            InspectContentResponse response = dlp.InspectContent(new InspectContentRequest
            {
                ParentAsProjectName = new ProjectName(projectId),
                Item = new ContentItem
                {
                    ByteItem = new ByteContentItem
                    {
                        Data = ByteString.FromStream(fileStream),
                        Type = s_fileTypes.GetValueOrDefault(
                                new FileInfo(file).Extension.ToLower(),
                                ByteContentItem.Types.BytesType.Unspecified
                        )
                    }
                },
                InspectConfig = inspectConfig
            });

            var findings = response.Result.Findings;
            if (findings.Count > 0)
            {
                Console.WriteLine("Findings:");
                foreach (var finding in findings)
                {
                    if (includeQuote)
                    {
                        Console.WriteLine($"  Quote: {finding.Quote}");
                    }
                    Console.WriteLine($"  InfoType: {finding.InfoType}");
                    Console.WriteLine($"  Likelihood: {finding.Likelihood}");
                }
            }
            else
            {
                Console.WriteLine("No findings.");
            }

            return 0;
        }
        finally
        {
            fileStream.Close();
        }
    }