Cloud Data Loss Prevention (Cloud DLP) ahora forma parte de la protección de datos sensibles. El nombre de la API sigue siendo el mismo: API de Cloud Data Loss Prevention (API de DLP). Para obtener información sobre los servicios que conforman la protección de datos sensibles, consulta la
Descripción general de la protección de datos sensibles.
Inspecciona un archivo local
Organiza tus páginas con colecciones
Guarda y categoriza el contenido según tus preferencias.
Se demuestra cómo encontrar datos sensibles en un archivo de texto o imagen local.
Explora más
Para obtener documentación en la que se incluye esta muestra de código, consulta lo siguiente:
Muestra de código
Python
Para obtener información sobre cómo instalar y usar la biblioteca cliente de la protección de datos sensibles, consulta Bibliotecas cliente de la protección de datos sensibles.
Para autenticarte en la protección de datos sensibles, configura las credenciales predeterminadas de la aplicación.
Si deseas obtener más información, consulta Configura la autenticación para un entorno de desarrollo local.
import mimetypes
from typing import List
from typing import Optional
import google.cloud.dlp
def inspect_file(
project: str,
filename: str,
info_types: List[str],
min_likelihood: str = None,
custom_dictionaries: List[str] = None,
custom_regexes: List[str] = None,
max_findings: Optional[int] = None,
include_quote: bool = True,
mime_type: str = None,
) -> None:
"""Uses the Data Loss Prevention API to analyze a file for protected data.
Args:
project: The Google Cloud project id to use as a parent resource.
filename: The path to the file to inspect.
info_types: A list of strings representing info types to look for.
A full list of info type categories can be fetched from the API.
min_likelihood: A string representing the minimum likelihood threshold
that constitutes a match. One of: 'LIKELIHOOD_UNSPECIFIED',
'VERY_UNLIKELY', 'UNLIKELY', 'POSSIBLE', 'LIKELY', 'VERY_LIKELY'.
max_findings: The maximum number of findings to report; 0 = no maximum.
include_quote: Boolean for whether to display a quote of the detected
information in the results.
mime_type: The MIME type of the file. If not specified, the type is
inferred via the Python standard library's mimetypes module.
Returns:
None; the response from the API is printed to the terminal.
"""
# Instantiate a client.
dlp = google.cloud.dlp_v2.DlpServiceClient()
# Prepare info_types by converting the list of strings into a list of
# dictionaries (protos are also accepted).
if not info_types:
info_types = ["FIRST_NAME", "LAST_NAME", "EMAIL_ADDRESS"]
info_types = [{"name": info_type} for info_type in info_types]
# Prepare custom_info_types by parsing the dictionary word lists and
# regex patterns.
if custom_dictionaries is None:
custom_dictionaries = []
dictionaries = [
{
"info_type": {"name": f"CUSTOM_DICTIONARY_{i}"},
"dictionary": {"word_list": {"words": custom_dict.split(",")}},
}
for i, custom_dict in enumerate(custom_dictionaries)
]
if custom_regexes is None:
custom_regexes = []
regexes = [
{
"info_type": {"name": f"CUSTOM_REGEX_{i}"},
"regex": {"pattern": custom_regex},
}
for i, custom_regex in enumerate(custom_regexes)
]
custom_info_types = dictionaries + regexes
# Construct the configuration dictionary. Keys which are None may
# optionally be omitted entirely.
inspect_config = {
"info_types": info_types,
"custom_info_types": custom_info_types,
"min_likelihood": min_likelihood,
"include_quote": include_quote,
"limits": {"max_findings_per_request": max_findings},
}
# If mime_type is not specified, guess it from the filename.
if mime_type is None:
mime_guess = mimetypes.MimeTypes().guess_type(filename)
mime_type = mime_guess[0]
# Select the content type index from the list of supported types.
supported_content_types = {
None: 0, # "Unspecified"
"image/jpeg": 1,
"image/bmp": 2,
"image/png": 3,
"image/svg": 4,
"text/plain": 5,
}
content_type_index = supported_content_types.get(mime_type, 0)
# Construct the item, containing the file's byte data.
with open(filename, mode="rb") as f:
item = {"byte_item": {"type_": content_type_index, "data": f.read()}}
# Convert the project id into a full resource id.
parent = f"projects/{project}"
# Call the API.
response = dlp.inspect_content(
request={"parent": parent, "inspect_config": inspect_config, "item": item}
)
# Print out the results.
if response.result.findings:
for finding in response.result.findings:
try:
print(f"Quote: {finding.quote}")
except AttributeError:
pass
print(f"Info type: {finding.info_type.name}")
print(f"Likelihood: {finding.likelihood}")
else:
print("No findings.")
Salvo que se indique lo contrario, el contenido de esta página está sujeto a la licencia Atribución 4.0 de Creative Commons, y los ejemplos de código están sujetos a la licencia Apache 2.0. Para obtener más información, consulta las políticas del sitio de Google Developers. Java es una marca registrada de Oracle o sus afiliados.
[{
"type": "thumb-down",
"id": "hardToUnderstand",
"label":"Hard to understand"
},{
"type": "thumb-down",
"id": "incorrectInformationOrSampleCode",
"label":"Incorrect information or sample code"
},{
"type": "thumb-down",
"id": "missingTheInformationSamplesINeed",
"label":"Missing the information/samples I need"
},{
"type": "thumb-down",
"id": "translationIssue",
"label":"Problema de traducción"
},{
"type": "thumb-down",
"id": "otherDown",
"label":"Otro"
}]
[{
"type": "thumb-up",
"id": "easyToUnderstand",
"label":"Fácil de comprender"
},{
"type": "thumb-up",
"id": "solvedMyProblem",
"label":"Resolvió mi problema"
},{
"type": "thumb-up",
"id": "otherUp",
"label":"Otro"
}]