Lokale Datei prüfen
Mit Sammlungen den Überblick behalten
Sie können Inhalte basierend auf Ihren Einstellungen speichern und kategorisieren.
Veranschaulicht die Suche nach sensiblen Daten in einer lokalen Text- oder Bilddatei.
Weitere Informationen
Eine ausführliche Dokumentation, die dieses Codebeispiel enthält, finden Sie hier:
Codebeispiel
Python
Informationen zum Installieren und Verwenden der Clientbibliothek für Cloud DLP finden Sie unter Cloud DLP-Clientbibliotheken.
Für die Authentifizierung bei Cloud DLP richten Sie Standardanmeldedaten für Anwendungen ein.
Weitere Informationen finden Sie unter Authentifizierung für eine lokale Entwicklungsumgebung einrichten.
import mimetypes # noqa: I100, E402
from typing import Optional # noqa: I100, E402
import google.cloud.dlp # noqa: F811, E402
def inspect_file(
project: str,
filename: str,
info_types: List[str],
min_likelihood: str = None,
custom_dictionaries: List[str] = None,
custom_regexes: List[str] = None,
max_findings: Optional[int] = None,
include_quote: bool = True,
mime_type: str = None,
) -> None:
"""Uses the Data Loss Prevention API to analyze a file for protected data.
Args:
project: The Google Cloud project id to use as a parent resource.
filename: The path to the file to inspect.
info_types: A list of strings representing info types to look for.
A full list of info type categories can be fetched from the API.
min_likelihood: A string representing the minimum likelihood threshold
that constitutes a match. One of: 'LIKELIHOOD_UNSPECIFIED',
'VERY_UNLIKELY', 'UNLIKELY', 'POSSIBLE', 'LIKELY', 'VERY_LIKELY'.
max_findings: The maximum number of findings to report; 0 = no maximum.
include_quote: Boolean for whether to display a quote of the detected
information in the results.
mime_type: The MIME type of the file. If not specified, the type is
inferred via the Python standard library's mimetypes module.
Returns:
None; the response from the API is printed to the terminal.
"""
# Instantiate a client.
dlp = google.cloud.dlp_v2.DlpServiceClient()
# Prepare info_types by converting the list of strings into a list of
# dictionaries (protos are also accepted).
if not info_types:
info_types = ["FIRST_NAME", "LAST_NAME", "EMAIL_ADDRESS"]
info_types = [{"name": info_type} for info_type in info_types]
# Prepare custom_info_types by parsing the dictionary word lists and
# regex patterns.
if custom_dictionaries is None:
custom_dictionaries = []
dictionaries = [
{
"info_type": {"name": f"CUSTOM_DICTIONARY_{i}"},
"dictionary": {"word_list": {"words": custom_dict.split(",")}},
}
for i, custom_dict in enumerate(custom_dictionaries)
]
if custom_regexes is None:
custom_regexes = []
regexes = [
{
"info_type": {"name": f"CUSTOM_REGEX_{i}"},
"regex": {"pattern": custom_regex},
}
for i, custom_regex in enumerate(custom_regexes)
]
custom_info_types = dictionaries + regexes
# Construct the configuration dictionary. Keys which are None may
# optionally be omitted entirely.
inspect_config = {
"info_types": info_types,
"custom_info_types": custom_info_types,
"min_likelihood": min_likelihood,
"include_quote": include_quote,
"limits": {"max_findings_per_request": max_findings},
}
# If mime_type is not specified, guess it from the filename.
if mime_type is None:
mime_guess = mimetypes.MimeTypes().guess_type(filename)
mime_type = mime_guess[0]
# Select the content type index from the list of supported types.
supported_content_types = {
None: 0, # "Unspecified"
"image/jpeg": 1,
"image/bmp": 2,
"image/png": 3,
"image/svg": 4,
"text/plain": 5,
}
content_type_index = supported_content_types.get(mime_type, 0)
# Construct the item, containing the file's byte data.
with open(filename, mode="rb") as f:
item = {"byte_item": {"type_": content_type_index, "data": f.read()}}
# Convert the project id into a full resource id.
parent = f"projects/{project}"
# Call the API.
response = dlp.inspect_content(
request={"parent": parent, "inspect_config": inspect_config, "item": item}
)
# Print out the results.
if response.result.findings:
for finding in response.result.findings:
try:
print(f"Quote: {finding.quote}")
except AttributeError:
pass
print(f"Info type: {finding.info_type.name}")
print(f"Likelihood: {finding.likelihood}")
else:
print("No findings.")
Nächste Schritte
Informationen zum Suchen und Filtern von Codebeispielen für andere Google Cloud-Produkte finden Sie unter Google Cloud-Beispielbrowser.
Sofern nicht anders angegeben, sind die Inhalte dieser Seite unter der Creative Commons Attribution 4.0 License und Codebeispiele unter der Apache 2.0 License lizenziert. Weitere Informationen finden Sie in den Websiterichtlinien von Google Developers. Java ist eine eingetragene Marke von Oracle und/oder seinen Partnern.
[{
"type": "thumb-down",
"id": "hardToUnderstand",
"label":"Schwer verständlich"
},{
"type": "thumb-down",
"id": "incorrectInformationOrSampleCode",
"label":"Informationen oder Beispielcode falsch"
},{
"type": "thumb-down",
"id": "missingTheInformationSamplesINeed",
"label":"Benötigte Informationen/Beispiele nicht gefunden"
},{
"type": "thumb-down",
"id": "translationIssue",
"label":"Problem mit der Übersetzung"
},{
"type": "thumb-down",
"id": "otherDown",
"label":"Sonstiges"
}]
[{
"type": "thumb-up",
"id": "easyToUnderstand",
"label":"Leicht verständlich"
},{
"type": "thumb-up",
"id": "solvedMyProblem",
"label":"Mein Problem wurde gelöst"
},{
"type": "thumb-up",
"id": "otherUp",
"label":"Sonstiges"
}]