Dictionary(mapping=None, *, ignore_unknown_fields=False, **kwargs)
Custom information type based on a dictionary of words or phrases. This can be used to match sensitive information specific to the data, such as a list of employee IDs or job titles.
Dictionary words are case-insensitive and all characters other than
letters and digits in the unicode Basic Multilingual
Plane <https://en.wikipedia.org/wiki/Plane_%28Unicode%29#Basic_Multilingual_Plane>
__
will be replaced with whitespace when scanning for matches, so the
dictionary phrase "Sam Johnson" will match all three phrases "sam
johnson", "Sam, Johnson", and "Sam (Johnson)". Additionally, the
characters surrounding any match must be of a different type than
the adjacent characters within the word, so letters must be next to
non-letters and digits next to non-digits. For example, the
dictionary word "jen" will match the first three letters of the text
"jen123" but will return no matches for "jennifer".
Dictionary words containing a large number of characters that are
not letters or digits may result in unexpected findings because such
characters are treated as whitespace. The
limits <https://cloud.google.com/dlp/limits>
__ page contains
details about the size limits of dictionaries. For dictionaries that
do not fit within these constraints, consider using
LargeCustomDictionaryConfig
in the StoredInfoType
API.
Attributes | |
---|---|
Name | Description |
word_list |
google.cloud.dlp_v2.types.CustomInfoType.Dictionary.WordList
List of words or phrases to search for. |
cloud_storage_path |
google.cloud.dlp_v2.types.CloudStoragePath
Newline-delimited file of words in Cloud Storage. Only a single file is accepted. |
Classes
WordList
WordList(mapping=None, *, ignore_unknown_fields=False, **kwargs)
Message defining a list of words or phrases to search for in the data.