An infoType in the Cloud Data Loss Prevention (DLP) API is a type of sensitive data. For
EMAIL_ADDRESS infoType corresponds to an email address, such as
Every infoType has a corresponding detector. The DLP API uses infoType detectors in configuration to determine what to inspect for and how to transform findings. InfoType names are also used when displaying or reporting scan results.
Built-in infoType detectors
Built-in infoType detectors are built into the DLP API, and include detectors for country- or region-specific sensitive data such as the French Numéro d'Inscription au Répertoire (NIR), UK driver's license number, or US Social Security number, and detectors for global sensitive data such as credit card numbers or email addresses. To detect content that corresponds to infoTypes, the DLP API leverages various techniques including pattern matching, checksums, machine-learning, context analysis, and others.
The list of built-in infoType detectors is always being updated. For a complete list of currently supported built-in infoType detectors, see InfoType detector reference.
You can also view a complete list of all built-in infoType detectors by
calling the DLP API's
Built-in infoType detectors are not a 100% accurate detection method. For example, they can't guarantee compliance with regulatory requirements. You must decide what data is sensitive and how to best protect it. Google recommends that you test your settings to make sure your configuration meets your requirements.
Custom infoType detectors
There are two kinds of custom infoType detectors:
- Regular expressions (regex)
In addition, the DLP API includes the following detector extension, which allows you to fine-tine results by adjusting the likelihood based on other content in the vicinity of a potential finding:
- Hotword rules
In the DLP API, custom infoType detectors are defined in the
object, within the
CustomInfoType object allows you to create a custom infoType
detector for new content or to fine-tune the results returned by existing
Use custom dictionaries to match a list of words or phrases. A dictionary can act as its own unique detector.
For more details about how dictionary custom infoType detectors work, as well as examples in action, see Creating a dictionary custom infoType detector.
A regular expression (regex) custom infoType detector allows you to create your
own infoType detectors that enable the DLP API to detect
matches based on a regex pattern. For example, suppose that you had medical
record numbers in the form
###-#-#####. You could define a regex pattern such
as the following:
The DLP API would then match items like the following:
You can also specify a likelihood to assign to each
custom infoType match. That is, when the DLP API matches the
sequence you specify, it will assign the likelihood that you have indicated.
This is useful because if your custom regex defines a sequence that is common
enough it could easily match some other random sequence, you would not want the
DLP API to label every match as
VERY_LIKELY. Doing so would
erode confidence in scan results and potentially cause the wrong information to
For more information about regular expression custom infoType detectors, and to see them in action, see Creating a regex custom infoType detector.
Hotword rules allow you to further extend dictionary and regex custom infoType
detectors with powerful context rules. Suppose you wanted to detect a custom
infoType like a medical record number in the form of
###-#-#####, and you
wanted to boost the DLP API finding's match likelihood when the
hotword "MRN" was before—but not after—this number. Therefore:
- 123-4-56789 would match as
- MRN 123-4-56789 would match as
Hotword rules enable the DLP API to do this. To learn how to do this, see Customizing match likelihood.