The DLP API has many powerful capabilities, but depending on the quantity of information that you instruct the DLP API to scan, it is possible for costs to become prohibitively high. This topic describes several methods that you can use to keep costs down while also ensuring that you're using the DLP API to scan the exact data that you intend to.
Use sampling to restrict the number of bytes inspected
If you are scanning BigQuery tables or Cloud Storage buckets, the DLP API can scan a small subset of the dataset. This can provide a sampling of scan results without incurring the potential costs of scanning an entire dataset.
Once you find a sample with sensitive data, you can schedule a second, more exhaustive scan of that dataset to discover the entire list of findings.
For more information, see the "Limiting the amount of content inspected" section of Inspecting Storage and Databases for Sensitive Data.
Scan only data that has changed
You can instruct the DLP API to avoid scanning data that hasn't
been modified since the last inspection. Using
StorageConfig lets you control what data to scan based on when the
data was last modified.
If you're using job triggers, you can set the
automatically skip content that was scanned during the last scheduled job.
For more information, see the "Limit scans to only new content" section on the Job triggers conceptual page.
Limit scans of files in Cloud Storage to only relevant files
By specifying the
message, you can use regular expression filters for finer control over which
files or folders in buckets to include or exclude.
This is useful in situations where you want to skip scanning files that you know have no sensitive data, such as backups, TMP files, static Web content, and so on.
Use the pricing calculator
- Open the Google Cloud Platform Pricing Calculator.
- Select the DLP API icon.
- Enter the number of bytes you estimate needing to scan.
- Enter the number of infoTypes that you plan to use.
If your query processes less than 10 giga-units (GU), the estimate is $0. The DLP API provides 10 GU of on-demand query processing free per month.
For more information about giga-units and other pricing concepts, see Cloud Data Loss Prevention (DLP) API pricing.
View costs using a dashboard and query your audit logs
Create a dashboard to view your billing data so you can make adjustments to your DLP API usage. Also consider streaming your audit logs to DLP API so you can analyze usage patterns.
You can export your billing data to DLP API and visualize it in a tool such as Google Data Studio. For a tutorial on creating a billing dashboard, see Visualize GCP Billing using BigQuery and Data Studio.
You can also stream your audit logs to BigQuery and analyze the logs for usage patterns such as query costs by user.
Set budget alerts
Set a budget alert to track how your spend is growing toward a particular amount. Setting a budget does not cap API usage, it only alerts you when your spend amount gets near the specified amount.