Index
DlpService
(interface)Action
(message)Action.JobNotificationEmails
(message)Action.PublishFindingsToCloudDataCatalog
(message)Action.PublishSummaryToCscc
(message)Action.PublishToPubSub
(message)Action.PublishToStackdriver
(message)Action.SaveFindings
(message)ActivateJobTriggerRequest
(message)AnalyzeDataSourceRiskDetails
(message)AnalyzeDataSourceRiskDetails.CategoricalStatsResult
(message)AnalyzeDataSourceRiskDetails.CategoricalStatsResult.CategoricalStatsHistogramBucket
(message)AnalyzeDataSourceRiskDetails.DeltaPresenceEstimationResult
(message)AnalyzeDataSourceRiskDetails.DeltaPresenceEstimationResult.DeltaPresenceEstimationHistogramBucket
(message)AnalyzeDataSourceRiskDetails.DeltaPresenceEstimationResult.DeltaPresenceEstimationQuasiIdValues
(message)AnalyzeDataSourceRiskDetails.KAnonymityResult
(message)AnalyzeDataSourceRiskDetails.KAnonymityResult.KAnonymityEquivalenceClass
(message)AnalyzeDataSourceRiskDetails.KAnonymityResult.KAnonymityHistogramBucket
(message)AnalyzeDataSourceRiskDetails.KMapEstimationResult
(message)AnalyzeDataSourceRiskDetails.KMapEstimationResult.KMapEstimationHistogramBucket
(message)AnalyzeDataSourceRiskDetails.KMapEstimationResult.KMapEstimationQuasiIdValues
(message)AnalyzeDataSourceRiskDetails.LDiversityResult
(message)AnalyzeDataSourceRiskDetails.LDiversityResult.LDiversityEquivalenceClass
(message)AnalyzeDataSourceRiskDetails.LDiversityResult.LDiversityHistogramBucket
(message)AnalyzeDataSourceRiskDetails.NumericalStatsResult
(message)AnalyzeDataSourceRiskDetails.RequestedRiskAnalysisOptions
(message)BigQueryField
(message)BigQueryKey
(message)BigQueryOptions
(message)BigQueryOptions.SampleMethod
(enum)BigQueryTable
(message)BoundingBox
(message)BucketingConfig
(message)BucketingConfig.Bucket
(message)ByteContentItem
(message)ByteContentItem.BytesType
(enum)CancelDlpJobRequest
(message)CharacterMaskConfig
(message)CharsToIgnore
(message)CharsToIgnore.CommonCharsToIgnore
(enum)CloudStorageFileSet
(message)CloudStorageOptions
(message)CloudStorageOptions.FileSet
(message)CloudStorageOptions.SampleMethod
(enum)CloudStoragePath
(message)CloudStorageRegexFileSet
(message)Color
(message)Container
(message)ContentItem
(message)ContentLocation
(message)ContentOption
(enum)CreateDeidentifyTemplateRequest
(message)CreateDlpJobRequest
(message)CreateInspectTemplateRequest
(message)CreateJobTriggerRequest
(message)CreateStoredInfoTypeRequest
(message)CryptoDeterministicConfig
(message)CryptoHashConfig
(message)CryptoKey
(message)CryptoReplaceFfxFpeConfig
(message)CryptoReplaceFfxFpeConfig.FfxCommonNativeAlphabet
(enum)CustomInfoType
(message)CustomInfoType.DetectionRule
(message)CustomInfoType.DetectionRule.HotwordRule
(message)CustomInfoType.DetectionRule.LikelihoodAdjustment
(message)CustomInfoType.DetectionRule.Proximity
(message)CustomInfoType.Dictionary
(message)CustomInfoType.Dictionary.WordList
(message)CustomInfoType.ExclusionType
(enum)CustomInfoType.Regex
(message)CustomInfoType.SurrogateType
(message)DatastoreKey
(message)DatastoreOptions
(message)DateShiftConfig
(message)DateTime
(message)DateTime.TimeZone
(message)DeidentifyConfig
(message)DeidentifyContentRequest
(message)DeidentifyContentResponse
(message)DeidentifyTemplate
(message)DeleteDeidentifyTemplateRequest
(message)DeleteDlpJobRequest
(message)DeleteInspectTemplateRequest
(message)DeleteJobTriggerRequest
(message)DeleteStoredInfoTypeRequest
(message)DlpJob
(message)DlpJob.JobState
(enum)DlpJobType
(enum)DocumentLocation
(message)EntityId
(message)Error
(message)ExcludeInfoTypes
(message)ExclusionRule
(message)FieldId
(message)FieldTransformation
(message)FileType
(enum)Finding
(message)FinishDlpJobRequest
(message)FixedSizeBucketingConfig
(message)GetDeidentifyTemplateRequest
(message)GetDlpJobRequest
(message)GetInspectTemplateRequest
(message)GetJobTriggerRequest
(message)GetStoredInfoTypeRequest
(message)HybridContentItem
(message)HybridFindingDetails
(message)HybridInspectDlpJobRequest
(message)HybridInspectJobTriggerRequest
(message)HybridInspectResponse
(message)HybridInspectStatistics
(message)HybridOptions
(message)ImageLocation
(message)InfoType
(message)InfoTypeDescription
(message)InfoTypeStats
(message)InfoTypeSupportedBy
(enum)InfoTypeTransformations
(message)InfoTypeTransformations.InfoTypeTransformation
(message)InspectConfig
(message)InspectConfig.FindingLimits
(message)InspectConfig.FindingLimits.InfoTypeLimit
(message)InspectContentRequest
(message)InspectContentResponse
(message)InspectDataSourceDetails
(message)InspectDataSourceDetails.RequestedOptions
(message)InspectDataSourceDetails.Result
(message)InspectJobConfig
(message)InspectResult
(message)InspectTemplate
(message)InspectionRule
(message)InspectionRuleSet
(message)JobTrigger
(message)JobTrigger.Status
(enum)JobTrigger.Trigger
(message)Key
(message)Key.PathElement
(message)KindExpression
(message)KmsWrappedCryptoKey
(message)LargeCustomDictionaryConfig
(message)LargeCustomDictionaryStats
(message)Likelihood
(enum)ListDeidentifyTemplatesRequest
(message)ListDeidentifyTemplatesResponse
(message)ListDlpJobsRequest
(message)ListDlpJobsResponse
(message)ListInfoTypesRequest
(message)ListInfoTypesResponse
(message)ListInspectTemplatesRequest
(message)ListInspectTemplatesResponse
(message)ListJobTriggersRequest
(message)ListJobTriggersResponse
(message)ListStoredInfoTypesRequest
(message)ListStoredInfoTypesResponse
(message)Location
(message)Manual
(message)MatchingType
(enum)MetadataLocation
(message)MetadataType
(enum)OutputStorageConfig
(message)OutputStorageConfig.OutputSchema
(enum)PartitionId
(message)PrimitiveTransformation
(message)PrivacyMetric
(message)PrivacyMetric.CategoricalStatsConfig
(message)PrivacyMetric.DeltaPresenceEstimationConfig
(message)PrivacyMetric.KAnonymityConfig
(message)PrivacyMetric.KMapEstimationConfig
(message)PrivacyMetric.KMapEstimationConfig.AuxiliaryTable
(message)PrivacyMetric.KMapEstimationConfig.AuxiliaryTable.QuasiIdField
(message)PrivacyMetric.KMapEstimationConfig.TaggedField
(message)PrivacyMetric.LDiversityConfig
(message)PrivacyMetric.NumericalStatsConfig
(message)QuasiId
(message)QuoteInfo
(message)Range
(message)RecordCondition
(message)RecordCondition.Condition
(message)RecordCondition.Conditions
(message)RecordCondition.Expressions
(message)RecordCondition.Expressions.LogicalOperator
(enum)RecordKey
(message)RecordLocation
(message)RecordSuppression
(message)RecordTransformations
(message)RedactConfig
(message)RedactImageRequest
(message)RedactImageRequest.ImageRedactionConfig
(message)RedactImageResponse
(message)ReidentifyContentRequest
(message)ReidentifyContentResponse
(message)RelationalOperator
(enum)ReplaceValueConfig
(message)ReplaceWithInfoTypeConfig
(message)RiskAnalysisJobConfig
(message)Schedule
(message)StatisticalTable
(message)StatisticalTable.QuasiIdentifierField
(message)StorageConfig
(message)StorageConfig.TimespanConfig
(message)StorageMetadataLabel
(message)StoredInfoType
(message)StoredInfoTypeConfig
(message)StoredInfoTypeState
(enum)StoredInfoTypeStats
(message)StoredInfoTypeVersion
(message)StoredType
(message)Table
(message)Table.Row
(message)TableLocation
(message)TableOptions
(message)TimePartConfig
(message)TimePartConfig.TimePart
(enum)TransformationErrorHandling
(message)TransformationErrorHandling.LeaveUntransformed
(message)TransformationErrorHandling.ThrowError
(message)TransformationOverview
(message)TransformationSummary
(message)TransformationSummary.SummaryResult
(message)TransformationSummary.TransformationResultCode
(enum)TransientCryptoKey
(message)UnwrappedCryptoKey
(message)UpdateDeidentifyTemplateRequest
(message)UpdateInspectTemplateRequest
(message)UpdateJobTriggerRequest
(message)UpdateStoredInfoTypeRequest
(message)Value
(message)ValueFrequency
(message)
DlpService
The Cloud Data Loss Prevention (DLP) API is a service that allows clients to detect the presence of Personally Identifiable Information (PII) and other privacy-sensitive data in user-supplied, unstructured data streams, like text blocks or images. The service also includes methods for sensitive data redaction and scheduling of data scans on Google Cloud Platform based data sets.
To learn more about concepts and find how-to guides see https://cloud.google.com/dlp/docs/.
ActivateJobTrigger | |
---|---|
Activate a job trigger. Causes the immediate execute of a trigger instead of waiting on the trigger event to occur.
|
CancelDlpJob | |
---|---|
Starts asynchronous cancellation on a long-running DlpJob. The server makes a best effort to cancel the DlpJob, but success is not guaranteed. See https://cloud.google.com/dlp/docs/inspecting-storage and https://cloud.google.com/dlp/docs/compute-risk-analysis to learn more.
|
CreateDeidentifyTemplate | |
---|---|
Creates a DeidentifyTemplate for re-using frequently used configuration for de-identifying content, images, and storage. See https://cloud.google.com/dlp/docs/creating-templates-deid to learn more.
|
CreateDlpJob | |
---|---|
Creates a new job to inspect storage or calculate risk metrics. See https://cloud.google.com/dlp/docs/inspecting-storage and https://cloud.google.com/dlp/docs/compute-risk-analysis to learn more. When no InfoTypes or CustomInfoTypes are specified in inspect jobs, the system will automatically choose what detectors to run. By default this may be all types, but may change over time as detectors are updated.
|
CreateInspectTemplate | |
---|---|
Creates an InspectTemplate for re-using frequently used configuration for inspecting content, images, and storage. See https://cloud.google.com/dlp/docs/creating-templates to learn more.
|
CreateJobTrigger | |
---|---|
Creates a job trigger to run DLP actions such as scanning storage for sensitive information on a set schedule. See https://cloud.google.com/dlp/docs/creating-job-triggers to learn more.
|
CreateStoredInfoType | |
---|---|
Creates a pre-built stored infoType to be used for inspection. See https://cloud.google.com/dlp/docs/creating-stored-infotypes to learn more.
|
DeidentifyContent | |
---|---|
De-identifies potentially sensitive info from a ContentItem. This method has limits on input size and output size. See https://cloud.google.com/dlp/docs/deidentify-sensitive-data to learn more. When no InfoTypes or CustomInfoTypes are specified in this request, the system will automatically choose what detectors to run. By default this may be all types, but may change over time as detectors are updated.
|
DeleteDeidentifyTemplate | |
---|---|
Deletes a DeidentifyTemplate. See https://cloud.google.com/dlp/docs/creating-templates-deid to learn more.
|
DeleteDlpJob | |
---|---|
Deletes a long-running DlpJob. This method indicates that the client is no longer interested in the DlpJob result. The job will be cancelled if possible. See https://cloud.google.com/dlp/docs/inspecting-storage and https://cloud.google.com/dlp/docs/compute-risk-analysis to learn more.
|
DeleteInspectTemplate | |
---|---|
Deletes an InspectTemplate. See https://cloud.google.com/dlp/docs/creating-templates to learn more.
|
DeleteJobTrigger | |
---|---|
Deletes a job trigger. See https://cloud.google.com/dlp/docs/creating-job-triggers to learn more.
|
DeleteStoredInfoType | |
---|---|
Deletes a stored infoType. See https://cloud.google.com/dlp/docs/creating-stored-infotypes to learn more.
|
FinishDlpJob | |
---|---|
Finish a running hybrid DlpJob. Triggers the finalization steps and running of any enabled actions that have not yet run.
|
GetDeidentifyTemplate | |
---|---|
Gets a DeidentifyTemplate. See https://cloud.google.com/dlp/docs/creating-templates-deid to learn more.
|
GetDlpJob | |
---|---|
Gets the latest state of a long-running DlpJob. See https://cloud.google.com/dlp/docs/inspecting-storage and https://cloud.google.com/dlp/docs/compute-risk-analysis to learn more.
|
GetInspectTemplate | |
---|---|
Gets an InspectTemplate. See https://cloud.google.com/dlp/docs/creating-templates to learn more.
|
GetJobTrigger | |
---|---|
Gets a job trigger. See https://cloud.google.com/dlp/docs/creating-job-triggers to learn more.
|
GetStoredInfoType | |
---|---|
Gets a stored infoType. See https://cloud.google.com/dlp/docs/creating-stored-infotypes to learn more.
|
HybridInspectDlpJob | |
---|---|
Inspect hybrid content and store findings to a job. To review the findings, inspect the job. Inspection will occur asynchronously.
|
HybridInspectJobTrigger | |
---|---|
Inspect hybrid content and store findings to a trigger. The inspection will be processed asynchronously. To review the findings monitor the jobs within the trigger.
|
InspectContent | |
---|---|
Finds potentially sensitive info in content. This method has limits on input size, processing time, and output size. When no InfoTypes or CustomInfoTypes are specified in this request, the system will automatically choose what detectors to run. By default this may be all types, but may change over time as detectors are updated. For how to guides, see https://cloud.google.com/dlp/docs/inspecting-images and https://cloud.google.com/dlp/docs/inspecting-text,
|
ListDeidentifyTemplates | |
---|---|
Lists DeidentifyTemplates. See https://cloud.google.com/dlp/docs/creating-templates-deid to learn more.
|
ListDlpJobs | |
---|---|
Lists DlpJobs that match the specified filter in the request. See https://cloud.google.com/dlp/docs/inspecting-storage and https://cloud.google.com/dlp/docs/compute-risk-analysis to learn more.
|
ListInfoTypes | |
---|---|
Returns a list of the sensitive information types that the DLP API supports. See https://cloud.google.com/dlp/docs/infotypes-reference to learn more.
|
ListInspectTemplates | |
---|---|
Lists InspectTemplates. See https://cloud.google.com/dlp/docs/creating-templates to learn more.
|
ListJobTriggers | |
---|---|
Lists job triggers. See https://cloud.google.com/dlp/docs/creating-job-triggers to learn more.
|
ListStoredInfoTypes | |
---|---|
Lists stored infoTypes. See https://cloud.google.com/dlp/docs/creating-stored-infotypes to learn more.
|
RedactImage | |
---|---|
Redacts potentially sensitive info from an image. This method has limits on input size, processing time, and output size. See https://cloud.google.com/dlp/docs/redacting-sensitive-data-images to learn more. When no InfoTypes or CustomInfoTypes are specified in this request, the system will automatically choose what detectors to run. By default this may be all types, but may change over time as detectors are updated.
|
ReidentifyContent | |
---|---|
Re-identifies content that has been de-identified. See https://cloud.google.com/dlp/docs/pseudonymization#re-identification_in_free_text_code_example to learn more.
|
UpdateDeidentifyTemplate | |
---|---|
Updates the DeidentifyTemplate. See https://cloud.google.com/dlp/docs/creating-templates-deid to learn more.
|
UpdateInspectTemplate | |
---|---|
Updates the InspectTemplate. See https://cloud.google.com/dlp/docs/creating-templates to learn more.
|
UpdateJobTrigger | |
---|---|
Updates a job trigger. See https://cloud.google.com/dlp/docs/creating-job-triggers to learn more.
|
UpdateStoredInfoType | |
---|---|
Updates the stored infoType by creating a new version. The existing version will continue to be used until the new version is ready. See https://cloud.google.com/dlp/docs/creating-stored-infotypes to learn more.
|
Action
A task to execute on the completion of a job. See https://cloud.google.com/dlp/docs/concepts-actions to learn more.
Fields | ||
---|---|---|
Union field
|
||
save_findings |
Save resulting findings in a provided location. |
|
pub_sub |
Publish a notification to a pubsub topic. |
|
publish_summary_to_cscc |
Publish summary to Cloud Security Command Center (Alpha). |
|
publish_findings_to_cloud_data_catalog |
Publish findings to Cloud Datahub. |
|
job_notification_emails |
Enable email notification for project owners and editors on job's completion/failure. |
|
publish_to_stackdriver |
Enable Stackdriver metric dlp.googleapis.com/finding_count. |
JobNotificationEmails
Enable email notification to project owners and editors on jobs's completion/failure.
PublishFindingsToCloudDataCatalog
Publish findings of a DlpJob to Cloud Data Catalog. Labels summarizing the results of the DlpJob will be applied to the entry for the resource scanned in Cloud Data Catalog. Any labels previously written by another DlpJob will be deleted. InfoType naming patterns are strictly enforced when using this feature. Note that the findings will be persisted in Cloud Data Catalog storage and are governed by Data Catalog service-specific policy, see https://cloud.google.com/terms/service-terms Only a single instance of this action can be specified and only allowed if all resources being scanned are BigQuery tables. Compatible with: Inspect
PublishSummaryToCscc
Publish the result summary of a DlpJob to the Cloud Security Command Center (CSCC Alpha). This action is only available for projects which are parts of an organization and whitelisted for the alpha Cloud Security Command Center. The action will publish count of finding instances and their info types. The summary of findings will be persisted in CSCC and are governed by CSCC service-specific policy, see https://cloud.google.com/terms/service-terms Only a single instance of this action can be specified. Compatible with: Inspect
PublishToPubSub
Publish a message into given Pub/Sub topic when DlpJob has completed. The message contains a single field, DlpJobName
, which is equal to the finished job's DlpJob.name
. Compatible with: Inspect, Risk
Fields | |
---|---|
topic |
Cloud Pub/Sub topic to send notifications to. The topic must have given publishing access rights to the DLP API service account executing the long running DlpJob sending the notifications. Format is projects/{project}/topics/{topic}. |
PublishToStackdriver
Enable Stackdriver metric dlp.googleapis.com/finding_count. This will publish a metric to stack driver on each infotype requested and how many findings were found for it. CustomDetectors will be bucketed as 'Custom' under the Stackdriver label 'info_type'.
SaveFindings
If set, the detailed findings will be persisted to the specified OutputStorageConfig. Only a single instance of this action can be specified. Compatible with: Inspect, Risk
Fields | |
---|---|
output_config |
Location to store findings outside of DLP. |
ActivateJobTriggerRequest
Request message for ActivateJobTrigger.
Fields | |
---|---|
name |
Required. Resource name of the trigger to activate, for example Authorization requires one or more of the following IAM permissions on the specified resource
|
AnalyzeDataSourceRiskDetails
Result of a risk analysis operation request.
Fields | ||
---|---|---|
requested_privacy_metric |
Privacy metric to compute. |
|
requested_source_table |
Input dataset to compute metrics over. |
|
requested_options |
The configuration used for this job. |
|
Union field result . Values associated with this metric. result can be only one of the following: |
||
numerical_stats_result |
Numerical stats result |
|
categorical_stats_result |
Categorical stats result |
|
k_anonymity_result |
K-anonymity result |
|
l_diversity_result |
L-divesity result |
|
k_map_estimation_result |
K-map result |
|
delta_presence_estimation_result |
Delta-presence result |
CategoricalStatsResult
Result of the categorical stats computation.
Fields | |
---|---|
value_frequency_histogram_buckets[] |
Histogram of value frequencies in the column. |
CategoricalStatsHistogramBucket
Histogram of value frequencies in the column.
Fields | |
---|---|
value_frequency_lower_bound |
Lower bound on the value frequency of the values in this bucket. |
value_frequency_upper_bound |
Upper bound on the value frequency of the values in this bucket. |
bucket_size |
Total number of values in this bucket. |
bucket_values[] |
Sample of value frequencies in this bucket. The total number of values returned per bucket is capped at 20. |
bucket_value_count |
Total number of distinct values in this bucket. |
DeltaPresenceEstimationResult
Result of the δ-presence computation. Note that these results are an estimation, not exact values.
Fields | |
---|---|
delta_presence_estimation_histogram[] |
The intervals [min_probability, max_probability) do not overlap. If a value doesn't correspond to any such interval, the associated frequency is zero. For example, the following records: {min_probability: 0, max_probability: 0.1, frequency: 17} {min_probability: 0.2, max_probability: 0.3, frequency: 42} {min_probability: 0.3, max_probability: 0.4, frequency: 99} mean that there are no record with an estimated probability in [0.1, 0.2) nor larger or equal to 0.4. |
DeltaPresenceEstimationHistogramBucket
A DeltaPresenceEstimationHistogramBucket message with the following values: min_probability: 0.1 max_probability: 0.2 frequency: 42 means that there are 42 records for which δ is in [0.1, 0.2). An important particular case is when min_probability = max_probability = 1: then, every individual who shares this quasi-identifier combination is in the dataset.
Fields | |
---|---|
min_probability |
Between 0 and 1. |
max_probability |
Always greater than or equal to min_probability. |
bucket_size |
Number of records within these probability bounds. |
bucket_values[] |
Sample of quasi-identifier tuple values in this bucket. The total number of classes returned per bucket is capped at 20. |
bucket_value_count |
Total number of distinct quasi-identifier tuple values in this bucket. |
DeltaPresenceEstimationQuasiIdValues
A tuple of values for the quasi-identifier columns.
Fields | |
---|---|
quasi_ids_values[] |
The quasi-identifier values. |
estimated_probability |
The estimated probability that a given individual sharing these quasi-identifier values is in the dataset. This value, typically called δ, is the ratio between the number of records in the dataset with these quasi-identifier values, and the total number of individuals (inside and outside the dataset) with these quasi-identifier values. For example, if there are 15 individuals in the dataset who share the same quasi-identifier values, and an estimated 100 people in the entire population with these values, then δ is 0.15. |
KAnonymityResult
Result of the k-anonymity computation.
Fields | |
---|---|
equivalence_class_histogram_buckets[] |
Histogram of k-anonymity equivalence classes. |
KAnonymityEquivalenceClass
The set of columns' values that share the same ldiversity value
Fields | |
---|---|
quasi_ids_values[] |
Set of values defining the equivalence class. One value per quasi-identifier column in the original KAnonymity metric message. The order is always the same as the original request. |
equivalence_class_size |
Size of the equivalence class, for example number of rows with the above set of values. |
KAnonymityHistogramBucket
Histogram of k-anonymity equivalence classes.
Fields | |
---|---|
equivalence_class_size_lower_bound |
Lower bound on the size of the equivalence classes in this bucket. |
equivalence_class_size_upper_bound |
Upper bound on the size of the equivalence classes in this bucket. |
bucket_size |
Total number of equivalence classes in this bucket. |
bucket_values[] |
Sample of equivalence classes in this bucket. The total number of classes returned per bucket is capped at 20. |
bucket_value_count |
Total number of distinct equivalence classes in this bucket. |
KMapEstimationResult
Result of the reidentifiability analysis. Note that these results are an estimation, not exact values.
Fields | |
---|---|
k_map_estimation_histogram[] |
The intervals [min_anonymity, max_anonymity] do not overlap. If a value doesn't correspond to any such interval, the associated frequency is zero. For example, the following records: {min_anonymity: 1, max_anonymity: 1, frequency: 17} {min_anonymity: 2, max_anonymity: 3, frequency: 42} {min_anonymity: 5, max_anonymity: 10, frequency: 99} mean that there are no record with an estimated anonymity of 4, 5, or larger than 10. |
KMapEstimationHistogramBucket
A KMapEstimationHistogramBucket message with the following values: min_anonymity: 3 max_anonymity: 5 frequency: 42 means that there are 42 records whose quasi-identifier values correspond to 3, 4 or 5 people in the overlying population. An important particular case is when min_anonymity = max_anonymity = 1: the frequency field then corresponds to the number of uniquely identifiable records.
Fields | |
---|---|
min_anonymity |
Always positive. |
max_anonymity |
Always greater than or equal to min_anonymity. |
bucket_size |
Number of records within these anonymity bounds. |
bucket_values[] |
Sample of quasi-identifier tuple values in this bucket. The total number of classes returned per bucket is capped at 20. |
bucket_value_count |
Total number of distinct quasi-identifier tuple values in this bucket. |
KMapEstimationQuasiIdValues
A tuple of values for the quasi-identifier columns.
Fields | |
---|---|
quasi_ids_values[] |
The quasi-identifier values. |
estimated_anonymity |
The estimated anonymity for these quasi-identifier values. |
LDiversityResult
Result of the l-diversity computation.
Fields | |
---|---|
sensitive_value_frequency_histogram_buckets[] |
Histogram of l-diversity equivalence class sensitive value frequencies. |
LDiversityEquivalenceClass
The set of columns' values that share the same ldiversity value.
Fields | |
---|---|
quasi_ids_values[] |
Quasi-identifier values defining the k-anonymity equivalence class. The order is always the same as the original request. |
equivalence_class_size |
Size of the k-anonymity equivalence class. |
num_distinct_sensitive_values |
Number of distinct sensitive values in this equivalence class. |
top_sensitive_values[] |
Estimated frequencies of top sensitive values. |
LDiversityHistogramBucket
Histogram of l-diversity equivalence class sensitive value frequencies.
Fields | |
---|---|
sensitive_value_frequency_lower_bound |
Lower bound on the sensitive value frequencies of the equivalence classes in this bucket. |
sensitive_value_frequency_upper_bound |
Upper bound on the sensitive value frequencies of the equivalence classes in this bucket. |
bucket_size |
Total number of equivalence classes in this bucket. |
bucket_values[] |
Sample of equivalence classes in this bucket. The total number of classes returned per bucket is capped at 20. |
bucket_value_count |
Total number of distinct equivalence classes in this bucket. |
NumericalStatsResult
Result of the numerical stats computation.
Fields | |
---|---|
min_value |
Minimum value appearing in the column. |
max_value |
Maximum value appearing in the column. |
quantile_values[] |
List of 99 values that partition the set of field values into 100 equal sized buckets. |
RequestedRiskAnalysisOptions
Risk analysis options.
Fields | |
---|---|
job_config |
The job config for the risk job. |
BigQueryField
Message defining a field of a BigQuery table.
Fields | |
---|---|
table |
Source table of the field. |
field |
Designated field in the BigQuery table. |
BigQueryKey
Row key for identifying a record in BigQuery table.
Fields | |
---|---|
table_reference |
Complete BigQuery table reference. |
row_number |
Row number inferred at the time the table was scanned. This value is nondeterministic, cannot be queried, and may be null for inspection jobs. To locate findings within a table, specify |
BigQueryOptions
Options defining BigQuery table and row identifiers.
Fields | |
---|---|
table_reference |
Complete BigQuery table reference. |
identifying_fields[] |
Table fields that may uniquely identify a row within the table. When |
rows_limit |
Max number of rows to scan. If the table has more rows than this value, the rest of the rows are omitted. If not set, or if set to 0, all rows will be scanned. Only one of rows_limit and rows_limit_percent can be specified. Cannot be used in conjunction with TimespanConfig. |
rows_limit_percent |
Max percentage of rows to scan. The rest are omitted. The number of rows scanned is rounded down. Must be between 0 and 100, inclusively. Both 0 and 100 means no limit. Defaults to 0. Only one of rows_limit and rows_limit_percent can be specified. Cannot be used in conjunction with TimespanConfig. |
sample_method |
|
excluded_fields[] |
References to fields excluded from scanning. This allows you to skip inspection of entire columns which you know have no findings. |
SampleMethod
How to sample rows if not all rows are scanned. Meaningful only when used in conjunction with either rows_limit or rows_limit_percent. If not specified, rows are scanned in the order BigQuery reads them.
Enums | |
---|---|
SAMPLE_METHOD_UNSPECIFIED |
|
TOP |
Scan groups of rows in the order BigQuery provides (default). Multiple groups of rows may be scanned in parallel, so results may not appear in the same order the rows are read. |
RANDOM_START |
Randomly pick groups of rows to scan. |
BigQueryTable
Message defining the location of a BigQuery table. A table is uniquely identified by its project_id, dataset_id, and table_name. Within a query a table is often referenced with a string in the format of: <project_id>:<dataset_id>.<table_id>
or <project_id>.<dataset_id>.<table_id>
.
Fields | |
---|---|
project_id |
The Google Cloud Platform project ID of the project containing the table. If omitted, project ID is inferred from the API call. |
dataset_id |
Dataset ID of the table. |
table_id |
Name of the table. |
BoundingBox
Bounding box encompassing detected text within an image.
Fields | |
---|---|
top |
Top coordinate of the bounding box. (0,0) is upper left. |
left |
Left coordinate of the bounding box. (0,0) is upper left. |
width |
Width of the bounding box in pixels. |
height |
Height of the bounding box in pixels. |
BucketingConfig
Generalization function that buckets values based on ranges. The ranges and replacement values are dynamically provided by the user for custom behavior, such as 1-30 -> LOW 31-65 -> MEDIUM 66-100 -> HIGH This can be used on data of type: number, long, string, timestamp. If the bound Value
type differs from the type of data being transformed, we will first attempt converting the type of the data to be transformed to match the type of the bound before comparing. See https://cloud.google.com/dlp/docs/concepts-bucketing to learn more.
Fields | |
---|---|
buckets[] |
Set of buckets. Ranges must be non-overlapping. |
Bucket
Bucket is represented as a range, along with replacement values.
Fields | |
---|---|
min |
Lower bound of the range, inclusive. Type should be the same as max if used. |
max |
Upper bound of the range, exclusive; type must match min. |
replacement_value |
Required. Replacement value for this bucket. |
ByteContentItem
Container for bytes to inspect or redact.
Fields | |
---|---|
type |
The type of data stored in the bytes string. Default will be TEXT_UTF8. |
data |
Content data to inspect or redact. |
BytesType
The type of data being sent for inspection.
Enums | |
---|---|
BYTES_TYPE_UNSPECIFIED |
Unused |
IMAGE |
Any image type. |
IMAGE_JPEG |
jpeg |
IMAGE_BMP |
bmp |
IMAGE_PNG |
png |
IMAGE_SVG |
svg |
TEXT_UTF8 |
plain text |
WORD_DOCUMENT |
docx, docm, dotx, dotm |
PDF |
|
AVRO |
avro |
CSV |
csv |
TSV |
tsv |
CancelDlpJobRequest
The request message for canceling a DLP job.
Fields | |
---|---|
name |
Required. The name of the DlpJob resource to be cancelled. Authorization requires the following IAM permission on the specified resource
|
CharacterMaskConfig
Partially mask a string by replacing a given number of characters with a fixed character. Masking can start from the beginning or end of the string. This can be used on data of any type (numbers, longs, and so on) and when de-identifying structured data we'll attempt to preserve the original data's type. (This allows you to take a long like 123 and modify it to a string like **3.
Fields | |
---|---|
masking_character |
Character to use to mask the sensitive values—for example, |
number_to_mask |
Number of characters to mask. If not set, all matching chars will be masked. Skipped characters do not count towards this tally. |
reverse_order |
Mask characters in reverse order. For example, if |
characters_to_ignore[] |
When masking a string, items in this list will be skipped when replacing characters. For example, if the input string is |
CharsToIgnore
Characters to skip when doing deidentification of a value. These will be left alone and skipped.
Fields | ||
---|---|---|
Union field
|
||
characters_to_skip |
Characters to not transform when masking. |
|
common_characters_to_ignore |
Common characters to not transform when masking. Useful to avoid removing punctuation. |
CommonCharsToIgnore
Convenience enum for indication common characters to not transform.
Enums | |
---|---|
COMMON_CHARS_TO_IGNORE_UNSPECIFIED |
Unused. |
NUMERIC |
0-9 |
ALPHA_UPPER_CASE |
A-Z |
ALPHA_LOWER_CASE |
a-z |
PUNCTUATION |
US Punctuation, one of !"#$%&'()*+,-./:;<=>?@[]^_`{|}~ |
WHITESPACE |
Whitespace character, one of [ \t\n\x0B\f\r] |
CloudStorageFileSet
Message representing a set of files in Cloud Storage.
Fields | |
---|---|
url |
The url, in the format |
CloudStorageOptions
Options defining a file or a set of files within a Google Cloud Storage bucket.
Fields | |
---|---|
file_set |
The set of one or more files to scan. |
bytes_limit_per_file |
Max number of bytes to scan from a file. If a scanned file's size is bigger than this value then the rest of the bytes are omitted. Only one of bytes_limit_per_file and bytes_limit_per_file_percent can be specified. Cannot be set if de-identification is requested. |
bytes_limit_per_file_percent |
Max percentage of bytes to scan from a file. The rest are omitted. The number of bytes scanned is rounded down. Must be between 0 and 100, inclusively. Both 0 and 100 means no limit. Defaults to 0. Only one of bytes_limit_per_file and bytes_limit_per_file_percent can be specified. Cannot be set if de-identification is requested. |
file_types[] |
List of file type groups to include in the scan. If empty, all files are scanned and available data format processors are applied. In addition, the binary content of the selected files is always scanned as well. Images are scanned only as binary if the specified region does not support image inspection and no file_types were specified. Image inspection is restricted to 'global', 'us', 'asia', and 'europe'. |
sample_method |
|
files_limit_percent |
Limits the number of files to scan to this percentage of the input FileSet. Number of files scanned is rounded down. Must be between 0 and 100, inclusively. Both 0 and 100 means no limit. Defaults to 0. |
FileSet
Set of files to scan.
Fields | |
---|---|
url |
The Cloud Storage url of the file(s) to scan, in the format If the url ends in a trailing slash, the bucket or directory represented by the url will be scanned non-recursively (content in sub-directories will not be scanned). This means that Exactly one of |
regex_file_set |
The regex-filtered set of files to scan. Exactly one of |
SampleMethod
How to sample bytes if not all bytes are scanned. Meaningful only when used in conjunction with bytes_limit_per_file. If not specified, scanning would start from the top.
Enums | |
---|---|
SAMPLE_METHOD_UNSPECIFIED |
|
TOP |
Scan from the top (default). |
RANDOM_START |
For each file larger than bytes_limit_per_file, randomly pick the offset to start scanning. The scanned bytes are contiguous. |
CloudStoragePath
Message representing a single file or path in Cloud Storage.
Fields | |
---|---|
path |
A url representing a file or path (no wildcards) in Cloud Storage. Example: gs://[BUCKET_NAME]/dictionary.txt |
CloudStorageRegexFileSet
Message representing a set of files in a Cloud Storage bucket. Regular expressions are used to allow fine-grained control over which files in the bucket to include.
Included files are those that match at least one item in include_regex
and do not match any items in exclude_regex
. Note that a file that matches items from both lists will not be included. For a match to occur, the entire file path (i.e., everything in the url after the bucket name) must match the regular expression.
For example, given the input {bucket_name: "mybucket", include_regex:
["directory1/.*"], exclude_regex:
["directory1/excluded.*"]}
:
gs://mybucket/directory1/myfile
will be includedgs://mybucket/directory1/directory2/myfile
will be included (.*
matches across/
)gs://mybucket/directory0/directory1/myfile
will not be included (the full path doesn't match any items ininclude_regex
)gs://mybucket/directory1/excludedfile
will not be included (the path matches an item inexclude_regex
)
If include_regex
is left empty, it will match all files by default (this is equivalent to setting include_regex: [".*"]
).
Some other common use cases:
{bucket_name: "mybucket", exclude_regex: [".*\.pdf"]}
will include all files inmybucket
except for .pdf files{bucket_name: "mybucket", include_regex: ["directory/[^/]+"]}
will include all files directly undergs://mybucket/directory/
, without matching across/
Fields | |
---|---|
bucket_name |
The name of a Cloud Storage bucket. Required. |
include_regex[] |
A list of regular expressions matching file paths to include. All files in the bucket that match at least one of these regular expressions will be included in the set of files, except for those that also match an item in Regular expressions use RE2 syntax; a guide can be found under the google/re2 repository on GitHub. |
exclude_regex[] |
A list of regular expressions matching file paths to exclude. All files in the bucket that match at least one of these regular expressions will be excluded from the scan. Regular expressions use RE2 syntax; a guide can be found under the google/re2 repository on GitHub. |
Color
Represents a color in the RGB color space.
Fields | |
---|---|
red |
The amount of red in the color as a value in the interval [0, 1]. |
green |
The amount of green in the color as a value in the interval [0, 1]. |
blue |
The amount of blue in the color as a value in the interval [0, 1]. |
Container
Represents a container that may contain DLP findings. Examples of a container include a file, table, or database record.
Fields | |
---|---|
type |
Container type, for example BigQuery or Google Cloud Storage. |
project_id |
Project where the finding was found. Can be different from the project that owns the finding. |
full_path |
A string representation of the full container name. Examples: - BigQuery: 'Project:DataSetId.TableId' - Google Cloud Storage: 'gs://Bucket/folders/filename.txt' |
root_path |
The root of the container. Examples: - For BigQuery table |
relative_path |
The rest of the path after the root. Examples: - For BigQuery table |
update_time |
Findings container modification timestamp, if applicable. For Google Cloud Storage contains last file modification timestamp. For BigQuery table contains last_modified_time property. For Datastore - not populated. |
version |
Findings container version, if available ("generation" for Google Cloud Storage). |
ContentItem
Container structure for the content to inspect.
Fields | ||
---|---|---|
Union field data_item . Data of the item either in the byte array or UTF-8 string form, or table. data_item can be only one of the following: |
||
value |
String data to inspect or redact. |
|
table |
Structured content for inspection. See https://cloud.google.com/dlp/docs/inspecting-text#inspecting_a_table to learn more. |
|
byte_item |
Content data to inspect or redact. Replaces |
ContentLocation
Precise location of the finding within a document, record, image, or metadata container.
Fields | ||
---|---|---|
container_name |
Name of the container where the finding is located. The top level name is the source file name or table name. Names of some common storage containers are formatted as follows:
Nested names could be absent if the embedded object has no string identifier (for an example an image contained within a document). |
|
container_timestamp |
Findings container modification timestamp, if applicable. For Google Cloud Storage contains last file modification timestamp. For BigQuery table contains last_modified_time property. For Datastore - not populated. |
|
container_version |
Findings container version, if available ("generation" for Google Cloud Storage). |
|
Union field location . Type of the container within the file with location of the finding. location can be only one of the following: |
||
record_location |
Location within a row or record of a database table. |
|
image_location |
Location within an image's pixels. |
|
document_location |
Location data for document files. |
|
metadata_location |
Location within the metadata for inspected content. |
ContentOption
Options describing which parts of the provided content should be scanned.
Enums | |
---|---|
CONTENT_UNSPECIFIED |
Includes entire content of a file or a data stream. |
CONTENT_TEXT |
Text content within the data, excluding any metadata. |
CONTENT_IMAGE |
Images found in the data. |
CreateDeidentifyTemplateRequest
Request message for CreateDeidentifyTemplate.
Fields | |
---|---|
parent |
Required. Parent resource name. The format of this value varies depending on the scope of the request (project or organization) and whether you have specified a processing location:
The following example
Authorization requires the following IAM permission on the specified resource
|
deidentify_template |
Required. The DeidentifyTemplate to create. |
template_id |
The template id can contain uppercase and lowercase letters, numbers, and hyphens; that is, it must match the regular expression: |
location_id |
Deprecated. This field has no effect. |
CreateDlpJobRequest
Request message for CreateDlpJobRequest. Used to initiate long running jobs such as calculating risk metrics or inspecting Google Cloud Storage.
Fields | ||
---|---|---|
parent |
Required. Parent resource name. The format of this value varies depending on whether you have specified a processing location:
The following example
Authorization requires the following IAM permission on the specified resource
|
|
job_id |
The job id can contain uppercase and lowercase letters, numbers, and hyphens; that is, it must match the regular expression: |
|
location_id |
Deprecated. This field has no effect. |
|
Union field job . The configuration details for the specific type of job to run. job can be only one of the following: |
||
inspect_job |
An inspection job scans a storage repository for InfoTypes. |
|
risk_job |
A risk analysis job calculates re-identification risk metrics for a BigQuery table. |
CreateInspectTemplateRequest
Request message for CreateInspectTemplate.
Fields | |
---|---|
parent |
Required. Parent resource name. The format of this value varies depending on the scope of the request (project or organization) and whether you have specified a processing location:
The following example
Authorization requires the following IAM permission on the specified resource
|
inspect_template |
Required. The InspectTemplate to create. |
template_id |
The template id can contain uppercase and lowercase letters, numbers, and hyphens; that is, it must match the regular expression: |
location_id |
Deprecated. This field has no effect. |
CreateJobTriggerRequest
Request message for CreateJobTrigger.
Fields | |
---|---|
parent |
Required. Parent resource name. The format of this value varies depending on whether you have specified a processing location:
The following example
Authorization requires one or more of the following IAM permissions on the specified resource
|
job_trigger |
Required. The JobTrigger to create. |
trigger_id |
The trigger id can contain uppercase and lowercase letters, numbers, and hyphens; that is, it must match the regular expression: |
location_id |
Deprecated. This field has no effect. |
CreateStoredInfoTypeRequest
Request message for CreateStoredInfoType.
Fields | |
---|---|
parent |
Required. Parent resource name. The format of this value varies depending on the scope of the request (project) and whether you have specified a processing location:
The following example
Authorization requires the following IAM permission on the specified resource
|
config |
Required. Configuration of the storedInfoType to create. |
stored_info_type_id |
The storedInfoType ID can contain uppercase and lowercase letters, numbers, and hyphens; that is, it must match the regular expression: |
location_id |
Deprecated. This field has no effect. |
CryptoDeterministicConfig
Pseudonymization method that generates deterministic encryption for the given input. Outputs a base64 encoded representation of the encrypted output. Uses AES-SIV based on the RFC https://tools.ietf.org/html/rfc5297.
Fields | |
---|---|
crypto_key |
The key used by the encryption function. |
surrogate_info_type |
The custom info type to annotate the surrogate with. This annotation will be applied to the surrogate by prefixing it with the name of the custom info type followed by the number of characters comprising the surrogate. The following scheme defines the format: {info type name}({surrogate character count}):{surrogate} For example, if the name of custom info type is 'MY_TOKEN_INFO_TYPE' and the surrogate is 'abc', the full replacement value will be: 'MY_TOKEN_INFO_TYPE(3):abc' This annotation identifies the surrogate when inspecting content using the custom info type 'Surrogate'. This facilitates reversal of the surrogate when it occurs in free text. Note: For record transformations where the entire cell in a table is being transformed, surrogates are not mandatory. Surrogates are used to denote the location of the token and are necessary for re-identification in free form text. In order for inspection to work properly, the name of this info type must not occur naturally anywhere in your data; otherwise, inspection may either
Therefore, choose your custom info type name carefully after considering what your data looks like. One way to select a name that has a high chance of yielding reliable detection is to include one or more unicode characters that are highly improbable to exist in your data. For example, assuming your data is entered from a regular ASCII keyboard, the symbol with the hex code point 29DD might be used like so: ⧝MY_TOKEN_TYPE. |
context |
A context may be used for higher security and maintaining referential integrity such that the same identifier in two different contexts will be given a distinct surrogate. The context is appended to plaintext value being encrypted. On decryption the provided context is validated against the value used during encryption. If a context was provided during encryption, same context must be provided during decryption as well. If the context is not set, plaintext would be used as is for encryption. If the context is set but:
plaintext would be used as is for encryption. Note that case (1) is expected when an |
CryptoHashConfig
Pseudonymization method that generates surrogates via cryptographic hashing. Uses SHA-256. The key size must be either 32 or 64 bytes. Outputs a base64 encoded representation of the hashed output (for example, L7k0BHmF1ha5U3NfGykjro4xWi1MPVQPjhMAZbSV9mM=). Currently, only string and integer values can be hashed. See https://cloud.google.com/dlp/docs/pseudonymization to learn more.
Fields | |
---|---|
crypto_key |
The key used by the hash function. |
CryptoKey
This is a data encryption key (DEK) (as opposed to a key encryption key (KEK) stored by KMS). When using KMS to wrap/unwrap DEKs, be sure to set an appropriate IAM policy on the KMS CryptoKey (KEK) to ensure an attacker cannot unwrap the data crypto key.
Fields | ||
---|---|---|
Union field source . Sources of crypto keys. source can be only one of the following: |
||
transient |
Transient crypto key |
|
unwrapped |
Unwrapped crypto key |
|
kms_wrapped |
Kms wrapped key |
CryptoReplaceFfxFpeConfig
Replaces an identifier with a surrogate using Format Preserving Encryption (FPE) with the FFX mode of operation; however when used in the ReidentifyContent
API method, it serves the opposite function by reversing the surrogate back into the original identifier. The identifier must be encoded as ASCII. For a given crypto key and context, the same identifier will be replaced with the same surrogate. Identifiers must be at least two characters long. In the case that the identifier is the empty string, it will be skipped. See https://cloud.google.com/dlp/docs/pseudonymization to learn more.
Note: We recommend using CryptoDeterministicConfig for all use cases which do not require preserving the input alphabet space and size, plus warrant referential integrity.
Fields | ||
---|---|---|
crypto_key |
Required. The key used by the encryption algorithm. |
|
context |
The 'tweak', a context may be used for higher security since the same identifier in two different contexts won't be given the same surrogate. If the context is not set, a default tweak will be used. If the context is set but:
a default tweak will be used. Note that case (1) is expected when an The tweak is constructed as a sequence of bytes in big endian byte order such that:
|
|
surrogate_info_type |
The custom infoType to annotate the surrogate with. This annotation will be applied to the surrogate by prefixing it with the name of the custom infoType followed by the number of characters comprising the surrogate. The following scheme defines the format: info_type_name(surrogate_character_count):surrogate For example, if the name of custom infoType is 'MY_TOKEN_INFO_TYPE' and the surrogate is 'abc', the full replacement value will be: 'MY_TOKEN_INFO_TYPE(3):abc' This annotation identifies the surrogate when inspecting content using the custom infoType In order for inspection to work properly, the name of this infoType must not occur naturally anywhere in your data; otherwise, inspection may find a surrogate that does not correspond to an actual identifier. Therefore, choose your custom infoType name carefully after considering what your data looks like. One way to select a name that has a high chance of yielding reliable detection is to include one or more unicode characters that are highly improbable to exist in your data. For example, assuming your data is entered from a regular ASCII keyboard, the symbol with the hex code point 29DD might be used like so: ⧝MY_TOKEN_TYPE |
|
Union field alphabet . Choose an alphabet which the data being transformed will be made up of. alphabet can be only one of the following: |
||
common_alphabet |
Common alphabets. |
|
custom_alphabet |
This is supported by mapping these to the alphanumeric characters that the FFX mode natively supports. This happens before/after encryption/decryption. Each character listed must appear only once. Number of characters must be in the range [2, 95]. This must be encoded as ASCII. The order of characters does not matter. The full list of allowed characters is:
|
|
radix |
The native way to select the alphabet. Must be in the range [2, 95]. |
FfxCommonNativeAlphabet
These are commonly used subsets of the alphabet that the FFX mode natively supports. In the algorithm, the alphabet is selected using the "radix". Therefore each corresponds to particular radix.
Enums | |
---|---|
FFX_COMMON_NATIVE_ALPHABET_UNSPECIFIED |
Unused. |
NUMERIC |
[0-9] (radix of 10) |
HEXADECIMAL |
[0-9A-F] (radix of 16) |
UPPER_CASE_ALPHA_NUMERIC |
[0-9A-Z] (radix of 36) |
ALPHA_NUMERIC |
[0-9A-Za-z] (radix of 62) |
CustomInfoType
Custom information type provided by the user. Used to find domain-specific sensitive information configurable to the data in question.
Fields | ||
---|---|---|
info_type |
CustomInfoType can either be a new infoType, or an extension of built-in infoType, when the name matches one of existing infoTypes and that infoType is specified in |
|
likelihood |
Likelihood to return for this CustomInfoType. This base value can be altered by a detection rule if the finding meets the criteria specified by the rule. Defaults to |
|
detection_rules[] |
Set of detection rules to apply to all findings of this CustomInfoType. Rules are applied in order that they are specified. Not supported for the |
|
exclusion_type |
If set to EXCLUSION_TYPE_EXCLUDE this infoType will not cause a finding to be returned. It still can be used for rules matching. |
|
Union field
|
||
dictionary |
A list of phrases to detect as a CustomInfoType. |
|
regex |
Regular expression based CustomInfoType. |
|
surrogate_type |
Message for detecting output from deidentification transformations that support reversing. |
|
stored_type |
Load an existing |
DetectionRule
Deprecated; use InspectionRuleSet
instead. Rule for modifying a CustomInfoType
to alter behavior under certain circumstances, depending on the specific details of the rule. Not supported for the surrogate_type
custom infoType.
Fields | |
---|---|
hotword_rule |
Hotword-based detection rule. |
HotwordRule
The rule that adjusts the likelihood of findings within a certain proximity of hotwords.
Fields | |
---|---|
hotword_regex |
Regular expression pattern defining what qualifies as a hotword. |
proximity |
Proximity of the finding within which the entire hotword must reside. The total length of the window cannot exceed 1000 characters. Note that the finding itself will be included in the window, so that hotwords may be used to match substrings of the finding itself. For example, the certainty of a phone number regex "(\d{3}) \d{3}-\d{4}" could be adjusted upwards if the area code is known to be the local area code of a company office using the hotword regex "(xxx)", where "xxx" is the area code in question. |
likelihood_adjustment |
Likelihood adjustment to apply to all matching findings. |
LikelihoodAdjustment
Message for specifying an adjustment to the likelihood of a finding as part of a detection rule.
Fields | ||
---|---|---|
Union field
|
||
fixed_likelihood |
Set the likelihood of a finding to a fixed value. |
|
relative_likelihood |
Increase or decrease the likelihood by the specified number of levels. For example, if a finding would be |
Proximity
Message for specifying a window around a finding to apply a detection rule.
Fields | |
---|---|
window_before |
Number of characters before the finding to consider. |
window_after |
Number of characters after the finding to consider. |
Dictionary
Custom information type based on a dictionary of words or phrases. This can be used to match sensitive information specific to the data, such as a list of employee IDs or job titles.
Dictionary words are case-insensitive and all characters other than letters and digits in the unicode Basic Multilingual Plane will be replaced with whitespace when scanning for matches, so the dictionary phrase "Sam Johnson" will match all three phrases "sam johnson", "Sam, Johnson", and "Sam (Johnson)". Additionally, the characters surrounding any match must be of a different type than the adjacent characters within the word, so letters must be next to non-letters and digits next to non-digits. For example, the dictionary word "jen" will match the first three letters of the text "jen123" but will return no matches for "jennifer".
Dictionary words containing a large number of characters that are not letters or digits may result in unexpected findings because such characters are treated as whitespace. The limits page contains details about the size limits of dictionaries. For dictionaries that do not fit within these constraints, consider using LargeCustomDictionaryConfig
in the StoredInfoType
API.
Fields | ||
---|---|---|
Union field
|
||
word_list |
List of words or phrases to search for. |
|
cloud_storage_path |
Newline-delimited file of words in Cloud Storage. Only a single file is accepted. |
WordList
Message defining a list of words or phrases to search for in the data.
Fields | |
---|---|
words[] |
Words or phrases defining the dictionary. The dictionary must contain at least one phrase and every phrase must contain at least 2 characters that are letters or digits. [required] |
ExclusionType
Enums | |
---|---|
EXCLUSION_TYPE_UNSPECIFIED |
A finding of this custom info type will not be excluded from results. |
EXCLUSION_TYPE_EXCLUDE |
A finding of this custom info type will be excluded from final results, but can still affect rule execution. |
Regex
Message defining a custom regular expression.
Fields | |
---|---|
pattern |
Pattern defining the regular expression. Its syntax (https://github.com/google/re2/wiki/Syntax) can be found under the google/re2 repository on GitHub. |
group_indexes[] |
The index of the submatch to extract as findings. When not specified, the entire match is returned. No more than 3 may be included. |
SurrogateType
Message for detecting output from deidentification transformations such as CryptoReplaceFfxFpeConfig
. These types of transformations are those that perform pseudonymization, thereby producing a "surrogate" as output. This should be used in conjunction with a field on the transformation such as surrogate_info_type
. This CustomInfoType does not support the use of detection_rules
.
DatastoreKey
Record key for a finding in Cloud Datastore.
Fields | |
---|---|
entity_key |
Datastore entity key. |
DatastoreOptions
Options defining a data set within Google Cloud Datastore.
Fields | |
---|---|
partition_id |
A partition ID identifies a grouping of entities. The grouping is always by project and namespace, however the namespace ID may be empty. |
kind |
The kind to process. |
DateShiftConfig
Shifts dates by random number of days, with option to be consistent for the same context. See https://cloud.google.com/dlp/docs/concepts-date-shifting to learn more.
Fields | |
---|---|
upper_bound_days |
Required. Range of shift in days. Actual shift will be selected at random within this range (inclusive ends). Negative means shift to earlier in time. Must not be more than 365250 days (1000 years) each direction. For example, 3 means shift date to at most 3 days into the future. |
lower_bound_days |
Required. For example, -5 means shift date to at most 5 days back in the past. |
context |
Points to the field that contains the context, for example, an entity id. If set, must also set cryptoKey. If set, shift will be consistent for the given context. |
crypto_key |
Causes the shift to be computed based on this key and the context. This results in the same shift for the same context and crypto_key. If set, must also set context. Can only be applied to table items. |
DateTime
Message for a date time object. e.g. 2018-01-01, 5th August.
Fields | |
---|---|
date |
One or more of the following must be set. Must be a valid date or time value. |
day_of_week |
Day of week |
time |
Time of day |
time_zone |
Time zone |
TimeZone
Time zone of the date time object.
Fields | |
---|---|
offset_minutes |
Set only if the offset can be determined. Positive for time ahead of UTC. E.g. For "UTC-9", this value is -540. |
DeidentifyConfig
The configuration that controls how the data will change.
Fields | ||
---|---|---|
transformation_error_handling |
Mode for handling transformation errors. If left unspecified, the default mode is |
|
Union field
|
||
info_type_transformations |
Treat the dataset as free-form text and apply the same free text transformation everywhere. |
|
record_transformations |
Treat the dataset as structured. Transformations can be applied to specific locations within structured datasets, such as transforming a column within a table. |
DeidentifyContentRequest
Request to de-identify a list of items.
Fields | |
---|---|
parent |
Parent resource name. The format of this value varies depending on whether you have specified a processing location:
The following example
Authorization requires the following IAM permission on the specified resource
|
deidentify_config |
Configuration for the de-identification of the content item. Items specified here will override the template referenced by the deidentify_template_name argument. |
inspect_config |
Configuration for the inspector. Items specified here will override the template referenced by the inspect_template_name argument. |
item |
The item to de-identify. Will be treated as text. |
inspect_template_name |
Template to use. Any configuration directly specified in inspect_config will override those set in the template. Singular fields that are set in this request will replace their corresponding fields in the template. Repeated fields are appended. Singular sub-messages and groups are recursively merged. |
deidentify_template_name |
Template to use. Any configuration directly specified in deidentify_config will override those set in the template. Singular fields that are set in this request will replace their corresponding fields in the template. Repeated fields are appended. Singular sub-messages and groups are recursively merged. |
location_id |
Deprecated. This field has no effect. |
DeidentifyContentResponse
Results of de-identifying a ContentItem.
Fields | |
---|---|
item |
The de-identified item. |
overview |
An overview of the changes that were made on the |
DeidentifyTemplate
DeidentifyTemplates contains instructions on how to de-identify content. See https://cloud.google.com/dlp/docs/concepts-templates to learn more.
Fields | |
---|---|
name |
Output only. The template name. The template will have one of the following formats: |
display_name |
Display name (max 256 chars). |
description |
Short description (max 256 chars). |
create_time |
Output only. The creation timestamp of an inspectTemplate. |
update_time |
Output only. The last update timestamp of an inspectTemplate. |
deidentify_config |
///////////// // The core content of the template // /////////////// |
DeleteDeidentifyTemplateRequest
Request message for DeleteDeidentifyTemplate.
Fields | |
---|---|
name |
Required. Resource name of the deidentify template to be deleted, for example projects/project-id/deidentifyTemplates/432452342. Authorization requires the following IAM permission on the specified resource
|
DeleteDlpJobRequest
The request message for deleting a DLP job.
Fields | |
---|---|
name |
Required. The name of the DlpJob resource to be deleted. Authorization requires the following IAM permission on the specified resource
|
DeleteInspectTemplateRequest
Request message for DeleteInspectTemplate.
Fields | |
---|---|
name |
Required. Resource name of the inspectTemplate to be deleted, for example projects/project-id/inspectTemplates/432452342. Authorization requires the following IAM permission on the specified resource
|
DeleteJobTriggerRequest
Request message for DeleteJobTrigger.
Fields | |
---|---|
name |
Required. Resource name of the project and the triggeredJob, for example Authorization requires the following IAM permission on the specified resource
|
DeleteStoredInfoTypeRequest
Request message for DeleteStoredInfoType.
Fields | |
---|---|
name |
Required. Resource name of the storedInfoType to be deleted, for example projects/project-id/storedInfoTypes/432452342. Authorization requires the following IAM permission on the specified resource
|
DlpJob
Combines all of the information about a DLP job.
Fields | ||
---|---|---|
name |
The server-assigned name. |
|
type |
The type of job. |
|
state |
State of a job. |
|
create_time |
Time when the job was created. |
|
start_time |
Time when the job started. |
|
end_time |
Time when the job finished. |
|
job_trigger_name |
If created by a job trigger, the resource name of the trigger that instantiated the job. |
|
errors[] |
A stream of errors encountered running the job. |
|
Union field
|
||
risk_details |
Results from analyzing risk of a data source. |
|
inspect_details |
Results from inspecting a data source. |
JobState
Possible states of a job. New items may be added.
Enums | |
---|---|
JOB_STATE_UNSPECIFIED |
Unused. |
PENDING |
The job has not yet started. |
RUNNING |
The job is currently running. Once a job has finished it will transition to FAILED or DONE. |
DONE |
The job is no longer running. |
CANCELED |
The job was canceled before it could complete. |
FAILED |
The job had an error and did not complete. |
ACTIVE |
The job is currently accepting findings via hybridInspect. A hybrid job in ACTIVE state may continue to have findings added to it through calling of hybridInspect. After the job has finished no more calls to hybridInspect may be made. ACTIVE jobs can transition to DONE. |
DlpJobType
An enum to represent the various types of DLP jobs.
Enums | |
---|---|
DLP_JOB_TYPE_UNSPECIFIED |
Unused |
INSPECT_JOB |
The job inspected Google Cloud for sensitive data. |
RISK_ANALYSIS_JOB |
The job executed a Risk Analysis computation. |
DocumentLocation
Location of a finding within a document.
Fields | |
---|---|
file_offset |
Offset of the line, from the beginning of the file, where the finding is located. |
EntityId
An entity in a dataset is a field or set of fields that correspond to a single person. For example, in medical records the EntityId
might be a patient identifier, or for financial records it might be an account identifier. This message is used when generalizations or analysis must take into account that multiple rows correspond to the same entity.
Fields | |
---|---|
field |
Composite key indicating which field contains the entity identifier. |
Error
Details information about an error encountered during job execution or the results of an unsuccessful activation of the JobTrigger.
Fields | |
---|---|
details |
Detailed error codes and messages. |
timestamps[] |
The times the error occurred. |
ExcludeInfoTypes
List of exclude infoTypes.
Fields | |
---|---|
info_types[] |
InfoType list in ExclusionRule rule drops a finding when it overlaps or contained within with a finding of an infoType from this list. For example, for |
ExclusionRule
The rule that specifies conditions when findings of infoTypes specified in InspectionRuleSet
are removed from results.
Fields | ||
---|---|---|
matching_type |
How the rule is applied, see MatchingType documentation for details. |
|
Union field type . Exclusion rule types. type can be only one of the following: |
||
dictionary |
Dictionary which defines the rule. |
|
regex |
Regular expression which defines the rule. |
|
exclude_info_types |
Set of infoTypes for which findings would affect this rule. |
FieldId
General identifier of a data field in a storage service.
Fields | |
---|---|
name |
Name describing the field. |
FieldTransformation
The transformation to apply to the field.
Fields | ||
---|---|---|
fields[] |
Required. Input field(s) to apply the transformation to. |
|
condition |
Only apply the transformation if the condition evaluates to true for the given Example Use Cases:
|
|
Union field transformation . Transformation to apply. [required] transformation can be only one of the following: |
||
primitive_transformation |
Apply the transformation to the entire field. |
|
info_type_transformations |
Treat the contents of the field as free text, and selectively transform content that matches an |
FileType
Definitions of file type groups to scan. New types will be added to this list.
Enums | |
---|---|
FILE_TYPE_UNSPECIFIED |
Includes all files. |
BINARY_FILE |
Includes all file extensions not covered by another entry. Binary scanning attempts to convert the content of the file to utf_8 to scan the file. If you wish to avoid this fall back, specify one or more of the other FileType's in your storage scan. |
TEXT_FILE |
Included file extensions: asc,asp, aspx, brf, c, cc,cfm, cgi, cpp, csv, cxx, c++, cs, css, dart, dat, dot, eml,, epbub, ged, go, h, hh, hpp, hxx, h++, hs, html, htm, mkd, markdown, m, ml, mli, perl, pl, plist, pm, php, phtml, pht, properties, py, pyw, rb, rbw, rs, rss, rc, scala, sh, sql, swift, tex, shtml, shtm, xhtml, lhs, ics, ini, java, js, json, kix, kml, ocaml, md, txt, text, tsv, vb, vcard, vcs, wml, xcodeproj, xml, xsl, xsd, yml, yaml. |
IMAGE |
Included file extensions: bmp, gif, jpg, jpeg, jpe, png. bytes_limit_per_file has no effect on image files. Image inspection is restricted to 'global', 'us', 'asia', and 'europe'. |
WORD |
Word files >30 MB will be scanned as binary files. Included file extensions: docx, dotx, docm, dotm |
PDF |
PDF files >30 MB will be scanned as binary files. Included file extensions: pdf |
AVRO |
Included file extensions: avro |
CSV |
Included file extensions: csv |
TSV |
Included file extensions: tsv |
Finding
Represents a piece of potentially sensitive content.
Fields | |
---|---|
name |
Resource name in format projects/{project}/locations/{location}/findings/{finding} Populated only when viewing persisted findings. |
quote |
The content that was found. Even if the content is not textual, it may be converted to a textual representation here. Provided if |
info_type |
The type of content that might have been found. Provided if |
likelihood |
Confidence of how likely it is that the |
location |
Where the content was found. |
create_time |
Timestamp when finding was detected. |
quote_info |
Contains data parsed from quotes. Only populated if include_quote was set to true and a supported infoType was requested. Currently supported infoTypes: DATE, DATE_OF_BIRTH and TIME. |
resource_name |
The job that stored the finding. |
trigger_name |
Job trigger name, if applicable, for this finding. |
labels |
The labels associated with this Label keys must be between 1 and 63 characters long and must conform to the following regular expression: Label values must be between 0 and 63 characters long and must conform to the regular expression No more than 10 labels can be associated with a given finding. Examples: * |
job_create_time |
Time the job started that produced this finding. |
job_name |
The job that stored the finding. |
finding_id |
The unique finding id. |
FinishDlpJobRequest
The request message for finishing a DLP hybrid job.
Fields | |
---|---|
name |
Required. The name of the DlpJob resource to be cancelled. Authorization requires the following IAM permission on the specified resource
|
FixedSizeBucketingConfig
Buckets values based on fixed size ranges. The Bucketing transformation can provide all of this functionality, but requires more configuration. This message is provided as a convenience to the user for simple bucketing strategies.
The transformed value will be a hyphenated string of {lower_bound}-{upper_bound}, i.e if lower_bound = 10 and upper_bound = 20 all values that are within this bucket will be replaced with "10-20".
This can be used on data of type: double, long.
If the bound Value type differs from the type of data being transformed, we will first attempt converting the type of the data to be transformed to match the type of the bound before comparing.
See https://cloud.google.com/dlp/docs/concepts-bucketing to learn more.
Fields | |
---|---|
lower_bound |
Required. Lower bound value of buckets. All values less than |
upper_bound |
Required. Upper bound value of buckets. All values greater than upper_bound are grouped together into a single bucket; for example if |
bucket_size |
Required. Size of each bucket (except for minimum and maximum buckets). So if |
GetDeidentifyTemplateRequest
Request message for GetDeidentifyTemplate.
Fields | |
---|---|
name |
Required. Resource name of the deidentify template to be read, for example projects/project-id/deidentifyTemplates/432452342. Authorization requires the following IAM permission on the specified resource
|
GetDlpJobRequest
The request message for [DlpJobs.GetDlpJob][].
Fields | |
---|---|
name |
Required. The name of the DlpJob resource. Authorization requires the following IAM permission on the specified resource
|
GetInspectTemplateRequest
Request message for GetInspectTemplate.
Fields | |
---|---|
name |
Required. Resource name of the inspectTemplate to be read, for example projects/project-id/inspectTemplates/432452342. Authorization requires the following IAM permission on the specified resource
|
GetJobTriggerRequest
Request message for GetJobTrigger.
Fields | |
---|---|
name |
Required. Resource name of the project and the triggeredJob, for example Authorization requires the following IAM permission on the specified resource
|
GetStoredInfoTypeRequest
Request message for GetStoredInfoType.
Fields | |
---|---|
name |
Required. Resource name of the storedInfoType to be read, for example projects/project-id/storedInfoTypes/432452342. Authorization requires the following IAM permission on the specified resource
|
HybridContentItem
An individual hybrid item to inspect. Will be stored temporarily during processing.
Fields | |
---|---|
item |
The item to inspect. |
finding_details |
Supplementary information that will be added to each finding. |
HybridFindingDetails
Populate to associate additional data with each finding.
Fields | |
---|---|
container_details |
Details about the container where the content being inspected is from. |
file_offset |
Offset in bytes of the line, from the beginning of the file, where the finding is located. Populate if the item being scanned is only part of a bigger item, such as a shard of a file and you want to track the absolute position of the finding. |
row_offset |
Offset of the row for tables. Populate if the row(s) being scanned are part of a bigger dataset and you want to keep track of their absolute position. |
table_options |
If the container is a table, additional information to make findings meaningful such as the columns that are primary keys. If not known ahead of time, can also be set within each inspect hybrid call and the two will be merged. Note that identifying_fields will only be stored to BigQuery, and only if the BigQuery action has been included. |
labels |
Labels to represent user provided metadata about the data being inspected. If configured by the job, some key values may be required. The labels associated with Label keys must be between 1 and 63 characters long and must conform to the following regular expression: Label values must be between 0 and 63 characters long and must conform to the regular expression No more than 10 labels can be associated with a given finding. Examples: * |
HybridInspectDlpJobRequest
Request to search for potentially sensitive info in a custom location.
Fields | |
---|---|
name |
Required. Resource name of the job to execute a hybrid inspect on, for example Authorization requires the following IAM permission on the specified resource
|
hybrid_item |
The item to inspect. |
HybridInspectJobTriggerRequest
Request to search for potentially sensitive info in a custom location.
Fields | |
---|---|
name |
Required. Resource name of the trigger to execute a hybrid inspect on, for example Authorization requires the following IAM permission on the specified resource
|
hybrid_item |
The item to inspect. |
HybridInspectResponse
Quota exceeded errors will be thrown once quota has been met.
HybridInspectStatistics
Statistics related to processing hybrid inspect requests.
Fields | |
---|---|
processed_count |
The number of hybrid inspection requests processed within this job. |
aborted_count |
The number of hybrid inspection requests aborted because the job ran out of quota or was ended before they could be processed. |
pending_count |
The number of hybrid requests currently being processed. Only populated when called via method |
HybridOptions
Configuration to control jobs where the content being inspected is outside of Google Cloud Platform.
Fields | |
---|---|
description |
A short description of where the data is coming from. Will be stored once in the job. 256 max length. |
required_finding_label_keys[] |
These are labels that each inspection request must include within their 'finding_labels' map. Request may contain others, but any missing one of these will be rejected. Label keys must be between 1 and 63 characters long and must conform to the following regular expression: No more than 10 keys can be required. |
labels |
To organize findings, these labels will be added to each finding. Label keys must be between 1 and 63 characters long and must conform to the following regular expression: Label values must be between 0 and 63 characters long and must conform to the regular expression No more than 10 labels can be associated with a given finding. Examples: * |
table_options |
If the container is a table, additional information to make findings meaningful such as the columns that are primary keys. |
ImageLocation
Location of the finding within an image.
Fields | |
---|---|
bounding_boxes[] |
Bounding boxes locating the pixels within the image containing the finding. |
InfoType
Type of information detected by the API.
Fields | |
---|---|
name |
Name of the information type. Either a name of your choosing when creating a CustomInfoType, or one of the names listed at https://cloud.google.com/dlp/docs/infotypes-reference when specifying a built-in type. When sending Cloud DLP results to Data Catalog, infoType names should conform to the pattern |
InfoTypeDescription
InfoType description.
Fields | |
---|---|
name |
Internal name of the infoType. |
display_name |
Human readable form of the infoType name. |
supported_by[] |
Which parts of the API supports this InfoType. |
description |
Description of the infotype. Translated when language is provided in the request. |
InfoTypeStats
Statistics regarding a specific InfoType.
Fields | |
---|---|
info_type |
The type of finding this stat is for. |
count |
Number of findings for this infoType. |
InfoTypeSupportedBy
Parts of the APIs which use certain infoTypes.
Enums | |
---|---|
ENUM_TYPE_UNSPECIFIED |
Unused. |
INSPECT |
Supported by the inspect operations. |
RISK_ANALYSIS |
Supported by the risk analysis operations. |
InfoTypeTransformations
A type of transformation that will scan unstructured text and apply various PrimitiveTransformation
s to each finding, where the transformation is applied to only values that were identified as a specific info_type.
Fields | |
---|---|
transformations[] |
Required. Transformation for each infoType. Cannot specify more than one for a given infoType. |
InfoTypeTransformation
A transformation to apply to text that is identified as a specific info_type.
Fields | |
---|---|
info_types[] |
InfoTypes to apply the transformation to. An empty list will cause this transformation to apply to all findings that correspond to infoTypes that were requested in |
primitive_transformation |
Required. Primitive transformation to apply to the infoType. |
InspectConfig
Configuration description of the scanning process. When used with redactContent only info_types and min_likelihood are currently used.
Fields | |
---|---|
info_types[] |
Restricts what info_types to look for. The values must correspond to InfoType values returned by ListInfoTypes or listed at https://cloud.google.com/dlp/docs/infotypes-reference. When no InfoTypes or CustomInfoTypes are specified in a request, the system may automatically choose what detectors to run. By default this may be all types, but may change over time as detectors are updated. If you need precise control and predictability as to what detectors are run you should specify specific InfoTypes listed in the reference, otherwise a default list will be used, which may change over time. |
min_likelihood |
Only returns findings equal or above this threshold. The default is POSSIBLE. See https://cloud.google.com/dlp/docs/likelihood to learn more. |
limits |
Configuration to control the number of findings returned. |
include_quote |
When true, a contextual quote from the data that triggered a finding is included in the response; see Finding.quote. |
exclude_info_types |
When true, excludes type information of the findings. |
custom_info_types[] |
CustomInfoTypes provided by the user. See https://cloud.google.com/dlp/docs/creating-custom-infotypes to learn more. |
content_options[] |
List of options defining data content to scan. If empty, text, images, and other content will be included. |
rule_set[] |
Set of rules to apply to the findings for this InspectConfig. Exclusion rules, contained in the set are executed in the end, other rules are executed in the order they are specified for each info type. |
FindingLimits
Configuration to control the number of findings returned. Cannot be set if de-identification is requested.
Fields | |
---|---|
max_findings_per_item |
Max number of findings that will be returned for each item scanned. When set within |
max_findings_per_request |
Max number of findings that will be returned per request/job. When set within |
ma |