This page describes the contents of a transformation details table and provides example queries that you can run on it.
When you de-identify data in storage, you can configure the inspection job to provide details about each transformation that it makes. Sensitive Data Protection writes these details in a BigQuery table that you specify. In this document, that table is called a transformation details table.
Contents of a transformation details table
This section lists and describes the contents of the transformation details table.
resource_name
The name of the inspection job that completed the transformation.
container_name
The file that contains the data that was transformed.
transformation
Details about the transformation. This field contains the following properties:
- type
The transformation method that Sensitive Data Protection applied to the finding. The following are some of the possible values:
- description
A string representation of the transformation. The value is the output of a
toString()
call on thePrimitiveTransformation
protocol buffer message for all types of transformations, except for aRecordSuppression
. If the transformation method is a record suppression, this field is empty.- condition
A string representation of the
RecordCondition
for the transformation. This field is set only if a record condition was used to determine whether Sensitive Data Protection must apply the transformation. Examples:(age_field <= 18)
(zip_field exists)
(zip_field == 01234) && (age_field <= 18) && (city_field exists)
- infoType
Details about the type of information detected in the finding. This field contains the following properties:
status_details
Details about the status of the transformation. If the transformation was unsuccessful, this field specifies what caused the failure. This field contains the following properties:
- result_status_type
A code representing the status of the transformation attempt. The following are the possible values:
STATE_TYPE_UNSPECIFIED
: Sensitive Data Protection couldn't determine the status of the transformation.INVALID_TRANSFORM
: Sensitive Data Protection couldn't transform the finding.METADATA_UNRETRIEVABLE
: there's a finding in the custom metadata of a file. While writing the transformed file, Sensitive Data Protection couldn't retrieve the metadata.SUCCESS
: the transformation was successful.
- details
Additional status details. This field follows the specifications defined in
Status
. This field contains the following properties:- code
- The error code.
- message
- The error message.
- details
- A list of messages that contain the error details.
transformed_bytes
The number of bytes that Sensitive Data Protection transformed. If the
transformation was unsuccessful or if there was no content to transform, the
value is 0
.
transformation_location
Details about the location of the transformation.
The following is a JSON example of a transformation location, where Sensitive Data Protection performed an infoType transformation:
{
"finding_id": "2022-05-23T23:51:29.775337Z831678185946560283",
"record_transformation": null,
"container_type": "TRANSFORM_BODY"
}
The following is a JSON example of a transformation location, where Sensitive Data Protection performed a record transformation:
{
"finding_id": null,
"record_transformation": {
"field_id": {
"name": " \"Name\""
},
"container_timestamp": {
"timestamp": null,
"seconds": "1654796423",
"nanos": "763000000"
},
"container_version": "1654796423733485"
},
"container_type": "TRANSFORM_TABLE"
}
}
As the examples show, Sensitive Data Protection populates either finding_id
or
record_transformation
, depending on the type of transformation it performed.
The two fields are mutually exclusive.
- finding_id
- This field is set if Sensitive Data Protection performed an infotype transformation. Each finding ID correlates to an entry in the findings output table. The findings output table contains all of the findings that Sensitive Data Protection detected during inspection. This table is created only if you configured your inspection job to save findings to BigQuery.
- record_transformation
This field is set if Sensitive Data Protection performed a record transformation on tabular data. This field contains the following properties:
- field_id
- The table column that contains the finding.
- container_timestamp
- Modification timestamp of the file.
- container_version
- Generation number of the file that contains the finding.
- container_type
Information about the functionality of the data that contains the finding. The following are the possible values:
TRANSFORM_UNKNOWN_CONTAINER
: Sensitive Data Protection couldn't determine the type of the data that contains the finding.TRANSFORM_BODY
: Sensitive Data Protection detected the finding in the body of a file.TRANSFORM_METADATA
: Sensitive Data Protection detected the finding in the metadata of a file.TRANSFORM_TABLE
: Sensitive Data Protection detected the finding in table.
Example queries
The following are example queries that you can run on the transformation details table. For information about how to query a BigQuery table, see Running interactive queries.
Select all failed transformations
SELECT *
FROM `PROJECT_ID.DATASET_ID.TABLE_ID`
WHERE status_details.result_status_type != "SUCCESS";
Replace the following:
PROJECT_ID
: the ID of the project that contains the transformation details table.DATASET_ID
: the ID of the BigQuery dataset that contains the transformation details table.TABLE_ID
: the ID of the transformation details table.
Count the number of files that have transformation failures
SELECT COUNT(DISTINCT(container_name))
FROM `PROJECT_ID.DATASET_ID.TABLE_ID`
WHERE status_details.result_status_type != "SUCCESS";
Select all transformations that used character masking
SELECT resource_name, container_name, info_type.name
FROM `PROJECT_ID.DATASET_ID.TABLE_ID`,
UNNEST(transformation) AS tr
WHERE tr.type LIKE "CHARACTER_MASK";
What's next
- Learn more about the process of de-identifying data in storage.
- Learn how to de-identify data in storage using the Google Cloud console.
- Learn how to de-identify sensitive data stored in Cloud Storage using the DLP API.
- Work through the Creating a De-identified Copy of Data in Cloud Storage codelab.
- Learn more about de-identification transformations.