This page explains how to use the v1beta1 DicomTagConfig
configuration in the Cloud Healthcare API to de-identify sensitive
data in DICOM instances at the following levels:
- At the dataset level using
datasets.deidentify
- At the DICOM store level using
dicomStores.deidentify
This page also explains how to apply filters when de-identifying data at the DICOM store level.
You can configure DICOM de-identify operations using either the
legacy v1 DicomConfig
object
or the v1beta1 DicomTagConfig
object. We strongly recommend that you use DicomTagConfig
.
If you already use DicomConfig
for your de-identify operations, we encourage
you to migrate to using DicomTagConfig
. For a summary of new features, see New configuration options in DicomTagConfig
.
For instructions on how to migrate, see Migrate requests and responses to use DicomTagConfig
.
New configuration options in DicomTagConfig
De-identify text with contextual de-identification
You can configure the DicomTagConfig.Options.CleanDescriptorsOption
object to enable contextual de-identification of unstructured metadata text.
This option is based on the
Clean Descriptors Option.
When you specify DicomTagConfig.Options.CleanDescriptorsOption
, an extra
infoType is used during inspection which can affect billing costs.
Using the DicomTagConfig.Options.CleanDescriptorsOption
option transforms
any unstructured metadata text that matches removed tags, and in doing so improves
the quality of the de-identification. For example, suppose that you're
de-identifying an X-ray, and the X-ray patient has a last name that's also
a noun, such as Wall
. If any metadata in the instance, such as the text in
StudyDescription
, contains the word Wall
, the text will be transformed.
The CleanDescriptorsOption
option redacts contextual phrases that match any tags
marked for removal in the DICOM Base Profile
as long as the tags match one of the following action codes:
D
Z
X
U
Matching contextual phrases are replaced with the token [CTX]
.
You can configure which tags are redacted by specifying the following:
- An enum in the
ProfileType
object. Specifying an enum isn't required. - The
CleanTextTag
filter to specific tags.
However, the tags used in the DICOM Base Profile cannot be changed.
Redact burned-in text with contextual de-identification
You can specify the TextRedactionMode.REDACT_SENSITIVE_TEXT_CLEAN_DESCRIPTORS
enum to enable contextual de-identification of burned-in text in an image.
This option is based on the
Clean Descriptors Option.
When you specify the TextRedactionMode.REDACT_SENSITIVE_TEXT_CLEAN_DESCRIPTORS
enum, an extra infoType is used during inspection which can affect
billing costs.
You can specify the TextRedactionMode.REDACT_SENSITIVE_TEXT_CLEAN_DESCRIPTORS
enum in the following ways:
In an
Options.ImageConfig
object withDicomTagConfig.Options.ImageConfig.TextRedactionMode
. When you specify this option, thePixelData (7FE0,0010)
tag is automatically added to the list of tags to be redacted.In an
Action.CleanImageTag
object withDicomTagConfig.Action.CleanImageTag.TextRedactionMode
. When you specify this option, you must add thePixelData
tag manually to theAction.queries[]
array. ThePixelData
tag is the only supported tag when usingAction.CleanImageTag
.
The TextRedactionMode.REDACT_SENSITIVE_TEXT_CLEAN_DESCRIPTORS
option redacts
burned-in text that matches any tags
marked for removal in the DICOM Base Profile
as long as the tags match one of the following action codes:
D
Z
X
U
There is no additional configuration for contextual de-identification of
burned-in text other than enabling or disabling it using an enum in the ProfileType
object. Specifying an enum isn't required.
Additional infoTypes in image de-identification
You can use information types (infoTypes) specify which data to scan for when performing de-identification on tags. An infoType is a type of sensitive data, such as a patient name, email address, telephone number, identification number, or credit card number.
You can configure the following fields in the DicomTagConfig.Options.ImageConfig
object to determine which infoTypes to use during DICOM image de-identification:
These fields only take effect if DicomTagConfig.Options.ImageConfig.TextRedactionMode
is set to one of the following values:
Migrate requests and responses to use DicomTagConfig
You can configure DICOM de-identification using DicomTagConfig
, which is available in Cloud Healthcare API v1beta1 and is an alternative to using the
legacy DicomConfig
.
When sending a request, you cannot include both DicomConfig
and DicomTagConfig
.
The following sections describe configurations in DicomConfig
and how to
migrate them to DicomTagConfig
.
TagFilterProfile
to ProfileType
Replace the DicomConfig
TagFilterProfile
object with the DicomTagConfig
ProfileType
object. The same four profiles in TagFilterProfileType
are available in
ProfileType
.
The following example shows how to migrate a request from using TagFilterProfile
to using ProfileType
:
DicomConfig | DicomTagConfig |
---|---|
|
|
keepList
and removeList
The DicomConfig
keepList
and removeList
fields are no longer available in DicomTagConfig
. If you used keepList
and removeList
to specify tags to keep or remove instead of using a profile,
you must migrate to the new Action
object where you specify tag behavior. The Action
object provides additional
options to transform tags.
The following example shows how to migrate a request from using
keepList
to using Action.keepTag
. The request specifies that the value of
the PatientID
tag is kept during the de-identify operation.
DicomConfig | DicomTagConfig |
---|---|
|
|
Combine keeplists, removelists, and profiles
In the DicomConfig
object, you can determine whether to keep or remove data
based on keeplists, removelists, and profiles. These options are mutually exclusive.
When using the DicomTagConfig
object, you can combine these options by specifying
which tags to keep and remove in an Action
object while also specifying a profile in ProfileType
.
Options configured in the Action
object override those configured in the
ProfileType
profile. The options in the Action
object apply in the order
in which they are provided in the request.
skipIdRedaction
to Objects.primaryIds
Replace the skipIdRedaction
field in the DicomConfig
object with the primaryIds
field in the DicomTagConfig
object. The primaryIds
field, which
is in the Options
object, contains a PrimaryIdsOption
object where you specify one of the following enums:
PRIMARY_IDS_OPTION_UNSPECIFIED
: Default behavior when no value is provided toPrimaryIdsOption
. Defaults to the option specified inProfileType
.KEEP
: Leave the primary IDs unchanged.REGEN
: Regenerate the primary IDs.
The following example shows how to migrate a request from using
skipIdRedaction
to using Options.primaryIds
. The request specifies that
the values of the primary IDs are kept during the de-identify operation:
DicomConfig | DicomTagConfig |
---|---|
|
|
DeidentifyConfig.ImageConfig
to DicomTagConfig.Options.ImageConfig
Replace the DeidentifyConfig.ImageConfig
object with the
DicomTagConfig.Options.ImageConfig
object.
The options in the ImageConfig
object are the same in both versions.
The following example shows how to migrate a request from using an ImageConfig
in DeidentifyConfig.image
to using an ImageConfig
in DeidentifyConfig.DicomTagConfig.Options.cleanImage
.
The request specifies that all text in an image is to be redacted during
the de-identify operation:
DeidentifyConfig.image | DeidentifyConfig.DicomTagConfig.Options.cleanImage |
---|---|
|
|
De-identification overview
Dataset level de-identification
To de-identify DICOM data at the dataset level, call the
datasets.deidentify
method. The datasets.deidentify
method has the following components:
- The source dataset: A dataset containing DICOM stores with one or more
instances that have sensitive data. When you call the
datasets.deidentify
method, all instances in all DICOM stores in the dataset are de-identified. - The destination dataset: De-identification doesn't affect the original dataset or its data. Instead, de-identified copies of the original data are written to a new dataset, called the destination dataset.
- What to de-identify: Configuration parameters that specify how to process
the DICOM data in the dataset. You can configure DICOM de-identification to de-identify DICOM
instance metadata (using tag keywords) or burned-in text in DICOM images by
specifying these parameters in a
DeidentifyConfig
object.
Most of the samples in this guide show how to de-identify DICOM data at the dataset level.
DICOM store level de-identification
De-identifying DICOM data at the DICOM store level lets you have more control over which data is de-identified. For example, if you have a dataset with multiple DICOM stores, you can de-identify each DICOM store according to what type of data exists in the store.
To de-identify DICOM data in a DICOM store, call the
dicomStores.deidentify
method. The dicomStores.deidentify
method has the following components:
- The source DICOM store: A DICOM store containing one or more
instances that have sensitive data. When you call the
dicomStores.deidentify
operation, all instances in the DICOM store are de-identified. - The destination DICOM store: De-identification doesn't affect the original DICOM store or its data. Instead, de-identified copies of the original data are written to the destination DICOM store. The destination DICOM store must already exist.
- What to de-identify: Configuration parameters that specify how to process
the DICOM store. You can configure DICOM de-identification to de-identify
DICOM instance metadata (using tag keywords) or burned-in text in DICOM images
by specifying these parameters in a
DeidentifyConfig
object.
For an example of how to de-identify DICOM data at the DICOM store level, see De-identify data at the DICOM store level.
Filters
When de-identifying DICOM data at the DICOM store level, you can de-identify
a subset of the data in the DICOM store by configuring a filter
file and specifying the file in the dicomStores.deidentify
request. For an
example, see
De-identify a subset of a DICOM store.
Samples overview
The samples in this guide use a single DICOM instance named
dicom_deid_instance_sample.dcm
,
but you can also de-identify multiple instances. To use the sample DICOM
instance in the examples in this page, download the file to your local machine
and follow the instructions in Store DICOM data
to store it in a DICOM store.
The following sections show what the image in the DICOM instance looks like and the metadata in the instance.
Sample image
Some samples in this page contain an output of the de-identified image. Each sample uses the following original image as its input. You can compare the output image from each de-identification operation to this original image to see the effects of the operation:
Sample metadata
Most samples in this page contain an output of the changed metadata in the DICOM instance. Each sample uses the following original metadata as its input. You can compare the output metadata from each de-identification operation to this original metadata to see the effects of de-identification:
[
{
"00020002": {
"vr": "UI",
"Value": [
"1.2.840.10008.5.1.4.1.1.7"
]
},
"00020003": {
"vr": "UI",
"Value": [
"1.2.276.0.7230010.3.1.4.8323329.78.1539083058.523695"
]
},
"00020010": {
"vr": "UI",
"Value": [
"1.2.840.10008.1.2.4.50"
]
},
"00020012": {
"vr": "UI",
"Value": [
"1.2.276.0.7230010.3.0.3.6.1"
]
},
"00020013": {
"vr": "SH",
"Value": [
"OFFIS_DCMTK_361"
]
},
"00080005": {
"vr": "CS",
"Value": [
"ISO_IR 100"
]
},
"00080016": {
"vr": "UI",
"Value": [
"1.2.840.10008.5.1.4.1.1.7"
]
},
"00080018": {
"vr": "UI",
"Value": [
"1.2.276.0.7230010.3.1.4.8323329.78.1539083058.523695"
]
},
"00080020": {
"vr": "DA",
"Value": [
"20110909"
]
},
"00080030": {
"vr": "TM",
"Value": [
"110032"
]
},
"00080050": {
"vr": "SH"
},
"00080064": {
"vr": "CS",
"Value": [
"WSD"
]
},
"00080070": {
"vr": "LO",
"Value": [
"Manufacturer"
]
},
"00080090": {
"vr": "PN",
"Value": [
{
"Alphabetic": "John Doe"
}
]
},
"00081090": {
"vr": "LO",
"Value": [
"ABC1"
]
},
"00100010": {
"vr": "PN",
"Value": [
{
"Alphabetic": "Ann Johnson"
}
]
},
"00100020": {
"vr": "LO",
"Value": [
"S1214223-1"
]
},
"00100030": {
"vr": "DA",
"Value": [
"19880812"
]
},
"00100040": {
"vr": "CS",
"Value": [
"F"
]
},
"0020000D": {
"vr": "UI",
"Value": [
"2.25.70541616638819138568043293671559322355"
]
},
"0020000E": {
"vr": "UI",
"Value": [
"1.2.276.0.7230010.3.1.3.8323329.78.1531234558.523694"
]
},
"00200010": {
"vr": "SH"
},
"00200011": {
"vr": "IS"
},
"00200013": {
"vr": "IS"
},
"00200020": {
"vr": "CS"
},
"00280002": {
"vr": "US",
"Value": [
3
]
},
"00280004": {
"vr": "CS",
"Value": [
"YBR_FULL_422"
]
},
"00280006": {
"vr": "US",
"Value": [
0
]
},
"00280010": {
"vr": "US",
"Value": [
1024
]
},
"00280011": {
"vr": "US",
"Value": [
1024
]
},
"00280100": {
"vr": "US",
"Value": [
8
]
},
"00280101": {
"vr": "US",
"Value": [
8
]
},
"00280102": {
"vr": "US",
"Value": [
7
]
},
"00280103": {
"vr": "US",
"Value": [
0
]
},
"00282110": {
"vr": "CS",
"Value": [
"01"
]
},
"00282114": {
"vr": "CS",
"Value": [
"ISO_10918_1"
]
}
}
]
Redact burned-in text from images
You can de-identify burned-in text in DICOM images using the
ImageConfig
object inside an Action
object. Inside ImageConfig
, you can specify which infoTypes to include or
exclude, and how to redact text using the TextRedactionMode
object.
Redact all text
The following samples show how to de-identify a DICOM instance by setting
TextRedactionMode
to REDACT_ALL_TEXT
.
This configuration redacts all burned-in text in the image.
REST
De-identify the dataset.
Before using any of the request data, make the following replacements:
- PROJECT_ID: the ID of your Google Cloud project
- SOURCE_DATASET_LOCATION: the source dataset location
- SOURCE_DATASET_ID: the ID of the dataset containing the data to de-identify
- DESTINATION_DATASET_ID: the ID of the destination dataset where de-identified data is written
Request JSON body:
{ "destinationDataset": "projects/PROJECT_ID/locations/SOURCE_DATASET_LOCATION/datasets/DESTINATION_DATASET_ID", "config": { "dicomTagConfig": { "options": { "cleanImage": { "textRedactionMode": "REDACT_ALL_TEXT" } } } } }
To send your request, choose one of these options:
curl
Save the request body in a file named
request.json
. Run the following command in the terminal to create or overwrite this file in the current directory:cat > request.json << 'EOF' { "destinationDataset": "projects/PROJECT_ID/locations/SOURCE_DATASET_LOCATION/datasets/DESTINATION_DATASET_ID", "config": { "dicomTagConfig": { "options": { "cleanImage": { "textRedactionMode": "REDACT_ALL_TEXT" } } } } } EOF
Then execute the following command to send your REST request:
curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
"https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/SOURCE_DATASET_LOCATION/datasets/SOURCE_DATASET_ID:deidentify"PowerShell
Save the request body in a file named
request.json
. Run the following command in the terminal to create or overwrite this file in the current directory:@' { "destinationDataset": "projects/PROJECT_ID/locations/SOURCE_DATASET_LOCATION/datasets/DESTINATION_DATASET_ID", "config": { "dicomTagConfig": { "options": { "cleanImage": { "textRedactionMode": "REDACT_ALL_TEXT" } } } } } '@ | Out-File -FilePath request.json -Encoding utf8
Then execute the following command to send your REST request:
$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }
Invoke-WebRequest `
-Method POST `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/SOURCE_DATASET_LOCATION/datasets/SOURCE_DATASET_ID:deidentify" | Select-Object -Expand ContentOPERATION_ID
. You need this value in the next step.Use the
projects.locations.datasets.operations.get
method to get the status of the long-running operation.Before using any of the request data, make the following replacements:
- PROJECT_ID: the ID of your Google Cloud project
- SOURCE_DATASET_LOCATION: the source dataset location
- SOURCE_DATASET_ID: the ID of the dataset containing the data to de-identify
- OPERATION_ID: the ID returned from the long-running operation
To send your request, choose one of these options:
curl
Execute the following command:
curl -X GET \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
"https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/SOURCE_DATASET_LOCATION/datasets/SOURCE_DATASET_ID/operations/OPERATION_ID"PowerShell
Execute the following command:
$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }
Invoke-WebRequest `
-Method GET `
-Headers $headers `
-Uri "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/SOURCE_DATASET_LOCATION/datasets/SOURCE_DATASET_ID/operations/OPERATION_ID" | Select-Object -Expand ContentAPIs Explorer
Open the method reference page. The APIs Explorer panel opens on the right side of the page. You can interact with this tool to send requests. Complete any required fields and click Execute.
"done": true
, the long-running operation has finished.After the de-identification succeeds, you can retrieve the metadata for the de-identified instance to see how it changed. The de-identified instance has a new studies UID, series UID, and instances UID, so you first need to search the new dataset for the de-identified instance.
Before using any of the request data, make the following replacements:
- PROJECT_ID: the ID of your Google Cloud project
- DESTINATION_DATASET_ID: the ID of the destination dataset where de-identified data is written
- SOURCE_DATASET_LOCATION: the source dataset location
- DESTINATION_DICOM_STORE_ID: the ID of the DICOM store in the destination dataset. This is the same as the ID of the DICOM store in the source dataset.
To send your request, choose one of these options:
curl
Execute the following command:
curl -X GET \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
"https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/SOURCE_DATASET_LOCATION/datasets/DESTINATION_DATASET_ID/dicomStores/DESTINATION_DICOM_STORE_ID/dicomWeb/instances"PowerShell
Execute the following command:
$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }
Invoke-WebRequest `
-Method GET `
-Headers $headers `
-Uri "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/SOURCE_DATASET_LOCATION/datasets/DESTINATION_DATASET_ID/dicomStores/DESTINATION_DICOM_STORE_ID/dicomWeb/instances" | Select-Object -Expand ContentYou should receive a JSON response similar to the following:
The following table shows how the studies UID, series UID, and instances UID changed:Original instance metadata De-identified instance metadata Studies UID ( 0020000D
)2.25.70541616638819138568043293671559322355
1.3.6.1.4.1.11129.5.1.201854290391432893460946240745559593763
Series UID ( 0020000E
)1.2.276.0.7230010.3.1.3.8323329.78.1531234558.523694
1.3.6.1.4.1.11129.5.1.303327499491957026103380014864616068710
Instances UID ( 00080018
)1.2.276.0.7230010.3.1.4.8323329.78.1539083058.523695
1.3.6.1.4.1.11129.5.1.97415866390999888717168863957686758029
Using the new values, retrieve the metadata for the instance.
Before using any of the request data, make the following replacements:
- PROJECT_ID: the ID of your Google Cloud project
- DESTINATION_DATASET_ID: the ID of the destination dataset where de-identified data is written
- SOURCE_DATASET_LOCATION: the source dataset location
- DESTINATION_DICOM_STORE_ID: the ID of the DICOM store in the destination dataset. This is the same as the ID of the DICOM store in the source dataset.
To send your request, choose one of these options:
curl
Execute the following command:
curl -X GET \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
"https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/SOURCE_DATASET_LOCATION/datasets/DESTINATION_DATASET_ID/dicomStores/DESTINATION_DICOM_STORE_ID/dicomWeb/studies/1.3.6.1.4.1.11129.5.1.201854290391432893460946240745559593763/series/1.3.6.1.4.1.11129.5.1.303327499491957026103380014864616068710/instances/1.3.6.1.4.1.11129.5.1.97415866390999888717168863957686758029/metadata"PowerShell
Execute the following command:
$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }
Invoke-WebRequest `
-Method GET `
-Headers $headers `
-Uri "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/SOURCE_DATASET_LOCATION/datasets/DESTINATION_DATASET_ID/dicomStores/DESTINATION_DICOM_STORE_ID/dicomWeb/studies/1.3.6.1.4.1.11129.5.1.201854290391432893460946240745559593763/series/1.3.6.1.4.1.11129.5.1.303327499491957026103380014864616068710/instances/1.3.6.1.4.1.11129.5.1.97415866390999888717168863957686758029/metadata" | Select-Object -Expand ContentAPIs Explorer
Open the method reference page. The APIs Explorer panel opens on the right side of the page. You can interact with this tool to send requests. Complete any required fields and click Execute.
After de-identifying the image using REDACT_ALL_TEXT
, the image
looks like this. Notice that all the burned-in text at the bottom of the
image has been redacted.
Redact sensitive text with the Clean Descriptors Option
The following samples show how to de-identify a DICOM instance by setting
TextRedactionMode
to REDACT_SENSITIVE_TEXT_CLEAN_DESCRIPTORS
.
For more
information on the CleanDescriptorsOption
option, see
De-identify text with contextual de-identification.
REST
De-identify the dataset.
Before using any of the request data, make the following replacements:
- PROJECT_ID: the ID of your Google Cloud project
- SOURCE_DATASET_LOCATION: the source dataset location
- SOURCE_DATASET_ID: the ID of the dataset containing the data to de-identify
- DESTINATION_DATASET_ID: the ID of the destination dataset where de-identified data is written
Request JSON body:
{ "destinationDataset": "projects/PROJECT_ID/locations/SOURCE_DATASET_LOCATION/datasets/DESTINATION_DATASET_ID", "config": { "dicomTagConfig": { "options": { "cleanImage": { "textRedactionMode": "REDACT_SENSITIVE_TEXT_CLEAN_DESCRIPTORS" } } } } }
To send your request, choose one of these options:
curl
Save the request body in a file named
request.json
. Run the following command in the terminal to create or overwrite this file in the current directory:cat > request.json << 'EOF' { "destinationDataset": "projects/PROJECT_ID/locations/SOURCE_DATASET_LOCATION/datasets/DESTINATION_DATASET_ID", "config": { "dicomTagConfig": { "options": { "cleanImage": { "textRedactionMode": "REDACT_SENSITIVE_TEXT_CLEAN_DESCRIPTORS" } } } } } EOF
Then execute the following command to send your REST request:
curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
"https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/SOURCE_DATASET_LOCATION/datasets/SOURCE_DATASET_ID:deidentify"PowerShell
Save the request body in a file named
request.json
. Run the following command in the terminal to create or overwrite this file in the current directory:@' { "destinationDataset": "projects/PROJECT_ID/locations/SOURCE_DATASET_LOCATION/datasets/DESTINATION_DATASET_ID", "config": { "dicomTagConfig": { "options": { "cleanImage": { "textRedactionMode": "REDACT_SENSITIVE_TEXT_CLEAN_DESCRIPTORS" } } } } } '@ | Out-File -FilePath request.json -Encoding utf8
Then execute the following command to send your REST request:
$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }
Invoke-WebRequest `
-Method POST `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/SOURCE_DATASET_LOCATION/datasets/SOURCE_DATASET_ID:deidentify" | Select-Object -Expand ContentOPERATION_ID
. You need this value in the next step.Use the
projects.locations.datasets.operations.get
method to get the status of the long-running operation.Before using any of the request data, make the following replacements:
- PROJECT_ID: the ID of your Google Cloud project
- SOURCE_DATASET_LOCATION: the source dataset location
- SOURCE_DATASET_ID: the ID of the dataset containing the data to de-identify
- OPERATION_ID: the ID returned from the long-running operation
To send your request, choose one of these options:
curl
Execute the following command:
curl -X GET \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
"https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/SOURCE_DATASET_LOCATION/datasets/SOURCE_DATASET_ID/operations/OPERATION_ID"PowerShell
Execute the following command:
$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }
Invoke-WebRequest `
-Method GET `
-Headers $headers `
-Uri "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/SOURCE_DATASET_LOCATION/datasets/SOURCE_DATASET_ID/operations/OPERATION_ID" | Select-Object -Expand ContentAPIs Explorer
Open the method reference page. The APIs Explorer panel opens on the right side of the page. You can interact with this tool to send requests. Complete any required fields and click Execute.
"done": true
, the long-running operation has finished.After the de-identification succeeds, you can retrieve the metadata for the de-identified instance to see how it changed. The de-identified instance has a new studies UID, series UID, and instances UID, so you first need to search the new dataset for the de-identified instance.
Before using any of the request data, make the following replacements:
- PROJECT_ID: the ID of your Google Cloud project
- DESTINATION_DATASET_ID: the ID of the destination dataset where de-identified data is written
- SOURCE_DATASET_LOCATION: the source dataset location
- DESTINATION_DICOM_STORE_ID: the ID of the DICOM store in the destination dataset. This is the same as the ID of the DICOM store in the source dataset.
To send your request, choose one of these options:
curl
Execute the following command:
curl -X GET \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
"https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/SOURCE_DATASET_LOCATION/datasets/DESTINATION_DATASET_ID/dicomStores/DESTINATION_DICOM_STORE_ID/dicomWeb/instances"PowerShell
Execute the following command:
$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }
Invoke-WebRequest `
-Method GET `
-Headers $headers `
-Uri "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/SOURCE_DATASET_LOCATION/datasets/DESTINATION_DATASET_ID/dicomStores/DESTINATION_DICOM_STORE_ID/dicomWeb/instances" | Select-Object -Expand ContentYou should receive a JSON response similar to the following:
The following table shows how the studies UID, series UID, and instances UID changed:Original instance metadata De-identified instance metadata Studies UID ( 0020000D
)2.25.70541616638819138568043293671559322355
1.3.6.1.4.1.11129.5.1.201854290391432893460946240745559593763
Series UID ( 0020000E
)1.2.276.0.7230010.3.1.3.8323329.78.1531234558.523694
1.3.6.1.4.1.11129.5.1.303327499491957026103380014864616068710
Instances UID ( 00080018
)1.2.276.0.7230010.3.1.4.8323329.78.1539083058.523695
1.3.6.1.4.1.11129.5.1.97415866390999888717168863957686758029
Using the new values, retrieve the metadata for the instance.
Before using any of the request data, make the following replacements:
- PROJECT_ID: the ID of your Google Cloud project
- DESTINATION_DATASET_ID: the ID of the destination dataset where de-identified data is written
- SOURCE_DATASET_LOCATION: the source dataset location
- DESTINATION_DICOM_STORE_ID: the ID of the DICOM store in the destination dataset. This is the same as the ID of the DICOM store in the source dataset.
To send your request, choose one of these options:
curl
Execute the following command:
curl -X GET \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
"https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/SOURCE_DATASET_LOCATION/datasets/DESTINATION_DATASET_ID/dicomStores/DESTINATION_DICOM_STORE_ID/dicomWeb/studies/1.3.6.1.4.1.11129.5.1.201854290391432893460946240745559593763/series/1.3.6.1.4.1.11129.5.1.303327499491957026103380014864616068710/instances/1.3.6.1.4.1.11129.5.1.97415866390999888717168863957686758029/metadata"PowerShell
Execute the following command:
$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }
Invoke-WebRequest `
-Method GET `
-Headers $headers `
-Uri "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/SOURCE_DATASET_LOCATION/datasets/DESTINATION_DATASET_ID/dicomStores/DESTINATION_DICOM_STORE_ID/dicomWeb/studies/1.3.6.1.4.1.11129.5.1.201854290391432893460946240745559593763/series/1.3.6.1.4.1.11129.5.1.303327499491957026103380014864616068710/instances/1.3.6.1.4.1.11129.5.1.97415866390999888717168863957686758029/metadata" | Select-Object -Expand ContentAPIs Explorer
Open the method reference page. The APIs Explorer panel opens on the right side of the page. You can interact with this tool to send requests. Complete any required fields and click Execute.
After de-identifying the image using REDACT_SENSITIVE_TEXT_CLEAN_DESCRIPTORS
, the image
looks like this. Notice that not all the burned-in text at the bottom of the
image has been redacted. The text Female
is still shown, because
PatientSex (0010,0040)
isn't one of the default DICOM infoTypes.
De-identify DICOM tags
You can de-identify DICOM instances based on tag keywords in the DICOM metadata.
The following tag filtering methods are available in the
DicomTagConfig
Action
object:
You specify each Action
option as a list of DICOM tag IDs, names, or
Value Representations (VRs), and then the option performs an action on
the tags in the list. You cannot specify more than one Action
option
on a list of tags.
Each Action
object provides a queries[]
list where you specify a list
of tags. The following tag formats are supported:
- Tag IDs, such as
"00100010"
- Tag names, such as
"PatientName"
- Value Representations (VRs), such as
"PN"
There is no limit to the number of tags that can be provided in the queries[]
list.
However, each tag can only have a single Action
option performed on it.
To specify different tags that have different Action
options performed on them,
you must specify multiple Action
objects.
Keep tags
You can prevent the values of tags from being redacted by specifying the
tags in a KeepTag
object in the DicomTagConfig
object.
To produce a valid DICOM object while using a KeepTag
object, specify the MINIMAL_KEEP_LIST_PROFILE
or DEIDENTIFY_TAG_CONTENTS
values in the ProfileType
object.
By specifying either of these profiles, the following tags are automatically kept, ensuring that the de-identified DICOM instance is valid DICOM:
StudyInstanceUID
SeriesInstanceUID
SOPInstanceUID
TransferSyntaxUID
MediaStorageSOPInstanceUID
MediaStorageSOPClassUID
PixelData
Rows
Columns
SamplesPerPixel
BitsAllocated
BitsStored
Highbit
PhotometricInterpretation
PixelRepresentation
NumberOfFrames
PlanarConfiguration
PixelAspectRatio
SmallestImagePixelValue
LargestImagePixelValue
RedPaletteColorLookupTableDescriptor
GreenPaletteColorLookupTableDescriptor
BluePaletteColorLookupTableDescriptor
RedPaletteColorLookupTableData
GreenPaletteColorLookupTableData
BluePaletteColorLookupTableData
ICCProfile
ColorSpace
WindowCenter
WindowWidth
VOILUTFunction
The values for some of the preceding tags are regenerated, meaning that the values are replaced with a different value through a deterministic transformation. For more information, see Retain UIDs Option in the DICOM standard.
The values of StudyInstanceUID
, SeriesInstanceUID
, SOPInstanceUID
,
and MediaStorageSOPInstanceUID
are called "primary IDs." To determine
how primary IDs are transformed, specify a value in
PrimaryIdsOption
.
The following samples show how to use the KeepTag
object to keep the values of
specific tags unchanged during de-identification.
The PatientName
tag is added in the queries[]
list, so the PatientName
value isn't redacted during de-identification.
Because PrimaryIdsOption
isn't specified in the sample,
the primaryIds
field defaults to PRIMARY_IDS_OPTION_UNSPECIFIED
, which defaults
to the value in ProfileType
. Because ProfileType
is also not specified,
the profileType
field defaults to PROFILE_TYPE_UNSPECIFIED
, which removes
tags based on the Attribute Confidentiality Basic Profile (DICOM Standard Edition 2018e).
REST
De-identify the dataset.
Before using any of the request data, make the following replacements:
- PROJECT_ID: the ID of your Google Cloud project
- SOURCE_DATASET_LOCATION: the source dataset location
- SOURCE_DATASET_ID: the ID of the dataset containing the data to de-identify
- DESTINATION_DATASET_ID: the ID of the destination dataset where de-identified data is written
Request JSON body:
{ "destinationDataset": "projects/PROJECT_ID/locations/SOURCE_DATASET_LOCATION/datasets/DESTINATION_DATASET_ID", "config": { "dicomTagConfig": { "actions": [ { "queries": [ "PatientName" ], "keepTag": {} } ] } } }
To send your request, choose one of these options:
curl
Save the request body in a file named
request.json
. Run the following command in the terminal to create or overwrite this file in the current directory:cat > request.json << 'EOF' { "destinationDataset": "projects/PROJECT_ID/locations/SOURCE_DATASET_LOCATION/datasets/DESTINATION_DATASET_ID", "config": { "dicomTagConfig": { "actions": [ { "queries": [ "PatientName" ], "keepTag": {} } ] } } } EOF
Then execute the following command to send your REST request:
curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
"https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/SOURCE_DATASET_LOCATION/datasets/SOURCE_DATASET_ID:deidentify"PowerShell
Save the request body in a file named
request.json
. Run the following command in the terminal to create or overwrite this file in the current directory:@' { "destinationDataset": "projects/PROJECT_ID/locations/SOURCE_DATASET_LOCATION/datasets/DESTINATION_DATASET_ID", "config": { "dicomTagConfig": { "actions": [ { "queries": [ "PatientName" ], "keepTag": {} } ] } } } '@ | Out-File -FilePath request.json -Encoding utf8
Then execute the following command to send your REST request:
$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }
Invoke-WebRequest `
-Method POST `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/SOURCE_DATASET_LOCATION/datasets/SOURCE_DATASET_ID:deidentify" | Select-Object -Expand ContentOPERATION_ID
. You need this value in the next step.Use the
projects.locations.datasets.operations.get
method to get the status of the long-running operation.Before using any of the request data, make the following replacements:
- PROJECT_ID: the ID of your Google Cloud project
- SOURCE_DATASET_LOCATION: the source dataset location
- SOURCE_DATASET_ID: the ID of the dataset containing the data to de-identify
- OPERATION_ID: the ID returned from the long-running operation
To send your request, choose one of these options:
curl
Execute the following command:
curl -X GET \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
"https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/SOURCE_DATASET_LOCATION/datasets/SOURCE_DATASET_ID/operations/OPERATION_ID"PowerShell
Execute the following command:
$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }
Invoke-WebRequest `
-Method GET `
-Headers $headers `
-Uri "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/SOURCE_DATASET_LOCATION/datasets/SOURCE_DATASET_ID/operations/OPERATION_ID" | Select-Object -Expand ContentAPIs Explorer
Open the method reference page. The APIs Explorer panel opens on the right side of the page. You can interact with this tool to send requests. Complete any required fields and click Execute.
"done": true
, the long-running operation has finished.After the de-identification succeeds, you can retrieve the metadata for the de-identified instance to see how it changed. The de-identified instance has a new studies UID, series UID, and instances UID, so you first need to search the new dataset for the de-identified instance.
Before using any of the request data, make the following replacements:
- PROJECT_ID: the ID of your Google Cloud project
- DESTINATION_DATASET_ID: the ID of the destination dataset where de-identified data is written
- SOURCE_DATASET_LOCATION: the source dataset location
- DESTINATION_DICOM_STORE_ID: the ID of the DICOM store in the destination dataset. This is the same as the ID of the DICOM store in the source dataset.
To send your request, choose one of these options:
curl
Execute the following command:
curl -X GET \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
"https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/SOURCE_DATASET_LOCATION/datasets/DESTINATION_DATASET_ID/dicomStores/DESTINATION_DICOM_STORE_ID/dicomWeb/instances"PowerShell
Execute the following command:
$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }
Invoke-WebRequest `
-Method GET `
-Headers $headers `
-Uri "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/SOURCE_DATASET_LOCATION/datasets/DESTINATION_DATASET_ID/dicomStores/DESTINATION_DICOM_STORE_ID/dicomWeb/instances" | Select-Object -Expand ContentYou should receive a JSON response similar to the following:
The following table shows how the studies UID, series UID, and instances UID changed:Original instance metadata De-identified instance metadata Studies UID ( 0020000D
)2.25.70541616638819138568043293671559322355
1.3.6.1.4.1.11129.5.1.201854290391432893460946240745559593763
Series UID ( 0020000E
)1.2.276.0.7230010.3.1.3.8323329.78.1531234558.523694
1.3.6.1.4.1.11129.5.1.303327499491957026103380014864616068710
Instances UID ( 00080018
)1.2.276.0.7230010.3.1.4.8323329.78.1539083058.523695
1.3.6.1.4.1.11129.5.1.97415866390999888717168863957686758029
Using the new values, retrieve the metadata for the instance.
Before using any of the request data, make the following replacements:
- PROJECT_ID: the ID of your Google Cloud project
- DESTINATION_DATASET_ID: the ID of the destination dataset where de-identified data is written
- SOURCE_DATASET_LOCATION: the source dataset location
- DESTINATION_DICOM_STORE_ID: the ID of the DICOM store in the destination dataset. This is the same as the ID of the DICOM store in the source dataset.
To send your request, choose one of these options:
curl
Execute the following command:
curl -X GET \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
"https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/SOURCE_DATASET_LOCATION/datasets/DESTINATION_DATASET_ID/dicomStores/DESTINATION_DICOM_STORE_ID/dicomWeb/studies/1.3.6.1.4.1.11129.5.1.201854290391432893460946240745559593763/series/1.3.6.1.4.1.11129.5.1.303327499491957026103380014864616068710/instances/1.3.6.1.4.1.11129.5.1.97415866390999888717168863957686758029/metadata"PowerShell
Execute the following command:
$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }
Invoke-WebRequest `
-Method GET `
-Headers $headers `
-Uri "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/SOURCE_DATASET_LOCATION/datasets/DESTINATION_DATASET_ID/dicomStores/DESTINATION_DICOM_STORE_ID/dicomWeb/studies/1.3.6.1.4.1.11129.5.1.201854290391432893460946240745559593763/series/1.3.6.1.4.1.11129.5.1.303327499491957026103380014864616068710/instances/1.3.6.1.4.1.11129.5.1.97415866390999888717168863957686758029/metadata" | Select-Object -Expand ContentAPIs Explorer
Open the method reference page. The APIs Explorer panel opens on the right side of the page. You can interact with this tool to send requests. Complete any required fields and click Execute.
Remove tags
The following samples show how to use the RemoveTag
object to remove
the values of specific tags during de-identification. A removed tag is
replaced with an empty value.
In the following samples, the PatientName
tag is added in the
queries[]
list, so its value is replaced with an empty
value during de-identification.
REST
De-identify the dataset.
Before using any of the request data, make the following replacements:
- PROJECT_ID: the ID of your Google Cloud project
- SOURCE_DATASET_LOCATION: the source dataset location
- SOURCE_DATASET_ID: the ID of the dataset containing the data to de-identify
- DESTINATION_DATASET_ID: the ID of the destination dataset where de-identified data is written
Request JSON body:
{ "destinationDataset": "projects/PROJECT_ID/locations/SOURCE_DATASET_LOCATION/datasets/DESTINATION_DATASET_ID", "config": { "dicomTagConfig": { "actions": [ { "queries": [ "PatientName" ], "removeTag": {} } ] } } }
To send your request, choose one of these options:
curl
Save the request body in a file named
request.json
. Run the following command in the terminal to create or overwrite this file in the current directory:cat > request.json << 'EOF' { "destinationDataset": "projects/PROJECT_ID/locations/SOURCE_DATASET_LOCATION/datasets/DESTINATION_DATASET_ID", "config": { "dicomTagConfig": { "actions": [ { "queries": [ "PatientName" ], "removeTag": {} } ] } } } EOF
Then execute the following command to send your REST request:
curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
"https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/SOURCE_DATASET_LOCATION/datasets/SOURCE_DATASET_ID:deidentify"PowerShell
Save the request body in a file named
request.json
. Run the following command in the terminal to create or overwrite this file in the current directory:@' { "destinationDataset": "projects/PROJECT_ID/locations/SOURCE_DATASET_LOCATION/datasets/DESTINATION_DATASET_ID", "config": { "dicomTagConfig": { "actions": [ { "queries": [ "PatientName" ], "removeTag": {} } ] } } } '@ | Out-File -FilePath request.json -Encoding utf8
Then execute the following command to send your REST request:
$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }
Invoke-WebRequest `
-Method POST `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/SOURCE_DATASET_LOCATION/datasets/SOURCE_DATASET_ID:deidentify" | Select-Object -Expand ContentOPERATION_ID
. You need this value in the next step.Use the
projects.locations.datasets.operations.get
method to get the status of the long-running operation.Before using any of the request data, make the following replacements:
- PROJECT_ID: the ID of your Google Cloud project
- SOURCE_DATASET_LOCATION: the source dataset location
- SOURCE_DATASET_ID: the ID of the dataset containing the data to de-identify
- OPERATION_ID: the ID returned from the long-running operation
To send your request, choose one of these options:
curl
Execute the following command:
curl -X GET \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
"https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/SOURCE_DATASET_LOCATION/datasets/SOURCE_DATASET_ID/operations/OPERATION_ID"PowerShell
Execute the following command:
$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }
Invoke-WebRequest `
-Method GET `
-Headers $headers `
-Uri "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/SOURCE_DATASET_LOCATION/datasets/SOURCE_DATASET_ID/operations/OPERATION_ID" | Select-Object -Expand ContentAPIs Explorer
Open the method reference page. The APIs Explorer panel opens on the right side of the page. You can interact with this tool to send requests. Complete any required fields and click Execute.
"done": true
, the long-running operation has finished.After the de-identification succeeds, you can retrieve the metadata for the de-identified instance to see how it changed. The de-identified instance has a new studies UID, series UID, and instances UID, so you first need to search the new dataset for the de-identified instance.
Before using any of the request data, make the following replacements:
- PROJECT_ID: the ID of your Google Cloud project
- DESTINATION_DATASET_ID: the ID of the destination dataset where de-identified data is written
- SOURCE_DATASET_LOCATION: the source dataset location
- DESTINATION_DICOM_STORE_ID: the ID of the DICOM store in the destination dataset. This is the same as the ID of the DICOM store in the source dataset.
To send your request, choose one of these options:
curl
Execute the following command:
curl -X GET \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
"https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/SOURCE_DATASET_LOCATION/datasets/DESTINATION_DATASET_ID/dicomStores/DESTINATION_DICOM_STORE_ID/dicomWeb/instances"PowerShell
Execute the following command:
$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }
Invoke-WebRequest `
-Method GET `
-Headers $headers `
-Uri "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/SOURCE_DATASET_LOCATION/datasets/DESTINATION_DATASET_ID/dicomStores/DESTINATION_DICOM_STORE_ID/dicomWeb/instances" | Select-Object -Expand ContentYou should receive a JSON response similar to the following:
The following table shows how the studies UID, series UID, and instances UID changed:Original instance metadata De-identified instance metadata Studies UID ( 0020000D
)2.25.70541616638819138568043293671559322355
1.3.6.1.4.1.11129.5.1.201854290391432893460946240745559593763
Series UID ( 0020000E
)1.2.276.0.7230010.3.1.3.8323329.78.1531234558.523694
1.3.6.1.4.1.11129.5.1.303327499491957026103380014864616068710
Instances UID ( 00080018
)1.2.276.0.7230010.3.1.4.8323329.78.1539083058.523695
1.3.6.1.4.1.11129.5.1.97415866390999888717168863957686758029
Using the new values, retrieve the metadata for the instance.
Before using any of the request data, make the following replacements:
- PROJECT_ID: the ID of your Google Cloud project
- DESTINATION_DATASET_ID: the ID of the destination dataset where de-identified data is written
- SOURCE_DATASET_LOCATION: the source dataset location
- DESTINATION_DICOM_STORE_ID: the ID of the DICOM store in the destination dataset. This is the same as the ID of the DICOM store in the source dataset.
To send your request, choose one of these options:
curl
Execute the following command:
curl -X GET \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
"https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/SOURCE_DATASET_LOCATION/datasets/DESTINATION_DATASET_ID/dicomStores/DESTINATION_DICOM_STORE_ID/dicomWeb/studies/1.3.6.1.4.1.11129.5.1.201854290391432893460946240745559593763/series/1.3.6.1.4.1.11129.5.1.303327499491957026103380014864616068710/instances/1.3.6.1.4.1.11129.5.1.97415866390999888717168863957686758029/metadata"PowerShell
Execute the following command:
$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }
Invoke-WebRequest `
-Method GET `
-Headers $headers `
-Uri "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/SOURCE_DATASET_LOCATION/datasets/DESTINATION_DATASET_ID/dicomStores/DESTINATION_DICOM_STORE_ID/dicomWeb/studies/1.3.6.1.4.1.11129.5.1.201854290391432893460946240745559593763/series/1.3.6.1.4.1.11129.5.1.303327499491957026103380014864616068710/instances/1.3.6.1.4.1.11129.5.1.97415866390999888717168863957686758029/metadata" | Select-Object -Expand ContentAPIs Explorer
Open the method reference page. The APIs Explorer panel opens on the right side of the page. You can interact with this tool to send requests. Complete any required fields and click Execute.
Delete tags
The following samples show how to use the DeleteTag
object to delete
specific tags during de-identification.
REST
De-identify the dataset.
Before using any of the request data, make the following replacements:
- PROJECT_ID: the ID of your Google Cloud project
- SOURCE_DATASET_LOCATION: the source dataset location
- SOURCE_DATASET_ID: the ID of the dataset containing the data to de-identify
- DESTINATION_DATASET_ID: the ID of the destination dataset where de-identified data is written
Request JSON body:
{ "destinationDataset": "projects/PROJECT_ID/locations/SOURCE_DATASET_LOCATION/datasets/DESTINATION_DATASET_ID", "config": { "dicomTagConfig": { "actions": [ { "queries": [ "PatientName" ], "deleteTag": {} } ] } } }
To send your request, choose one of these options:
curl
Save the request body in a file named
request.json
. Run the following command in the terminal to create or overwrite this file in the current directory:cat > request.json << 'EOF' { "destinationDataset": "projects/PROJECT_ID/locations/SOURCE_DATASET_LOCATION/datasets/DESTINATION_DATASET_ID", "config": { "dicomTagConfig": { "actions": [ { "queries": [ "PatientName" ], "deleteTag": {} } ] } } } EOF
Then execute the following command to send your REST request:
curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
"https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/SOURCE_DATASET_LOCATION/datasets/SOURCE_DATASET_ID:deidentify"PowerShell
Save the request body in a file named
request.json
. Run the following command in the terminal to create or overwrite this file in the current directory:@' { "destinationDataset": "projects/PROJECT_ID/locations/SOURCE_DATASET_LOCATION/datasets/DESTINATION_DATASET_ID", "config": { "dicomTagConfig": { "actions": [ { "queries": [ "PatientName" ], "deleteTag": {} } ] } } } '@ | Out-File -FilePath request.json -Encoding utf8
Then execute the following command to send your REST request:
$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }
Invoke-WebRequest `
-Method POST `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/SOURCE_DATASET_LOCATION/datasets/SOURCE_DATASET_ID:deidentify" | Select-Object -Expand ContentOPERATION_ID
. You need this value in the next step.Use the
projects.locations.datasets.operations.get
method to get the status of the long-running operation.Before using any of the request data, make the following replacements:
- PROJECT_ID: the ID of your Google Cloud project
- SOURCE_DATASET_LOCATION: the source dataset location
- SOURCE_DATASET_ID: the ID of the dataset containing the data to de-identify
- OPERATION_ID: the ID returned from the long-running operation
To send your request, choose one of these options:
curl
Execute the following command:
curl -X GET \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
"https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/SOURCE_DATASET_LOCATION/datasets/SOURCE_DATASET_ID/operations/OPERATION_ID"PowerShell
Execute the following command:
$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }
Invoke-WebRequest `
-Method GET `
-Headers $headers `
-Uri "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/SOURCE_DATASET_LOCATION/datasets/SOURCE_DATASET_ID/operations/OPERATION_ID" | Select-Object -Expand ContentAPIs Explorer
Open the method reference page. The APIs Explorer panel opens on the right side of the page. You can interact with this tool to send requests. Complete any required fields and click Execute.
"done": true
, the long-running operation has finished.After the de-identification succeeds, you can retrieve the metadata for the de-identified instance to see how it changed. The de-identified instance has a new studies UID, series UID, and instances UID, so you first need to search the new dataset for the de-identified instance.
Before using any of the request data, make the following replacements:
- PROJECT_ID: the ID of your Google Cloud project
- DESTINATION_DATASET_ID: the ID of the destination dataset where de-identified data is written
- SOURCE_DATASET_LOCATION: the source dataset location
- DESTINATION_DICOM_STORE_ID: the ID of the DICOM store in the destination dataset. This is the same as the ID of the DICOM store in the source dataset.
To send your request, choose one of these options:
curl
Execute the following command:
curl -X GET \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
"https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/SOURCE_DATASET_LOCATION/datasets/DESTINATION_DATASET_ID/dicomStores/DESTINATION_DICOM_STORE_ID/dicomWeb/instances"PowerShell
Execute the following command:
$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }
Invoke-WebRequest `
-Method GET `
-Headers $headers `
-Uri "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/SOURCE_DATASET_LOCATION/datasets/DESTINATION_DATASET_ID/dicomStores/DESTINATION_DICOM_STORE_ID/dicomWeb/instances" | Select-Object -Expand ContentYou should receive a JSON response similar to the following:
The following table shows how the studies UID, series UID, and instances UID changed:Original instance metadata De-identified instance metadata Studies UID ( 0020000D
)2.25.70541616638819138568043293671559322355
1.3.6.1.4.1.11129.5.1.201854290391432893460946240745559593763
Series UID ( 0020000E
)1.2.276.0.7230010.3.1.3.8323329.78.1531234558.523694
1.3.6.1.4.1.11129.5.1.303327499491957026103380014864616068710
Instances UID ( 00080018
)1.2.276.0.7230010.3.1.4.8323329.78.1539083058.523695
1.3.6.1.4.1.11129.5.1.97415866390999888717168863957686758029
Using the new values, retrieve the metadata for the instance.
Before using any of the request data, make the following replacements:
- PROJECT_ID: the ID of your Google Cloud project
- DESTINATION_DATASET_ID: the ID of the destination dataset where de-identified data is written
- SOURCE_DATASET_LOCATION: the source dataset location
- DESTINATION_DICOM_STORE_ID: the ID of the DICOM store in the destination dataset. This is the same as the ID of the DICOM store in the source dataset.
To send your request, choose one of these options:
curl
Execute the following command:
curl -X GET \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
"https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/SOURCE_DATASET_LOCATION/datasets/DESTINATION_DATASET_ID/dicomStores/DESTINATION_DICOM_STORE_ID/dicomWeb/studies/1.3.6.1.4.1.11129.5.1.201854290391432893460946240745559593763/series/1.3.6.1.4.1.11129.5.1.303327499491957026103380014864616068710/instances/1.3.6.1.4.1.11129.5.1.97415866390999888717168863957686758029/metadata"PowerShell
Execute the following command:
$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }
Invoke-WebRequest `
-Method GET `
-Headers $headers `
-Uri "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/SOURCE_DATASET_LOCATION/datasets/DESTINATION_DATASET_ID/dicomStores/DESTINATION_DICOM_STORE_ID/dicomWeb/studies/1.3.6.1.4.1.11129.5.1.201854290391432893460946240745559593763/series/1.3.6.1.4.1.11129.5.1.303327499491957026103380014864616068710/instances/1.3.6.1.4.1.11129.5.1.97415866390999888717168863957686758029/metadata" | Select-Object -Expand ContentAPIs Explorer
Open the method reference page. The APIs Explorer panel opens on the right side of the page. You can interact with this tool to send requests. Complete any required fields and click Execute.
Reset tags to a placeholder value
The following samples show how to use the ResetTag
object to set
the value of tags to the string PLACEHOLDER
during de-identification.
REST
De-identify the dataset.
Before using any of the request data, make the following replacements:
- PROJECT_ID: the ID of your Google Cloud project
- SOURCE_DATASET_LOCATION: the source dataset location
- SOURCE_DATASET_ID: the ID of the dataset containing the data to de-identify
- DESTINATION_DATASET_ID: the ID of the destination dataset where de-identified data is written
Request JSON body:
{ "destinationDataset": "projects/PROJECT_ID/locations/SOURCE_DATASET_LOCATION/datasets/DESTINATION_DATASET_ID", "config": { "dicomTagConfig": { "actions": [ { "queries": [ "PatientName" ], "resetTag": {} } ] } } }
To send your request, choose one of these options:
curl
Save the request body in a file named
request.json
. Run the following command in the terminal to create or overwrite this file in the current directory:cat > request.json << 'EOF' { "destinationDataset": "projects/PROJECT_ID/locations/SOURCE_DATASET_LOCATION/datasets/DESTINATION_DATASET_ID", "config": { "dicomTagConfig": { "actions": [ { "queries": [ "PatientName" ], "resetTag": {} } ] } } } EOF
Then execute the following command to send your REST request:
curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
"https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/SOURCE_DATASET_LOCATION/datasets/SOURCE_DATASET_ID:deidentify"PowerShell
Save the request body in a file named
request.json
. Run the following command in the terminal to create or overwrite this file in the current directory:@' { "destinationDataset": "projects/PROJECT_ID/locations/SOURCE_DATASET_LOCATION/datasets/DESTINATION_DATASET_ID", "config": { "dicomTagConfig": { "actions": [ { "queries": [ "PatientName" ], "resetTag": {} } ] } } } '@ | Out-File -FilePath request.json -Encoding utf8
Then execute the following command to send your REST request:
$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }
Invoke-WebRequest `
-Method POST `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/SOURCE_DATASET_LOCATION/datasets/SOURCE_DATASET_ID:deidentify" | Select-Object -Expand ContentOPERATION_ID
. You need this value in the next step.Use the
projects.locations.datasets.operations.get
method to get the status of the long-running operation.Before using any of the request data, make the following replacements:
- PROJECT_ID: the ID of your Google Cloud project
- SOURCE_DATASET_LOCATION: the source dataset location
- SOURCE_DATASET_ID: the ID of the dataset containing the data to de-identify
- OPERATION_ID: the ID returned from the long-running operation
To send your request, choose one of these options:
curl
Execute the following command:
curl -X GET \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
"https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/SOURCE_DATASET_LOCATION/datasets/SOURCE_DATASET_ID/operations/OPERATION_ID"PowerShell
Execute the following command:
$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }
Invoke-WebRequest `
-Method GET `
-Headers $headers `
-Uri "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/SOURCE_DATASET_LOCATION/datasets/SOURCE_DATASET_ID/operations/OPERATION_ID" | Select-Object -Expand ContentAPIs Explorer
Open the method reference page. The APIs Explorer panel opens on the right side of the page. You can interact with this tool to send requests. Complete any required fields and click Execute.
"done": true
, the long-running operation has finished.After the de-identification succeeds, you can retrieve the metadata for the de-identified instance to see how it changed. The de-identified instance has a new studies UID, series UID, and instances UID, so you first need to search the new dataset for the de-identified instance.
Before using any of the request data, make the following replacements:
- PROJECT_ID: the ID of your Google Cloud project
- DESTINATION_DATASET_ID: the ID of the destination dataset where de-identified data is written
- SOURCE_DATASET_LOCATION: the source dataset location
- DESTINATION_DICOM_STORE_ID: the ID of the DICOM store in the destination dataset. This is the same as the ID of the DICOM store in the source dataset.
To send your request, choose one of these options:
curl
Execute the following command:
curl -X GET \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
"https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/SOURCE_DATASET_LOCATION/datasets/DESTINATION_DATASET_ID/dicomStores/DESTINATION_DICOM_STORE_ID/dicomWeb/instances"PowerShell
Execute the following command:
$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }
Invoke-WebRequest `
-Method GET `
-Headers $headers `
-Uri "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/SOURCE_DATASET_LOCATION/datasets/DESTINATION_DATASET_ID/dicomStores/DESTINATION_DICOM_STORE_ID/dicomWeb/instances" | Select-Object -Expand ContentYou should receive a JSON response similar to the following:
The following table shows how the studies UID, series UID, and instances UID changed:Original instance metadata De-identified instance metadata Studies UID ( 0020000D
)2.25.70541616638819138568043293671559322355
1.3.6.1.4.1.11129.5.1.201854290391432893460946240745559593763
Series UID ( 0020000E
)1.2.276.0.7230010.3.1.3.8323329.78.1531234558.523694
1.3.6.1.4.1.11129.5.1.303327499491957026103380014864616068710
Instances UID ( 00080018
)1.2.276.0.7230010.3.1.4.8323329.78.1539083058.523695
1.3.6.1.4.1.11129.5.1.97415866390999888717168863957686758029
Using the new values, retrieve the metadata for the instance.
Before using any of the request data, make the following replacements:
- PROJECT_ID: the ID of your Google Cloud project
- DESTINATION_DATASET_ID: the ID of the destination dataset where de-identified data is written
- SOURCE_DATASET_LOCATION: the source dataset location
- DESTINATION_DICOM_STORE_ID: the ID of the DICOM store in the destination dataset. This is the same as the ID of the DICOM store in the source dataset.
To send your request, choose one of these options:
curl
Execute the following command:
curl -X GET \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
"https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/SOURCE_DATASET_LOCATION/datasets/DESTINATION_DATASET_ID/dicomStores/DESTINATION_DICOM_STORE_ID/dicomWeb/studies/1.3.6.1.4.1.11129.5.1.201854290391432893460946240745559593763/series/1.3.6.1.4.1.11129.5.1.303327499491957026103380014864616068710/instances/1.3.6.1.4.1.11129.5.1.97415866390999888717168863957686758029/metadata"PowerShell
Execute the following command:
$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }
Invoke-WebRequest `
-Method GET `
-Headers $headers `
-Uri "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/SOURCE_DATASET_LOCATION/datasets/DESTINATION_DATASET_ID/dicomStores/DESTINATION_DICOM_STORE_ID/dicomWeb/studies/1.3.6.1.4.1.11129.5.1.201854290391432893460946240745559593763/series/1.3.6.1.4.1.11129.5.1.303327499491957026103380014864616068710/instances/1.3.6.1.4.1.11129.5.1.97415866390999888717168863957686758029/metadata" | Select-Object -Expand ContentAPIs Explorer
Open the method reference page. The APIs Explorer panel opens on the right side of the page. You can interact with this tool to send requests. Complete any required fields and click Execute.
In particular, notice that the value of the
PatientName
tag is set toPLACEHOLDER
:Original metadata
Metadata after running ResetTag
"00100010": { "vr": "PN", "Value": [ { "Alphabetic": "Ann Johnson" } ] }
"00100010": { "vr": "PN", "Value": [ { "Alphabetic": "PLACEHOLDER" } ] }
Inspect and transform sensitive text in tags
The following samples show how to use the CleanTextTag
object to inspect tags
and transform the values of the tags according to the configuration in
TextConfig
object.
In these samples, the following options are set in the Actions
object:
- A
CleanTextTag
object. - A
queries[]
list containing thePatientName
DICOM tag.
The following options are set in the TextConfig
object:
- An
InfoTypeTransformation
object which transforms text that matches a particular infoType. - A
ReplaceWithInfoTypeConfig
object which replaces any matching text with the name of the infoType. - An
infoTypes[]
list containing thePERSON_NAME
infoType.
With these configurations set, the de-identification operation inspects the
PatientName
tag, matches the tag to the PERSON_NAME
infoType, and replaces
the tag's value with the PERSON_NAME
infoType.
The PatientName
tag has a Value Representation (VR) of PN
, which is one of
the supported VRs in the CleanTextTag
object.
REST
De-identify the dataset.
Before using any of the request data, make the following replacements:
- PROJECT_ID: the ID of your Google Cloud project
- SOURCE_DATASET_LOCATION: the source dataset location
- SOURCE_DATASET_ID: the ID of the dataset containing the data to de-identify
- DESTINATION_DATASET_ID: the ID of the destination dataset where de-identified data is written
Request JSON body:
{ "destinationDataset": "projects/PROJECT_ID/locations/SOURCE_DATASET_LOCATION/datasets/DESTINATION_DATASET_ID", "config": { "dicomTagConfig": { "actions": [ { "queries": [ "PatientName" ], "cleanTextTag": {} } ] }, "text": { "additionalTransformations": [ { "infoTypes": [ "PERSON_NAME" ], "replaceWithInfoTypeConfig": {} } ] } } }
To send your request, choose one of these options:
curl
Save the request body in a file named
request.json
. Run the following command in the terminal to create or overwrite this file in the current directory:cat > request.json << 'EOF' { "destinationDataset": "projects/PROJECT_ID/locations/SOURCE_DATASET_LOCATION/datasets/DESTINATION_DATASET_ID", "config": { "dicomTagConfig": { "actions": [ { "queries": [ "PatientName" ], "cleanTextTag": {} } ] }, "text": { "additionalTransformations": [ { "infoTypes": [ "PERSON_NAME" ], "replaceWithInfoTypeConfig": {} } ] } } } EOF
Then execute the following command to send your REST request:
curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
"https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/SOURCE_DATASET_LOCATION/datasets/SOURCE_DATASET_ID:deidentify"PowerShell
Save the request body in a file named
request.json
. Run the following command in the terminal to create or overwrite this file in the current directory:@' { "destinationDataset": "projects/PROJECT_ID/locations/SOURCE_DATASET_LOCATION/datasets/DESTINATION_DATASET_ID", "config": { "dicomTagConfig": { "actions": [ { "queries": [ "PatientName" ], "cleanTextTag": {} } ] }, "text": { "additionalTransformations": [ { "infoTypes": [ "PERSON_NAME" ], "replaceWithInfoTypeConfig": {} } ] } } } '@ | Out-File -FilePath request.json -Encoding utf8
Then execute the following command to send your REST request:
$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }
Invoke-WebRequest `
-Method POST `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/SOURCE_DATASET_LOCATION/datasets/SOURCE_DATASET_ID:deidentify" | Select-Object -Expand ContentOPERATION_ID
. You need this value in the next step.Use the
projects.locations.datasets.operations.get
method to get the status of the long-running operation.Before using any of the request data, make the following replacements:
- PROJECT_ID: the ID of your Google Cloud project
- SOURCE_DATASET_LOCATION: the source dataset location
- SOURCE_DATASET_ID: the ID of the dataset containing the data to de-identify
- OPERATION_ID: the ID returned from the long-running operation
To send your request, choose one of these options:
curl
Execute the following command:
curl -X GET \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
"https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/SOURCE_DATASET_LOCATION/datasets/SOURCE_DATASET_ID/operations/OPERATION_ID"PowerShell
Execute the following command:
$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }
Invoke-WebRequest `
-Method GET `
-Headers $headers `
-Uri "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/SOURCE_DATASET_LOCATION/datasets/SOURCE_DATASET_ID/operations/OPERATION_ID" | Select-Object -Expand ContentAPIs Explorer
Open the method reference page. The APIs Explorer panel opens on the right side of the page. You can interact with this tool to send requests. Complete any required fields and click Execute.
"done": true
, the long-running operation has finished.After the de-identification succeeds, you can retrieve the metadata for the de-identified instance to see how it changed. The de-identified instance has a new studies UID, series UID, and instances UID, so you first need to search the new dataset for the de-identified instance.
Before using any of the request data, make the following replacements:
- PROJECT_ID: the ID of your Google Cloud project
- DESTINATION_DATASET_ID: the ID of the destination dataset where de-identified data is written
- SOURCE_DATASET_LOCATION: the source dataset location
- DESTINATION_DICOM_STORE_ID: the ID of the DICOM store in the destination dataset. This is the same as the ID of the DICOM store in the source dataset.
To send your request, choose one of these options:
curl
Execute the following command:
curl -X GET \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
"https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/SOURCE_DATASET_LOCATION/datasets/DESTINATION_DATASET_ID/dicomStores/DESTINATION_DICOM_STORE_ID/dicomWeb/instances"PowerShell
Execute the following command:
$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }
Invoke-WebRequest `
-Method GET `
-Headers $headers `
-Uri "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/SOURCE_DATASET_LOCATION/datasets/DESTINATION_DATASET_ID/dicomStores/DESTINATION_DICOM_STORE_ID/dicomWeb/instances" | Select-Object -Expand ContentYou should receive a JSON response similar to the following:
The following table shows how the studies UID, series UID, and instances UID changed:Original instance metadata De-identified instance metadata Studies UID ( 0020000D
)2.25.70541616638819138568043293671559322355
1.3.6.1.4.1.11129.5.1.201854290391432893460946240745559593763
Series UID ( 0020000E
)1.2.276.0.7230010.3.1.3.8323329.78.1531234558.523694
1.3.6.1.4.1.11129.5.1.303327499491957026103380014864616068710
Instances UID ( 00080018
)1.2.276.0.7230010.3.1.4.8323329.78.1539083058.523695
1.3.6.1.4.1.11129.5.1.97415866390999888717168863957686758029
Using the new values, retrieve the metadata for the instance.
Before using any of the request data, make the following replacements:
- PROJECT_ID: the ID of your Google Cloud project
- DESTINATION_DATASET_ID: the ID of the destination dataset where de-identified data is written
- SOURCE_DATASET_LOCATION: the source dataset location
- DESTINATION_DICOM_STORE_ID: the ID of the DICOM store in the destination dataset. This is the same as the ID of the DICOM store in the source dataset.
To send your request, choose one of these options:
curl
Execute the following command:
curl -X GET \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
"https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/SOURCE_DATASET_LOCATION/datasets/DESTINATION_DATASET_ID/dicomStores/DESTINATION_DICOM_STORE_ID/dicomWeb/studies/1.3.6.1.4.1.11129.5.1.201854290391432893460946240745559593763/series/1.3.6.1.4.1.11129.5.1.303327499491957026103380014864616068710/instances/1.3.6.1.4.1.11129.5.1.97415866390999888717168863957686758029/metadata"PowerShell
Execute the following command:
$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }
Invoke-WebRequest `
-Method GET `
-Headers $headers `
-Uri "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/SOURCE_DATASET_LOCATION/datasets/DESTINATION_DATASET_ID/dicomStores/DESTINATION_DICOM_STORE_ID/dicomWeb/studies/1.3.6.1.4.1.11129.5.1.201854290391432893460946240745559593763/series/1.3.6.1.4.1.11129.5.1.303327499491957026103380014864616068710/instances/1.3.6.1.4.1.11129.5.1.97415866390999888717168863957686758029/metadata" | Select-Object -Expand ContentAPIs Explorer
Open the method reference page. The APIs Explorer panel opens on the right side of the page. You can interact with this tool to send requests. Complete any required fields and click Execute.
In particular, notice that the value of the
PatientName
tag is set to[PERSON_NAME]
:Original metadata
Metadata after running CleanTextTag
"00100010": { "vr": "PN", "Value": [ { "Alphabetic": "Ann Johnson" } ] }
"00100010": { "vr": "PN", "Value": [ { "Alphabetic": "[PERSON_NAME]" } ] }
Replace a UID with a generated UID
The following samples show how to use the RegenUidTag
object to replace a
UID with a newly generated UID. The only VR that the RegenUidTag
object
supports is UI
.
By default, every tag in the sample metadata with a VR
of UI
has its UID generated during de-identification. To show how to generate
a UID for a specific tag, the following options are set
in the sample:
ProfileType
is set to theKEEP_ALL
enum, which prevents any DICOM metadata from being de-identified.PrimaryIdsOption
is set to theKEEP
enum, which leaves the primary IDs (StudyInstanceUID
,SeriesInstanceUID
,SOPInstanceUID
, andMediaStorageSOPInstanceUID
) unchanged.
When these options are set, none of the primary ID UIDs in the sample data are replaced
with newly generated values. However, by adding SOPInstanceUID
to the
Action.queries[]
array, you can generate a new UID specifically for the SOPInstanceUID
tag.
REST
De-identify the dataset.
Before using any of the request data, make the following replacements:
- PROJECT_ID: the ID of your Google Cloud project
- SOURCE_DATASET_LOCATION: the source dataset location
- SOURCE_DATASET_ID: the ID of the dataset containing the data to de-identify
- DESTINATION_DATASET_ID: the ID of the destination dataset where de-identified data is written
Request JSON body:
{ "destinationDataset": "projects/PROJECT_ID/locations/SOURCE_DATASET_LOCATION/datasets/DESTINATION_DATASET_ID", "config": { "dicomTagConfig": { "options": { "primaryIds": "KEEP" }, "actions": [ { "queries": [ "00080018" ], "regenUidTag": {} } ], "profileType": "KEEP_ALL_PROFILE" } } }
To send your request, choose one of these options:
curl
Save the request body in a file named
request.json
. Run the following command in the terminal to create or overwrite this file in the current directory:cat > request.json << 'EOF' { "destinationDataset": "projects/PROJECT_ID/locations/SOURCE_DATASET_LOCATION/datasets/DESTINATION_DATASET_ID", "config": { "dicomTagConfig": { "options": { "primaryIds": "KEEP" }, "actions": [ { "queries": [ "00080018" ], "regenUidTag": {} } ], "profileType": "KEEP_ALL_PROFILE" } } } EOF
Then execute the following command to send your REST request:
curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
"https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/SOURCE_DATASET_LOCATION/datasets/SOURCE_DATASET_ID:deidentify"PowerShell
Save the request body in a file named
request.json
. Run the following command in the terminal to create or overwrite this file in the current directory:@' { "destinationDataset": "projects/PROJECT_ID/locations/SOURCE_DATASET_LOCATION/datasets/DESTINATION_DATASET_ID", "config": { "dicomTagConfig": { "options": { "primaryIds": "KEEP" }, "actions": [ { "queries": [ "00080018" ], "regenUidTag": {} } ], "profileType": "KEEP_ALL_PROFILE" } } } '@ | Out-File -FilePath request.json -Encoding utf8
Then execute the following command to send your REST request:
$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }
Invoke-WebRequest `
-Method POST `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/SOURCE_DATASET_LOCATION/datasets/SOURCE_DATASET_ID:deidentify" | Select-Object -Expand ContentOPERATION_ID
. You need this value in the next step.Use the
projects.locations.datasets.operations.get
method to get the status of the long-running operation.Before using any of the request data, make the following replacements:
- PROJECT_ID: the ID of your Google Cloud project
- SOURCE_DATASET_LOCATION: the source dataset location
- SOURCE_DATASET_ID: the ID of the dataset containing the data to de-identify
- OPERATION_ID: the ID returned from the long-running operation
To send your request, choose one of these options:
curl
Execute the following command:
curl -X GET \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
"https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/SOURCE_DATASET_LOCATION/datasets/SOURCE_DATASET_ID/operations/OPERATION_ID"PowerShell
Execute the following command:
$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }
Invoke-WebRequest `
-Method GET `
-Headers $headers `
-Uri "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/SOURCE_DATASET_LOCATION/datasets/SOURCE_DATASET_ID/operations/OPERATION_ID" | Select-Object -Expand ContentAPIs Explorer
Open the method reference page. The APIs Explorer panel opens on the right side of the page. You can interact with this tool to send requests. Complete any required fields and click Execute.
"done": true
, the long-running operation has finished.After the de-identification succeeds, you can retrieve the metadata for the de-identified instance to see how it changed. The de-identified instance has a new instances UID because you specified the
SOPInstanceUID
tag in theAction.queries[]
array, so you first need to search the new dataset for the de-identified instance.Before using any of the request data, make the following replacements:
- PROJECT_ID: the ID of your Google Cloud project
- DESTINATION_DATASET_ID: the ID of the destination dataset where de-identified data is written
- SOURCE_DATASET_LOCATION: the source dataset location
- DESTINATION_DICOM_STORE_ID: the ID of the DICOM store in the destination dataset. This is the same as the ID of the DICOM store in the source dataset.
To send your request, choose one of these options:
curl
Execute the following command:
curl -X GET \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
"https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/SOURCE_DATASET_LOCATION/datasets/DESTINATION_DATASET_ID/dicomStores/DESTINATION_DICOM_STORE_ID/dicomWeb/instances"PowerShell
Execute the following command:
$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }
Invoke-WebRequest `
-Method GET `
-Headers $headers `
-Uri "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/SOURCE_DATASET_LOCATION/datasets/DESTINATION_DATASET_ID/dicomStores/DESTINATION_DICOM_STORE_ID/dicomWeb/instances" | Select-Object -Expand ContentYou should receive a JSON response similar to the following:
The following table shows how the instances UID changed:
Original instance metadata De-identified instance metadata Instances UID ( 00080018
)1.2.276.0.7230010.3.1.4.8323329.78.1539083058.523695
1.3.6.1.4.1.11129.5.1.97415866390999888717168863957686758029
Using the new value, retrieve the metadata for the instance.
Before using any of the request data, make the following replacements:
- PROJECT_ID: the ID of your Google Cloud project
- DESTINATION_DATASET_ID: the ID of the destination dataset where de-identified data is written
- SOURCE_DATASET_LOCATION: the source dataset location
- DESTINATION_DICOM_STORE_ID: the ID of the DICOM store in the destination dataset. This is the same as the ID of the DICOM store in the source dataset.
To send your request, choose one of these options:
curl
Execute the following command:
curl -X GET \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
"https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/SOURCE_DATASET_LOCATION/datasets/DESTINATION_DATASET_ID/dicomStores/DESTINATION_DICOM_STORE_ID/dicomWeb/studies/2.25.70541616638819138568043293671559322355/series/1.2.276.0.7230010.3.1.3.8323329.78.1531234558.523694/instances/1.3.6.1.4.1.11129.5.1.97415866390999888717168863957686758029/metadata"PowerShell
Execute the following command:
$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }
Invoke-WebRequest `
-Method GET `
-Headers $headers `
-Uri "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/SOURCE_DATASET_LOCATION/datasets/DESTINATION_DATASET_ID/dicomStores/DESTINATION_DICOM_STORE_ID/dicomWeb/studies/2.25.70541616638819138568043293671559322355/series/1.2.276.0.7230010.3.1.3.8323329.78.1531234558.523694/instances/1.3.6.1.4.1.11129.5.1.97415866390999888717168863957686758029/metadata" | Select-Object -Expand ContentAPIs Explorer
Open the method reference page. The APIs Explorer panel opens on the right side of the page. You can interact with this tool to send requests. Complete any required fields and click Execute.
Notice that, out of the primary IDs, only the
SOPInstanceUID
has a newly generated UID.
Recursively de-identify tags in a nested Sequence
The following samples show how to use the RecurseTag
object to recursively
de-identify nested DICOM tags in a Sequence
.
The RecurseTag
object only supports the SQ
VR, which is the VR for
a Sequence
.
For information on the SQ
VR, see 7.5 Nesting of Data Sets.
The DICOM sample instance
provided for this page doesn't contain any DICOM tags that have an SQ
VR.
You can create and store a DICOM instance with fake data that contains the SQ
VR
by completing the following steps, which are based on the instructions in
Create DICOM instances from JSON metadata and JPEG files.
The DICOM instance that you create in the following steps uses fake data,
and is only intended to illustrate the behavior ofRecurseTag
.
Save the following DICOM metadata to a JSON file named
instance.json
. The metadata contains aPhysiciansReadingStudyIdentificationSequence
(00081062
) tag. The tag has anSQ
VR, and contains two nestedPersonIdentificationCodeSequence
(00401101
) tags. The nested tags also have anSQ
VR, and each contains the following nested tags:CodeValue
(00080100
)CodingSchemeDesignator
(00080102
)
[{ "00020010": { "vr": "UI", "Value": [ "1.2.840.10008.1.2.4.50" ] }, "00080005": { "vr": "CS", "Value": [ "ISO_IR 192" ] }, "00080016": { "vr": "UI", "Value": [ "1111111" ] }, "00080018": { "vr": "UI", "Value": [ "2222222" ] }, "0020000D": { "vr": "UI", "Value": [ "3333333" ] }, "0020000E": { "vr": "UI", "Value": [ "4444444" ] }, "00280002": { "vr": "US", "Value": [ 3 ] }, "00280004": { "vr": "CS", "Value": [ "YBR_FULL_422" ] }, "00280006": { "vr": "US", "Value": [ 0 ] }, "00280008": { "vr": "IS", "Value": [ 1 ] }, "00280010": { "vr": "US", "Value": [ 1024 ] }, "00280011": { "vr": "US", "Value": [ 1024 ] }, "00280100": { "vr": "US", "Value": [ 8 ] }, "00280101": { "vr": "US", "Value": [ 8 ] }, "00280102": { "vr": "US", "Value": [ 7 ] }, "00280103": { "vr": "US", "Value": [ 0 ] }, "7FE00010": { "vr": "OB", "BulkDataURI": "jpeg-image" }, "00081062": { "vr": "SQ", "Value": [ { "00401101": { "vr": "SQ", "Value": [ { "00080100": { "vr": "SH", "Value": [ "CodeValue1" ] }, "00080102": { "vr": "SH", "Value": [ "CodingSchemeDesignator1" ] } } ] } }, { "00401101": { "vr": "SQ", "Value": [ { "00080100": { "vr": "SH", "Value": [ "CodeValue2" ] }, "00080102": { "vr": "SH", "Value": [ "CodingSchemeDesignator2" ] } } ] } } ] } }]
Download the
google.jpg
file to your local machine. The Cloud Healthcare API DICOMweb API accepts any JPEG image paired with JSON metadata as long as the metadata is valid.Run the following commands to create an opening (for the JSON metadata), middle (for the JPEG) image, and closing boundaries in the image:
echo -ne "--DICOMwebBoundary\r\nContent-Type: application/dicom+json\r\n\r\n" > opening.file echo -ne "\r\n--DICOMwebBoundary\r\nContent-Location: jpeg-image\r\nContent-Type: image/jpeg; transfer-syntax=1.2.840.10008.1.2.4.50\r\n\r\n" > middle.file echo -ne "\r\n--DICOMwebBoundary--" > closing.file
Wrap the
google.jpg
image within the middle and closing boundaries. The output file, which you send to the Cloud Healthcare API, is calledmultipart-request.file
:cat opening.file instance.json middle.file google.jpg closing.file > multipart-request.file
Store the
multipart-request.file
file:REST
Before using any of the request data, make the following replacements:
- PROJECT_ID: the ID of your Google Cloud project
- SOURCE_DATASET_ID: the ID of the dataset containing the data to de-identify
- SOURCE_DATASET_LOCATION: the source dataset location
- SOURCE_DICOM_STORE_ID: the ID of the DICOM store inside the source dataset
To send your request, choose one of these options:
curl
Execute the following command:
curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: multipart/related; type=\"application/dicom+json\"; boundary=DICOMwebBoundary" \
--data-binary @multipart-request.file \
"https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/SOURCE_DATASET_LOCATION/datasets/SOURCE_DATASET_ID/dicomStores/SOURCE_DICOM_STORE_ID/dicomWeb/studies"PowerShell
Execute the following command:
$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }
Invoke-WebRequest `
-Method POST `
-Headers $headers `
-InFile multipart-request.file `
-Uri "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/SOURCE_DATASET_LOCATION/datasets/SOURCE_DATASET_ID/dicomStores/SOURCE_DICOM_STORE_ID/dicomWeb/studies" | Select-Object -Expand ContentYou should receive a JSON response similar to the following:
Complete the following steps to de-identify the DICOM instance you stored.
REST
De-identify the dataset.
Before using any of the request data, make the following replacements:
- PROJECT_ID: the ID of your Google Cloud project
- SOURCE_DATASET_LOCATION: the source dataset location
- SOURCE_DATASET_ID: the ID of the dataset containing the data to de-identify
- DESTINATION_DATASET_ID: the ID of the destination dataset where de-identified data is written
Request JSON body:
{ "destinationDataset": "projects/PROJECT_ID/locations/SOURCE_DATASET_LOCATION/datasets/DESTINATION_DATASET_ID", "config": { "dicomTagConfig": { "actions": [ { "queries": [ "PhysiciansReadingStudyIdentificationSequence" ], "recurseTag": {} } ] } } }
To send your request, choose one of these options:
curl
Save the request body in a file named
request.json
. Run the following command in the terminal to create or overwrite this file in the current directory:cat > request.json << 'EOF' { "destinationDataset": "projects/PROJECT_ID/locations/SOURCE_DATASET_LOCATION/datasets/DESTINATION_DATASET_ID", "config": { "dicomTagConfig": { "actions": [ { "queries": [ "PhysiciansReadingStudyIdentificationSequence" ], "recurseTag": {} } ] } } } EOF
Then execute the following command to send your REST request:
curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
"https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/SOURCE_DATASET_LOCATION/datasets/SOURCE_DATASET_ID:deidentify"PowerShell
Save the request body in a file named
request.json
. Run the following command in the terminal to create or overwrite this file in the current directory:@' { "destinationDataset": "projects/PROJECT_ID/locations/SOURCE_DATASET_LOCATION/datasets/DESTINATION_DATASET_ID", "config": { "dicomTagConfig": { "actions": [ { "queries": [ "PhysiciansReadingStudyIdentificationSequence" ], "recurseTag": {} } ] } } } '@ | Out-File -FilePath request.json -Encoding utf8
Then execute the following command to send your REST request:
$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }
Invoke-WebRequest `
-Method POST `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/SOURCE_DATASET_LOCATION/datasets/SOURCE_DATASET_ID:deidentify" | Select-Object -Expand ContentOPERATION_ID
. You need this value in the next step.Use the
projects.locations.datasets.operations.get
method to get the status of the long-running operation.Before using any of the request data, make the following replacements:
- PROJECT_ID: the ID of your Google Cloud project
- SOURCE_DATASET_LOCATION: the source dataset location
- SOURCE_DATASET_ID: the ID of the dataset containing the data to de-identify
- OPERATION_ID: the ID returned from the long-running operation
To send your request, choose one of these options:
curl
Execute the following command:
curl -X GET \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
"https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/SOURCE_DATASET_LOCATION/datasets/SOURCE_DATASET_ID/operations/OPERATION_ID"PowerShell
Execute the following command:
$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }
Invoke-WebRequest `
-Method GET `
-Headers $headers `
-Uri "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/SOURCE_DATASET_LOCATION/datasets/SOURCE_DATASET_ID/operations/OPERATION_ID" | Select-Object -Expand ContentAPIs Explorer
Open the method reference page. The APIs Explorer panel opens on the right side of the page. You can interact with this tool to send requests. Complete any required fields and click Execute.
"done": true
, the long-running operation has finished.After the de-identification succeeds, you can retrieve the metadata for the de-identified instance to see how it changed. The de-identified instance has a new studies UID, series UID, and instances UID, so you first need to search the new dataset for the de-identified instance.
Before using any of the request data, make the following replacements:
- PROJECT_ID: the ID of your Google Cloud project
- DESTINATION_DATASET_ID: the ID of the destination dataset where de-identified data is written
- SOURCE_DATASET_LOCATION: the source dataset location
- DESTINATION_DICOM_STORE_ID: the ID of the DICOM store in the destination dataset. This is the same as the ID of the DICOM store in the source dataset.
To send your request, choose one of these options:
curl
Execute the following command:
curl -X GET \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
"https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/SOURCE_DATASET_LOCATION/datasets/DESTINATION_DATASET_ID/dicomStores/DESTINATION_DICOM_STORE_ID/dicomWeb/instances"PowerShell
Execute the following command:
$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }
Invoke-WebRequest `
-Method GET `
-Headers $headers `
-Uri "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/SOURCE_DATASET_LOCATION/datasets/DESTINATION_DATASET_ID/dicomStores/DESTINATION_DICOM_STORE_ID/dicomWeb/instances" | Select-Object -Expand ContentYou should receive a JSON response similar to the following:
The following table shows how the studies UID, series UID, and instances UID changed:
Original instance metadata De-identified instance metadata Studies UID ( 0020000D
)3333333
1.3.6.1.4.1.11129.5.1.222168347996079463826250565085126257314
Series UID ( 0020000E
)4444444
1.3.6.1.4.1.11129.5.1.25205702030237830896398173746777399347
Instances UID ( 00080018
)2222222
1.3.6.1.4.1.11129.5.1.286710307126045768765142714621897494633
Using the new values, retrieve the metadata for the instance.
Before using any of the request data, make the following replacements:
- PROJECT_ID: the ID of your Google Cloud project
- DESTINATION_DATASET_ID: the ID of the destination dataset where de-identified data is written
- SOURCE_DATASET_LOCATION: the source dataset location
- DESTINATION_DICOM_STORE_ID: the ID of the DICOM store in the destination dataset. This is the same as the ID of the DICOM store in the source dataset.
To send your request, choose one of these options:
curl
Execute the following command:
curl -X GET \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
"https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/SOURCE_DATASET_LOCATION/datasets/DESTINATION_DATASET_ID/dicomStores/DESTINATION_DICOM_STORE_ID/dicomWeb/studies/1.3.6.1.4.1.11129.5.1.222168347996079463826250565085126257314/series/1.3.6.1.4.1.11129.5.1.25205702030237830896398173746777399347/instances/1.3.6.1.4.1.11129.5.1.286710307126045768765142714621897494633/metadata"PowerShell
Execute the following command:
$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }
Invoke-WebRequest `
-Method GET `
-Headers $headers `
-Uri "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/SOURCE_DATASET_LOCATION/datasets/DESTINATION_DATASET_ID/dicomStores/DESTINATION_DICOM_STORE_ID/dicomWeb/studies/1.3.6.1.4.1.11129.5.1.222168347996079463826250565085126257314/series/1.3.6.1.4.1.11129.5.1.25205702030237830896398173746777399347/instances/1.3.6.1.4.1.11129.5.1.286710307126045768765142714621897494633/metadata" | Select-Object -Expand ContentAPIs Explorer
Open the method reference page. The APIs Explorer panel opens on the right side of the page. You can interact with this tool to send requests. Complete any required fields and click Execute.
In particular, notice that the values of the nested
CodeValue
andCodingSchemaDesignator
tags are set toPLACEHOLDER
:Original metadata
Metadata after running RecurseTag
{ "00081062": { "vr": "SQ", "Value": [ { "00401101": { "vr": "SQ", "Value": [ { "00080100": { "vr": "SH", "Value": [ "CodeValue1" ] }, "00080102": { "vr": "SH", "Value": [ "CodingSchemeDesignator1" ] } } ] } }, { "00401101": { "vr": "SQ", "Value": [ { "00080100": { "vr": "SH", "Value": [
"CodeValue2"
] }, "00080102": { "vr": "SH", "Value": ["CodingSchemeDesignator2"
] } } ] } } ] } }{ "00081062": { "vr": "SQ", "Value": [ { "00401101": { "vr": "SQ", "Value": [ { "00080100": { "vr": "SH", "Value": [ "PLACEHOLDER" ] }, "00080102": { "vr": "SH", "Value": [ "PLACEHOLDER" ] } } ] } }, { "00401101": { "vr": "SQ", "Value": [ { "00080100": { "vr": "SH", "Value": [ "PLACEHOLDER" ] }, "00080102": { "vr": "SH", "Value": [ "PLACEHOLDER" ] } } ] } } ] } }
De-identify data at the DICOM store level
The preceding samples show how to de-identify DICOM data at the dataset level. This section describes how to de-identify data at the DICOM store level.
To change a dataset de-identification request to a DICOM store de-identification request, make the following changes:
- Replace the
destinationDataset
in the request body withdestinationStore
- Add
dicomStores/DESTINATION_DICOM_STORE_ID
at the end of the value indestinationStore
when specifying the destination - Add
dicomStores/SOURCE_DICOM_STORE_ID
when specifying the location of the source data
The following examples show a dataset level de-identification request and how to modify the request for a DICOM store level de-identification:
Dataset level de-identification:
"destinationDataset": "projects/PROJECT_ID/locations/LOCATION/datasets/DESTINATION_DATASET_ID" ... "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/datasets/SOURCE_DATASET_ID:deidentify"
DICOM store level de-identification:
"destinationStore": "projects/PROJECT_ID/locations/LOCATION/datasets/DESTINATION_DATASET_ID/dicomStores/DESTINATION_DICOM_STORE_ID" ... "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/datasets/SOURCE_DATASET_ID/dicomStores/SOURCE_DICOM_STORE_ID:deidentify"
The following samples show how to de-identify a DICOM instance in a DICOM store and write the de-identified data to a new DICOM store. Before running the samples, the destination DICOM store ID must already exist.
REST
De-identify the dataset.
Before using any of the request data, make the following replacements:
- PROJECT_ID: the ID of your Google Cloud project
- SOURCE_DATASET_LOCATION: the source dataset location
- SOURCE_DATASET_ID: the ID of the dataset containing the data to de-identify
- DESTINATION_DATASET_ID: the ID of the destination dataset where de-identified data is written. Must already exist before running the de-identify operation.
- DESTINATION_DICOM_STORE_ID: the DICOM store in the destination dataset. Must already exist before running the de-identify operation.
Request JSON body:
{ "destinationStore": "projects/PROJECT_ID/locations/SOURCE_DATASET_LOCATION/datasets/DESTINATION_DATASET_ID/dicomStores/DESTINATION_DICOM_STORE_ID", "config": { "dicomTagConfig": { "actions": [ { "queries": [ "PatientName" ], "keepTag": {} } ] } } }
To send your request, choose one of these options:
curl
Save the request body in a file named
request.json
. Run the following command in the terminal to create or overwrite this file in the current directory:cat > request.json << 'EOF' { "destinationStore": "projects/PROJECT_ID/locations/SOURCE_DATASET_LOCATION/datasets/DESTINATION_DATASET_ID/dicomStores/DESTINATION_DICOM_STORE_ID", "config": { "dicomTagConfig": { "actions": [ { "queries": [ "PatientName" ], "keepTag": {} } ] } } } EOF
Then execute the following command to send your REST request:
curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
"https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/SOURCE_DATASET_LOCATION/datasets/SOURCE_DATASET_ID/dicomStores/DESTINATION_DICOM_STORE_ID:deidentify"PowerShell
Save the request body in a file named
request.json
. Run the following command in the terminal to create or overwrite this file in the current directory:@' { "destinationStore": "projects/PROJECT_ID/locations/SOURCE_DATASET_LOCATION/datasets/DESTINATION_DATASET_ID/dicomStores/DESTINATION_DICOM_STORE_ID", "config": { "dicomTagConfig": { "actions": [ { "queries": [ "PatientName" ], "keepTag": {} } ] } } } '@ | Out-File -FilePath request.json -Encoding utf8
Then execute the following command to send your REST request:
$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }
Invoke-WebRequest `
-Method POST `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/SOURCE_DATASET_LOCATION/datasets/SOURCE_DATASET_ID/dicomStores/DESTINATION_DICOM_STORE_ID:deidentify" | Select-Object -Expand ContentOPERATION_ID
. You need this value in the next step.Use the
projects.locations.datasets.operations.get
method to get the status of the long-running operation.Before using any of the request data, make the following replacements:
- PROJECT_ID: the ID of your Google Cloud project
- SOURCE_DATASET_LOCATION: the source dataset location
- SOURCE_DATASET_ID: the ID of the dataset containing the data to de-identify
- OPERATION_ID: the ID returned from the long-running operation
To send your request, choose one of these options:
curl
Execute the following command:
curl -X GET \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
"https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/SOURCE_DATASET_LOCATION/datasets/SOURCE_DATASET_ID/operations/OPERATION_ID"PowerShell
Execute the following command:
$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }
Invoke-WebRequest `
-Method GET `
-Headers $headers `
-Uri "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/SOURCE_DATASET_LOCATION/datasets/SOURCE_DATASET_ID/operations/OPERATION_ID" | Select-Object -Expand ContentAPIs Explorer
Open the method reference page. The APIs Explorer panel opens on the right side of the page. You can interact with this tool to send requests. Complete any required fields and click Execute.
"done": true
, the long-running operation has finished.After the de-identification succeeds, you can retrieve the metadata for the de-identified instance to see how it changed. The de-identified instance has a new studies UID, series UID, and instances UID, so you first need to search the new dataset for the de-identified instance.
Before using any of the request data, make the following replacements:
- PROJECT_ID: the ID of your Google Cloud project
- DESTINATION_DATASET_ID: the ID of the destination dataset where de-identified data is written
- SOURCE_DATASET_LOCATION: the source dataset location
- DESTINATION_DICOM_STORE_ID: the ID of the DICOM store in the destination dataset. This is the same as the ID of the DICOM store in the source dataset.
To send your request, choose one of these options:
curl
Execute the following command:
curl -X GET \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
"https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/SOURCE_DATASET_LOCATION/datasets/DESTINATION_DATASET_ID/dicomStores/DESTINATION_DICOM_STORE_ID/dicomWeb/instances"PowerShell
Execute the following command:
$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }
Invoke-WebRequest `
-Method GET `
-Headers $headers `
-Uri "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/SOURCE_DATASET_LOCATION/datasets/DESTINATION_DATASET_ID/dicomStores/DESTINATION_DICOM_STORE_ID/dicomWeb/instances" | Select-Object -Expand ContentYou should receive a JSON response similar to the following:
The following table shows how the studies UID, series UID, and instances UID changed:Original instance metadata De-identified instance metadata Studies UID ( 0020000D
)2.25.70541616638819138568043293671559322355
1.3.6.1.4.1.11129.5.1.201854290391432893460946240745559593763
Series UID ( 0020000E
)1.2.276.0.7230010.3.1.3.8323329.78.1531234558.523694
1.3.6.1.4.1.11129.5.1.303327499491957026103380014864616068710
Instances UID ( 00080018
)1.2.276.0.7230010.3.1.4.8323329.78.1539083058.523695
1.3.6.1.4.1.11129.5.1.97415866390999888717168863957686758029
Using the new values, retrieve the metadata for the instance.
Before using any of the request data, make the following replacements:
- PROJECT_ID: the ID of your Google Cloud project
- DESTINATION_DATASET_ID: the ID of the destination dataset where de-identified data is written
- SOURCE_DATASET_LOCATION: the source dataset location
- DESTINATION_DICOM_STORE_ID: the ID of the DICOM store in the destination dataset. This is the same as the ID of the DICOM store in the source dataset.
To send your request, choose one of these options:
curl
Execute the following command:
curl -X GET \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
"https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/SOURCE_DATASET_LOCATION/datasets/DESTINATION_DATASET_ID/dicomStores/DESTINATION_DICOM_STORE_ID/dicomWeb/studies/1.3.6.1.4.1.11129.5.1.201854290391432893460946240745559593763/series/1.3.6.1.4.1.11129.5.1.303327499491957026103380014864616068710/instances/1.3.6.1.4.1.11129.5.1.97415866390999888717168863957686758029/metadata"PowerShell
Execute the following command:
$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }
Invoke-WebRequest `
-Method GET `
-Headers $headers `
-Uri "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/SOURCE_DATASET_LOCATION/datasets/DESTINATION_DATASET_ID/dicomStores/DESTINATION_DICOM_STORE_ID/dicomWeb/studies/1.3.6.1.4.1.11129.5.1.201854290391432893460946240745559593763/series/1.3.6.1.4.1.11129.5.1.303327499491957026103380014864616068710/instances/1.3.6.1.4.1.11129.5.1.97415866390999888717168863957686758029/metadata" | Select-Object -Expand ContentAPIs Explorer
Open the method reference page. The APIs Explorer panel opens on the right side of the page. You can interact with this tool to send requests. Complete any required fields and click Execute.
De-identify a subset of a DICOM store
You can de-identify a subset of the data in a DICOM store by specifying a filter.
The filter takes the form of a filter file that you specify as a value for the
resourcePathsGcsUri
field in the DicomFilterConfig
object. The filter file must
exist in a Cloud Storage bucket; you cannot specify a filter file
that exists on your local machine or any other source. The location of the
file must be in the format
gs://BUCKET/PATH/TO/FILE
.
Create a filter file
A filter file defines which DICOM files to de-identify. You can filter files at the following levels:
- At the study level
- At the series level
- At the instance level
The filter file is made up of one line per study, series, or instance you want
to de-identify. Each line uses the format /studies/STUDY_UID[/series/SERIES_UID[/instances/INSTANCE_UID]]
.
At the end of each line is a newline character: either \n
or \r\n
.
If a study, series, or instance isn't specified in the filter file you passed in when calling the de-identify operation, that study, series, or instance will not be de-identified and will not be present in the destination DICOM store.
Only the /studies/STUDY_UID
portion of the path is
required. This means that you can de-identify a study by specifying
/studies/STUDY_UID
, or you can de-identify a
series by specifying /studies/STUDY_UID/series/SERIES_UID
.
Consider the following filter file. The filter file causes one study, two series, and three individual instances to be de-identified:
/studies/1.123.456.789
/studies/1.666.333.111/series/123.456\n
/studies/1.666.333.111/series/567.890\n
/studies/1.888.999.222/series/123.456/instances/111\n
/studies/1.888.999.222/series/123.456/instances/222\n
/studies/1.888.999.222/series/123.456/instances/333\n
Create a filter file using BigQuery
You typically create a filter file by first exporting the metadata from a DICOM store to BigQuery. This lets you use BigQuery to view the study, series, and instance UIDs of the DICOM data in your DICOM store. You can then do the following:
Query for the study, series, and instance UIDs you're interested in. For example, after exporting the metadata to BigQuery, you could run the following query to concatenate the study, series, and instance UIDs to a format that's compatible with the filter file requirements:
SELECT CONCAT ('/studies/', StudyInstanceUID, '/series/', SeriesInstanceUID, '/instances/', SOPInstanceUID) FROM [PROJECT_ID:BIGQUERY_DATASET.BIGQUERY_TABLE]
If the query returns a large result set, you can materialize a new table by saving the query results to a destination table in BigQuery.
After saving the query results to the destination table, you can save the contents of the destination table to a file and export it to Cloud Storage. For steps on how to do so, see Exporting table data. The exported file is your filter file. You will use the location of the filter file in Cloud Storage when specifying the filter in the export operation.
Create a filter file manually
You can create a filter file with custom content and
upload it to a Cloud Storage bucket.
You will use the location of the filter file in Cloud Storage when
specifying the filter in the de-identify operation. The following sample shows how
to upload a filter file to a Cloud Storage bucket using the
gcloud storage cp
command:
gcloud storage cp PATH/TO/FILTER_FILE gs://BUCKET/DIRECTORY
For example:
gcloud storage cp /home/user/Desktop/filters.txt gs://my-bucket/my-directory
Use a filter
After you have your filter file configured, you can pass it in as a value to
the resourcePathsGcsUri
field in the filterConfig
object.
The following sample expands on De-identifying data at the DICOM store level, but a filter file in Cloud Storage is provided to determine which DICOM resources are de-identified.
REST
De-identify the dataset.
Before using any of the request data, make the following replacements:
- PROJECT_ID: the ID of your Google Cloud project
- LOCATION: the dataset location
- SOURCE_DATASET_ID: the ID of the dataset containing the data to de-identify
- DESTINATION_DATASET_ID: the ID of the destination dataset where de-identified data is written
- SOURCE_DICOM_STORE_ID: the ID of the DICOM store containing the data to de-identify
- DESTINATION_DICOM_STORE_ID: the ID of the DICOM store in the destination dataset
- BUCKET/PATH/TO/FILE: the location of the filter file in a Cloud Storage bucket
Request JSON body:
{ "destinationStore": "projects/PROJECT_ID/locations/LOCATION/datasets/DESTINATION_DATASET_ID/dicomStores/DESTINATION_DICOM_STORE_ID", "config": { "dicom": { "filterProfile": "DEIDENTIFY_TAG_CONTENTS" }, "image": { "textRedactionMode": "REDACT_ALL_TEXT" } }, "filterConfig": { "resourcePathGcsUri": "gs://BUCKET/PATH/TO/FILE" } }
To send your request, choose one of these options:
curl
Save the request body in a file named
request.json
. Run the following command in the terminal to create or overwrite this file in the current directory:cat > request.json << 'EOF' { "destinationStore": "projects/PROJECT_ID/locations/LOCATION/datasets/DESTINATION_DATASET_ID/dicomStores/DESTINATION_DICOM_STORE_ID", "config": { "dicom": { "filterProfile": "DEIDENTIFY_TAG_CONTENTS" }, "image": { "textRedactionMode": "REDACT_ALL_TEXT" } }, "filterConfig": { "resourcePathGcsUri": "gs://BUCKET/PATH/TO/FILE" } } EOF
Then execute the following command to send your REST request:
curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
"https://healthcare.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/datasets/SOURCE_DATASET_ID/dicomStores/SOURCE_DICOM_STORE_ID:deidentify"PowerShell
Save the request body in a file named
request.json
. Run the following command in the terminal to create or overwrite this file in the current directory:@' { "destinationStore": "projects/PROJECT_ID/locations/LOCATION/datasets/DESTINATION_DATASET_ID/dicomStores/DESTINATION_DICOM_STORE_ID", "config": { "dicom": { "filterProfile": "DEIDENTIFY_TAG_CONTENTS" }, "image": { "textRedactionMode": "REDACT_ALL_TEXT" } }, "filterConfig": { "resourcePathGcsUri": "gs://BUCKET/PATH/TO/FILE" } } '@ | Out-File -FilePath request.json -Encoding utf8
Then execute the following command to send your REST request:
$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }
Invoke-WebRequest `
-Method POST `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "https://healthcare.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/datasets/SOURCE_DATASET_ID/dicomStores/SOURCE_DICOM_STORE_ID:deidentify" | Select-Object -Expand ContentUse the
projects.locations.datasets.operations.get
method to get the status of the long-running operation.Before using any of the request data, make the following replacements:
- PROJECT_ID: the ID of your Google Cloud project
- SOURCE_DATASET_LOCATION: the source dataset location
- SOURCE_DATASET_ID: the ID of the dataset containing the data to de-identify
- OPERATION_ID: the ID returned from the long-running operation
To send your request, choose one of these options:
curl
Execute the following command:
curl -X GET \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
"https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/SOURCE_DATASET_LOCATION/datasets/SOURCE_DATASET_ID/operations/OPERATION_ID"PowerShell
Execute the following command:
$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }
Invoke-WebRequest `
-Method GET `
-Headers $headers `
-Uri "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/SOURCE_DATASET_LOCATION/datasets/SOURCE_DATASET_ID/operations/OPERATION_ID" | Select-Object -Expand ContentAPIs Explorer
Open the method reference page. The APIs Explorer panel opens on the right side of the page. You can interact with this tool to send requests. Complete any required fields and click Execute.
"done": true
, the long-running operation has finished.After the de-identification succeeds, you can retrieve the metadata for the de-identified instance to see how it changed. The de-identified instance has a new studies UID, series UID, and instances UID, so you first need to search the new dataset for the de-identified instance.
Before using any of the request data, make the following replacements:
- PROJECT_ID: the ID of your Google Cloud project
- DESTINATION_DATASET_ID: the ID of the destination dataset where de-identified data is written
- SOURCE_DATASET_LOCATION: the source dataset location
- DESTINATION_DICOM_STORE_ID: the ID of the DICOM store in the destination dataset. This is the same as the ID of the DICOM store in the source dataset.
To send your request, choose one of these options:
curl
Execute the following command:
curl -X GET \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
"https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/SOURCE_DATASET_LOCATION/datasets/DESTINATION_DATASET_ID/dicomStores/DESTINATION_DICOM_STORE_ID/dicomWeb/instances"PowerShell
Execute the following command:
$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }
Invoke-WebRequest `
-Method GET `
-Headers $headers `
-Uri "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/SOURCE_DATASET_LOCATION/datasets/DESTINATION_DATASET_ID/dicomStores/DESTINATION_DICOM_STORE_ID/dicomWeb/instances" | Select-Object -Expand ContentYou should receive a JSON response similar to the following:
The following table shows how the studies UID, series UID, and instances UID changed:Original instance metadata De-identified instance metadata Studies UID ( 0020000D
)2.25.70541616638819138568043293671559322355
1.3.6.1.4.1.11129.5.1.201854290391432893460946240745559593763
Series UID ( 0020000E
)1.2.276.0.7230010.3.1.3.8323329.78.1531234558.523694
1.3.6.1.4.1.11129.5.1.303327499491957026103380014864616068710
Instances UID ( 00080018
)1.2.276.0.7230010.3.1.4.8323329.78.1539083058.523695
1.3.6.1.4.1.11129.5.1.97415866390999888717168863957686758029
Using the new values, retrieve the metadata for the instance.
Before using any of the request data, make the following replacements:
- PROJECT_ID: the ID of your Google Cloud project
- DESTINATION_DATASET_ID: the ID of the destination dataset where de-identified data is written
- SOURCE_DATASET_LOCATION: the source dataset location
- DESTINATION_DICOM_STORE_ID: the ID of the DICOM store in the destination dataset. This is the same as the ID of the DICOM store in the source dataset.
To send your request, choose one of these options:
curl
Execute the following command:
curl -X GET \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
"https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/SOURCE_DATASET_LOCATION/datasets/DESTINATION_DATASET_ID/dicomStores/DESTINATION_DICOM_STORE_ID/dicomWeb/studies/1.3.6.1.4.1.11129.5.1.201854290391432893460946240745559593763/series/1.3.6.1.4.1.11129.5.1.303327499491957026103380014864616068710/instances/1.3.6.1.4.1.11129.5.1.97415866390999888717168863957686758029/metadata"PowerShell
Execute the following command:
$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }
Invoke-WebRequest `
-Method GET `
-Headers $headers `
-Uri "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/SOURCE_DATASET_LOCATION/datasets/DESTINATION_DATASET_ID/dicomStores/DESTINATION_DICOM_STORE_ID/dicomWeb/studies/1.3.6.1.4.1.11129.5.1.201854290391432893460946240745559593763/series/1.3.6.1.4.1.11129.5.1.303327499491957026103380014864616068710/instances/1.3.6.1.4.1.11129.5.1.97415866390999888717168863957686758029/metadata" | Select-Object -Expand ContentAPIs Explorer
Open the method reference page. The APIs Explorer panel opens on the right side of the page. You can interact with this tool to send requests. Complete any required fields and click Execute.
Troubleshoot DICOM de-identification operations
If errors occur during a DICOM de-identification operation, the errors are logged to Cloud Logging. For more information, see Viewing error logs in Cloud Logging.
If the entire operation returns an error, see Troubleshooting long-running operations.