De-identifying DICOM data

This page explains how to de-identify sensitive data in DICOM instances using the Cloud Healthcare API at the following levels:

This page also explains how to apply filters when de-identifying data at the DICOM store level.

De-identification overview

Dataset level de-identification

To de-identify DICOM data at the dataset level, call the datasets.deidentify operation. The de-identification API call has the following components:

  • The source dataset: A dataset containing DICOM stores with one or more instances that have sensitive data. When you call the deidentify operation, all instances in all DICOM stores in the dataset are de-identified.
  • What to de-identify: Configuration parameters that specify how to process the dataset. You can configure DICOM de-identification to de-identify DICOM instance metadata (using tag keywords) or burnt-in text in DICOM images.
  • The destination dataset: De-identification does not impact the original dataset or its data. Instead, de-identified copies of the original data are written to a new dataset, called the destination dataset.

The majority of the samples in this guide show how to de-identify DICOM data at the dataset level.

DICOM store level de-identification

De-identifying DICOM data at the DICOM store level lets you have more control over which data is de-identified. For example, if you have a dataset with multiple DICOM stores, you can de-identify each DICOM store according to what type of data exists in the store.

To de-identify DICOM data in a DICOM store, call the dicomStores.deidentify method. The de-identification API call has the following components:

  • The source DICOM store: A DICOM store containing one or more instances that have sensitive data. When you call the deidentify operation, all instances in the DICOM store are de-identified.
  • What to de-identify: Configuration parameters that specify how to process the DICOM store. You can configure DICOM de-identification to de-identify DICOM instance metadata (using tag keywords) or burnt-in text in DICOM images.
  • The destination DICOM store: De-identification does not impact the original DICOM store or its data. Instead, de-identified copies of the original data are written to a new or existing DICOM store. The dataset that the DICOM store is created in must already exist.

For an example of how to de-identify DICOM data at the DICOM store level, see De-identifying data at the DICOM store level.

Filters

You can de-identify a subset of data in a DICOM store by configuring a filter file and specifying the file in the dicomStores.deidentify request. For an example, see De-identifying a subset of a DICOM store.

Samples overview

The samples in this guide use a single DICOM instance, but you can also de-identify multiple instances.

Each of the following sections provides samples of how to de-identify DICOM data using various methods. An output of the de-identified image is provided with each sample. Each sample uses the following original image as its input:

xray_original

You can compare the output image from each de-identification operation to this original image to see the effects of the operation.

De-identification using tags

You can de-identify DICOM instances based on tag keywords in the DICOM metadata. The following tag filtering methods are available in the DicomConfig object:

  • keepList: List of tags to keep. Remove all other tags.
  • removeList: List of tags to remove. Keep all other tags.
  • filterProfile: A tag filtering profile used to determine which tags to keep or remove.

For each sample in this section, the output of the DICOM instance's changed metadata is provided. The following is the instance's original metadata used as the input for each sample:

[
  {
    "00020002":{"vr":"UI","Value":["1.2.840.10008.5.1.4.1.1.7"]},
     "00020003":{"vr":"UI","Value":["1.2.276.0.7230010.3.1.4.8323329.78.1539083058.523695"]},
     "00020010":{"vr":"UI","Value":["1.2.840.10008.1.2.4.50"]},
     "00020012":{"vr":"UI","Value":["1.2.276.0.7230010.3.0.3.6.1"]},
     "00020013":{"vr":"SH","Value":["OFFIS_DCMTK_361"]},
     "00080005":{"vr":"CS","Value":["ISO_IR 100"]},
     "00080016":{"vr":"UI","Value":["1.2.840.10008.5.1.4.1.1.7"]},
     "00080018":{"vr":"UI","Value":["1.2.276.0.7230010.3.1.4.8323329.78.1539083058.523695"]},
     "00080020":{"vr":"DA","Value":["20110909"]},
     "00080030":{"vr":"TM","Value":["110032"]},
     "00080050":{"vr":"SH"},
     "00080064":{"vr":"CS","Value":["WSD"]},
     "00080070":{"vr":"LO","Value":["Manufacturer"]},
     "00080090":{"vr":"PN","Value":[{"Alphabetic":"John Doe"}]},
     "00081090":{"vr":"LO","Value":["ABC1"]},
     "00100010":{"vr":"PN","Value":[{"Alphabetic":"Ann Johnson"}]},
     "00100020":{"vr":"LO","Value":["S1214223-1"]},
     "00100030":{"vr":"DA","Value":["19880812"]},
     "00100040":{"vr":"CS","Value":["F"]},
     "0020000D":{"vr":"UI","Value":["2.25.70541616638819138568043293671559322355"]},
     "0020000E":{"vr":"UI","Value":["1.2.276.0.7230010.3.1.3.8323329.78.1531234558.523694"]},
     "00200010":{"vr":"SH"},
     "00200011":{"vr":"IS"},
     "00200013":{"vr":"IS"},
     "00200020":{"vr":"CS"},
     "00280002":{"vr":"US","Value":[3]},
     "00280004":{"vr":"CS","Value":["YBR_FULL_422"]},
     "00280006":{"vr":"US","Value":[0]},
     "00280010":{"vr":"US","Value":[1024]},
     "00280011":{"vr":"US","Value":[1024]},
     "00280100":{"vr":"US","Value":[8]},
     "00280101":{"vr":"US","Value":[8]},
     "00280102":{"vr":"US","Value":[7]},
     "00280103":{"vr":"US","Value":[0]},
     "00282110":{"vr":"CS","Value":["01"]},
     "00282114":{"vr":"CS","Value":["ISO_10918_1"]}
  }
]

De-identification using keeplist tags

When you specify a keeplist tag in the DicomConfig object, the following tags are added by default:

  • StudyInstanceUID
  • SeriesInstanceUID
  • SOPInstanceUID
  • TransferSyntaxUID
  • MediaStorageSOPInstanceUID
  • MediaStorageSOPClassUID
  • PixelData
  • Rows
  • Columns
  • SamplesPerPixel
  • BitsAllocated
  • BitsStored
  • Highbit
  • PhotometricInterpretation
  • PixelRepresentation
  • NumberOfFrames
  • PlanarConfiguration
  • PixelAspectRatio
  • SmallestImagePixelValue
  • LargestImagePixelValue
  • RedPaletteColorLookupTableDescriptor
  • GreenPaletteColorLookupTableDescriptor
  • BluePaletteColorLookupTableDescriptor
  • RedPaletteColorLookupTableData
  • GreenPaletteColorLookupTableData
  • BluePaletteColorLookupTableData
  • ICCProfile
  • ColorSpace
  • WindowCenter
  • WindowWidth
  • VOILUTFunction

The deidentify operation does not redact the above tags (unless SkipIdRedaction is used to override this behaviour). If no keeplist tags are provided, then no DICOM tags in the dataset are redacted.

The following samples show how to de-identify a dataset containing DICOM stores and DICOM data while leaving some tags unchanged.

After submitting the image to the Cloud Healthcare API, the image appears as follows. While the metadata displayed in the top corners of the image has been redacted, the burnt-in PHI at the bottom of the image remains. To also remove the burnt-in text, see Redacting burnt-in text from images.

dicom_keeplist

curl command

To de-identify a dataset containing DICOM data using keeplist tags, make a POST request and provide the name of the destination dataset, a set of keeplist tags for the data you want to retain, and an access token. The following sample shows how to make a POST request using curl.

curl -X POST \
    -H "Authorization: Bearer "$(gcloud auth print-access-token) \
    -H "Content-Type: application/json; charset=utf-8" \
    --data "{
      'destinationDataset': 'projects/PROJECT_ID/locations/REGION/datasets/DESTINATION_DATASET_ID',
      'config': {
        'dicom': {
          'keepList': {
            'tags': [
              'PatientID'
            ]
          }
        }
      }
    }" "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/REGION/datasets/SOURCE_DATASET_ID:deidentify"
If the request is successful, the server returns a 200 OK HTTP status code and the response in JSON format:
200 OK
{
  "name": "projects/PROJECT_ID/locations/REGION/datasets/SOURCE_DATASET_ID/OPERATION_ID"
}
The response contains an operation name. You can use the Operation get method to track the status of the operation:
curl -X GET \
    -H "Authorization: Bearer "$(gcloud auth print-access-token) \
    -H "Content-Type: application/json; charset=utf-8" \
    "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/REGION/datasets/SOURCE_DATASET_ID/operations/OPERATION_ID"
If the request is successful, the server returns a 200 OK HTTP status code and the response in JSON format. After the de-identification process finishes, the response contains "done": true.
200 OK
{
  "name": "projects/PROJECT_ID/locations/REGION/datasets/SOURCE_DATASET_ID/operations/OPERATION_NUMBER",
  "metadata": {
    "@type": "type.googleapis.com/google.cloud.healthcare.v1beta1.OperationMetadata",
    "apiMethodName": "google.cloud.healthcare.v1beta1.dataset.DatasetService.DeidentifyDataset",
    "createTime": "2018-01-01T00:00:00Z",
    "endTime": "2018-01-01T00:00:00Z"
  },
  "done": true,
  "response": {
    "@type": "...",
    "successStoreCount": "SUCCESS_STORE_COUNT"
  }
}
After the de-identification succeeds, you can retrieve the metadata for the de-identified instance to see how it was changed. The de-identified instance will have a new studies UID, series UID, and instances UID, so you first need to search in the new dataset for the de-identified instance:
curl -X GET \
     -H "Authorization: Bearer "$(gcloud auth print-access-token) \
     -H "Content-Type: application/dicom+json; charset=utf-8" \
     "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/REGION/datasets/DESTINATION_DATASET_ID/dicomStores/DICOM_STORE_ID/dicomWeb/instances"
If the request is successful, the server returns a 200 OK HTTP status code and the response in JSON format:
200 OK
[
  {
    "00080005":{"vr":"CS"},
    "00080016":{"vr":"UI"},
    "00080018":{"vr":"UI","Value":["1.3.6.1.4.1.11129.5.1.97415866390999888717168863957686758029"]},
    "00080020":{"vr":"DA"},
    "00080030":{"vr":"TM"},
    "00080050":{"vr":"SH"},
    "00080090":{"vr":"PN"},
    "00100010":{"vr":"PN"},
    "00100020":{"vr":"LO","Value":["S1214223-1"]},
    "00100030":{"vr":"DA"},
    "00100040":{"vr":"CS"},
    "0020000D":{"vr":"UI","Value":["1.3.6.1.4.1.11129.5.1.201854290391432893460946240745559593763"]},
    "0020000E":{"vr":"UI","Value":["1.3.6.1.4.1.11129.5.1.303327499491957026103380014864616068710"]},
    "00200010":{"vr":"SH"},
    "00200013":{"vr":"IS"},
    "00280010":{"vr":"US","Value":[1024]},
    "00280011":{"vr":"US","Value":[1024]},
    "00280100":{"vr":"US","Value":[8]}
  }
]
The following table shows how the studies UID, series UID, and instances UID changed:
  Original instance metadata De-identified instance metadata
Studies UID (0020000D) 2.25.70541616638819138568043293671559322355 1.3.6.1.4.1.11129.5.1.201854290391432893460946240745559593763
Series UID (0020000E) 1.2.276.0.7230010.3.1.3.8323329.78.1531234558.523694 1.3.6.1.4.1.11129.5.1.303327499491957026103380014864616068710
Instances UID (00080018) 1.2.276.0.7230010.3.1.4.8323329.78.1539083058.523695 1.3.6.1.4.1.11129.5.1.97415866390999888717168863957686758029
Using the new values, retrieve the metadata for the instance:
curl -X GET \
    -H "Authorization: Bearer "$(gcloud auth print-access-token) \
    -H "Content-Type: application/json; charset=utf-8" \
    "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/REGION/datasets/DESTINATION_DATASET_ID//dicomStores/DICOM_STORE_ID/dicomWeb/studies/1.3.6.1.4.1.11129.5.1.201854290391432893460946240745559593763/series/1.3.6.1.4.1.11129.5.1.303327499491957026103380014864616068710/instances/1.3.6.1.4.1.11129.5.1.97415866390999888717168863957686758029/metadata"
If the request is successful, the server returns a 200 OK HTTP status code and the new metadata in JSON format. You can compare the new metadata with the original metadata to see the effect of the transformation.
200 OK
[
  {
    "00020002":{"vr":"UI","Value":["1.2.840.10008.5.1.4.1.1.7"]},
    "00020003":{"vr":"UI","Value":["1.3.6.1.4.1.11129.5.1.97415866390999888717168863957686758029"]},
    "00020010":{"vr":"UI","Value":["1.2.840.10008.1.2.4.50"]},
    "00020012":{"vr":"UI","Value":["1.2.40.0.13.1.3"]},
    "00020013":{"vr":"SH","Value":["dcm4che-null"]},
    "00080005":{"vr":"CS"},
    "00080016":{"vr":"UI"},
    "00080018":{"vr":"UI","Value":["1.3.6.1.4.1.11129.5.1.97415866390999888717168863957686758029"]},
    "00080020":{"vr":"DA"},
    "00080030":{"vr":"TM"},
    "00080050":{"vr":"SH"},
    "00080064":{"vr":"CS"},
    "00080070":{"vr":"LO"},
    "00080090":{"vr":"PN"},
    "00081090":{"vr":"LO"},
    "00100010":{"vr":"PN"},
    "00100020":{"vr":"LO","Value":["S1214223-1"]},
    "00100030":{"vr":"DA"},
    "00100040":{"vr":"CS"},
    "0020000D":{"vr":"UI","Value":["1.3.6.1.4.1.11129.5.1.201854290391432893460946240745559593763"]},
    "0020000E":{"vr":"UI","Value":["1.3.6.1.4.1.11129.5.1.303327499491957026103380014864616068710"]},
    "00200010":{"vr":"SH"},
    "00200011":{"vr":"IS"},
    "00200013":{"vr":"IS"},
    "00200020":{"vr":"CS"},
    "00280002":{"vr":"US","Value":[3]},
    "00280004":{"vr":"CS","Value":["YBR_FULL_422"]},
    "00280006":{"vr":"US"},
    "00280010":{"vr":"US","Value":[1024]},
    "00280011":{"vr":"US","Value":[1024]},
    "00280100":{"vr":"US","Value":[8]},
    "00280101":{"vr":"US","Value":[8]},
    "00280102":{"vr":"US","Value":[7]},
    "00280103":{"vr":"US","Value":[0]},
    "00282110":{"vr":"CS"},
    "00282114":{"vr":"CS"}
  }
]

PowerShell

To de-identify a dataset containing DICOM data using keeplist tags, make a POST request and provide the name of the destination dataset, a set of keeplist tags for the data you want to retain, and an access token. The following sample shows how to make a POST request using Windows PowerShell.

$cred = gcloud auth print-access-token
$headers = @{ Authorization = "Bearer $cred" }

Invoke-WebRequest `
  -Method Post `
  -Headers $headers `
  -ContentType: "application/json; charset=utf-8" `
  -Body "{
    'destinationDataset': 'projects/PROJECT_ID/locations/REGION/datasets/DESTINATION_DATASET_ID',
    'config': {
      'dicom': {
        'keepList': {
          'tags': [
            'PatientID'
          ]
        }
      }
    }
  }" `
  -Uri "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/REGION/datasets/SOURCE_DATASET_ID:deidentify" | Select-Object -Expand Content
If the request is successful, the server returns a 200 OK HTTP status code and the response in JSON format:
200 OK
{
  "name": "projects/PROJECT_ID/locations/REGION/datasets/SOURCE_DATASET_ID/OPERATION_ID"
}
The response contains an operation name. You can use the Operation get method to track the status of the operation:
$cred = gcloud auth print-access-token
$headers = @{ Authorization = "Bearer $cred" }

Invoke-WebRequest `
  -Method Get `
  -Headers $headers `
  -ContentType: "application/json; charset=utf-8" `
  -Uri "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/REGION/datasets/SOURCE_DATASET_ID/operations/OPERATION_ID" | Select-Object -Expand Content
If the request is successful, the server returns a 200 OK HTTP status code and the response in JSON format. After the de-identification process finishes, the response contains "done": true.
200 OK
{
  "name": "projects/PROJECT_ID/locations/REGION/datasets/SOURCE_DATASET_ID/operations/OPERATION_NUMBER",
  "metadata": {
    "@type": "type.googleapis.com/google.cloud.healthcare.v1beta1.OperationMetadata",
    "apiMethodName": "google.cloud.healthcare.v1beta1.dataset.DatasetService.DeidentifyDataset",
    "createTime": "2018-01-01T00:00:00Z",
    "endTime": "2018-01-01T00:00:00Z"
  },
  "done": true,
  "response": {
    "@type": "...",
    "successStoreCount": "SUCCESS_STORE_COUNT"
  }
}
After the de-identification succeeds, you can retrieve the metadata for the de-identified instance to see how it was changed. The de-identified instance will have a new studies UID, series UID, and instances UID, so you first need to search in the new dataset for the de-identified instance:
$cred = gcloud auth print-access-token
$headers = @{ Authorization = "Bearer $cred" }

Invoke-WebRequest `
  -Method Get `
  -Headers $headers `
  -ContentType: "application/json; charset=utf-8" `
  -Uri "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/REGION/datasets/DESTINATION_DATASET_ID/dicomStores/DICOM_STORE_ID/dicomWeb/instances" | Select-Object -Expand Content
If the request is successful, the server returns a 200 OK HTTP status code and the response in JSON format:
200 OK
[
  {
    "00080005":{"vr":"CS"},
    "00080016":{"vr":"UI"},
    "00080018":{"vr":"UI","Value":["1.3.6.1.4.1.11129.5.1.97415866390999888717168863957686758029"]},
    "00080020":{"vr":"DA"},
    "00080030":{"vr":"TM"},
    "00080050":{"vr":"SH"},
    "00080090":{"vr":"PN"},
    "00100010":{"vr":"PN"},
    "00100020":{"vr":"LO","Value":["S1214223-1"]},
    "00100030":{"vr":"DA"},
    "00100040":{"vr":"CS"},
    "0020000D":{"vr":"UI","Value":["1.3.6.1.4.1.11129.5.1.201854290391432893460946240745559593763"]},
    "0020000E":{"vr":"UI","Value":["1.3.6.1.4.1.11129.5.1.303327499491957026103380014864616068710"]},
    "00200010":{"vr":"SH"},
    "00200013":{"vr":"IS"},
    "00280010":{"vr":"US","Value":[1024]},
    "00280011":{"vr":"US","Value":[1024]},
    "00280100":{"vr":"US","Value":[8]}
  }
]
The following table shows how the studies UID, series UID, and instances UID changed:
  Original instance metadata De-identified instance metadata
Studies UID (0020000D) 2.25.70541616638819138568043293671559322355 1.3.6.1.4.1.11129.5.1.201854290391432893460946240745559593763
Series UID (0020000E) 1.2.276.0.7230010.3.1.3.8323329.78.1531234558.523694 1.3.6.1.4.1.11129.5.1.303327499491957026103380014864616068710
Instances UID (00080018) 1.2.276.0.7230010.3.1.4.8323329.78.1539083058.523695 1.3.6.1.4.1.11129.5.1.97415866390999888717168863957686758029
Using the new values, retrieve the metadata for the instance:
$cred = gcloud auth print-access-token
$headers = @{ Authorization = "Bearer $cred" }

Invoke-WebRequest `
  -Method Get `
  -Headers $headers `
  -ContentType: "application/json; charset=utf-8" `
  -Uri "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/REGION/datasets/DESTINATION_DATASET_ID//dicomStores/DICOM_STORE_ID/dicomWeb/studies/1.3.6.1.4.1.11129.5.1.201854290391432893460946240745559593763/series/1.3.6.1.4.1.11129.5.1.303327499491957026103380014864616068710/instances/1.3.6.1.4.1.11129.5.1.97415866390999888717168863957686758029/metadata" | Select-Object -Expand Content
If the request is successful, the server returns a 200 OK HTTP status code and the new metadata in JSON format. You can compare the new metadata with the original metadata to see the effect of the transformation.
200 OK
[
  {
    "00020002":{"vr":"UI","Value":["1.2.840.10008.5.1.4.1.1.7"]},
    "00020003":{"vr":"UI","Value":["1.3.6.1.4.1.11129.5.1.97415866390999888717168863957686758029"]},
    "00020010":{"vr":"UI","Value":["1.2.840.10008.1.2.4.50"]},
    "00020012":{"vr":"UI","Value":["1.2.40.0.13.1.3"]},
    "00020013":{"vr":"SH","Value":["dcm4che-null"]},
    "00080005":{"vr":"CS"},
    "00080016":{"vr":"UI"},
    "00080018":{"vr":"UI","Value":["1.3.6.1.4.1.11129.5.1.97415866390999888717168863957686758029"]},
    "00080020":{"vr":"DA"},
    "00080030":{"vr":"TM"},
    "00080050":{"vr":"SH"},
    "00080064":{"vr":"CS"},
    "00080070":{"vr":"LO"},
    "00080090":{"vr":"PN"},
    "00081090":{"vr":"LO"},
    "00100010":{"vr":"PN"},
    "00100020":{"vr":"LO","Value":["S1214223-1"]},
    "00100030":{"vr":"DA"},
    "00100040":{"vr":"CS"},
    "0020000D":{"vr":"UI","Value":["1.3.6.1.4.1.11129.5.1.201854290391432893460946240745559593763"]},
    "0020000E":{"vr":"UI","Value":["1.3.6.1.4.1.11129.5.1.303327499491957026103380014864616068710"]},
    "00200010":{"vr":"SH"},
    "00200011":{"vr":"IS"},
    "00200013":{"vr":"IS"},
    "00200020":{"vr":"CS"},
    "00280002":{"vr":"US","Value":[3]},
    "00280004":{"vr":"CS","Value":["YBR_FULL_422"]},
    "00280006":{"vr":"US"},
    "00280010":{"vr":"US","Value":[1024]},
    "00280011":{"vr":"US","Value":[1024]},
    "00280100":{"vr":"US","Value":[8]},
    "00280101":{"vr":"US","Value":[8]},
    "00280102":{"vr":"US","Value":[7]},
    "00280103":{"vr":"US","Value":[0]},
    "00282110":{"vr":"CS"},
    "00282114":{"vr":"CS"}
  }
]

Go

import (
	"context"
	"fmt"
	"io"
	"time"

	healthcare "google.golang.org/api/healthcare/v1beta1"
)

// deidentifyDataset creates a new dataset containing de-identified data from the source dataset.
func deidentifyDataset(w io.Writer, projectID, location, sourceDatasetID, destinationDatasetID string) error {
	ctx := context.Background()

	healthcareService, err := healthcare.NewService(ctx)
	if err != nil {
		return fmt.Errorf("healthcare.NewService: %v", err)
	}

	datasetsService := healthcareService.Projects.Locations.Datasets

	parent := fmt.Sprintf("projects/%s/locations/%s", projectID, location)

	req := &healthcare.DeidentifyDatasetRequest{
		DestinationDataset: fmt.Sprintf("%s/datasets/%s", parent, destinationDatasetID),
		Config: &healthcare.DeidentifyConfig{
			Dicom: &healthcare.DicomConfig{
				KeepList: &healthcare.TagFilterList{
					Tags: []string{
						"PatientID",
					},
				},
			},
		},
	}

	sourceName := fmt.Sprintf("%s/datasets/%s", parent, sourceDatasetID)
	resp, err := datasetsService.Deidentify(sourceName, req).Do()
	if err != nil {
		return fmt.Errorf("Deidentify: %v", err)
	}

	// Wait for the deidentification operation to finish.
	operationService := healthcareService.Projects.Locations.Datasets.Operations
	for {
		op, err := operationService.Get(resp.Name).Do()
		if err != nil {
			return fmt.Errorf("operationService.Get: %v", err)
		}
		if !op.Done {
			time.Sleep(1 * time.Second)
			continue
		}
		if op.Error != nil {
			return fmt.Errorf("deidentify operation error: %v", *op.Error)
		}
		fmt.Fprintf(w, "Created de-identified dataset %s from %s\n", resp.Name, sourceName)
		return nil
	}
}

Java

import com.google.api.client.googleapis.auth.oauth2.GoogleCredential;
import com.google.api.client.http.HttpHeaders;
import com.google.api.client.http.HttpRequestInitializer;
import com.google.api.client.http.javanet.NetHttpTransport;
import com.google.api.client.json.JsonFactory;
import com.google.api.client.json.jackson2.JacksonFactory;
import com.google.api.services.healthcare.v1beta1.CloudHealthcare;
import com.google.api.services.healthcare.v1beta1.CloudHealthcare.Projects.Locations.Datasets;
import com.google.api.services.healthcare.v1beta1.CloudHealthcareScopes;
import com.google.api.services.healthcare.v1beta1.model.DeidentifyConfig;
import com.google.api.services.healthcare.v1beta1.model.DeidentifyDatasetRequest;
import com.google.api.services.healthcare.v1beta1.model.DicomConfig;
import com.google.api.services.healthcare.v1beta1.model.Operation;
import com.google.api.services.healthcare.v1beta1.model.TagFilterList;
import java.io.IOException;
import java.util.Arrays;
import java.util.Collections;

public class DatasetDeIdentify {
  private static final String DATASET_NAME = "projects/%s/locations/%s/datasets/%s";
  private static final JsonFactory JSON_FACTORY = new JacksonFactory();
  private static final NetHttpTransport HTTP_TRANSPORT = new NetHttpTransport();

  public static void datasetDeIdentify(String srcDatasetName, String destDatasetName)
      throws IOException {
    // String srcDatasetName =
    //     String.format(DATASET_NAME, "your-project-id", "your-region-id", "your-src-dataset-id");
    // String destDatasetName =
    //    String.format(DATASET_NAME, "your-project-id", "your-region-id", "your-dest-dataset-id");

    // Initialize the client, which will be used to interact with the service.
    CloudHealthcare client = createClient();

    // Configure what information needs to be De-Identified.
    // For more information on de-identifying using tags, please see the following:
    // https://cloud.google.com/healthcare/docs/how-tos/dicom-deidentify#de-identification_using_tags
    TagFilterList tags = new TagFilterList().setTags(Arrays.asList("PatientID"));
    DicomConfig dicomConfig = new DicomConfig().setKeepList(tags);
    DeidentifyConfig config = new DeidentifyConfig().setDicom(dicomConfig);

    // Create the de-identify request and configure any parameters.
    DeidentifyDatasetRequest deidentifyRequest =
        new DeidentifyDatasetRequest().setDestinationDataset(destDatasetName).setConfig(config);
    Datasets.Deidentify request =
        client.projects().locations().datasets().deidentify(srcDatasetName, deidentifyRequest);

    // Execute the request, wait for the operation to complete, and process the results.
    try {
      Operation operation = request.execute();
      while (operation.getDone() == null || !operation.getDone()) {
        // Update the status of the operation with another request.
        Thread.sleep(500); // Pause for 500ms between requests.
        operation =
            client
                .projects()
                .locations()
                .datasets()
                .operations()
                .get(operation.getName())
                .execute();
      }
      System.out.println(
          "De-identified Dataset created. Response content: " + operation.getResponse());
    } catch (Exception ex) {
      System.out.printf("Error during request execution: %s", ex.toString());
      ex.printStackTrace(System.out);
    }
  }

  private static CloudHealthcare createClient() throws IOException {
    // Use Application Default Credentials (ADC) to authenticate the requests
    // For more information see https://cloud.google.com/docs/authentication/production
    GoogleCredential credential =
        GoogleCredential.getApplicationDefault(HTTP_TRANSPORT, JSON_FACTORY)
            .createScoped(Collections.singleton(CloudHealthcareScopes.CLOUD_PLATFORM));

    // Create a HttpRequestInitializer, which will provide a baseline configuration to all requests.
    HttpRequestInitializer requestInitializer =
        request -> {
          credential.initialize(request);
          request.setHeaders(new HttpHeaders().set("X-GFE-SSL", "yes"));
          request.setConnectTimeout(60000); // 1 minute connect timeout
          request.setReadTimeout(60000); // 1 minute read timeout
        };

    // Build the client for interacting with the service.
    return new CloudHealthcare.Builder(HTTP_TRANSPORT, JSON_FACTORY, requestInitializer)
        .setApplicationName("your-application-name")
        .build();
  }
}

Node.js

const {google} = require('googleapis');
const healthcare = google.healthcare('v1beta1');

const deidentifyDataset = async () => {
  const auth = await google.auth.getClient({
    scopes: ['https://www.googleapis.com/auth/cloud-platform'],
  });
  google.options({auth});

  // TODO(developer): uncomment these lines before running the sample
  // const cloudRegion = 'us-central1';
  // const projectId = 'adjective-noun-123';
  // const sourceDatasetId = 'my-source-dataset';
  // const destinationDatasetId = 'my-destination-dataset';
  // const keeplistTags = 'PatientID'
  const sourceDataset = `projects/${projectId}/locations/${cloudRegion}/datasets/${sourceDatasetId}`;
  const destinationDataset = `projects/${projectId}/locations/${cloudRegion}/datasets/${destinationDatasetId}`;
  const request = {
    sourceDataset: sourceDataset,
    destinationDataset: destinationDataset,
    resource: {
      config: {
        dicom: {
          keepList: {
            tags: [keeplistTags],
          },
        },
      },
    },
  };

  await healthcare.projects.locations.datasets.deidentify(request);
  console.log(
    `De-identified data written from dataset ${sourceDatasetId} to dataset ${destinationDatasetId}`
  );
};

deidentifyDataset();

Python

def deidentify_dataset(
        service_account_json,
        project_id,
        cloud_region,
        dataset_id,
        destination_dataset_id,
        keeplist_tags):
    """Creates a new dataset containing de-identified data
    from the source dataset.
    """
    client = get_client(service_account_json)
    source_dataset = 'projects/{}/locations/{}/datasets/{}'.format(
        project_id, cloud_region, dataset_id)
    destination_dataset = 'projects/{}/locations/{}/datasets/{}'.format(
        project_id, cloud_region, destination_dataset_id)

    body = {
        'destinationDataset': destination_dataset,
        'config': {
            'dicom': {
                'keepList': {
                    'tags': [
                        'Columns',
                        'NumberOfFrames',
                        'PixelRepresentation',
                        'MediaStorageSOPClassUID',
                        'MediaStorageSOPInstanceUID',
                        'Rows',
                        'SamplesPerPixel',
                        'BitsAllocated',
                        'HighBit',
                        'PhotometricInterpretation',
                        'BitsStored',
                        'PatientID',
                        'TransferSyntaxUID',
                        'SOPInstanceUID',
                        'StudyInstanceUID',
                        'SeriesInstanceUID',
                        'PixelData'
                    ]
                }
            }
        }
    }

    request = client.projects().locations().datasets().deidentify(
        sourceDataset=source_dataset, body=body)

    try:
        response = request.execute()
        print(
            'Data in dataset {} de-identified.'
            'De-identified data written to {}'.format(
                dataset_id,
                destination_dataset_id))
        return response
    except HttpError as e:
        print('Error, data could not be deidentified: {}'.format(e))
        return ""

De-identification using removelist tags

You can specify a removelist in the DicomConfig object. The deidentify operation will redact only the tags specified in the list. If no removelist tags are provided, then the de-identification operation proceeds as normal, but no DICOM tags in the destination dataset are redacted.

When you specify a removelist, the OverlayData tag is added by default.

The tags that are by default added to a keeplist cannot be added to a removelist.

The following samples show how to de-identify a dataset containing DICOM stores and DICOM data while leaving any tags not in the removelist unchanged.

After submitting the image to the Cloud Healthcare API, the image appears as follows. Out of the tags provided in the removelist, only PatientBirthDate is removed in the image, as it's the only tag from the removelist that corresponds to metadata that is visible in the image.

While the PatientBirthDate in the top corner of the image has been redacted according to the configuration in the removelist, the burnt-in PHI at the bottom of the image remains. To also remove the burnt-in text, see Redacting burnt-in text from images.

dicom_removelist

curl command

To de-identify a dataset containing DICOM data using removelist tags, make a POST request and provide the name of the destination dataset, a set of removelist tags for the data you want to redact, and an access token. The following sample shows how to make a POST request using curl.

curl -X POST \
    -H "Authorization: Bearer "$(gcloud auth print-access-token) \
    -H "Content-Type: application/json; charset=utf-8" \
    --data "{
      'destinationDataset': 'projects/PROJECT_ID/locations/REGION/datasets/DESTINATION_DATASET_ID',
      'config': {
        'dicom': {
          'removeList': {
            'tags': [
              'PatientBirthName',
              'PatientBirthDate',
              'PatientAge',
              'PatientSize',
              'PatientWeight',
              'PatientAddress',
              'PatientMotherBirthName'
            ]
          }
        }
      }
    }" "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/REGION/datasets/SOURCE_DATASET_ID:deidentify"
If the request is successful, the server returns a 200 OK HTTP status code and the response in JSON format:
200 OK
{
  "name": "projects/PROJECT_ID/locations/REGION/datasets/SOURCE_DATASET_ID/OPERATION_ID"
}
The response contains an operation name. You can use the Operation get method to track the status of the operation:
curl -X GET \
    -H "Authorization: Bearer "$(gcloud auth print-access-token) \
    -H "Content-Type: application/json; charset=utf-8" \
    "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/REGION/datasets/SOURCE_DATASET_ID/operations/OPERATION_ID"
If the request is successful, the server returns a 200 OK HTTP status code and the response in JSON format. After the de-identification process finishes, the response contains "done": true.
200 OK
{
  "name": "projects/PROJECT_ID/locations/REGION/datasets/SOURCE_DATASET_ID/operations/OPERATION_NUMBER",
  "metadata": {
    "@type": "type.googleapis.com/google.cloud.healthcare.v1beta1.OperationMetadata",
    "apiMethodName": "google.cloud.healthcare.v1beta1.dataset.DatasetService.DeidentifyDataset",
    "createTime": "2018-01-01T00:00:00Z",
    "endTime": "2018-01-01T00:00:00Z"
  },
  "done": true,
  "response": {
    "@type": "...",
    "successStoreCount": "SUCCESS_STORE_COUNT"
  }
}
After the de-identification succeeds, you can retrieve the metadata for the de-identified instance to see how it was changed. The de-identified instance will have a new studies UID, series UID, and instances UID, so you first need to search in the new dataset for the de-identified instance:
curl -X GET \
     -H "Authorization: Bearer "$(gcloud auth print-access-token) \
     -H "Content-Type: application/dicom+json; charset=utf-8" \
     "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/REGION/datasets/DESTINATION_DATASET_ID/dicomStores/DICOM_STORE_ID/dicomWeb/instances"
If the request is successful, the server returns a 200 OK HTTP status code and the response in JSON format:
200 OK
[
  {
    "00080005":{"vr":"CS"},
    "00080016":{"vr":"UI"},
    "00080018":{"vr":"UI","Value":["1.3.6.1.4.1.11129.5.1.97415866390999888717168863957686758029"]},
    "00080020":{"vr":"DA"},
    "00080030":{"vr":"TM"},
    "00080050":{"vr":"SH"},
    "00080090":{"vr":"PN"},
    "00100010":{"vr":"PN"},
    "00100020":{"vr":"LO","Value":["S1214223-1"]},
    "00100030":{"vr":"DA"},
    "00100040":{"vr":"CS"},
    "0020000D":{"vr":"UI","Value":["1.3.6.1.4.1.11129.5.1.201854290391432893460946240745559593763"]},
    "0020000E":{"vr":"UI","Value":["1.3.6.1.4.1.11129.5.1.303327499491957026103380014864616068710"]},
    "00200010":{"vr":"SH"},
    "00200013":{"vr":"IS"},
    "00280010":{"vr":"US","Value":[1024]},
    "00280011":{"vr":"US","Value":[1024]},
    "00280100":{"vr":"US","Value":[8]}
  }
]
The following table shows how the studies UID, series UID, and instances UID changed:
  Original instance metadata De-identified instance metadata
Studies UID (0020000D) 2.25.70541616638819138568043293671559322355 1.3.6.1.4.1.11129.5.1.201854290391432893460946240745559593763
Series UID (0020000E) 1.2.276.0.7230010.3.1.3.8323329.78.1531234558.523694 1.3.6.1.4.1.11129.5.1.303327499491957026103380014864616068710
Instances UID (00080018) 1.2.276.0.7230010.3.1.4.8323329.78.1539083058.523695 1.3.6.1.4.1.11129.5.1.97415866390999888717168863957686758029
Using the new values, retrieve the metadata for the instance:
curl -X GET \
    -H "Authorization: Bearer "$(gcloud auth print-access-token) \
    -H "Content-Type: application/json; charset=utf-8" \
    "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/REGION/datasets/DESTINATION_DATASET_ID//dicomStores/DICOM_STORE_ID/dicomWeb/studies/1.3.6.1.4.1.11129.5.1.201854290391432893460946240745559593763/series/1.3.6.1.4.1.11129.5.1.303327499491957026103380014864616068710/instances/1.3.6.1.4.1.11129.5.1.97415866390999888717168863957686758029/metadata"
If the request is successful, the server returns a 200 OK HTTP status code and the new metadata in JSON format. You can compare the new metadata with the original metadata to see the effect of the transformation.
200 OK
[
  {
    "00020002":{"vr":"UI","Value":["1.2.840.10008.5.1.4.1.1.7"]},
    "00020003":{"vr":"UI","Value":["1.3.6.1.4.1.11129.5.1.97415866390999888717168863957686758029"]},
    "00020010":{"vr":"UI","Value":["1.2.840.10008.1.2.4.50"]},
    "00020012":{"vr":"UI","Value":["1.2.40.0.13.1.3"]},
    "00020013":{"vr":"SH","Value":["dcm4che-null"]},
    "00080005":{"vr":"CS","Value":["ISO_IR 100"]},
    "00080016":{"vr":"UI","Value":["1.2.840.10008.5.1.4.1.1.7"]},
    "00080018":{"vr":"UI","Value":["1.3.6.1.4.1.11129.5.1.97415866390999888717168863957686758029"]},
    "00080020":{"vr":"DA","Value":["20110909"]},
    "00080030":{"vr":"TM","Value":["110032"]},
    "00080050":{"vr":"SH"},
    "00080064":{"vr":"CS","Value":["WSD"]},
    "00080070":{"vr":"LO","Value":["Manufacturer"]},
    "00080090":{"vr":"PN","Value":[{"Alphabetic":"John Doe"}]},
    "00081090":{"vr":"LO","Value":["ABC1"]},
    "00100010":{"vr":"PN","Value":[{"Alphabetic":"Ann Johnson"}]},
    "00100020":{"vr":"LO","Value":["S1214223-1"]},
    "00100030":{"vr":"DA"},
    "00100040":{"vr":"CS","Value":["F"]},
    "0020000D":{"vr":"UI","Value":["1.3.6.1.4.1.11129.5.1.201854290391432893460946240745559593763"]},
    "0020000E":{"vr":"UI","Value":["1.3.6.1.4.1.11129.5.1.303327499491957026103380014864616068710"]},
    "00200010":{"vr":"SH"},
    "00200011":{"vr":"IS"},
    "00200013":{"vr":"IS"},
    "00200020":{"vr":"CS"},
    "00280002":{"vr":"US","Value":[3]},
    "00280004":{"vr":"CS","Value":["YBR_FULL_422"]},
    "00280006":{"vr":"US","Value":[0]},
    "00280010":{"vr":"US","Value":[1024]},
    "00280011":{"vr":"US","Value":[1024]},
    "00280100":{"vr":"US","Value":[8]},
    "00280101":{"vr":"US","Value":[8]},
    "00280102":{"vr":"US","Value":[7]},
    "00280103":{"vr":"US","Value":[0]},
    "00282110":{"vr":"CS","Value":["01"]},
    "00282114":{"vr":"CS","Value":["ISO_10918_1"]}
  }
]

PowerShell

To de-identify a dataset containing DICOM data using removelist tags, make a POST request and provide the name of the destination dataset, a set of removelist tags for the data you want to redact, and an access token. The following sample shows how to make a POST request using Windows PowerShell.

$cred = gcloud auth print-access-token
$headers = @{ Authorization = "Bearer $cred" }

Invoke-WebRequest `
  -Method Post `
  -Headers $headers `
  -ContentType: "application/json; charset=utf-8" `
  -Body "{
    'destinationDataset': 'projects/PROJECT_ID/locations/REGION/datasets/DESTINATION_DATASET_ID',
    'config': {
      'dicom': {
        'removeList': {
          'tags': [
            'PatientBirthName',
            'PatientBirthDate',
            'PatientAge',
            'PatientSize',
            'PatientWeight',
            'PatientAddress',
            'PatientMotherBirthName'
          ]
        }
      }
    }
  }" `
  -Uri "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/REGION/datasets/SOURCE_DATASET_ID:deidentify" | Select-Object -Expand Content
If the request is successful, the server returns a 200 OK HTTP status code and the response in JSON format:
200 OK
{
  "name": "projects/PROJECT_ID/locations/REGION/datasets/SOURCE_DATASET_ID/OPERATION_ID"
}
The response contains an operation name. You can use the Operation get method to track the status of the operation:
$cred = gcloud auth print-access-token
$headers = @{ Authorization = "Bearer $cred" }

Invoke-WebRequest `
  -Method Get `
  -Headers $headers `
  -ContentType: "application/json; charset=utf-8" `
  -Uri "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/REGION/datasets/SOURCE_DATASET_ID/operations/OPERATION_ID" | Select-Object -Expand Content
If the request is successful, the server returns a 200 OK HTTP status code and the response in JSON format. After the de-identification process finishes, the response contains "done": true.
200 OK
{
  "name": "projects/PROJECT_ID/locations/REGION/datasets/SOURCE_DATASET_ID/operations/OPERATION_NUMBER",
  "metadata": {
    "@type": "type.googleapis.com/google.cloud.healthcare.v1beta1.OperationMetadata",
    "apiMethodName": "google.cloud.healthcare.v1beta1.dataset.DatasetService.DeidentifyDataset",
    "createTime": "2018-01-01T00:00:00Z",
    "endTime": "2018-01-01T00:00:00Z"
  },
  "done": true,
  "response": {
    "@type": "...",
    "successStoreCount": "SUCCESS_STORE_COUNT"
  }
}
After the de-identification succeeds, you can retrieve the metadata for the de-identified instance to see how it was changed. The de-identified instance will have a new studies UID, series UID, and instances UID, so you first need to search in the new dataset for the de-identified instance:
$cred = gcloud auth print-access-token
$headers = @{ Authorization = "Bearer $cred" }

Invoke-WebRequest `
  -Method Get `
  -Headers $headers `
  -ContentType: "application/json; charset=utf-8" `
  -Uri "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/REGION/datasets/DESTINATION_DATASET_ID/dicomStores/DICOM_STORE_ID/dicomWeb/instances" | Select-Object -Expand Content
If the request is successful, the server returns a 200 OK HTTP status code and the response in JSON format:
200 OK
[
  {
    "00080005":{"vr":"CS"},
    "00080016":{"vr":"UI"},
    "00080018":{"vr":"UI","Value":["1.3.6.1.4.1.11129.5.1.97415866390999888717168863957686758029"]},
    "00080020":{"vr":"DA"},
    "00080030":{"vr":"TM"},
    "00080050":{"vr":"SH"},
    "00080090":{"vr":"PN"},
    "00100010":{"vr":"PN"},
    "00100020":{"vr":"LO","Value":["S1214223-1"]},
    "00100030":{"vr":"DA"},
    "00100040":{"vr":"CS"},
    "0020000D":{"vr":"UI","Value":["1.3.6.1.4.1.11129.5.1.201854290391432893460946240745559593763"]},
    "0020000E":{"vr":"UI","Value":["1.3.6.1.4.1.11129.5.1.303327499491957026103380014864616068710"]},
    "00200010":{"vr":"SH"},
    "00200013":{"vr":"IS"},
    "00280010":{"vr":"US","Value":[1024]},
    "00280011":{"vr":"US","Value":[1024]},
    "00280100":{"vr":"US","Value":[8]}
  }
]
The following table shows how the studies UID, series UID, and instances UID changed:
  Original instance metadata De-identified instance metadata
Studies UID (0020000D) 2.25.70541616638819138568043293671559322355 1.3.6.1.4.1.11129.5.1.201854290391432893460946240745559593763
Series UID (0020000E) 1.2.276.0.7230010.3.1.3.8323329.78.1531234558.523694 1.3.6.1.4.1.11129.5.1.303327499491957026103380014864616068710
Instances UID (00080018) 1.2.276.0.7230010.3.1.4.8323329.78.1539083058.523695 1.3.6.1.4.1.11129.5.1.97415866390999888717168863957686758029
Using the new values, retrieve the metadata for the instance:
$cred = gcloud auth print-access-token
$headers = @{ Authorization = "Bearer $cred" }

Invoke-WebRequest `
  -Method Get `
  -Headers $headers `
  -ContentType: "application/json; charset=utf-8" `
  -Uri "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/REGION/datasets/DESTINATION_DATASET_ID//dicomStores/DICOM_STORE_ID/dicomWeb/studies/1.3.6.1.4.1.11129.5.1.201854290391432893460946240745559593763/series/1.3.6.1.4.1.11129.5.1.303327499491957026103380014864616068710/instances/1.3.6.1.4.1.11129.5.1.97415866390999888717168863957686758029/metadata" | Select-Object -Expand Content
If the request is successful, the server returns a 200 OK HTTP status code and the new metadata in JSON format. You can compare the new metadata with the original metadata to see the effect of the transformation.
200 OK
[
  {
    "00020002":{"vr":"UI","Value":["1.2.840.10008.5.1.4.1.1.7"]},
    "00020003":{"vr":"UI","Value":["1.3.6.1.4.1.11129.5.1.97415866390999888717168863957686758029"]},
    "00020010":{"vr":"UI","Value":["1.2.840.10008.1.2.4.50"]},
    "00020012":{"vr":"UI","Value":["1.2.40.0.13.1.3"]},
    "00020013":{"vr":"SH","Value":["dcm4che-null"]},
    "00080005":{"vr":"CS","Value":["ISO_IR 100"]},
    "00080016":{"vr":"UI","Value":["1.2.840.10008.5.1.4.1.1.7"]},
    "00080018":{"vr":"UI","Value":["1.3.6.1.4.1.11129.5.1.97415866390999888717168863957686758029"]},
    "00080020":{"vr":"DA","Value":["20110909"]},
    "00080030":{"vr":"TM","Value":["110032"]},
    "00080050":{"vr":"SH"},
    "00080064":{"vr":"CS","Value":["WSD"]},
    "00080070":{"vr":"LO","Value":["Manufacturer"]},
    "00080090":{"vr":"PN","Value":[{"Alphabetic":"John Doe"}]},
    "00081090":{"vr":"LO","Value":["ABC1"]},
    "00100010":{"vr":"PN","Value":[{"Alphabetic":"Ann Johnson"}]},
    "00100020":{"vr":"LO","Value":["S1214223-1"]},
    "00100030":{"vr":"DA"},
    "00100040":{"vr":"CS","Value":["F"]},
    "0020000D":{"vr":"UI","Value":["1.3.6.1.4.1.11129.5.1.201854290391432893460946240745559593763"]},
    "0020000E":{"vr":"UI","Value":["1.3.6.1.4.1.11129.5.1.303327499491957026103380014864616068710"]},
    "00200010":{"vr":"SH"},
    "00200011":{"vr":"IS"},
    "00200013":{"vr":"IS"},
    "00200020":{"vr":"CS"},
    "00280002":{"vr":"US","Value":[3]},
    "00280004":{"vr":"CS","Value":["YBR_FULL_422"]},
    "00280006":{"vr":"US","Value":[0]},
    "00280010":{"vr":"US","Value":[1024]},
    "00280011":{"vr":"US","Value":[1024]},
    "00280100":{"vr":"US","Value":[8]},
    "00280101":{"vr":"US","Value":[8]},
    "00280102":{"vr":"US","Value":[7]},
    "00280103":{"vr":"US","Value":[0]},
    "00282110":{"vr":"CS","Value":["01"]},
    "00282114":{"vr":"CS","Value":["ISO_10918_1"]}
  }
]

De-identification using a tag filter profile

Rather than specifying which tags to keep or remove, you can configure a TagFilterProfile in the DicomConfig object. A tag filter profile is a pre-defined profile that determines which tags to keep or remove. See the TagFilterProfile documentation for available profiles.

The following samples show how to de-identify a dataset containing DICOM stores and DICOM data using the tag filter profile ATTRIBUTE_CONFIDENTIALITY_BASIC_PROFILE. This tag filter profile removes tags based on the DICOM Standard's Attribute Confidentiality Basic Profile. The Cloud Healthcare API does not fully conform to the Attribute Confidentiality Basic Profile. For example, the Cloud Healthcare API does not check for Information Object Definition (IOD) restrictions when selecting an action for a tag.

After submitting the image to the Cloud Healthcare API using the ATTRIBUTE_CONFIDENTIALITY_BASIC_PROFILE tag filter profile, the image appears as follows. While the metadata displayed in the top corners of the image has been redacted, the burnt-in PHI at the bottom of the image remains. To also remove the burnt-in text, see Redacting burnt-in text from images.

dicom_attribute_confidentiality_basic_profile

curl command

To de-identify a dataset containing DICOM data using a tag filter profile, make a POST request and provide the name of the destination dataset, the tag filter profile for the data you want to redact, and an access token. The following sample shows how to make a POST request using curl.

curl -X POST \
    -H "Authorization: Bearer "$(gcloud auth print-access-token) \
    -H "Content-Type: application/json; charset=utf-8" \
    --data "{
      'destinationDataset': 'projects/PROJECT_ID/locations/REGION/datasets/DESTINATION_DATASET_ID',
      'config': {
        'dicom': {
          'filterProfile': 'ATTRIBUTE_CONFIDENTIALITY_BASIC_PROFILE'
        }
      }
    }" "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/REGION/datasets/SOURCE_DATASET_ID:deidentify"
If the request is successful, the server returns a 200 OK HTTP status code and the response in JSON format:
200 OK
{
  "name": "projects/PROJECT_ID/locations/REGION/datasets/SOURCE_DATASET_ID/OPERATION_ID"
}
The response contains an operation name. You can use the Operation get method to track the status of the operation:
curl -X GET \
    -H "Authorization: Bearer "$(gcloud auth print-access-token) \
    -H "Content-Type: application/json; charset=utf-8" \
    "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/REGION/datasets/SOURCE_DATASET_ID/operations/OPERATION_ID"
If the request is successful, the server returns a 200 OK HTTP status code and the response in JSON format. After the de-identification process finishes, the response contains "done": true.
200 OK
{
  "name": "projects/PROJECT_ID/locations/REGION/datasets/SOURCE_DATASET_ID/operations/OPERATION_NUMBER",
  "metadata": {
    "@type": "type.googleapis.com/google.cloud.healthcare.v1beta1.OperationMetadata",
    "apiMethodName": "google.cloud.healthcare.v1beta1.dataset.DatasetService.DeidentifyDataset",
    "createTime": "2018-01-01T00:00:00Z",
    "endTime": "2018-01-01T00:00:00Z"
  },
  "done": true,
  "response": {
    "@type": "...",
    "successStoreCount": "SUCCESS_STORE_COUNT"
  }
}
After the de-identification succeeds, you can retrieve the metadata for the de-identified instance to see how it was changed. The de-identified instance will have a new studies UID, series UID, and instances UID, so you first need to search in the new dataset for the de-identified instance:
curl -X GET \
     -H "Authorization: Bearer "$(gcloud auth print-access-token) \
     -H "Content-Type: application/dicom+json; charset=utf-8" \
     "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/REGION/datasets/DESTINATION_DATASET_ID/dicomStores/DICOM_STORE_ID/dicomWeb/instances"
If the request is successful, the server returns a 200 OK HTTP status code and the response in JSON format:
200 OK
[
  {
    "00080005":{"vr":"CS"},
    "00080016":{"vr":"UI"},
    "00080018":{"vr":"UI","Value":["1.3.6.1.4.1.11129.5.1.97415866390999888717168863957686758029"]},
    "00080020":{"vr":"DA"},
    "00080030":{"vr":"TM"},
    "00080050":{"vr":"SH"},
    "00080090":{"vr":"PN"},
    "00100010":{"vr":"PN"},
    "00100020":{"vr":"LO","Value":["S1214223-1"]},
    "00100030":{"vr":"DA"},
    "00100040":{"vr":"CS"},
    "0020000D":{"vr":"UI","Value":["1.3.6.1.4.1.11129.5.1.201854290391432893460946240745559593763"]},
    "0020000E":{"vr":"UI","Value":["1.3.6.1.4.1.11129.5.1.303327499491957026103380014864616068710"]},
    "00200010":{"vr":"SH"},
    "00200013":{"vr":"IS"},
    "00280010":{"vr":"US","Value":[1024]},
    "00280011":{"vr":"US","Value":[1024]},
    "00280100":{"vr":"US","Value":[8]}
  }
]
The following table shows how the studies UID, series UID, and instances UID changed:
  Original instance metadata De-identified instance metadata
Studies UID (0020000D) 2.25.70541616638819138568043293671559322355 1.3.6.1.4.1.11129.5.1.201854290391432893460946240745559593763
Series UID (0020000E) 1.2.276.0.7230010.3.1.3.8323329.78.1531234558.523694 1.3.6.1.4.1.11129.5.1.303327499491957026103380014864616068710
Instances UID (00080018) 1.2.276.0.7230010.3.1.4.8323329.78.1539083058.523695 1.3.6.1.4.1.11129.5.1.97415866390999888717168863957686758029
Using the new values, retrieve the metadata for the instance:
curl -X GET \
    -H "Authorization: Bearer "$(gcloud auth print-access-token) \
    -H "Content-Type: application/json; charset=utf-8" \
    "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/REGION/datasets/DESTINATION_DATASET_ID//dicomStores/DICOM_STORE_ID/dicomWeb/studies/1.3.6.1.4.1.11129.5.1.201854290391432893460946240745559593763/series/1.3.6.1.4.1.11129.5.1.303327499491957026103380014864616068710/instances/1.3.6.1.4.1.11129.5.1.97415866390999888717168863957686758029/metadata"
If the request is successful, the server returns a 200 OK HTTP status code and the new metadata in JSON format. You can compare the new metadata with the original metadata to see the effect of the transformation.
200 OK
[
  {
    "00020002":{"vr":"UI","Value":["1.2.840.10008.5.1.4.1.1.7"]},
    "00020003":{"vr":"UI","Value":["1.3.6.1.4.1.11129.5.1.97415866390999888717168863957686758029"]},
    "00020010":{"vr":"UI","Value":["1.2.840.10008.1.2.4.50"]},
    "00020012":{"vr":"UI","Value":["1.2.40.0.13.1.3"]},
    "00020013":{"vr":"SH","Value":["dcm4che-null"]},
    "00080005":{"vr":"CS","Value":["ISO_IR 100"]},
    "00080016":{"vr":"UI","Value":["1.2.840.10008.5.1.4.1.1.7"]},
    "00080018":{"vr":"UI","Value":["1.3.6.1.4.1.11129.5.1.97415866390999888717168863957686758029"]},
    "00080020":{"vr":"DA"},
    "00080030":{"vr":"TM"},
    "00080050":{"vr":"SH"},
    "00080064":{"vr":"CS","Value":["WSD"]},
    "00080070":{"vr":"LO","Value":["Manufactuer"]},
    "00080090":{"vr":"PN"},
    "00081090":{"vr":"LO","Value":["ABC1"]},
    "00100010":{"vr":"PN"},
    "00100020":{"vr":"LO"},
    "00100030":{"vr":"DA"},
    "00100040":{"vr":"CS"},
    "0020000D":{"vr":"UI","Value":["1.3.6.1.4.1.11129.5.1.201854290391432893460946240745559593763"]},
    "0020000E":{"vr":"UI","Value":["1.3.6.1.4.1.11129.5.1.303327499491957026103380014864616068710"]},
    "00200010":{"vr":"SH"},
    "00200011":{"vr":"IS"},
    "00200013":{"vr":"IS"},
    "00200020":{"vr":"CS"},
    "00280002":{"vr":"US","Value":[3]},
    "00280004":{"vr":"CS","Value":["YBR_FULL_422"]},
    "00280006":{"vr":"US","Value":[0]},
    "00280010":{"vr":"US","Value":[1024]},
    "00280011":{"vr":"US","Value":[1024]},
    "00280100":{"vr":"US","Value":[8]},
    "00280101":{"vr":"US","Value":[8]},
    "00280102":{"vr":"US","Value":[7]},
    "00280103":{"vr":"US","Value":[0]},
    "00282110":{"vr":"CS","Value":["01"]},
    "00282114":{"vr":"CS","Value":["ISO_10918_1"]}
  }
]

PowerShell

To de-identify a dataset containing DICOM data using a tag filter profile, make a POST request and provide the name of the destination dataset, the tag filter profile for the data you want to redact, and an access token. The following sample shows how to make a POST request using Windows PowerShell.

$cred = gcloud auth print-access-token
$headers = @{ Authorization = "Bearer $cred" }

Invoke-WebRequest `
  -Method Post `
  -Headers $headers `
  -ContentType: "application/json; charset=utf-8" `
  -Body "{
    'destinationDataset': 'projects/PROJECT_ID/locations/REGION/datasets/DESTINATION_DATASET_ID',
    'config': {
      'dicom': {
        'filterProfile': 'ATTRIBUTE_CONFIDENTIALITY_BASIC_PROFILE'
      }
    }
  }" `
  -Uri "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/REGION/datasets/SOURCE_DATASET_ID:deidentify" | Select-Object -Expand Content
If the request is successful, the server returns a 200 OK HTTP status code and the response in JSON format:
200 OK
{
  "name": "projects/PROJECT_ID/locations/REGION/datasets/SOURCE_DATASET_ID/OPERATION_ID"
}
The response contains an operation name. You can use the Operation get method to track the status of the operation:
$cred = gcloud auth print-access-token
$headers = @{ Authorization = "Bearer $cred" }

Invoke-WebRequest `
  -Method Get `
  -Headers $headers `
  -ContentType: "application/json; charset=utf-8" `
  -Uri "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/REGION/datasets/SOURCE_DATASET_ID/operations/OPERATION_ID" | Select-Object -Expand Content
If the request is successful, the server returns a 200 OK HTTP status code and the response in JSON format. After the de-identification process finishes, the response contains "done": true.
200 OK
{
  "name": "projects/PROJECT_ID/locations/REGION/datasets/SOURCE_DATASET_ID/operations/OPERATION_NUMBER",
  "metadata": {
    "@type": "type.googleapis.com/google.cloud.healthcare.v1beta1.OperationMetadata",
    "apiMethodName": "google.cloud.healthcare.v1beta1.dataset.DatasetService.DeidentifyDataset",
    "createTime": "2018-01-01T00:00:00Z",
    "endTime": "2018-01-01T00:00:00Z"
  },
  "done": true,
  "response": {
    "@type": "...",
    "successStoreCount": "SUCCESS_STORE_COUNT"
  }
}
After the de-identification succeeds, you can retrieve the metadata for the de-identified instance to see how it was changed. The de-identified instance will have a new studies UID, series UID, and instances UID, so you first need to search in the new dataset for the de-identified instance:
$cred = gcloud auth print-access-token
$headers = @{ Authorization = "Bearer $cred" }

Invoke-WebRequest `
  -Method Get `
  -Headers $headers `
  -ContentType: "application/json; charset=utf-8" `
  -Uri "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/REGION/datasets/DESTINATION_DATASET_ID/dicomStores/DICOM_STORE_ID/dicomWeb/instances" | Select-Object -Expand Content
If the request is successful, the server returns a 200 OK HTTP status code and the response in JSON format:
200 OK
[
  {
    "00080005":{"vr":"CS"},
    "00080016":{"vr":"UI"},
    "00080018":{"vr":"UI","Value":["1.3.6.1.4.1.11129.5.1.97415866390999888717168863957686758029"]},
    "00080020":{"vr":"DA"},
    "00080030":{"vr":"TM"},
    "00080050":{"vr":"SH"},
    "00080090":{"vr":"PN"},
    "00100010":{"vr":"PN"},
    "00100020":{"vr":"LO","Value":["S1214223-1"]},
    "00100030":{"vr":"DA"},
    "00100040":{"vr":"CS"},
    "0020000D":{"vr":"UI","Value":["1.3.6.1.4.1.11129.5.1.201854290391432893460946240745559593763"]},
    "0020000E":{"vr":"UI","Value":["1.3.6.1.4.1.11129.5.1.303327499491957026103380014864616068710"]},
    "00200010":{"vr":"SH"},
    "00200013":{"vr":"IS"},
    "00280010":{"vr":"US","Value":[1024]},
    "00280011":{"vr":"US","Value":[1024]},
    "00280100":{"vr":"US","Value":[8]}
  }
]
The following table shows how the studies UID, series UID, and instances UID changed:
  Original instance metadata De-identified instance metadata
Studies UID (0020000D) 2.25.70541616638819138568043293671559322355 1.3.6.1.4.1.11129.5.1.201854290391432893460946240745559593763
Series UID (0020000E) 1.2.276.0.7230010.3.1.3.8323329.78.1531234558.523694 1.3.6.1.4.1.11129.5.1.303327499491957026103380014864616068710
Instances UID (00080018) 1.2.276.0.7230010.3.1.4.8323329.78.1539083058.523695 1.3.6.1.4.1.11129.5.1.97415866390999888717168863957686758029
Using the new values, retrieve the metadata for the instance:
$cred = gcloud auth print-access-token
$headers = @{ Authorization = "Bearer $cred" }

Invoke-WebRequest `
  -Method Get `
  -Headers $headers `
  -ContentType: "application/json; charset=utf-8" `
  -Uri "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/REGION/datasets/DESTINATION_DATASET_ID//dicomStores/DICOM_STORE_ID/dicomWeb/studies/1.3.6.1.4.1.11129.5.1.201854290391432893460946240745559593763/series/1.3.6.1.4.1.11129.5.1.303327499491957026103380014864616068710/instances/1.3.6.1.4.1.11129.5.1.97415866390999888717168863957686758029/metadata" | Select-Object -Expand Content
If the request is successful, the server returns a 200 OK HTTP status code and the new metadata in JSON format. You can compare the new metadata with the original metadata to see the effect of the transformation.
200 OK
[
  {
    "00020002":{"vr":"UI","Value":["1.2.840.10008.5.1.4.1.1.7"]},
    "00020003":{"vr":"UI","Value":["1.3.6.1.4.1.11129.5.1.97415866390999888717168863957686758029"]},
    "00020010":{"vr":"UI","Value":["1.2.840.10008.1.2.4.50"]},
    "00020012":{"vr":"UI","Value":["1.2.40.0.13.1.3"]},
    "00020013":{"vr":"SH","Value":["dcm4che-null"]},
    "00080005":{"vr":"CS","Value":["ISO_IR 100"]},
    "00080016":{"vr":"UI","Value":["1.2.840.10008.5.1.4.1.1.7"]},
    "00080018":{"vr":"UI","Value":["1.3.6.1.4.1.11129.5.1.97415866390999888717168863957686758029"]},
    "00080020":{"vr":"DA"},
    "00080030":{"vr":"TM"},
    "00080050":{"vr":"SH"},
    "00080064":{"vr":"CS","Value":["WSD"]},
    "00080070":{"vr":"LO","Value":["Manufactuer"]},
    "00080090":{"vr":"PN"},
    "00081090":{"vr":"LO","Value":["ABC1"]},
    "00100010":{"vr":"PN"},
    "00100020":{"vr":"LO"},
    "00100030":{"vr":"DA"},
    "00100040":{"vr":"CS"},
    "0020000D":{"vr":"UI","Value":["1.3.6.1.4.1.11129.5.1.201854290391432893460946240745559593763"]},
    "0020000E":{"vr":"UI","Value":["1.3.6.1.4.1.11129.5.1.303327499491957026103380014864616068710"]},
    "00200010":{"vr":"SH"},
    "00200011":{"vr":"IS"},
    "00200013":{"vr":"IS"},
    "00200020":{"vr":"CS"},
    "00280002":{"vr":"US","Value":[3]},
    "00280004":{"vr":"CS","Value":["YBR_FULL_422"]},
    "00280006":{"vr":"US","Value":[0]},
    "00280010":{"vr":"US","Value":[1024]},
    "00280011":{"vr":"US","Value":[1024]},
    "00280100":{"vr":"US","Value":[8]},
    "00280101":{"vr":"US","Value":[8]},
    "00280102":{"vr":"US","Value":[7]},
    "00280103":{"vr":"US","Value":[0]},
    "00282110":{"vr":"CS","Value":["01"]},
    "00282114":{"vr":"CS","Value":["ISO_10918_1"]}
  }
]

Redacting burnt-in text from images

The Cloud Healthcare API can redact sensitive burnt-in text from images. Sensitive data such as PHI is detected by the API, which then obscures it using an opaque rectangle. The API returns the same DICOM images you gave it, in the same format, but any text identified as containing sensitive information according to your criteria is redacted.

You can redact burnt-in text from images by specifying a TextRedactionMode option inside of an ImageConfig object. See the TextRedactionMode documentation for possible values.

Redacting all burnt-in text from an image

The following samples show how to redact all burnt-in text from DICOM images in a dataset. This is done by specifying REDACT_ALL_TEXT in the TextRedactionMode field.

After submitting the image to the Cloud Healthcare API using the REDACT_ALL_TEXT option, the image appears as follows. While the burnt-in text at the bottom of the image has been removed, the metadata in the top corners of the image remains. To also remove the metadata, see De-identification using tags.

xray_redact_all_text

curl command

To redact all burnt-in text from a DICOM image, make a POST request and provide the name of the destination dataset, a DeidentifyConfig object with an empty dicom field and image.text_redaction_mode set to REDACT_ALL_TEXT, and an access token. The following sample shows a POST request using curl.

curl -X POST \
    -H "Authorization: Bearer "$(gcloud auth print-access-token) \
    -H "Content-Type: application/json; charset=utf-8" \
    --data "{
      'destinationDataset': 'projects/PROJECT_ID/locations/REGION/datasets/DESTINATION_DATASET_ID',
      'config': {
        'dicom': {},
        'image': {
          'textRedactionMode': 'REDACT_ALL_TEXT'
        }
      }
    }" "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/REGION/datasets/SOURCE_DATASET_ID:deidentify"

If the request is successful, the server returns a 200 OK HTTP status code and the response in JSON format:

200 OK
{
  "name": "projects/PROJECT_ID/locations/REGION/datasets/SOURCE_DATASET_ID/OPERATION_ID"
}

The response contains an operation name. To track the status of the operation, you can use the Operation get method:

curl -X GET \
    -H "Authorization: Bearer "$(gcloud auth print-access-token) \
    "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/REGION/datasets/SOURCE_DATASET_ID/operations/OPERATION_ID"

If the request is successful, the server returns a 200 OK HTTP status code and the response in JSON format. After the de-identification process finishes, the response contains "done": true.

200 OK
{
  "name": "projects/PROJECT_ID/locations/REGION/datasets/SOURCE_DATASET_ID/operations/OPERATION_NUMBER",
  "metadata": {
    "@type": "type.googleapis.com/google.cloud.healthcare.v1beta1.OperationMetadata",
    "apiMethodName": "google.cloud.healthcare.v1beta1.dataset.DatasetService.DeidentifyDataset",
    "createTime": "2018-01-01T00:00:00Z",
    "endTime": "2018-01-01T00:00:00Z"
  },
  "done": true,
  "response": {
    "@type": "..."
  }
}

PowerShell

To redact all burnt-in text from a DICOM image, make a POST request and provide the name of the destination dataset, a DeidentifyConfig object with an empty dicom field and image.text_redaction_mode set to REDACT_ALL_TEXT, and an access token. The following sample shows a POST request using Windows PowerShell.

$cred = gcloud auth print-access-token
$headers = @{ Authorization = "Bearer $cred" }

Invoke-WebRequest `
  -Method Post `
  -Headers $headers `
  -ContentType: "application/json; charset=utf-8" `
  -Body "{
    'destinationDataset': 'projects/PROJECT_ID/locations/REGION/datasets/DESTINATION_DATASET_ID',
    'config': {
      'dicom': {},
      'image': {
        'textRedactionMode': 'REDACT_ALL_TEXT'
      }
    }
  }" `
  -Uri "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/REGION/datasets/SOURCE_DATASET_ID:deidentify" | Select-Object -Expand Content

If the request is successful, the server returns a 200 OK HTTP status code and the response in JSON format:

200 OK
{
  "name": "projects/PROJECT_ID/locations/REGION/datasets/SOURCE_DATASET_ID/OPERATION_ID"
}

The response contains an operation name. To track the status of the operation, you can use the Operation get method:

$cred = gcloud auth print-access-token
$headers = @{ Authorization = "Bearer $cred" }

Invoke-WebRequest `
  -Method Get `
  -Headers $headers `
  -Uri "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/REGION/datasets/SOURCE_DATASET_ID/operations/OPERATION_ID" | Select-Object -Expand Content

After the de-identification process finishes, the response contains "done": true.

200 OK
{
  "name": "projects/PROJECT_ID/locations/REGION/datasets/SOURCE_DATASET_ID/operations/OPERATION_NUMBER",
  "metadata": {
    "@type": "type.googleapis.com/google.cloud.healthcare.v1beta1.OperationMetadata",
    "apiMethodName": "google.cloud.healthcare.v1beta1.dataset.DatasetService.DeidentifyDataset",
    "createTime": "2018-01-01T00:00:00Z",
    "endTime": "2018-01-01T00:00:00Z"
  },
  "done": true,
  "response": {
    "@type": "...",
    "successStoreCount": "SUCCESS_STORE_COUNT"
  }
}

Redacting only sensitive burnt-in text from an image

The following samples show how to redact sensitive burnt-in text from DICOM images in a dataset. This is done by specifying REDACT_SENSITIVE_TEXT in the TextRedactionMode field.

The infoTypes specified in the default DICOM infoTypes are redacted when REDACT_SENSITIVE_TEXT is specified. An additional custom infoType for patient identifiers, such as Medical Record Numbers (MRNs), is also applied and the patient identifiers are redacted.

The following image shows an unredacted x-ray of a patient:

xray2_unredacted

After submitting the image to the Cloud Healthcare API using the REDACT_SENSITIVE_TEXT option, the image appears as follows:

xray2_redact_sensitive_text

You can see that the following occurred:

  • The PERSON_NAME in the bottom left of the image was redacted
  • The DATE in the bottom left of the image was redacted

The patient's sex was not redacted because it is not considered to be sensitive text according to the default DICOM infoTypes.

curl command

To redact sensitive burnt-in text from a DICOM image, make a POST request and provide the name of the destination dataset, a DeidentifyConfig object with an empty dicom field and image.text_redaction_mode set to REDACT_SENSITIVE_TEXT, and an access token. The following sample shows a POST request using curl.

curl -X POST \
    -H "Authorization: Bearer "$(gcloud auth print-access-token) \
    -H "Content-Type: application/json; charset=utf-8" \
    --data "{
      'destinationDataset': 'projects/PROJECT_ID/locations/REGION/datasets/DESTINATION_DATASET_ID',
      'config': {
        'dicom': {},
        'image': {
          'textRedactionMode': 'REDACT_SENSITIVE_TEXT'
        }
      }
    }" "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/REGION/datasets/SOURCE_DATASET_ID:deidentify"

If the request is successful, the server returns a 200 OK HTTP status code and the response in JSON format:

200 OK
{
  "name": "projects/PROJECT_ID/locations/REGION/datasets/SOURCE_DATASET_ID/OPERATION_ID"
}

The response contains an operation name. To track the status of the operation, you can use the Operation get method:

curl -X GET \
    -H "Authorization: Bearer "$(gcloud auth print-access-token) \
    "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/REGION/datasets/SOURCE_DATASET_ID/operations/OPERATION_ID"

If the request is successful, the server returns a 200 OK HTTP status code and the response in JSON format. After the de-identification process finishes, the response contains "done": true.

200 OK
{
  "name": "projects/PROJECT_ID/locations/REGION/datasets/SOURCE_DATASET_ID/operations/OPERATION_NUMBER",
  "metadata": {
    "@type": "type.googleapis.com/google.cloud.healthcare.v1beta1.OperationMetadata",
    "apiMethodName": "google.cloud.healthcare.v1beta1.dataset.DatasetService.DeidentifyDataset",
    "createTime": "2018-01-01T00:00:00Z",
    "endTime": "2018-01-01T00:00:00Z"
  },
  "done": true,
  "response": {
    "@type": "..."
  }
}

PowerShell

To redact sensitive burnt-in text from a DICOM image, make a POST request and provide the name of the destination dataset, a DeidentifyConfig object with an empty dicom field and image.text_redaction_mode set to REDACT_SENSITIVE_TEXT, and an access token. The following sample shows a POST request using Windows PowerShell.

$cred = gcloud auth print-access-token
$headers = @{ Authorization = "Bearer $cred" }

Invoke-WebRequest `
  -Method Post `
  -Headers $headers `
  -ContentType: "application/json; charset=utf-8" `
  -Body "{
    'destinationDataset': 'projects/PROJECT_ID/locations/REGION/datasets/DESTINATION_DATASET_ID',
    'config': {
      'dicom': {},
      'image': {
        'textRedactionMode': 'REDACT_SENSITIVE_TEXT'
      }
    }
  }" `
  -Uri "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/REGION/datasets/SOURCE_DATASET_ID:deidentify" | Select-Object -Expand Content

If the request is successful, the server returns a 200 OK HTTP status code and the response in JSON format:

200 OK
{
  "name": "projects/PROJECT_ID/locations/REGION/datasets/SOURCE_DATASET_ID/OPERATION_ID"
}

The response contains an operation name. To track the status of the operation, you can use the Operation get method:

$cred = gcloud auth print-access-token
$headers = @{ Authorization = "Bearer $cred" }

Invoke-WebRequest `
  -Method Get `
  -Headers $headers `
  -Uri "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/REGION/datasets/SOURCE_DATASET_ID/operations/OPERATION_ID" | Select-Object -Expand Content

After the de-identification process finishes, the response contains "done": true.

200 OK
{
  "name": "projects/PROJECT_ID/locations/REGION/datasets/SOURCE_DATASET_ID/operations/OPERATION_NUMBER",
  "metadata": {
    "@type": "type.googleapis.com/google.cloud.healthcare.v1beta1.OperationMetadata",
    "apiMethodName": "google.cloud.healthcare.v1beta1.dataset.DatasetService.DeidentifyDataset",
    "createTime": "2018-01-01T00:00:00Z",
    "endTime": "2018-01-01T00:00:00Z"
  },
  "done": true,
  "response": {
    "@type": "...",
    "successStoreCount": "SUCCESS_STORE_COUNT"
  }
}

Combining tag de-identification and burnt-in text redaction

You can combine de-identification using tags with redaction of burnt-in text from images to de-identify DICOM instances at a more granular level. For example, by combining REDACT_ALL_TEXT in the TextRedactionMode field with DEIDENTIFY_TAG_CONTENTS in the TagFilterProfile field, you can do the following:

  • REDACT_ALL_TEXT: Redact all burnt-in text in the image.
  • DEIDENTIFY_TAG_CONTENTS: Inspect tag contents and transform sensitive text. For more information on the behavior of DEIDENTIFY_TAG_CONTENTS, see Default configuration.

After submitting the image to the Cloud Healthcare API using the REDACT_ALL_TEXT and DEIDENTIFY_TAG_CONTENTS options, the image appears as follows:

xray_redact_all_text_deidentify_tag_contents

curl command

To redact all burnt-in text from a DICOM image and transform sensitive text, make a POST request and provide the name of the destination dataset, dicom.filter_profile set to DEIDENTIFY_TAG_CONTENTS, image.text_redaction_mode set to REDACT_ALL_TEXT, and an access token. The following sample shows a POST request using curl.

curl -X POST \
    -H "Authorization: Bearer "$(gcloud auth print-access-token) \
    -H "Content-Type: application/json; charset=utf-8" \
    --data "{
      'destinationDataset': 'projects/PROJECT_ID/locations/REGION/datasets/DESTINATION_DATASET_ID',
      'config': {
        'dicom': {
          'filterProfile': 'DEIDENTIFY_TAG_CONTENTS'
        },
        'image': {
          'textRedactionMode': 'REDACT_ALL_TEXT'
        }
      }
    }" "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/REGION/datasets/SOURCE_DATASET_ID:deidentify"
If the request is successful, the server returns a 200 OK HTTP status code and the response in JSON format:
200 OK
{
  "name": "projects/PROJECT_ID/locations/REGION/datasets/SOURCE_DATASET_ID/OPERATION_ID"
}
The response contains an operation name. You can use the Operation get method to track the status of the operation:
curl -X GET \
    -H "Authorization: Bearer "$(gcloud auth print-access-token) \
    -H "Content-Type: application/json; charset=utf-8" \
    "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/REGION/datasets/SOURCE_DATASET_ID/operations/OPERATION_ID"
If the request is successful, the server returns a 200 OK HTTP status code and the response in JSON format. After the de-identification process finishes, the response contains "done": true.
200 OK
{
  "name": "projects/PROJECT_ID/locations/REGION/datasets/SOURCE_DATASET_ID/operations/OPERATION_NUMBER",
  "metadata": {
    "@type": "type.googleapis.com/google.cloud.healthcare.v1beta1.OperationMetadata",
    "apiMethodName": "google.cloud.healthcare.v1beta1.dataset.DatasetService.DeidentifyDataset",
    "createTime": "2018-01-01T00:00:00Z",
    "endTime": "2018-01-01T00:00:00Z"
  },
  "done": true,
  "response": {
    "@type": "...",
    "successStoreCount": "SUCCESS_STORE_COUNT"
  }
}
After the de-identification succeeds, you can retrieve the metadata for the de-identified instance to see how it was changed. The de-identified instance will have a new studies UID, series UID, and instances UID, so you first need to search in the new dataset for the de-identified instance:
curl -X GET \
     -H "Authorization: Bearer "$(gcloud auth print-access-token) \
     -H "Content-Type: application/dicom+json; charset=utf-8" \
     "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/REGION/datasets/DESTINATION_DATASET_ID/dicomStores/DICOM_STORE_ID/dicomWeb/instances"
If the request is successful, the server returns a 200 OK HTTP status code and the response in JSON format:
200 OK
[
  {
    "00080005":{"vr":"CS"},
    "00080016":{"vr":"UI"},
    "00080018":{"vr":"UI","Value":["1.3.6.1.4.1.11129.5.1.97415866390999888717168863957686758029"]},
    "00080020":{"vr":"DA"},
    "00080030":{"vr":"TM"},
    "00080050":{"vr":"SH"},
    "00080090":{"vr":"PN"},
    "00100010":{"vr":"PN"},
    "00100020":{"vr":"LO","Value":["S1214223-1"]},
    "00100030":{"vr":"DA"},
    "00100040":{"vr":"CS"},
    "0020000D":{"vr":"UI","Value":["1.3.6.1.4.1.11129.5.1.201854290391432893460946240745559593763"]},
    "0020000E":{"vr":"UI","Value":["1.3.6.1.4.1.11129.5.1.303327499491957026103380014864616068710"]},
    "00200010":{"vr":"SH"},
    "00200013":{"vr":"IS"},
    "00280010":{"vr":"US","Value":[1024]},
    "00280011":{"vr":"US","Value":[1024]},
    "00280100":{"vr":"US","Value":[8]}
  }
]
The following table shows how the studies UID, series UID, and instances UID changed:
  Original instance metadata De-identified instance metadata
Studies UID (0020000D) 2.25.70541616638819138568043293671559322355 1.3.6.1.4.1.11129.5.1.201854290391432893460946240745559593763
Series UID (0020000E) 1.2.276.0.7230010.3.1.3.8323329.78.1531234558.523694 1.3.6.1.4.1.11129.5.1.303327499491957026103380014864616068710
Instances UID (00080018) 1.2.276.0.7230010.3.1.4.8323329.78.1539083058.523695 1.3.6.1.4.1.11129.5.1.97415866390999888717168863957686758029
Using the new values, retrieve the metadata for the instance:
curl -X GET \
    -H "Authorization: Bearer "$(gcloud auth print-access-token) \
    -H "Content-Type: application/json; charset=utf-8" \
    "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/REGION/datasets/DESTINATION_DATASET_ID//dicomStores/DICOM_STORE_ID/dicomWeb/studies/1.3.6.1.4.1.11129.5.1.201854290391432893460946240745559593763/series/1.3.6.1.4.1.11129.5.1.303327499491957026103380014864616068710/instances/1.3.6.1.4.1.11129.5.1.97415866390999888717168863957686758029/metadata"
If the request is successful, the server returns a 200 OK HTTP status code and the new metadata in JSON format. You can compare the new metadata with the original metadata to see the effect of the transformation.
200 OK
[
  {
    "00020002":{"vr":"UI","Value":["1.2.840.10008.5.1.4.1.1.7"]},
    "00020003":{"vr":"UI","Value":["1.3.6.1.4.1.11129.5.1.97415866390999888717168863957686758029"]},
    "00020010":{"vr":"UI","Value":["1.2.840.10008.1.2.4.50"]},
    "00020012":{"vr":"UI","Value":["1.2.40.0.13.1.3"]},
    "00020013":{"vr":"SH","Value":["dcm4che-null"]},
    "00080005":{"vr":"CS","Value":["ISO_IR 100"]},
    "00080016":{"vr":"UI","Value":["1.2.840.10008.5.1.4.1.1.7"]},
    "00080018":{"vr":"UI","Value":["1.3.6.1.4.1.11129.5.1.97415866390999888717168863957686758029"]},
    "00080020":{"vr":"DA","Value":["20110728"]},
    "00080030":{"vr":"TM","Value":["110032"]},
    "00080050":{"vr":"SH"},
    "00080064":{"vr":"CS","Value":["WSD"]},
    "00080070":{"vr":"LO","Value":["Manufacturer"]},
    "00080090":{"vr":"PN","Value":[{"Alphabetic":"vTbECsOCTSOejDB1nAoHuGAn2tGaqXY4OJP6uZLPWMc"}]},
    "00081090":{"vr":"LO","Value":["ABC1"]},
    "00100010":{"vr":"PN","Value":[{"Alphabetic":"4LuwWK6k3Z/K/Ity4whf6YHYrm9an103tWL1EnCGwIk"}]},
    "00100020":{"vr":"LO","Value":["S1214223-1"]},
    "00100030":{"vr":"DA","Value":["19880630"]},
    "00100040":{"vr":"CS","Value":["F"]},
    "0020000D":{"vr":"UI","Value":["1.3.6.1.4.1.11129.5.1.201854290391432893460946240745559593763"]},
    "0020000E":{"vr":"UI","Value":["1.3.6.1.4.1.11129.5.1.303327499491957026103380014864616068710"]},
    "00200010":{"vr":"SH"},
    "00200011":{"vr":"IS"},
    "00200013":{"vr":"IS"},
    "00200020":{"vr":"CS"},
    "00280002":{"vr":"US","Value":[3]},
    "00280004":{"vr":"CS","Value":["YBR_FULL_422"]},
    "00280006":{"vr":"US","Value":[0]},
    "00280010":{"vr":"US","Value":[1024]},
    "00280011":{"vr":"US","Value":[1024]},
    "00280100":{"vr":"US","Value":[8]},
    "00280101":{"vr":"US","Value":[8]},
    "00280102":{"vr":"US","Value":[7]},
    "00280103":{"vr":"US","Value":[0]},
    "00282110":{"vr":"CS","Value":["01"]},
    "00282114":{"vr":"CS","Value":["ISO_10918_1"]}
  }
]

PowerShell

To redact all burnt-in text from a DICOM image and transform sensitive text, make a POST request and provide the name of the destination dataset, dicom.filter_profile set to DEIDENTIFY_TAG_CONTENTS, image.text_redaction_mode set to REDACT_ALL_TEXT, and an access token. The following sample shows a POST request using Windows PowerShell.

$cred = gcloud auth print-access-token
$headers = @{ Authorization = "Bearer $cred" }

Invoke-WebRequest `
  -Method Post `
  -Headers $headers `
  -ContentType: "application/json; charset=utf-8" `
  -Body "{
    'destinationDataset': 'projects/PROJECT_ID/locations/REGION/datasets/DESTINATION_DATASET_ID',
    'config': {
      'dicom': {
        'filterProfile': 'DEIDENTIFY_TAG_CONTENTS'
      },
      'image': {
        'textRedactionMode': 'REDACT_ALL_TEXT'
      }
    }
  }" `
  -Uri "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/REGION/datasets/SOURCE_DATASET_ID:deidentify" | Select-Object -Expand Content
If the request is successful, the server returns a 200 OK HTTP status code and the response in JSON format:
200 OK
{
  "name": "projects/PROJECT_ID/locations/REGION/datasets/SOURCE_DATASET_ID/OPERATION_ID"
}
The response contains an operation name. You can use the Operation get method to track the status of the operation:
$cred = gcloud auth print-access-token
$headers = @{ Authorization = "Bearer $cred" }

Invoke-WebRequest `
  -Method Get `
  -Headers $headers `
  -ContentType: "application/json; charset=utf-8" `
  -Uri "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/REGION/datasets/SOURCE_DATASET_ID/operations/OPERATION_ID" | Select-Object -Expand Content
If the request is successful, the server returns a 200 OK HTTP status code and the response in JSON format. After the de-identification process finishes, the response contains "done": true.
200 OK
{
  "name": "projects/PROJECT_ID/locations/REGION/datasets/SOURCE_DATASET_ID/operations/OPERATION_NUMBER",
  "metadata": {
    "@type": "type.googleapis.com/google.cloud.healthcare.v1beta1.OperationMetadata",
    "apiMethodName": "google.cloud.healthcare.v1beta1.dataset.DatasetService.DeidentifyDataset",
    "createTime": "2018-01-01T00:00:00Z",
    "endTime": "2018-01-01T00:00:00Z"
  },
  "done": true,
  "response": {
    "@type": "...",
    "successStoreCount": "SUCCESS_STORE_COUNT"
  }
}
After the de-identification succeeds, you can retrieve the metadata for the de-identified instance to see how it was changed. The de-identified instance will have a new studies UID, series UID, and instances UID, so you first need to search in the new dataset for the de-identified instance:
$cred = gcloud auth print-access-token
$headers = @{ Authorization = "Bearer $cred" }

Invoke-WebRequest `
  -Method Get `
  -Headers $headers `
  -ContentType: "application/json; charset=utf-8" `
  -Uri "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/REGION/datasets/DESTINATION_DATASET_ID/dicomStores/DICOM_STORE_ID/dicomWeb/instances" | Select-Object -Expand Content
If the request is successful, the server returns a 200 OK HTTP status code and the response in JSON format:
200 OK
[
  {
    "00080005":{"vr":"CS"},
    "00080016":{"vr":"UI"},
    "00080018":{"vr":"UI","Value":["1.3.6.1.4.1.11129.5.1.97415866390999888717168863957686758029"]},
    "00080020":{"vr":"DA"},
    "00080030":{"vr":"TM"},
    "00080050":{"vr":"SH"},
    "00080090":{"vr":"PN"},
    "00100010":{"vr":"PN"},
    "00100020":{"vr":"LO","Value":["S1214223-1"]},
    "00100030":{"vr":"DA"},
    "00100040":{"vr":"CS"},
    "0020000D":{"vr":"UI","Value":["1.3.6.1.4.1.11129.5.1.201854290391432893460946240745559593763"]},
    "0020000E":{"vr":"UI","Value":["1.3.6.1.4.1.11129.5.1.303327499491957026103380014864616068710"]},
    "00200010":{"vr":"SH"},
    "00200013":{"vr":"IS"},
    "00280010":{"vr":"US","Value":[1024]},
    "00280011":{"vr":"US","Value":[1024]},
    "00280100":{"vr":"US","Value":[8]}
  }
]
The following table shows how the studies UID, series UID, and instances UID changed:
  Original instance metadata De-identified instance metadata
Studies UID (0020000D) 2.25.70541616638819138568043293671559322355 1.3.6.1.4.1.11129.5.1.201854290391432893460946240745559593763
Series UID (0020000E) 1.2.276.0.7230010.3.1.3.8323329.78.1531234558.523694 1.3.6.1.4.1.11129.5.1.303327499491957026103380014864616068710
Instances UID (00080018) 1.2.276.0.7230010.3.1.4.8323329.78.1539083058.523695 1.3.6.1.4.1.11129.5.1.97415866390999888717168863957686758029
Using the new values, retrieve the metadata for the instance:
$cred = gcloud auth print-access-token
$headers = @{ Authorization = "Bearer $cred" }

Invoke-WebRequest `
  -Method Get `
  -Headers $headers `
  -ContentType: "application/json; charset=utf-8" `
  -Uri "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/REGION/datasets/DESTINATION_DATASET_ID//dicomStores/DICOM_STORE_ID/dicomWeb/studies/1.3.6.1.4.1.11129.5.1.201854290391432893460946240745559593763/series/1.3.6.1.4.1.11129.5.1.303327499491957026103380014864616068710/instances/1.3.6.1.4.1.11129.5.1.97415866390999888717168863957686758029/metadata" | Select-Object -Expand Content
If the request is successful, the server returns a 200 OK HTTP status code and the new metadata in JSON format. You can compare the new metadata with the original metadata to see the effect of the transformation.
200 OK
[
  {
    "00020002":{"vr":"UI","Value":["1.2.840.10008.5.1.4.1.1.7"]},
    "00020003":{"vr":"UI","Value":["1.3.6.1.4.1.11129.5.1.97415866390999888717168863957686758029"]},
    "00020010":{"vr":"UI","Value":["1.2.840.10008.1.2.4.50"]},
    "00020012":{"vr":"UI","Value":["1.2.40.0.13.1.3"]},
    "00020013":{"vr":"SH","Value":["dcm4che-null"]},
    "00080005":{"vr":"CS","Value":["ISO_IR 100"]},
    "00080016":{"vr":"UI","Value":["1.2.840.10008.5.1.4.1.1.7"]},
    "00080018":{"vr":"UI","Value":["1.3.6.1.4.1.11129.5.1.97415866390999888717168863957686758029"]},
    "00080020":{"vr":"DA","Value":["20110728"]},
    "00080030":{"vr":"TM","Value":["110032"]},
    "00080050":{"vr":"SH"},
    "00080064":{"vr":"CS","Value":["WSD"]},
    "00080070":{"vr":"LO","Value":["Manufacturer"]},
    "00080090":{"vr":"PN","Value":[{"Alphabetic":"vTbECsOCTSOejDB1nAoHuGAn2tGaqXY4OJP6uZLPWMc"}]},
    "00081090":{"vr":"LO","Value":["ABC1"]},
    "00100010":{"vr":"PN","Value":[{"Alphabetic":"4LuwWK6k3Z/K/Ity4whf6YHYrm9an103tWL1EnCGwIk"}]},
    "00100020":{"vr":"LO","Value":["S1214223-1"]},
    "00100030":{"vr":"DA","Value":["19880630"]},
    "00100040":{"vr":"CS","Value":["F"]},
    "0020000D":{"vr":"UI","Value":["1.3.6.1.4.1.11129.5.1.201854290391432893460946240745559593763"]},
    "0020000E":{"vr":"UI","Value":["1.3.6.1.4.1.11129.5.1.303327499491957026103380014864616068710"]},
    "00200010":{"vr":"SH"},
    "00200011":{"vr":"IS"},
    "00200013":{"vr":"IS"},
    "00200020":{"vr":"CS"},
    "00280002":{"vr":"US","Value":[3]},
    "00280004":{"vr":"CS","Value":["YBR_FULL_422"]},
    "00280006":{"vr":"US","Value":[0]},
    "00280010":{"vr":"US","Value":[1024]},
    "00280011":{"vr":"US","Value":[1024]},
    "00280100":{"vr":"US","Value":[8]},
    "00280101":{"vr":"US","Value":[8]},
    "00280102":{"vr":"US","Value":[7]},
    "00280103":{"vr":"US","Value":[0]},
    "00282110":{"vr":"CS","Value":["01"]},
    "00282114":{"vr":"CS","Value":["ISO_10918_1"]}
  }
]

Using infoTypes and primitive transformations with DICOM tags

The Cloud Healthcare API can use information types (infoTypes) to define what data it scans for when performing de-identification on tags. An infoType is a type of sensitive data, such as a patient name, email address, telephone number, identification number, or credit card number.

Primitive transformations are rules that you use for transforming an input value. You can customize how DICOM tags are de-identified by applying a primitive transformation to each tag's infoType. For example, you could de-identify a patient's last name and replace it with a series of asterisks by specifying the LAST_NAME infoType with the CharacterMaskConfig primitive transformation.

Default DICOM infoTypes

The default DICOM infoTypes used when de-identifying metadata are:

  • AGE
  • CREDIT_CARD_NUMBER
  • DATE
  • EMAIL_ADDRESS
  • IP_ADDRESS
  • LOCATION
  • MAC_ADDRESS
  • PASSPORT
  • PERSON_NAME
  • PHONE_NUMBER
  • SWIFT_CODE
  • US_DRIVERS_LICENSE_NUMBER
  • US_SOCIAL_SECURITY_NUMBER
  • US_VEHICLE_IDENTIFICATION_NUMBER
  • US_INDIVIDUAL_TAXPAYER_IDENTIFICATION_NUMBER

When you de-identify sensitive text in images using REDACT_SENSITIVE_TEXT, the Cloud Healthcare API uses the above infoTypes, but an additional custom infoType for patient identifiers, such as Medical Record Numbers (MRNs), is also applied to sensitive text in the image.

Primitive transformation options

The Cloud Healthcare API primitive transformation options include:

  • RedactConfig: Redacts a value by removing it.
  • CharacterMaskConfig: Masks a string either fully or partially by replacing input characters with a specified fixed character.
  • DateShiftConfig: Shifts dates by a random number of days, with the option to be consistent for the same context.
  • CryptoHashConfig: Uses SHA-256 to replace input values with a base64-encoded representation of a hashed output string generated using a given data encryption key.
  • ReplaceWithInfoTypeConfig: Replaces an input value with the name of its infoType.

Specifying configurations in TextConfig

InfoTypes and primitive transformations are specified within an InfoTypeTransformation, which is an object inside of TextConfig. InfoTypes are entered into the infoTypes array as comma-separated values.

Specifying an infoType is optional. If you do not specify at least one infoType, the transformation applies to the default DICOM infoTypes found in the Cloud Healthcare API.

If you specify any infoTypes in InfoTypeTransformation, you must specify at least one primitive transformation.

You can apply an InfoTypeTransformation only to the DEIDENTIFY_TAG_CONTENTS profile. An InfoTypeTransformation cannot be applied to the other profiles listed in TagFilterProfile.

The following sections show how to use the primitive transformations available in InfoTypeTransformation along with infoTypes to customize how DICOM tags are de-identified. The samples use the sample image provided in DICOM de-identification overview and the sample metadata provided in De-identification using tags.

Default configuration

By default, when the DEIDENTIFY_TAG_CONTENTS profile is set without providing any configuration in the TextConfig object, the Cloud Healthcare API replaces sensitive data using the default DICOM infoTypes. However, there is different behavior for the DATE and PERSON_NAME infoTypes, as shown below:

  • A DateShiftConfig is applied to text that is classified as a DATE infoType. The DateShiftConfig uses a date shifting technique with a 100-day differential.
  • A CryptoHashConfig is applied to text that is classified as a PERSON_NAME infoType. The CryptoHashConfig performs tokenization by generating a surrogate value using cryptographic hashing.

The following behavior also applies:

  • Any patient ages that have a value greater than 90 are converted to 90.
  • If a transformation cannot be applied due to DICOM format restrictions, a placeholder value is supplied that corresponds to the tag's Value Representation (VR).
  • Any other values that correspond to one of the default DICOM infoTypes in the Cloud Healthcare API are replaced by their infoType. For example, if the PatientComments tag contained the string "Ann Johnson went to Anytown Hospital," then "Anytown" would be replaced with the LOCATION infoType.

The following samples show the output of using the DEIDENTIFY_TAG_CONTENTS default profile on a dataset containing DICOM stores and DICOM data. You can compare this default output with the outputs when using the various primitive transformations with infoType combinations. The samples use a single DICOM instance, but you can also de-identify multiple instances.

After submitting the image to the Cloud Healthcare API using the DEIDENTIFY_TAG_CONTENTS profile, the image appears as follows:

dicom_infotype_default

curl command

To de-identify a dataset containing DICOM data using the default DEIDENTIFY_TAG_CONTENTS profile with no infoTypes or primitive transformations specified, make a POST request and provide the name of the destination dataset, the dicom.filter_profile set to DEIDENTIFY_TAG_CONTENTS, and an access token. The following sample shows a POST request using curl.

curl -X POST \
    -H "Authorization: Bearer "$(gcloud auth print-access-token) \
    -H "Content-Type: application/json; charset=utf-8" \
    --data "{
      'destinationDataset': 'projects/PROJECT_ID/locations/REGION/datasets/DESTINATION_DATASET_ID',
      'config': {
        'dicom': {
          'filterProfile': 'DEIDENTIFY_TAG_CONTENTS'
        }
      }
    }" "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/REGION/datasets/SOURCE_DATASET_ID:deidentify"
If the request is successful, the server returns a 200 OK HTTP status code and the response in JSON format:
200 OK
{
  "name": "projects/PROJECT_ID/locations/REGION/datasets/SOURCE_DATASET_ID/OPERATION_ID"
}
The response contains an operation name. You can use the Operation get method to track the status of the operation:
curl -X GET \
    -H "Authorization: Bearer "$(gcloud auth print-access-token) \
    -H "Content-Type: application/json; charset=utf-8" \
    "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/REGION/datasets/SOURCE_DATASET_ID/operations/OPERATION_ID"
If the request is successful, the server returns a 200 OK HTTP status code and the response in JSON format. After the de-identification process finishes, the response contains "done": true.
200 OK
{
  "name": "projects/PROJECT_ID/locations/REGION/datasets/SOURCE_DATASET_ID/operations/OPERATION_NUMBER",
  "metadata": {
    "@type": "type.googleapis.com/google.cloud.healthcare.v1beta1.OperationMetadata",
    "apiMethodName": "google.cloud.healthcare.v1beta1.dataset.DatasetService.DeidentifyDataset",
    "createTime": "2018-01-01T00:00:00Z",
    "endTime": "2018-01-01T00:00:00Z"
  },
  "done": true,
  "response": {
    "@type": "...",
    "successStoreCount": "SUCCESS_STORE_COUNT"
  }
}
After the de-identification succeeds, you can retrieve the metadata for the de-identified instance to see how it was changed. The de-identified instance will have a new studies UID, series UID, and instances UID, so you first need to search in the new dataset for the de-identified instance:
curl -X GET \
     -H "Authorization: Bearer "$(gcloud auth print-access-token) \
     -H "Content-Type: application/dicom+json; charset=utf-8" \
     "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/REGION/datasets/DESTINATION_DATASET_ID/dicomStores/DICOM_STORE_ID/dicomWeb/instances"
If the request is successful, the server returns a 200 OK HTTP status code and the response in JSON format:
200 OK
[
  {
    "00080005":{"vr":"CS"},
    "00080016":{"vr":"UI"},
    "00080018":{"vr":"UI","Value":["1.3.6.1.4.1.11129.5.1.97415866390999888717168863957686758029"]},
    "00080020":{"vr":"DA"},
    "00080030":{"vr":"TM"},
    "00080050":{"vr":"SH"},
    "00080090":{"vr":"PN"},
    "00100010":{"vr":"PN"},
    "00100020":{"vr":"LO","Value":["S1214223-1"]},
    "00100030":{"vr":"DA"},
    "00100040":{"vr":"CS"},
    "0020000D":{"vr":"UI","Value":["1.3.6.1.4.1.11129.5.1.201854290391432893460946240745559593763"]},
    "0020000E":{"vr":"UI","Value":["1.3.6.1.4.1.11129.5.1.303327499491957026103380014864616068710"]},
    "00200010":{"vr":"SH"},
    "00200013":{"vr":"IS"},
    "00280010":{"vr":"US","Value":[1024]},
    "00280011":{"vr":"US","Value":[1024]},
    "00280100":{"vr":"US","Value":[8]}
  }
]
The following table shows how the studies UID, series UID, and instances UID changed:
  Original instance metadata De-identified instance metadata
Studies UID (0020000D) 2.25.70541616638819138568043293671559322355 1.3.6.1.4.1.11129.5.1.201854290391432893460946240745559593763
Series UID (0020000E) 1.2.276.0.7230010.3.1.3.8323329.78.1531234558.523694 1.3.6.1.4.1.11129.5.1.303327499491957026103380014864616068710
Instances UID (00080018) 1.2.276.0.7230010.3.1.4.8323329.78.1539083058.523695 1.3.6.1.4.1.11129.5.1.97415866390999888717168863957686758029
Using the new values, retrieve the metadata for the instance:
curl -X GET \
    -H "Authorization: Bearer "$(gcloud auth print-access-token) \
    -H "Content-Type: application/json; charset=utf-8" \
    "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/REGION/datasets/DESTINATION_DATASET_ID//dicomStores/DICOM_STORE_ID/dicomWeb/studies/1.3.6.1.4.1.11129.5.1.201854290391432893460946240745559593763/series/1.3.6.1.4.1.11129.5.1.303327499491957026103380014864616068710/instances/1.3.6.1.4.1.11129.5.1.97415866390999888717168863957686758029/metadata"
If the request is successful, the server returns a 200 OK HTTP status code and the new metadata in JSON format. You can compare the new metadata with the original metadata to see the effect of the transformation.
200 OK
[
  {
    "00020002":{"vr":"UI","Value":["1.2.840.10008.5.1.4.1.1.7"]},
    "00020003":{"vr":"UI","Value":["1.3.6.1.4.1.11129.5.1.97415866390999888717168863957686758029"]},
    "00020010":{"vr":"UI","Value":["1.2.840.10008.1.2.4.50"]},
    "00020012":{"vr":"UI","Value":["1.2.40.0.13.1.3"]},
    "00020013":{"vr":"SH","Value":["dcm4che-null"]},
    "00080005":{"vr":"CS","Value":["ISO_IR 100"]},
    "00080016":{"vr":"UI","Value":["1.2.840.10008.5.1.4.1.1.7"]},
    "00080018":{"vr":"UI","Value":["1.3.6.1.4.1.11129.5.1.97415866390999888717168863957686758029"]},
    "00080020":{"vr":"DA","Value":["20111006"]},
    "00080030":{"vr":"TM","Value":["110032"]},
    "00080050":{"vr":"SH"},
    "00080064":{"vr":"CS","Value":["WSD"]},
    "00080070":{"vr":"LO","Value":["Manufacturer"]},
    "00080090":{"vr":"PN","Value":[{"Alphabetic":"pzv2lYqFu6wap3PXXi0y3c6VcjAsWQY/TcW0AjanOY4"}]},
    "00081090":{"vr":"LO","Value":["ABC1"]},
    "00100010":{"vr":"PN","Value":[{"Alphabetic":"uGSY8u8934To+NRbmD6HtthMxgQ91rCK6UqIYeO0UkA"}]},
    "00100020":{"vr":"LO","Value":["S1214223-1"]},
    "00100030":{"vr":"DA","Value":["19880908"]},
    "00100040":{"vr":"CS","Value":["F"]},
    "0020000D":{"vr":"UI","Value":["1.3.6.1.4.1.11129.5.1.201854290391432893460946240745559593763"]},
    "0020000E":{"vr":"UI","Value":["1.3.6.1.4.1.11129.5.1.303327499491957026103380014864616068710"]},
    "00200010":{"vr":"SH"},
    "00200011":{"vr":"IS"},
    "00200013":{"vr":"IS"},
    "00200020":{"vr":"CS"},
    "00280002":{"vr":"US","Value":[3]},
    "00280004":{"vr":"CS","Value":["YBR_FULL_422"]},
    "00280006":{"vr":"US","Value":[0]},
    "00280010":{"vr":"US","Value":[1024]},
    "00280011":{"vr":"US","Value":[1024]},
    "00280100":{"vr":"US","Value":[8]},
    "00280101":{"vr":"US","Value":[8]},
    "00280102":{"vr":"US","Value":[7]},
    "00280103":{"vr":"US","Value":[0]},
    "00282110":{"vr":"CS","Value":["01"]},
    "00282114":{"vr":"CS","Value":["ISO_10918_1"]}
  }
]

PowerShell

To de-identify a dataset containing DICOM data using the default DEIDENTIFY_TAG_CONTENTS profile with no infoTypes or primitive transformations specified, make a POST request and provide the name of the destination dataset, the dicom.filter_profile set to DEIDENTIFY_TAG_CONTENTS, and an access token. The following sample shows a POST request using Windows PowerShell.

$cred = gcloud auth print-access-token
$headers = @{ Authorization = "Bearer $cred" }

Invoke-WebRequest `
  -Method Post `
  -Headers $headers `
  -ContentType: "application/json; charset=utf-8" `
  -Body "{
    'destinationDataset': 'projects/PROJECT_ID/locations/REGION/datasets/DESTINATION_DATASET_ID',
    'config': {
      'dicom': {
        'filterProfile': 'DEIDENTIFY_TAG_CONTENTS'
      }
    }
  }" `
  -Uri "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/REGION/datasets/SOURCE_DATASET_ID:deidentify" | Select-Object -Expand Content
If the request is successful, the server returns a 200 OK HTTP status code and the response in JSON format:
200 OK
{
  "name": "projects/PROJECT_ID/locations/REGION/datasets/SOURCE_DATASET_ID/OPERATION_ID"
}
The response contains an operation name. You can use the Operation get method to track the status of the operation:
$cred = gcloud auth print-access-token
$headers = @{ Authorization = "Bearer $cred" }

Invoke-WebRequest `
  -Method Get `
  -Headers $headers `
  -ContentType: "application/json; charset=utf-8" `
  -Uri "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/REGION/datasets/SOURCE_DATASET_ID/operations/OPERATION_ID" | Select-Object -Expand Content
If the request is successful, the server returns a 200 OK HTTP status code and the response in JSON format. After the de-identification process finishes, the response contains "done": true.
200 OK
{
  "name": "projects/PROJECT_ID/locations/REGION/datasets/SOURCE_DATASET_ID/operations/OPERATION_NUMBER",
  "metadata": {
    "@type": "type.googleapis.com/google.cloud.healthcare.v1beta1.OperationMetadata",
    "apiMethodName": "google.cloud.healthcare.v1beta1.dataset.DatasetService.DeidentifyDataset",
    "createTime": "2018-01-01T00:00:00Z",
    "endTime": "2018-01-01T00:00:00Z"
  },
  "done": true,
  "response": {
    "@type": "...",
    "successStoreCount": "SUCCESS_STORE_COUNT"
  }
}
After the de-identification succeeds, you can retrieve the metadata for the de-identified instance to see how it was changed. The de-identified instance will have a new studies UID, series UID, and instances UID, so you first need to search in the new dataset for the de-identified instance:
$cred = gcloud auth print-access-token
$headers = @{ Authorization = "Bearer $cred" }

Invoke-WebRequest `
  -Method Get `
  -Headers $headers `
  -ContentType: "application/json; charset=utf-8" `
  -Uri "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/REGION/datasets/DESTINATION_DATASET_ID/dicomStores/DICOM_STORE_ID/dicomWeb/instances" | Select-Object -Expand Content
If the request is successful, the server returns a 200 OK HTTP status code and the response in JSON format:
200 OK
[
  {
    "00080005":{"vr":"CS"},
    "00080016":{"vr":"UI"},
    "00080018":{"vr":"UI","Value":["1.3.6.1.4.1.11129.5.1.97415866390999888717168863957686758029"]},
    "00080020":{"vr":"DA"},
    "00080030":{"vr":"TM"},
    "00080050":{"vr":"SH"},
    "00080090":{"vr":"PN"},
    "00100010":{"vr":"PN"},
    "00100020":{"vr":"LO","Value":["S1214223-1"]},
    "00100030":{"vr":"DA"},
    "00100040":{"vr":"CS"},
    "0020000D":{"vr":"UI","Value":["1.3.6.1.4.1.11129.5.1.201854290391432893460946240745559593763"]},
    "0020000E":{"vr":"UI","Value":["1.3.6.1.4.1.11129.5.1.303327499491957026103380014864616068710"]},
    "00200010":{"vr":"SH"},
    "00200013":{"vr":"IS"},
    "00280010":{"vr":"US","Value":[1024]},
    "00280011":{"vr":"US","Value":[1024]},
    "00280100":{"vr":"US","Value":[8]}
  }
]
The following table shows how the studies UID, series UID, and instances UID changed:
  Original instance metadata De-identified instance metadata
Studies UID (0020000D) 2.25.70541616638819138568043293671559322355 1.3.6.1.4.1.11129.5.1.201854290391432893460946240745559593763
Series UID (0020000E) 1.2.276.0.7230010.3.1.3.8323329.78.1531234558.523694 1.3.6.1.4.1.11129.5.1.303327499491957026103380014864616068710
Instances UID (00080018) 1.2.276.0.7230010.3.1.4.8323329.78.1539083058.523695 1.3.6.1.4.1.11129.5.1.97415866390999888717168863957686758029
Using the new values, retrieve the metadata for the instance:
$cred = gcloud auth print-access-token
$headers = @{ Authorization = "Bearer $cred" }

Invoke-WebRequest `
  -Method Get `
  -Headers $headers `
  -ContentType: "application/json; charset=utf-8" `
  -Uri "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/REGION/datasets/DESTINATION_DATASET_ID//dicomStores/DICOM_STORE_ID/dicomWeb/studies/1.3.6.1.4.1.11129.5.1.201854290391432893460946240745559593763/series/1.3.6.1.4.1.11129.5.1.303327499491957026103380014864616068710/instances/1.3.6.1.4.1.11129.5.1.97415866390999888717168863957686758029/metadata" | Select-Object -Expand Content
If the request is successful, the server returns a 200 OK HTTP status code and the new metadata in JSON format. You can compare the new metadata with the original metadata to see the effect of the transformation.
200 OK
[
  {
    "00020002":{"vr":"UI","Value":["1.2.840.10008.5.1.4.1.1.7"]},
    "00020003":{"vr":"UI","Value":["1.3.6.1.4.1.11129.5.1.97415866390999888717168863957686758029"]},
    "00020010":{"vr":"UI","Value":["1.2.840.10008.1.2.4.50"]},
    "00020012":{"vr":"UI","Value":["1.2.40.0.13.1.3"]},
    "00020013":{"vr":"SH","Value":["dcm4che-null"]},
    "00080005":{"vr":"CS","Value":["ISO_IR 100"]},
    "00080016":{"vr":"UI","Value":["1.2.840.10008.5.1.4.1.1.7"]},
    "00080018":{"vr":"UI","Value":["1.3.6.1.4.1.11129.5.1.97415866390999888717168863957686758029"]},
    "00080020":{"vr":"DA","Value":["20111006"]},
    "00080030":{"vr":"TM","Value":["110032"]},
    "00080050":{"vr":"SH"},
    "00080064":{"vr":"CS","Value":["WSD"]},
    "00080070":{"vr":"LO","Value":["Manufacturer"]},
    "00080090":{"vr":"PN","Value":[{"Alphabetic":"pzv2lYqFu6wap3PXXi0y3c6VcjAsWQY/TcW0AjanOY4"}]},
    "00081090":{"vr":"LO","Value":["ABC1"]},
    "00100010":{"vr":"PN","Value":[{"Alphabetic":"uGSY8u8934To+NRbmD6HtthMxgQ91rCK6UqIYeO0UkA"}]},
    "00100020":{"vr":"LO","Value":["S1214223-1"]},
    "00100030":{"vr":"DA","Value":["19880908"]},
    "00100040":{"vr":"CS","Value":["F"]},
    "0020000D":{"vr":"UI","Value":["1.3.6.1.4.1.11129.5.1.201854290391432893460946240745559593763"]},
    "0020000E":{"vr":"UI","Value":["1.3.6.1.4.1.11129.5.1.303327499491957026103380014864616068710"]},
    "00200010":{"vr":"SH"},
    "00200011":{"vr":"IS"},
    "00200013":{"vr":"IS"},
    "00200020":{"vr":"CS"},
    "00280002":{"vr":"US","Value":[3]},
    "00280004":{"vr":"CS","Value":["YBR_FULL_422"]},
    "00280006":{"vr":"US","Value":[0]},
    "00280010":{"vr":"US","Value":[1024]},
    "00280011":{"vr":"US","Value":[1024]},
    "00280100":{"vr":"US","Value":[8]},
    "00280101":{"vr":"US","Value":[8]},
    "00280102":{"vr":"US","Value":[7]},
    "00280103":{"vr":"US","Value":[0]},
    "00282110":{"vr":"CS","Value":["01"]},
    "00282114":{"vr":"CS","Value":["ISO_10918_1"]}
  }
]

RedactConfig

Specifying redactConfig redacts a given value by removing it completely. The redactConfig message has no arguments; specifying it enables transformation.

The following samples expand on the default configuration, but they now include setting the PERSON_NAME infoType with the redactConfig transform. Sending this request redacts all names from the DICOM instance.

After submitting the image to the Cloud Healthcare API using the redactConfig transformation, the image appears as follows:

dicom_redactconfig

curl command

To de-identify a dataset containing DICOM data and remove names entirely, make a POST request and provide the name of the destination dataset, the dicom.filter_profile set to DEIDENTIFY_TAG_CONTENTS, transformations.infoTypes set to PERSON_NAME, transformations.redactConfig set to an empty value, and an access token. The following sample shows a POST request using curl.

curl -X POST \
    -H "Authorization: Bearer "$(gcloud auth print-access-token) \
    -H "Content-Type: application/json; charset=utf-8" \
    --data "{
      'destinationDataset': 'projects/PROJECT_ID/locations/REGION/datasets/DESTINATION_DATASET_ID',
      'config': {
        'dicom': {
          'filterProfile': 'DEIDENTIFY_TAG_CONTENTS'
        },
        'text': {
          'transformations': [
            {
              'infoTypes': [
                'PERSON_NAME'
              ],
              'redactConfig': {}
            }
          ]
        }
      }
    }" "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/REGION/datasets/SOURCE_DATASET_ID:deidentify"
If the request is successful, the server returns a 200 OK HTTP status code and the response in JSON format:
200 OK
{
  "name": "projects/PROJECT_ID/locations/REGION/datasets/SOURCE_DATASET_ID/OPERATION_ID"
}
The response contains an operation name. You can use the Operation get method to track the status of the operation:
curl -X GET \
    -H "Authorization: Bearer "$(gcloud auth print-access-token) \
    -H "Content-Type: application/json; charset=utf-8" \
    "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/REGION/datasets/SOURCE_DATASET_ID/operations/OPERATION_ID"
If the request is successful, the server returns a 200 OK HTTP status code and the response in JSON format. After the de-identification process finishes, the response contains "done": true.
200 OK
{
  "name": "projects/PROJECT_ID/locations/REGION/datasets/SOURCE_DATASET_ID/operations/OPERATION_NUMBER",
  "metadata": {
    "@type": "type.googleapis.com/google.cloud.healthcare.v1beta1.OperationMetadata",
    "apiMethodName": "google.cloud.healthcare.v1beta1.dataset.DatasetService.DeidentifyDataset",
    "createTime": "2018-01-01T00:00:00Z",
    "endTime": "2018-01-01T00:00:00Z"
  },
  "done": true,
  "response": {
    "@type": "...",
    "successStoreCount": "SUCCESS_STORE_COUNT"
  }
}
After the de-identification succeeds, you can retrieve the metadata for the de-identified instance to see how it was changed. The de-identified instance will have a new studies UID, series UID, and instances UID, so you first need to search in the new dataset for the de-identified instance:
curl -X GET \
     -H "Authorization: Bearer "$(gcloud auth print-access-token) \
     -H "Content-Type: application/dicom+json; charset=utf-8" \
     "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/REGION/datasets/DESTINATION_DATASET_ID/dicomStores/DICOM_STORE_ID/dicomWeb/instances"
If the request is successful, the server returns a 200 OK HTTP status code and the response in JSON format:
200 OK
[
  {
    "00080005":{"vr":"CS"},
    "00080016":{"vr":"UI"},
    "00080018":{"vr":"UI","Value":["1.3.6.1.4.1.11129.5.1.97415866390999888717168863957686758029"]},
    "00080020":{"vr":"DA"},
    "00080030":{"vr":"TM"},
    "00080050":{"vr":"SH"},
    "00080090":{"vr":"PN"},
    "00100010":{"vr":"PN"},
    "00100020":{"vr":"LO","Value":["S1214223-1"]},
    "00100030":{"vr":"DA"},
    "00100040":{"vr":"CS"},
    "0020000D":{"vr":"UI","Value":["1.3.6.1.4.1.11129.5.1.201854290391432893460946240745559593763"]},
    "0020000E":{"vr":"UI","Value":["1.3.6.1.4.1.11129.5.1.303327499491957026103380014864616068710"]},
    "00200010":{"vr":"SH"},
    "00200013":{"vr":"IS"},
    "00280010":{"vr":"US","Value":[1024]},
    "00280011":{"vr":"US","Value":[1024]},
    "00280100":{"vr":"US","Value":[8]}
  }
]
The following table shows how the studies UID, series UID, and instances UID changed:
  Original instance metadata De-identified instance metadata
Studies UID (0020000D) 2.25.70541616638819138568043293671559322355 1.3.6.1.4.1.11129.5.1.201854290391432893460946240745559593763
Series UID (0020000E) 1.2.276.0.7230010.3.1.3.8323329.78.1531234558.523694 1.3.6.1.4.1.11129.5.1.303327499491957026103380014864616068710
Instances UID (00080018) 1.2.276.0.7230010.3.1.4.8323329.78.1539083058.523695 1.3.6.1.4.1.11129.5.1.97415866390999888717168863957686758029
Using the new values, retrieve the metadata for the instance:
curl -X GET \
    -H "Authorization: Bearer "$(gcloud auth print-access-token) \
    -H "Content-Type: application/json; charset=utf-8" \
    "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/REGION/datasets/DESTINATION_DATASET_ID//dicomStores/DICOM_STORE_ID/dicomWeb/studies/1.3.6.1.4.1.11129.5.1.201854290391432893460946240745559593763/series/1.3.6.1.4.1.11129.5.1.303327499491957026103380014864616068710/instances/1.3.6.1.4.1.11129.5.1.97415866390999888717168863957686758029/metadata"
If the request is successful, the server returns a 200 OK HTTP status code and the new metadata in JSON format. You can compare the new metadata with the original metadata to see the effect of the transformation.
200 OK
[
  {
    "00020002":{"vr":"UI","Value":["1.2.840.10008.5.1.4.1.1.7"]},
    "00020003":{"vr":"UI","Value":["1.3.6.1.4.1.11129.5.1.97415866390999888717168863957686758029"]},
    "00020010":{"vr":"UI","Value":["1.2.840.10008.1.2.4.50"]},
    "00020012":{"vr":"UI","Value":["1.2.40.0.13.1.3"]},
    "00020013":{"vr":"SH","Value":["dcm4che-null"]},
    "00080005":{"vr":"CS","Value":["ISO_IR 100"]},
    "00080016":{"vr":"UI","Value":["1.2.840.10008.5.1.4.1.1.7"]},
    "00080018":{"vr":"UI","Value":["1.3.6.1.4.1.11129.5.1.97415866390999888717168863957686758029"]},
    "00080020":{"vr":"DA","Value":["20110909"]},
    "00080030":{"vr":"TM","Value":["110032"]},
    "00080050":{"vr":"SH"},
    "00080064":{"vr":"CS","Value":["WSD"]},
    "00080070":{"vr":"LO","Value":["Manufacturer"]},
    "00080090":{"vr":"PN"},
    "00081090":{"vr":"LO","Value":["ABC1"]},
    "00100010":{"vr":"PN"},
    "00100020":{"vr":"LO","Value":["S1214223-1"]},
    "00100030":{"vr":"DA","Value":["19880812"]},
    "00100040":{"vr":"CS","Value":["F"]},
    "0020000D":{"vr":"UI","Value":["1.3.6.1.4.1.11129.5.1.201854290391432893460946240745559593763"]},
    "0020000E":{"vr":"UI","Value":["1.3.6.1.4.1.11129.5.1.303327499491957026103380014864616068710"]},
    "00200010":{"vr":"SH"},
    "00200011":{"vr":"IS"},
    "00200013":{"vr":"IS"},
    "00200020":{"vr":"CS"},
    "00280002":{"vr":"US","Value":[3]},
    "00280004":{"vr":"CS","Value":["YBR_FULL_422"]},
    "00280006":{"vr":"US","Value":[0]},
    "00280010":{"vr":"US","Value":[1024]},
    "00280011":{"vr":"US","Value":[1024]},
    "00280100":{"vr":"US","Value":[8]},
    "00280101":{"vr":"US","Value":[8]},
    "00280102":{"vr":"US","Value":[7]},
    "00280103":{"vr":"US","Value":[0]},
    "00282110":{"vr":"CS","Value":["01"]},
    "00282114":{"vr":"CS","Value":["ISO_10918_1"]}
  }
]

PowerShell

To de-identify a dataset containing DICOM data and remove names entirely, make a POST request and provide the name of the destination dataset, the dicom.filter_profile set to DEIDENTIFY_TAG_CONTENTS, transformations.infoTypes set to PERSON_NAME, transformations.redactConfig set to an empty value, and an access token. The following sample shows a POST request using Windows PowerShell.

$cred = gcloud auth print-access-token
$headers = @{ Authorization = "Bearer $cred" }

Invoke-WebRequest `
  -Method Post `
  -Headers $headers `
  -ContentType: "application/json; charset=utf-8" `
  -Body "{
    'destinationDataset': 'projects/PROJECT_ID/locations/REGION/datasets/DESTINATION_DATASET_ID',
    'config': {
      'dicom': {
        'filterProfile': 'DEIDENTIFY_TAG_CONTENTS'
      },
      'text': {
        'transformations': [
          {
            'infoTypes': [
              'PERSON_NAME'
            ],
            'redactConfig': {}
          }
        ]
      }
    }
  }" `
  -Uri "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/REGION/datasets/SOURCE_DATASET_ID:deidentify" | Select-Object -Expand Content
If the request is successful, the server returns a 200 OK HTTP status code and the response in JSON format:
200 OK
{
  "name": "projects/PROJECT_ID/locations/REGION/datasets/SOURCE_DATASET_ID/OPERATION_ID"
}
The response contains an operation name. You can use the Operation get method to track the status of the operation:
$cred = gcloud auth print-access-token
$headers = @{ Authorization = "Bearer $cred" }

Invoke-WebRequest `
  -Method Get `
  -Headers $headers `
  -ContentType: "application/json; charset=utf-8" `
  -Uri "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/REGION/datasets/SOURCE_DATASET_ID/operations/OPERATION_ID" | Select-Object -Expand Content
If the request is successful, the server returns a 200 OK HTTP status code and the response in JSON format. After the de-identification process finishes, the response contains "done": true.
200 OK
{
  "name": "projects/PROJECT_ID/locations/REGION/datasets/SOURCE_DATASET_ID/operations/OPERATION_NUMBER",
  "metadata": {
    "@type": "type.googleapis.com/google.cloud.healthcare.v1beta1.OperationMetadata",
    "apiMethodName": "google.cloud.healthcare.v1beta1.dataset.DatasetService.DeidentifyDataset",
    "createTime": "2018-01-01T00:00:00Z",
    "endTime": "2018-01-01T00:00:00Z"
  },
  "done": true,
  "response": {
    "@type": "...",
    "successStoreCount": "SUCCESS_STORE_COUNT"
  }
}
After the de-identification succeeds, you can retrieve the metadata for the de-identified instance to see how it was changed. The de-identified instance will have a new studies UID, series UID, and instances UID, so you first need to search in the new dataset for the de-identified instance:
$cred = gcloud auth print-access-token
$headers = @{ Authorization = "Bearer $cred" }

Invoke-WebRequest `
  -Method Get `
  -Headers $headers `
  -ContentType: "application/json; charset=utf-8" `
  -Uri "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/REGION/datasets/DESTINATION_DATASET_ID/dicomStores/DICOM_STORE_ID/dicomWeb/instances" | Select-Object -Expand Content
If the request is successful, the server returns a 200 OK HTTP status code and the response in JSON format:
200 OK
[
  {
    "00080005":{"vr":"CS"},
    "00080016":{"vr":"UI"},
    "00080018":{"vr":"UI","Value":["1.3.6.1.4.1.11129.5.1.97415866390999888717168863957686758029"]},
    "00080020":{"vr":"DA"},
    "00080030":{"vr":"TM"},
    "00080050":{"vr":"SH"},
    "00080090":{"vr":"PN"},
    "00100010":{"vr":"PN"},
    "00100020":{"vr":"LO","Value":["S1214223-1"]},
    "00100030":{"vr":"DA"},
    "00100040":{"vr":"CS"},
    "0020000D":{"vr":"UI","Value":["1.3.6.1.4.1.11129.5.1.201854290391432893460946240745559593763"]},
    "0020000E":{"vr":"UI","Value":["1.3.6.1.4.1.11129.5.1.303327499491957026103380014864616068710"]},
    "00200010":{"vr":"SH"},
    "00200013":{"vr":"IS"},
    "00280010":{"vr":"US","Value":[1024]},
    "00280011":{"vr":"US","Value":[1024]},
    "00280100":{"vr":"US","Value":[8]}
  }
]
The following table shows how the studies UID, series UID, and instances UID changed:
  Original instance metadata De-identified instance metadata
Studies UID (0020000D) 2.25.70541616638819138568043293671559322355 1.3.6.1.4.1.11129.5.1.201854290391432893460946240745559593763
Series UID (0020000E) 1.2.276.0.7230010.3.1.3.8323329.78.1531234558.523694 1.3.6.1.4.1.11129.5.1.303327499491957026103380014864616068710
Instances UID (00080018) 1.2.276.0.7230010.3.1.4.8323329.78.1539083058.523695 1.3.6.1.4.1.11129.5.1.97415866390999888717168863957686758029
Using the new values, retrieve the metadata for the instance:
$cred = gcloud auth print-access-token
$headers = @{ Authorization = "Bearer $cred" }

Invoke-WebRequest `
  -Method Get `
  -Headers $headers `
  -ContentType: "application/json; charset=utf-8" `
  -Uri "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/REGION/datasets/DESTINATION_DATASET_ID//dicomStores/DICOM_STORE_ID/dicomWeb/studies/1.3.6.1.4.1.11129.5.1.201854290391432893460946240745559593763/series/1.3.6.1.4.1.11129.5.1.303327499491957026103380014864616068710/instances/1.3.6.1.4.1.11129.5.1.97415866390999888717168863957686758029/metadata" | Select-Object -Expand Content
If the request is successful, the server returns a 200 OK HTTP status code and the new metadata in JSON format. You can compare the new metadata with the original metadata to see the effect of the transformation.
200 OK
[
  {
    "00020002":{"vr":"UI","Value":["1.2.840.10008.5.1.4.1.1.7"]},
    "00020003":{"vr":"UI","Value":["1.3.6.1.4.1.11129.5.1.97415866390999888717168863957686758029"]},
    "00020010":{"vr":"UI","Value":["1.2.840.10008.1.2.4.50"]},
    "00020012":{"vr":"UI","Value":["1.2.40.0.13.1.3"]},
    "00020013":{"vr":"SH","Value":["dcm4che-null"]},
    "00080005":{"vr":"CS","Value":["ISO_IR 100"]},
    "00080016":{"vr":"UI","Value":["1.2.840.10008.5.1.4.1.1.7"]},
    "00080018":{"vr":"UI","Value":["1.3.6.1.4.1.11129.5.1.97415866390999888717168863957686758029"]},
    "00080020":{"vr":"DA","Value":["20110909"]},
    "00080030":{"vr":"TM","Value":["110032"]},
    "00080050":{"vr":"SH"},
    "00080064":{"vr":"CS","Value":["WSD"]},
    "00080070":{"vr":"LO","Value":["Manufacturer"]},
    "00080090":{"vr":"PN"},
    "00081090":{"vr":"LO","Value":["ABC1"]},
    "00100010":{"vr":"PN"},
    "00100020":{"vr":"LO","Value":["S1214223-1"]},
    "00100030":{"vr":"DA","Value":["19880812"]},
    "00100040":{"vr":"CS","Value":["F"]},
    "0020000D":{"vr":"UI","Value":["1.3.6.1.4.1.11129.5.1.201854290391432893460946240745559593763"]},
    "0020000E":{"vr":"UI","Value":["1.3.6.1.4.1.11129.5.1.303327499491957026103380014864616068710"]},
    "00200010":{"vr":"SH"},
    "00200011":{"vr":"IS"},
    "00200013":{"vr":"IS"},
    "00200020":{"vr":"CS"},
    "00280002":{"vr":"US","Value":[3]},
    "00280004":{"vr":"CS","Value":["YBR_FULL_422"]},
    "00280006":{"vr":"US","Value":[0]},
    "00280010":{"vr":"US","Value":[1024]},
    "00280011":{"vr":"US","Value":[1024]},
    "00280100":{"vr":"US","Value":[8]},
    "00280101":{"vr":"US","Value":[8]},
    "00280102":{"vr":"US","Value":[7]},
    "00280103":{"vr":"US","Value":[0]},
    "00282110":{"vr":"CS","Value":["01"]},
    "00282114":{"vr":"CS","Value":["ISO_10918_1"]}
  }
]

You can see from the output that the values in ReferringPhysicianName (00080090) and PatientName (00100010) have been removed. This is in contrast to the sample in the Default configuration, where these values were transformed using cryptographic hashing.

CharacterMaskConfig

Specifying characterMaskConfig replaces strings that correspond to the given infoTypes with a specified fixed character. For example, rather than redacting a patient's name or transforming it using cryptographic hashing, you can replace the name with a series of asterisks (*). You can specify the fixed character as a value to the maskingCharacter field.

The following samples expand on the default configuration, but they now include setting the LAST_NAME infoType with the characterMaskConfig transform. No fixed character is provided, so the masking defaults to using asterisks.

The samples use a single DICOM instance, but you can also de-identify multiple instances.

After submitting the image to the Cloud Healthcare API using the characterMaskConfig transformation, the image appears as follows:

dicom_charactermaskconfig

curl command

To de-identify a dataset containing DICOM data and replace last names with asterisk characters, make a POST request and provide the name of the destination dataset, the dicom.filter_profile set to DEIDENTIFY_TAG_CONTENTS, transformations.infoTypes set to LAST_NAME, transformations.characterMaskConfig set to an empty value, and an access token. The following sample shows a POST request using curl.

curl -X POST \
    -H "Authorization: Bearer "$(gcloud auth print-access-token) \
    -H "Content-Type: application/json; charset=utf-8" \
    --data "{
      'destinationDataset': 'projects/PROJECT_ID/locations/REGION/datasets/DESTINATION_DATASET_ID',
      'config': {
        'dicom': {
          'filterProfile': 'DEIDENTIFY_TAG_CONTENTS'
        },
        'text': {
          'transformations': [
            {
              'infoTypes': [
                'LAST_NAME'
              ],
              'characterMaskConfig': {
                'maskingCharacter': ''
              }
            }
          ]
        }
      }
    }" "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/REGION/datasets/SOURCE_DATASET_ID:deidentify"
If the request is successful, the server returns a 200 OK HTTP status code and the response in JSON format:
200 OK
{
  "name": "projects/PROJECT_ID/locations/REGION/datasets/SOURCE_DATASET_ID/OPERATION_ID"
}
The response contains an operation name. You can use the Operation get method to track the status of the operation:
curl -X GET \
    -H "Authorization: Bearer "$(gcloud auth print-access-token) \
    -H "Content-Type: application/json; charset=utf-8" \
    "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/REGION/datasets/SOURCE_DATASET_ID/operations/OPERATION_ID"
If the request is successful, the server returns a 200 OK HTTP status code and the response in JSON format. After the de-identification process finishes, the response contains "done": true.
200 OK
{
  "name": "projects/PROJECT_ID/locations/REGION/datasets/SOURCE_DATASET_ID/operations/OPERATION_NUMBER",
  "metadata": {
    "@type": "type.googleapis.com/google.cloud.healthcare.v1beta1.OperationMetadata",
    "apiMethodName": "google.cloud.healthcare.v1beta1.dataset.DatasetService.DeidentifyDataset",
    "createTime": "2018-01-01T00:00:00Z",
    "endTime": "2018-01-01T00:00:00Z"
  },
  "done": true,
  "response": {
    "@type": "...",
    "successStoreCount": "SUCCESS_STORE_COUNT"
  }
}
After the de-identification succeeds, you can retrieve the metadata for the de-identified instance to see how it was changed. The de-identified instance will have a new studies UID, series UID, and instances UID, so you first need to search in the new dataset for the de-identified instance:
curl -X GET \
     -H "Authorization: Bearer "$(gcloud auth print-access-token) \
     -H "Content-Type: application/dicom+json; charset=utf-8" \
     "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/REGION/datasets/DESTINATION_DATASET_ID/dicomStores/DICOM_STORE_ID/dicomWeb/instances"
If the request is successful, the server returns a 200 OK HTTP status code and the response in JSON format:
200 OK
[
  {
    "00080005":{"vr":"CS"},
    "00080016":{"vr":"UI"},
    "00080018":{"vr":"UI","Value":["1.3.6.1.4.1.11129.5.1.97415866390999888717168863957686758029"]},
    "00080020":{"vr":"DA"},
    "00080030":{"vr":"TM"},
    "00080050":{"vr":"SH"},
    "00080090":{"vr":"PN"},
    "00100010":{"vr":"PN"},
    "00100020":{"vr":"LO","Value":["S1214223-1"]},
    "00100030":{"vr":"DA"},
    "00100040":{"vr":"CS"},
    "0020000D":{"vr":"UI","Value":["1.3.6.1.4.1.11129.5.1.201854290391432893460946240745559593763"]},
    "0020000E":{"vr":"UI","Value":["1.3.6.1.4.1.11129.5.1.303327499491957026103380014864616068710"]},
    "00200010":{"vr":"SH"},
    "00200013":{"vr":"IS"},
    "00280010":{"vr":"US","Value":[1024]},
    "00280011":{"vr":"US","Value":[1024]},
    "00280100":{"vr":"US","Value":[8]}
  }
]
The following table shows how the studies UID, series UID, and instances UID changed:
  Original instance metadata De-identified instance metadata
Studies UID (0020000D) 2.25.70541616638819138568043293671559322355 1.3.6.1.4.1.11129.5.1.201854290391432893460946240745559593763
Series UID (0020000E) 1.2.276.0.7230010.3.1.3.8323329.78.1531234558.523694 1.3.6.1.4.1.11129.5.1.303327499491957026103380014864616068710
Instances UID (00080018) 1.2.276.0.7230010.3.1.4.8323329.78.1539083058.523695 1.3.6.1.4.1.11129.5.1.97415866390999888717168863957686758029
Using the new values, retrieve the metadata for the instance:
curl -X GET \
    -H "Authorization: Bearer "$(gcloud auth print-access-token) \
    -H "Content-Type: application/json; charset=utf-8" \
    "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/REGION/datasets/DESTINATION_DATASET_ID//dicomStores/DICOM_STORE_ID/dicomWeb/studies/1.3.6.1.4.1.11129.5.1.201854290391432893460946240745559593763/series/1.3.6.1.4.1.11129.5.1.303327499491957026103380014864616068710/instances/1.3.6.1.4.1.11129.5.1.97415866390999888717168863957686758029/metadata"
If the request is successful, the server returns a 200 OK HTTP status code and the new metadata in JSON format. You can compare the new metadata with the original metadata to see the effect of the transformation.
200 OK
[
  {
    "00020002":{"vr":"UI","Value":["1.2.840.10008.5.1.4.1.1.7"]},
    "00020003":{"vr":"UI","Value":["1.3.6.1.4.1.11129.5.1.97415866390999888717168863957686758029"]},
    "00020010":{"vr":"UI","Value":["1.2.840.10008.1.2.4.50"]},
    "00020012":{"vr":"UI","Value":["1.2.40.0.13.1.3"]},
    "00020013":{"vr":"SH","Value":["dcm4che-null"]},
    "00080005":{"vr":"CS","Value":["ISO_IR 100"]},
    "00080016":{"vr":"UI","Value":["1.2.840.10008.5.1.4.1.1.7"]},
    "00080018":{"vr":"UI","Value":["1.3.6.1.4.1.11129.5.1.97415866390999888717168863957686758029"]},
    "00080020":{"vr":"DA","Value":["20110909"]},
    "00080030":{"vr":"TM","Value":["110032"]},
    "00080050":{"vr":"SH"},
    "00080064":{"vr":"CS","Value":["WSD"]},
    "00080070":{"vr":"LO","Value":["Manufacturer"]},
    "00080090":{"vr":"PN","Value":[{"Alphabetic":"John ***"}]},
    "00081090":{"vr":"LO","Value":["ABC1"]},
    "00100010":{"vr":"PN","Value":[{"Alphabetic":"Ann *******"}]},
    "00100020":{"vr":"LO","Value":["S1214223-1"]},
    "00100030":{"vr":"DA","Value":["19880812"]},
    "00100040":{"vr":"CS","Value":["F"]},
    "0020000D":{"vr":"UI","Value":["1.3.6.1.4.1.11129.5.1.201854290391432893460946240745559593763"]},
    "0020000E":{"vr":"UI","Value":["1.3.6.1.4.1.11129.5.1.303327499491957026103380014864616068710"]},
    "00200010":{"vr":"SH"},
    "00200011":{"vr":"IS"},
    "00200013":{"vr":"IS"},
    "00200020":{"vr":"CS"},
    "00280002":{"vr":"US","Value":[3]},
    "00280004":{"vr":"CS","Value":["YBR_FULL_422"]},
    "00280006":{"vr":"US","Value":[0]},
    "00280010":{"vr":"US","Value":[1024]},
    "00280011":{"vr":"US","Value":[1024]},
    "00280100":{"vr":"US","Value":[8]},
    "00280101":{"vr":"US","Value":[8]},
    "00280102":{"vr":"US","Value":[7]},
    "00280103":{"vr":"US","Value":[0]},
    "00282110":{"vr":"CS","Value":["01"]},
    "00282114":{"vr":"CS","Value":["ISO_10918_1"]}
  }
]

PowerShell

To de-identify a dataset containing DICOM data and replace last names with asterisk characters, make a POST request and provide the name of the destination dataset, the dicom.filter_profile set to DEIDENTIFY_TAG_CONTENTS, transformations.infoTypes set to LAST_NAME, transformations.characterMaskConfig set to an empty value, and an access token. The following sample shows a POST request using Windows PowerShell.

$cred = gcloud auth print-access-token
$headers = @{ Authorization = "Bearer $cred" }

Invoke-WebRequest `
  -Method Post `
  -Headers $headers `
  -ContentType: "application/json; charset=utf-8" `
  -Body "{
    'destinationDataset': 'projects/PROJECT_ID/locations/REGION/datasets/DESTINATION_DATASET_ID',
    'config': {
      'dicom': {
        'filterProfile': 'DEIDENTIFY_TAG_CONTENTS'
      },
      'text': {
        'transformations': [
          {
            'infoTypes': [
              'LAST_NAME'
            ],
            'characterMaskConfig': {
              'maskingCharacter': ''
            }
          }
        ]
      }
    }
  }" `
  -Uri "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/REGION/datasets/SOURCE_DATASET_ID:deidentify" | Select-Object -Expand Content
If the request is successful, the server returns a 200 OK HTTP status code and the response in JSON format:
200 OK
{
  "name": "projects/PROJECT_ID/locations/REGION/datasets/SOURCE_DATASET_ID/OPERATION_ID"
}
The response contains an operation name. You can use the Operation get method to track the status of the operation:
$cred = gcloud auth print-access-token
$headers = @{ Authorization = "Bearer $cred" }

Invoke-WebRequest `
  -Method Get `
  -Headers $headers `
  -ContentType: "application/json; charset=utf-8" `
  -Uri "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/REGION/datasets/SOURCE_DATASET_ID/operations/OPERATION_ID" | Select-Object -Expand Content
If the request is successful, the server returns a 200 OK HTTP status code and the response in JSON format. After the de-identification process finishes, the response contains "done": true.
200 OK
{
  "name": "projects/PROJECT_ID/locations/REGION/datasets/SOURCE_DATASET_ID/operations/OPERATION_NUMBER",
  "metadata": {
    "@type": "type.googleapis.com/google.cloud.healthcare.v1beta1.OperationMetadata",
    "apiMethodName": "google.cloud.healthcare.v1beta1.dataset.DatasetService.DeidentifyDataset",
    "createTime": "2018-01-01T00:00:00Z",
    "endTime": "2018-01-01T00:00:00Z"
  },
  "done": true,
  "response": {
    "@type": "...",
    "successStoreCount": "SUCCESS_STORE_COUNT"
  }
}
After the de-identification succeeds, you can retrieve the metadata for the de-identified instance to see how it was changed. The de-identified instance will have a new studies UID, series UID, and instances UID, so you first need to search in the new dataset for the de-identified instance:
$cred = gcloud auth print-access-token
$headers = @{ Authorization = "Bearer $cred" }

Invoke-WebRequest `
  -Method Get `
  -Headers $headers `
  -ContentType: "application/json; charset=utf-8" `
  -Uri "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/REGION/datasets/DESTINATION_DATASET_ID/dicomStores/DICOM_STORE_ID/dicomWeb/instances" | Select-Object -Expand Content
If the request is successful, the server returns a 200 OK HTTP status code and the response in JSON format:
200 OK
[
  {
    "00080005":{"vr":"CS"},
    "00080016":{"vr":"UI"},
    "00080018":{"vr":"UI","Value":["1.3.6.1.4.1.11129.5.1.97415866390999888717168863957686758029"]},
    "00080020":{"vr":"DA"},
    "00080030":{"vr":"TM"},
    "00080050":{"vr":"SH"},
    "00080090":{"vr":"PN"},
    "00100010":{"vr":"PN"},
    "00100020":{"vr":"LO","Value":["S1214223-1"]},
    "00100030":{"vr":"DA"},
    "00100040":{"vr":"CS"},
    "0020000D":{"vr":"UI","Value":["1.3.6.1.4.1.11129.5.1.201854290391432893460946240745559593763"]},
    "0020000E":{"vr":"UI","Value":["1.3.6.1.4.1.11129.5.1.303327499491957026103380014864616068710"]},
    "00200010":{"vr":"SH"},
    "00200013":{"vr":"IS"},
    "00280010":{"vr":"US","Value":[1024]},
    "00280011":{"vr":"US","Value":[1024]},
    "00280100":{"vr":"US","Value":[8]}
  }
]
The following table shows how the studies UID, series UID, and instances UID changed:
  Original instance metadata De-identified instance metadata
Studies UID (0020000D) 2.25.70541616638819138568043293671559322355 1.3.6.1.4.1.11129.5.1.201854290391432893460946240745559593763
Series UID (0020000E) 1.2.276.0.7230010.3.1.3.8323329.78.1531234558.523694 1.3.6.1.4.1.11129.5.1.303327499491957026103380014864616068710
Instances UID (00080018) 1.2.276.0.7230010.3.1.4.8323329.78.1539083058.523695 1.3.6.1.4.1.11129.5.1.97415866390999888717168863957686758029
Using the new values, retrieve the metadata for the instance:
$cred = gcloud auth print-access-token
$headers = @{ Authorization = "Bearer $cred" }

Invoke-WebRequest `
  -Method Get `
  -Headers $headers `
  -ContentType: "application/json; charset=utf-8" `
  -Uri "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/REGION/datasets/DESTINATION_DATASET_ID//dicomStores/DICOM_STORE_ID/dicomWeb/studies/1.3.6.1.4.1.11129.5.1.201854290391432893460946240745559593763/series/1.3.6.1.4.1.11129.5.1.303327499491957026103380014864616068710/instances/1.3.6.1.4.1.11129.5.1.97415866390999888717168863957686758029/metadata" | Select-Object -Expand Content
If the request is successful, the server returns a 200 OK HTTP status code and the new metadata in JSON format. You can compare the new metadata with the original metadata to see the effect of the transformation.
200 OK
[
  {
    "00020002":{"vr":"UI","Value":["1.2.840.10008.5.1.4.1.1.7"]},
    "00020003":{"vr":"UI","Value":["1.3.6.1.4.1.11129.5.1.97415866390999888717168863957686758029"]},
    "00020010":{"vr":"UI","Value":["1.2.840.10008.1.2.4.50"]},
    "00020012":{"vr":"UI","Value":["1.2.40.0.13.1.3"]},
    "00020013":{"vr":"SH","Value":["dcm4che-null"]},
    "00080005":{"vr":"CS","Value":["ISO_IR 100"]},
    "00080016":{"vr":"UI","Value":["1.2.840.10008.5.1.4.1.1.7"]},
    "00080018":{"vr":"UI","Value":["1.3.6.1.4.1.11129.5.1.97415866390999888717168863957686758029"]},
    "00080020":{"vr":"DA","Value":["20110909"]},
    "00080030":{"vr":"TM","Value":["110032"]},
    "00080050":{"vr":"SH"},
    "00080064":{"vr":"CS","Value":["WSD"]},
    "00080070":{"vr":"LO","Value":["Manufacturer"]},
    "00080090":{"vr":"PN","Value":[{"Alphabetic":"John ***"}]},
    "00081090":{"vr":"LO","Value":["ABC1"]},
    "00100010":{"vr":"PN","Value":[{"Alphabetic":"Ann *******"}]},
    "00100020":{"vr":"LO","Value":["S1214223-1"]},
    "00100030":{"vr":"DA","Value":["19880812"]},
    "00100040":{"vr":"CS","Value":["F"]},
    "0020000D":{"vr":"UI","Value":["1.3.6.1.4.1.11129.5.1.201854290391432893460946240745559593763"]},
    "0020000E":{"vr":"UI","Value":["1.3.6.1.4.1.11129.5.1.303327499491957026103380014864616068710"]},
    "00200010":{"vr":"SH"},
    "00200011":{"vr":"IS"},
    "00200013":{"vr":"IS"},
    "00200020":{"vr":"CS"},
    "00280002":{"vr":"US","Value":[3]},
    "00280004":{"vr":"CS","Value":["YBR_FULL_422"]},
    "00280006":{"vr":"US","Value":[0]},
    "00280010":{"vr":"US","Value":[1024]},
    "00280011":{"vr":"US","Value":[1024]},
    "00280100":{"vr":"US","Value":[8]},
    "00280101":{"vr":"US","Value":[8]},
    "00280102":{"vr":"US","Value":[7]},
    "00280103":{"vr":"US","Value":[0]},
    "00282110":{"vr":"CS","Value":["01"]},
    "00282114":{"vr":"CS","Value":["ISO_10918_1"]}
  }
]

You can see from the output that the last names in ReferringPhysicianName (00080090) and PatientName (00100010) have been replaced with asterisks. This is in contrast to the sample in the Default configuration, where these values were transformed using cryptographic hashing.

DateShiftConfig

The Cloud Healthcare API can transform dates by shifting them within a preset range. If you want date transformations to remain consistent across de-identification runs, use DateShiftConfig and specify an AES 128/192/256-bit base 64-encoded key. The Cloud Healthcare API uses this key to compute the amount by which dates, such as a patient's birthdate, are shifted within a 100-day differential.

If you don't provide a key, the Cloud Healthcare API generates its own key each time the de-identification operation runs on date values. This can result in inconsistent date outputs between runs.

The following samples show how to set the DATE and DATE_OF_BIRTH infoTypes with the DateShiftConfig transform on a DICOM instance. After sending the de-identification request to the Cloud Healthcare API, the date values in the instance will shift within plus or minus 100 days of their original values.

The provided cryptokey, U2FsdGVkX19bS2oZsdbK9X5zi2utBn22uY+I2Vo0zOU=, is an AES-encrypted 256-bit base64-encoded key generated using the following command. When prompted, an empty password is provided to the command:

echo -n "test" | openssl enc -e -aes-256-ofb -a -salt

After submitting the image to the Cloud Healthcare API using the dateShiftConfig transformation, the image appears as follows:

dicom_dateshiftconfig

curl command

To de-identify a dataset containing DICOM data and provide consistency among runs when transforming date values, make a POST request and provide the name of the destination dataset, the dicom.filter_profile set to DEIDENTIFY_TAG_CONTENTS, transformations.infoTypes set to DATE and DATE_OF_BIRTH, transformations.dateShiftConfig set to an AES 128/192/256-bit key, and an access token. The following sample shows a POST request using curl.

curl -X POST \
    -H "Authorization: Bearer "$(gcloud auth print-access-token) \
    -H "Content-Type: application/json; charset=utf-8" \
    --data "{
      'destinationDataset': 'projects/PROJECT_ID/locations/REGION/datasets/DESTINATION_DATASET_ID',
      'config': {
        'dicom': {
          'filterProfile': 'DEIDENTIFY_TAG_CONTENTS'
        },
        'text': {
          'transformations': [
            {
              'infoTypes': [
                'DATE',
                'DATE_OF_BIRTH'
              ],
              'dateShiftConfig': {
                'cryptoKey': 'U2FsdGVkX19bS2oZsdbK9X5zi2utBn22uY+I2Vo0zOU='
              }
            }
          ]
        }
      }
    }" "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/REGION/datasets/SOURCE_DATASET_ID:deidentify"
If the request is successful, the server returns a 200 OK HTTP status code and the response in JSON format:
200 OK
{
  "name": "projects/PROJECT_ID/locations/REGION/datasets/SOURCE_DATASET_ID/OPERATION_ID"
}
The response contains an operation name. You can use the Operation get method to track the status of the operation:
curl -X GET \
    -H "Authorization: Bearer "$(gcloud auth print-access-token) \
    -H "Content-Type: application/json; charset=utf-8" \
    "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/REGION/datasets/SOURCE_DATASET_ID/operations/OPERATION_ID"
If the request is successful, the server returns a 200 OK HTTP status code and the response in JSON format. After the de-identification process finishes, the response contains "done": true.
200 OK
{
  "name": "projects/PROJECT_ID/locations/REGION/datasets/SOURCE_DATASET_ID/operations/OPERATION_NUMBER",
  "metadata": {
    "@type": "type.googleapis.com/google.cloud.healthcare.v1beta1.OperationMetadata",
    "apiMethodName": "google.cloud.healthcare.v1beta1.dataset.DatasetService.DeidentifyDataset",
    "createTime": "2018-01-01T00:00:00Z",
    "endTime": "2018-01-01T00:00:00Z"
  },
  "done": true,
  "response": {
    "@type": "...",
    "successStoreCount": "SUCCESS_STORE_COUNT"
  }
}
After the de-identification succeeds, you can retrieve the metadata for the de-identified instance to see how it was changed. The de-identified instance will have a new studies UID, series UID, and instances UID, so you first need to search in the new dataset for the de-identified instance:
curl -X GET \
     -H "Authorization: Bearer "$(gcloud auth print-access-token) \
     -H "Content-Type: application/dicom+json; charset=utf-8" \
     "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/REGION/datasets/DESTINATION_DATASET_ID/dicomStores/DICOM_STORE_ID/dicomWeb/instances"
If the request is successful, the server returns a 200 OK HTTP status code and the response in JSON format:
200 OK
[
  {
    "00080005":{"vr":"CS"},
    "00080016":{"vr":"UI"},
    "00080018":{"vr":"UI","Value":["1.3.6.1.4.1.11129.5.1.97415866390999888717168863957686758029"]},
    "00080020":{"vr":"DA"},
    "00080030":{"vr":"TM"},
    "00080050":{"vr":"SH"},
    "00080090":{"vr":"PN"},
    "00100010":{"vr":"PN"},
    "00100020":{"vr":"LO","Value":["S1214223-1"]},
    "00100030":{"vr":"DA"},
    "00100040":{"vr":"CS"},
    "0020000D":{"vr":"UI","Value":["1.3.6.1.4.1.11129.5.1.201854290391432893460946240745559593763"]},
    "0020000E":{"vr":"UI","Value":["1.3.6.1.4.1.11129.5.1.303327499491957026103380014864616068710"]},
    "00200010":{"vr":"SH"},
    "00200013":{"vr":"IS"},
    "00280010":{"vr":"US","Value":[1024]},
    "00280011":{"vr":"US","Value":[1024]},
    "00280100":{"vr":"US","Value":[8]}
  }
]
The following table shows how the studies UID, series UID, and instances UID changed:
  Original instance metadata De-identified instance metadata
Studies UID (0020000D) 2.25.70541616638819138568043293671559322355 1.3.6.1.4.1.11129.5.1.201854290391432893460946240745559593763
Series UID (0020000E) 1.2.276.0.7230010.3.1.3.8323329.78.1531234558.523694 1.3.6.1.4.1.11129.5.1.303327499491957026103380014864616068710
Instances UID (00080018) 1.2.276.0.7230010.3.1.4.8323329.78.1539083058.523695 1.3.6.1.4.1.11129.5.1.97415866390999888717168863957686758029
Using the new values, retrieve the metadata for the instance:
curl -X GET \
    -H "Authorization: Bearer "$(gcloud auth print-access-token) \
    -H "Content-Type: application/json; charset=utf-8" \
    "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/REGION/datasets/DESTINATION_DATASET_ID//dicomStores/DICOM_STORE_ID/dicomWeb/studies/1.3.6.1.4.1.11129.5.1.201854290391432893460946240745559593763/series/1.3.6.1.4.1.11129.5.1.303327499491957026103380014864616068710/instances/1.3.6.1.4.1.11129.5.1.97415866390999888717168863957686758029/metadata"
If the request is successful, the server returns a 200 OK HTTP status code and the new metadata in JSON format. You can compare the new metadata with the original metadata to see the effect of the transformation.
200 OK
[
  {
    "00020002":{"vr":"UI","Value":["1.2.840.10008.5.1.4.1.1.7"]},
    "00020003":{"vr":"UI","Value":["1.3.6.1.4.1.11129.5.1.97415866390999888717168863957686758029"]},
    "00020010":{"vr":"UI","Value":["1.2.840.10008.1.2.4.50"]},
    "00020012":{"vr":"UI","Value":["1.2.40.0.13.1.3"]},
    "00020013":{"vr":"SH","Value":["dcm4che-null"]},
    "00080005":{"vr":"CS","Value":["ISO_IR 100"]},
    "00080016":{"vr":"UI","Value":["1.2.840.10008.5.1.4.1.1.7"]},
    "00080018":{"vr":"UI","Value":["1.3.6.1.4.1.11129.5.1.97415866390999888717168863957686758029"]},
    "00080020":{"vr":"DA","Value":["20110820"]},
    "00080030":{"vr":"TM","Value":["110032"]},
    "00080050":{"vr":"SH"},
    "00080064":{"vr":"CS","Value":["WSD"]},
    "00080070":{"vr":"LO","Value":["Manufacturer"]},
    "00080090":{"vr":"PN","Value":[{"Alphabetic":"John Doe"}]},
    "00081090":{"vr":"LO","Value":["ABC1"]},
    "00100010":{"vr":"PN","Value":[{"Alphabetic":"Ann Johnson"}]},
    "00100020":{"vr":"LO","Value":["S1214223-1"]},
    "00100030":{"vr":"DA","Value":["19880723"]},
    "00100040":{"vr":"CS","Value":["F"]},
    "0020000D":{"vr":"UI","Value":["1.3.6.1.4.1.11129.5.1.201854290391432893460946240745559593763"]},
    "0020000E":{"vr":"UI","Value":["1.3.6.1.4.1.11129.5.1.303327499491957026103380014864616068710"]},
    "00200010":{"vr":"SH"},
    "00200011":{"vr":"IS"},
    "00200013":{"vr":"IS"},
    "00200020":{"vr":"CS"},
    "00280002":{"vr":"US","Value":[3]},
    "00280004":{"vr":"CS","Value":["YBR_FULL_422"]},
    "00280006":{"vr":"US","Value":[0]},
    "00280010":{"vr":"US","Value":[1024]},
    "00280011":{"vr":"US","Value":[1024]},
    "00280100":{"vr":"US","Value":[8]},
    "00280101":{"vr":"US","Value":[8]},
    "00280102":{"vr":"US","Value":[7]},
    "00280103":{"vr":"US","Value":[0]},
    "00282110":{"vr":"CS","Value":["01"]},
    "00282114":{"vr":"CS","Value":["ISO_10918_1"]}
  }
]

PowerShell

To de-identify a dataset containing DICOM data and provide consistency among runs when transforming date values, make a POST request and provide the name of the destination dataset, the dicom.filter_profile set to DEIDENTIFY_TAG_CONTENTS, transformations.infoTypes set to DATE and DATE_OF_BIRTH, transformations.dateShiftConfig set to an AES 128/192/256-bit key, and an access token. The following sample shows a POST request using Windows PowerShell.

$cred = gcloud auth print-access-token
$headers = @{ Authorization = "Bearer $cred" }

Invoke-WebRequest `
  -Method Post `
  -Headers $headers `
  -ContentType: "application/json; charset=utf-8" `
  -Body "{
    'destinationDataset': 'projects/PROJECT_ID/locations/REGION/datasets/DESTINATION_DATASET_ID',
    'config': {
      'dicom': {
        'filterProfile': 'DEIDENTIFY_TAG_CONTENTS'
      },
      'text': {
        'transformations': [
          {
            'infoTypes': [
              'DATE',
              'DATE_OF_BIRTH'
            ],
            'dateShiftConfig': {
              'cryptoKey': 'U2FsdGVkX19bS2oZsdbK9X5zi2utBn22uY+I2Vo0zOU='
            }
          }
        ]
      }
    }
  }" `
  -Uri "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/REGION/datasets/SOURCE_DATASET_ID:deidentify" | Select-Object -Expand Content
If the request is successful, the server returns a 200 OK HTTP status code and the response in JSON format:
200 OK
{
  "name": "projects/PROJECT_ID/locations/REGION/datasets/SOURCE_DATASET_ID/OPERATION_ID"
}
The response contains an operation name. You can use the Operation get method to track the status of the operation:
$cred = gcloud auth print-access-token
$headers = @{ Authorization = "Bearer $cred" }

Invoke-WebRequest `
  -Method Get `
  -Headers $headers `
  -ContentType: "application/json; charset=utf-8" `
  -Uri "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/REGION/datasets/SOURCE_DATASET_ID/operations/OPERATION_ID" | Select-Object -Expand Content
If the request is successful, the server returns a 200 OK HTTP status code and the response in JSON format. After the de-identification process finishes, the response contains "done": true.
200 OK
{
  "name": "projects/PROJECT_ID/locations/REGION/datasets/SOURCE_DATASET_ID/operations/OPERATION_NUMBER",
  "metadata": {
    "@type": "type.googleapis.com/google.cloud.healthcare.v1beta1.OperationMetadata",
    "apiMethodName": "google.cloud.healthcare.v1beta1.dataset.DatasetService.DeidentifyDataset",
    "createTime": "2018-01-01T00:00:00Z",
    "endTime": "2018-01-01T00:00:00Z"
  },
  "done": true,
  "response": {
    "@type": "...",
    "successStoreCount": "SUCCESS_STORE_COUNT"
  }
}
After the de-identification succeeds, you can retrieve the metadata for the de-identified instance to see how it was changed. The de-identified instance will have a new studies UID, series UID, and instances UID, so you first need to search in the new dataset for the de-identified instance:
$cred = gcloud auth print-access-token
$headers = @{ Authorization = "Bearer $cred" }

Invoke-WebRequest `
  -Method Get `
  -Headers $headers `
  -ContentType: "application/json; charset=utf-8" `
  -Uri "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/REGION/datasets/DESTINATION_DATASET_ID/dicomStores/DICOM_STORE_ID/dicomWeb/instances" | Select-Object -Expand Content
If the request is successful, the server returns a 200 OK HTTP status code and the response in JSON format:
200 OK
[
  {
    "00080005":{"vr":"CS"},
    "00080016":{"vr":"UI"},
    "00080018":{"vr":"UI","Value":["1.3.6.1.4.1.11129.5.1.97415866390999888717168863957686758029"]},
    "00080020":{"vr":"DA"},
    "00080030":{"vr":"TM"},
    "00080050":{"vr":"SH"},
    "00080090":{"vr":"PN"},
    "00100010":{"vr":"PN"},
    "00100020":{"vr":"LO","Value":["S1214223-1"]},
    "00100030":{"vr":"DA"},
    "00100040":{"vr":"CS"},
    "0020000D":{"vr":"UI","Value":["1.3.6.1.4.1.11129.5.1.201854290391432893460946240745559593763"]},
    "0020000E":{"vr":"UI","Value":["1.3.6.1.4.1.11129.5.1.303327499491957026103380014864616068710"]},
    "00200010":{"vr":"SH"},
    "00200013":{"vr":"IS"},
    "00280010":{"vr":"US","Value":[1024]},
    "00280011":{"vr":"US","Value":[1024]},
    "00280100":{"vr":"US","Value":[8]}
  }
]
The following table shows how the studies UID, series UID, and instances UID changed:
  Original instance metadata De-identified instance metadata
Studies UID (0020000D) 2.25.70541616638819138568043293671559322355 1.3.6.1.4.1.11129.5.1.201854290391432893460946240745559593763
Series UID (0020000E) 1.2.276.0.7230010.3.1.3.8323329.78.1531234558.523694 1.3.6.1.4.1.11129.5.1.303327499491957026103380014864616068710
Instances UID (00080018) 1.2.276.0.7230010.3.1.4.8323329.78.1539083058.523695 1.3.6.1.4.1.11129.5.1.97415866390999888717168863957686758029
Using the new values, retrieve the metadata for the instance:
$cred = gcloud auth print-access-token
$headers = @{ Authorization = "Bearer $cred" }

Invoke-WebRequest `
  -Method Get `
  -Headers $headers `
  -ContentType: "application/json; charset=utf-8" `
  -Uri "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/REGION/datasets/DESTINATION_DATASET_ID//dicomStores/DICOM_STORE_ID/dicomWeb/studies/1.3.6.1.4.1.11129.5.1.201854290391432893460946240745559593763/series/1.3.6.1.4.1.11129.5.1.303327499491957026103380014864616068710/instances/1.3.6.1.4.1.11129.5.1.97415866390999888717168863957686758029/metadata" | Select-Object -Expand Content
If the request is successful, the server returns a 200 OK HTTP status code and the new metadata in JSON format. You can compare the new metadata with the original metadata to see the effect of the transformation.
200 OK
[
  {
    "00020002":{"vr":"UI","Value":["1.2.840.10008.5.1.4.1.1.7"]},
    "00020003":{"vr":"UI","Value":["1.3.6.1.4.1.11129.5.1.97415866390999888717168863957686758029"]},
    "00020010":{"vr":"UI","Value":["1.2.840.10008.1.2.4.50"]},
    "00020012":{"vr":"UI","Value":["1.2.40.0.13.1.3"]},
    "00020013":{"vr":"SH","Value":["dcm4che-null"]},
    "00080005":{"vr":"CS","Value":["ISO_IR 100"]},
    "00080016":{"vr":"UI","Value":["1.2.840.10008.5.1.4.1.1.7"]},
    "00080018":{"vr":"UI","Value":["1.3.6.1.4.1.11129.5.1.97415866390999888717168863957686758029"]},
    "00080020":{"vr":"DA","Value":["20110820"]},
    "00080030":{"vr":"TM","Value":["110032"]},
    "00080050":{"vr":"SH"},
    "00080064":{"vr":"CS","Value":["WSD"]},
    "00080070":{"vr":"LO","Value":["Manufacturer"]},
    "00080090":{"vr":"PN","Value":[{"Alphabetic":"John Doe"}]},
    "00081090":{"vr":"LO","Value":["ABC1"]},
    "00100010":{"vr":"PN","Value":[{"Alphabetic":"Ann Johnson"}]},
    "00100020":{"vr":"LO","Value":["S1214223-1"]},
    "00100030":{"vr":"DA","Value":["19880723"]},
    "00100040":{"vr":"CS","Value":["F"]},
    "0020000D":{"vr":"UI","Value":["1.3.6.1.4.1.11129.5.1.201854290391432893460946240745559593763"]},
    "0020000E":{"vr":"UI","Value":["1.3.6.1.4.1.11129.5.1.303327499491957026103380014864616068710"]},
    "00200010":{"vr":"SH"},
    "00200011":{"vr":"IS"},
    "00200013":{"vr":"IS"},
    "00200020":{"vr":"CS"},
    "00280002":{"vr":"US","Value":[3]},
    "00280004":{"vr":"CS","Value":["YBR_FULL_422"]},
    "00280006":{"vr":"US","Value":[0]},
    "00280010":{"vr":"US","Value":[1024]},
    "00280011":{"vr":"US","Value":[1024]},
    "00280100":{"vr":"US","Value":[8]},
    "00280101":{"vr":"US","Value":[8]},
    "00280102":{"vr":"US","Value":[7]},
    "00280103":{"vr":"US","Value":[0]},
    "00282110":{"vr":"CS","Value":["01"]},
    "00282114":{"vr":"CS","Value":["ISO_10918_1"]}
  }
]

You can see from the output that the StudyDate (00080020) and PatientBirthDate (00100030) have new values. These transformations occurred as a result of combining the 100-day differential with the provided cryptoKey value. The new date values are consistent for this instance between de-identification runs as long as the same cryptoKey is provided.

CryptoHashConfig

The Cloud Healthcare API can transform data by replacing values with cryptographic hashes (also called surrogate values). To do so, specify a cryptoHashConfig message.

You can leave the cryptoHashConfig empty, or you can provide an AES 128/192/256-bit base 64-encoded key. If you do not provide a key, the Cloud Healthcare API generates a key. The Cloud Healthcare API uses this key to generate surrogate values. If you provide the same key for each run, the Cloud Healthcare API generates consistent surrogate values. If you do not provide a key, the Cloud Healthcare API generates a new key each time the operation runs. Using a different key yields different surrogate values.

The following samples show how to apply a cryptoHashConfig transform to all default DICOM infoTypes supported in the Cloud Healthcare API. After sending the de-identification request, the values with a corresponding DICOM infoType in the Cloud Healthcare API are replaced with surrogate values.

The sample also shows how to provide a cryptokey to generate consistent surrogate values between de-identification runs.

The provided cryptokey, U2FsdGVkX19bS2oZsdbK9X5zi2utBn22uY+I2Vo0zOU=, is an AES-encrypted 256-bit base64-encoded key generated using the following command. When prompted, an empty password is provided to the command:

echo -n "test" | openssl enc -e -aes-256-ofb -a -salt

After submitting the image to the Cloud Healthcare API using the cryptoHashConfig transformation, the image appears as follows:

dicom_cryptohashconfig

curl command

To de-identify a dataset containing DICOM data and cryptographically hash all available default DICOM infoTypes, make a POST request and provide the name of the destination dataset, the dicom.filter_profile set to DEIDENTIFY_TAG_CONTENTS, transformations.infoTypes set to an empty object, transformations.cryptoHashConfig set to an AES 128/192/256-bit key, and an access token. The following sample shows a POST request using curl.

curl -X POST \
    -H "Authorization: Bearer "$(gcloud auth print-access-token) \
    -H "Content-Type: application/json; charset=utf-8" \
    --data "{
      'destinationDataset': 'projects/PROJECT_ID/locations/REGION/datasets/DESTINATION_DATASET_ID',
      'config': {
        'dicom': {
          'filterProfile': 'DEIDENTIFY_TAG_CONTENTS'
        },
        'text': {
          'transformations': [
            {
              'infoTypes': [],
              'cryptoHashConfig': {
                'cryptoKey': 'U2FsdGVkX19bS2oZsdbK9X5zi2utBn22uY+I2Vo0zOU='
              }
            }
          ]
        }
      }
    }" "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/REGION/datasets/SOURCE_DATASET_ID:deidentify"
If the request is successful, the server returns a 200 OK HTTP status code and the response in JSON format:
200 OK
{
  "name": "projects/PROJECT_ID/locations/REGION/datasets/SOURCE_DATASET_ID/OPERATION_ID"
}
The response contains an operation name. You can use the Operation get method to track the status of the operation:
curl -X GET \
    -H "Authorization: Bearer "$(gcloud auth print-access-token) \
    -H "Content-Type: application/json; charset=utf-8" \
    "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/REGION/datasets/SOURCE_DATASET_ID/operations/OPERATION_ID"
If the request is successful, the server returns a 200 OK HTTP status code and the response in JSON format. After the de-identification process finishes, the response contains "done": true.
200 OK
{
  "name": "projects/PROJECT_ID/locations/REGION/datasets/SOURCE_DATASET_ID/operations/OPERATION_NUMBER",
  "metadata": {
    "@type": "type.googleapis.com/google.cloud.healthcare.v1beta1.OperationMetadata",
    "apiMethodName": "google.cloud.healthcare.v1beta1.dataset.DatasetService.DeidentifyDataset",
    "createTime": "2018-01-01T00:00:00Z",
    "endTime": "2018-01-01T00:00:00Z"
  },
  "done": true,
  "response": {
    "@type": "...",
    "successStoreCount": "SUCCESS_STORE_COUNT"
  }
}
After the de-identification succeeds, you can retrieve the metadata for the de-identified instance to see how it was changed. The de-identified instance will have a new studies UID, series UID, and instances UID, so you first need to search in the new dataset for the de-identified instance:
curl -X GET \
     -H "Authorization: Bearer "$(gcloud auth print-access-token) \
     -H "Content-Type: application/dicom+json; charset=utf-8" \
     "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/REGION/datasets/DESTINATION_DATASET_ID/dicomStores/DICOM_STORE_ID/dicomWeb/instances"
If the request is successful, the server returns a 200 OK HTTP status code and the response in JSON format:
200 OK
[
  {
    "00080005":{"vr":"CS"},
    "00080016":{"vr":"UI"},
    "00080018":{"vr":"UI","Value":["1.3.6.1.4.1.11129.5.1.97415866390999888717168863957686758029"]},
    "00080020":{"vr":"DA"},
    "00080030":{"vr":"TM"},
    "00080050":{"vr":"SH"},
    "00080090":{"vr":"PN"},
    "00100010":{"vr":"PN"},
    "00100020":{"vr":"LO","Value":["S1214223-1"]},
    "00100030":{"vr":"DA"},
    "00100040":{"vr":"CS"},
    "0020000D":{"vr":"UI","Value":["1.3.6.1.4.1.11129.5.1.201854290391432893460946240745559593763"]},
    "0020000E":{"vr":"UI","Value":["1.3.6.1.4.1.11129.5.1.303327499491957026103380014864616068710"]},
    "00200010":{"vr":"SH"},
    "00200013":{"vr":"IS"},
    "00280010":{"vr":"US","Value":[1024]},
    "00280011":{"vr":"US","Value":[1024]},
    "00280100":{"vr":"US","Value":[8]}
  }
]
The following table shows how the studies UID, series UID, and instances UID changed:
  Original instance metadata De-identified instance metadata
Studies UID (0020000D) 2.25.70541616638819138568043293671559322355 1.3.6.1.4.1.11129.5.1.201854290391432893460946240745559593763
Series UID (0020000E) 1.2.276.0.7230010.3.1.3.8323329.78.1531234558.523694 1.3.6.1.4.1.11129.5.1.303327499491957026103380014864616068710
Instances UID (00080018) 1.2.276.0.7230010.3.1.4.8323329.78.1539083058.523695 1.3.6.1.4.1.11129.5.1.97415866390999888717168863957686758029
Using the new values, retrieve the metadata for the instance:
curl -X GET \
    -H "Authorization: Bearer "$(gcloud auth print-access-token) \
    -H "Content-Type: application/json; charset=utf-8" \
    "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/REGION/datasets/DESTINATION_DATASET_ID//dicomStores/DICOM_STORE_ID/dicomWeb/studies/1.3.6.1.4.1.11129.5.1.201854290391432893460946240745559593763/series/1.3.6.1.4.1.11129.5.1.303327499491957026103380014864616068710/instances/1.3.6.1.4.1.11129.5.1.97415866390999888717168863957686758029/metadata"
If the request is successful, the server returns a 200 OK HTTP status code and the new metadata in JSON format. You can compare the new metadata with the original metadata to see the effect of the transformation.
200 OK
[
  {
    "00020002":{"vr":"UI","Value":["1.2.840.10008.5.1.4.1.1.7"]},
    "00020003":{"vr":"UI","Value":["1.3.6.1.4.1.11129.5.1.97415866390999888717168863957686758029"]},
    "00020010":{"vr":"UI","Value":["1.2.840.10008.1.2.4.50"]},
    "00020012":{"vr":"UI","Value":["1.2.40.0.13.1.3"]},
    "00020013":{"vr":"SH","Value":["dcm4che-null"]},
    "00080005":{"vr":"CS","Value":["ISO_IR 100"]},
    "00080016":{"vr":"UI","Value":["1.2.840.10008.5.1.4.1.1.7"]},
    "00080018":{"vr":"UI","Value":["1.3.6.1.4.1.11129.5.1.97415866390999888717168863957686758029"]},
    "00080020":{"vr":"DA","Value":["19000101"]},
    "00080030":{"vr":"TM","Value":["110032"]},
    "00080050":{"vr":"SH"},
    "00080064":{"vr":"CS","Value":["WSD"]},
    "00080070":{"vr":"LO","Value":["Manufacturer"]},
    "00080090":{"vr":"PN","Value":[{"Alphabetic":"TgUF3V/7IfiYXOvA63tpPnsFrc+j1YBenF/9E4+B1CE"}]},
    "00081090":{"vr":"LO","Value":["ABC1"]},
    "00100010":{"vr":"PN","Value":[{"Alphabetic":"YTQB+AfSXJUQsIip8odrSfntGe4yWgGyFjq/lI/e+Jk"}]},
    "00100020":{"vr":"LO","Value":["S1214223-1"]},
    "00100030":{"vr":"DA","Value":["19000101"]},
    "00100040":{"vr":"CS","Value":["F"]},
    "0020000D":{"vr":"UI","Value":["1.3.6.1.4.1.11129.5.1.201854290391432893460946240745559593763"]},
    "0020000E":{"vr":"UI","Value":["1.3.6.1.4.1.11129.5.1.303327499491957026103380014864616068710"]},
    "00200010":{"vr":"SH"},
    "00200011":{"vr":"IS"},
    "00200013":{"vr":"IS"},
    "00200020":{"vr":"CS"},
    "00280002":{"vr":"US","Value":[3]},
    "00280004":{"vr":"CS","Value":["YBR_FULL_422"]},
    "00280006":{"vr":"US","Value":[0]},
    "00280010":{"vr":"US","Value":[1024]},
    "00280011":{"vr":"US","Value":[1024]},
    "00280100":{"vr":"US","Value":[8]},
    "00280101":{"vr":"US","Value":[8]},
    "00280102":{"vr":"US","Value":[7]},
    "00280103":{"vr":"US","Value":[0]},
    "00282110":{"vr":"CS","Value":["01"]},
    "00282114":{"vr":"CS","Value":["ISO_10918_1"]}
  }
]

PowerShell

To de-identify a dataset containing DICOM data and cryptographically hash all available default DICOM infoTypes, make a POST request and provide the name of the destination dataset, the dicom.filter_profile set to DEIDENTIFY_TAG_CONTENTS, transformations.infoTypes set to an empty object, transformations.cryptoHashConfig set to an AES 128/192/256-bit key, and an access token. The following sample shows a POST request using Windows PowerShell.

$cred = gcloud auth print-access-token
$headers = @{ Authorization = "Bearer $cred" }

Invoke-WebRequest `
  -Method Post `
  -Headers $headers `
  -ContentType: "application/json; charset=utf-8" `
  -Body "{
    'destinationDataset': 'projects/PROJECT_ID/locations/REGION/datasets/DESTINATION_DATASET_ID',
    'config': {
      'dicom': {
        'filterProfile': 'DEIDENTIFY_TAG_CONTENTS'
      },
      'text': {
        'transformations': [
          {
            'infoTypes': [],
            'cryptoHashConfig': {
              'cryptoKey': 'U2FsdGVkX19bS2oZsdbK9X5zi2utBn22uY+I2Vo0zOU='
            }
          }
        ]
      }
    }
  }" `
  -Uri "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/REGION/datasets/SOURCE_DATASET_ID:deidentify" | Select-Object -Expand Content
If the request is successful, the server returns a 200 OK HTTP status code and the response in JSON format:
200 OK
{
  "name": "projects/PROJECT_ID/locations/REGION/datasets/SOURCE_DATASET_ID/OPERATION_ID"
}
The response contains an operation name. You can use the Operation get method to track the status of the operation:
$cred = gcloud auth print-access-token
$headers = @{ Authorization = "Bearer $cred" }

Invoke-WebRequest `
  -Method Get `
  -Headers $headers `
  -ContentType: "application/json; charset=utf-8" `
  -Uri "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/REGION/datasets/SOURCE_DATASET_ID/operations/OPERATION_ID" | Select-Object -Expand Content
If the request is successful, the server returns a 200 OK HTTP status code and the response in JSON format. After the de-identification process finishes, the response contains "done": true.
200 OK
{
  "name": "projects/PROJECT_ID/locations/REGION/datasets/SOURCE_DATASET_ID/operations/OPERATION_NUMBER",
  "metadata": {
    "@type": "type.googleapis.com/google.cloud.healthcare.v1beta1.OperationMetadata",
    "apiMethodName": "google.cloud.healthcare.v1beta1.dataset.DatasetService.DeidentifyDataset",
    "createTime": "2018-01-01T00:00:00Z",
    "endTime": "2018-01-01T00:00:00Z"
  },
  "done": true,
  "response": {
    "@type": "...",
    "successStoreCount": "SUCCESS_STORE_COUNT"
  }
}
After the de-identification succeeds, you can retrieve the metadata for the de-identified instance to see how it was changed. The de-identified instance will have a new studies UID, series UID, and instances UID, so you first need to search in the new dataset for the de-identified instance:
$cred = gcloud auth print-access-token
$headers = @{ Authorization = "Bearer $cred" }

Invoke-WebRequest `
  -Method Get `
  -Headers $headers `
  -ContentType: "application/json; charset=utf-8" `
  -Uri "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/REGION/datasets/DESTINATION_DATASET_ID/dicomStores/DICOM_STORE_ID/dicomWeb/instances" | Select-Object -Expand Content
If the request is successful, the server returns a 200 OK HTTP status code and the response in JSON format:
200 OK
[
  {
    "00080005":{"vr":"CS"},
    "00080016":{"vr":"UI"},
    "00080018":{"vr":"UI","Value":["1.3.6.1.4.1.11129.5.1.97415866390999888717168863957686758029"]},
    "00080020":{"vr":"DA"},
    "00080030":{"vr":"TM"},
    "00080050":{"vr":"SH"},
    "00080090":{"vr":"PN"},
    "00100010":{"vr":"PN"},
    "00100020":{"vr":"LO","Value":["S1214223-1"]},
    "00100030":{"vr":"DA"},
    "00100040":{"vr":"CS"},
    "0020000D":{"vr":"UI","Value":["1.3.6.1.4.1.11129.5.1.201854290391432893460946240745559593763"]},
    "0020000E":{"vr":"UI","Value":["1.3.6.1.4.1.11129.5.1.303327499491957026103380014864616068710"]},
    "00200010":{"vr":"SH"},
    "00200013":{"vr":"IS"},
    "00280010":{"vr":"US","Value":[1024]},
    "00280011":{"vr":"US","Value":[1024]},
    "00280100":{"vr":"US","Value":[8]}
  }
]
The following table shows how the studies UID, series UID, and instances UID changed:
  Original instance metadata De-identified instance metadata
Studies UID (0020000D) 2.25.70541616638819138568043293671559322355 1.3.6.1.4.1.11129.5.1.201854290391432893460946240745559593763
Series UID (0020000E) 1.2.276.0.7230010.3.1.3.8323329.78.1531234558.523694 1.3.6.1.4.1.11129.5.1.303327499491957026103380014864616068710
Instances UID (00080018) 1.2.276.0.7230010.3.1.4.8323329.78.1539083058.523695 1.3.6.1.4.1.11129.5.1.97415866390999888717168863957686758029
Using the new values, retrieve the metadata for the instance:
$cred = gcloud auth print-access-token
$headers = @{ Authorization = "Bearer $cred" }

Invoke-WebRequest `
  -Method Get `
  -Headers $headers `
  -ContentType: "application/json; charset=utf-8" `
  -Uri "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/REGION/datasets/DESTINATION_DATASET_ID//dicomStores/DICOM_STORE_ID/dicomWeb/studies/1.3.6.1.4.1.11129.5.1.201854290391432893460946240745559593763/series/1.3.6.1.4.1.11129.5.1.303327499491957026103380014864616068710/instances/1.3.6.1.4.1.11129.5.1.97415866390999888717168863957686758029/metadata" | Select-Object -Expand Content
If the request is successful, the server returns a 200 OK HTTP status code and the new metadata in JSON format. You can compare the new metadata with the original metadata to see the effect of the transformation.
200 OK
[
  {
    "00020002":{"vr":"UI","Value":["1.2.840.10008.5.1.4.1.1.7"]},
    "00020003":{"vr":"UI","Value":["1.3.6.1.4.1.11129.5.1.97415866390999888717168863957686758029"]},
    "00020010":{"vr":"UI","Value":["1.2.840.10008.1.2.4.50"]},
    "00020012":{"vr":"UI","Value":["1.2.40.0.13.1.3"]},
    "00020013":{"vr":"SH","Value":["dcm4che-null"]},
    "00080005":{"vr":"CS","Value":["ISO_IR 100"]},
    "00080016":{"vr":"UI","Value":["1.2.840.10008.5.1.4.1.1.7"]},
    "00080018":{"vr":"UI","Value":["1.3.6.1.4.1.11129.5.1.97415866390999888717168863957686758029"]},
    "00080020":{"vr":"DA","Value":["19000101"]},
    "00080030":{"vr":"TM","Value":["110032"]},
    "00080050":{"vr":"SH"},
    "00080064":{"vr":"CS","Value":["WSD"]},
    "00080070":{"vr":"LO","Value":["Manufacturer"]},
    "00080090":{"vr":"PN","Value":[{"Alphabetic":"TgUF3V/7IfiYXOvA63tpPnsFrc+j1YBenF/9E4+B1CE"}]},
    "00081090":{"vr":"LO","Value":["ABC1"]},
    "00100010":{"vr":"PN","Value":[{"Alphabetic":"YTQB+AfSXJUQsIip8odrSfntGe4yWgGyFjq/lI/e+Jk"}]},
    "00100020":{"vr":"LO","Value":["S1214223-1"]},
    "00100030":{"vr":"DA","Value":["19000101"]},
    "00100040":{"vr":"CS","Value":["F"]},
    "0020000D":{"vr":"UI","Value":["1.3.6.1.4.1.11129.5.1.201854290391432893460946240745559593763"]},
    "0020000E":{"vr":"UI","Value":["1.3.6.1.4.1.11129.5.1.303327499491957026103380014864616068710"]},
    "00200010":{"vr":"SH"},
    "00200011":{"vr":"IS"},
    "00200013":{"vr":"IS"},
    "00200020":{"vr":"CS"},
    "00280002":{"vr":"US","Value":[3]},
    "00280004":{"vr":"CS","Value":["YBR_FULL_422"]},
    "00280006":{"vr":"US","Value":[0]},
    "00280010":{"vr":"US","Value":[1024]},
    "00280011":{"vr":"US","Value":[1024]},
    "00280100":{"vr":"US","Value":[8]},
    "00280101":{"vr":"US","Value":[8]},
    "00280102":{"vr":"US","Value":[7]},
    "00280103":{"vr":"US","Value":[0]},
    "00282110":{"vr":"CS","Value":["01"]},
    "00282114":{"vr":"CS","Value":["ISO_10918_1"]}
  }
]

The transformations in the output are consistent for this instance between de-identification runs as long as the same cryptoKey is provided.

ReplaceWithInfoTypeConfig

Specifying replaceWithInfoTypeConfig replaces input values with the name of the value's infoType.

The following samples show how to apply a replaceWithInfoTypeConfig transform to all default DICOM infoTypes supported in the Cloud Healthcare API. The replaceWithInfoTypeConfig message has no arguments; specifying it enables transformation.

After submitting the image to the Cloud Healthcare API using the replaceWithInfoTypeConfig transformation, the image appears as follows:

dicom_replacewithinfotypeconfig

curl command

To de-identify a dataset containing DICOM data and replace all relevant values with their infoTypes, make a POST request and provide the name of the destination dataset, the dicom.filter_profile set to DEIDENTIFY_TAG_CONTENTS, an empty transformations.infoTypes, an empty transformations.cryptoHashConfig, and an access token. The following sample shows a POST request using curl.

curl -X POST \
    -H "Authorization: Bearer "$(gcloud auth print-access-token) \
    -H "Content-Type: application/json; charset=utf-8" \
    --data "{
      'destinationDataset': 'projects/PROJECT_ID/locations/REGION/datasets/DESTINATION_DATASET_ID',
      'config': {
        'dicom': {
          'filterProfile': 'DEIDENTIFY_TAG_CONTENTS'
        },
        'text': {
          'transformations': [
            {
              'infoTypes': [],
              'replaceWithInfoTypeConfig': {}
            }
          ]
        }
      }
    }" "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/REGION/datasets/SOURCE_DATASET_ID:deidentify"
If the request is successful, the server returns a 200 OK HTTP status code and the response in JSON format:
200 OK
{
  "name": "projects/PROJECT_ID/locations/REGION/datasets/SOURCE_DATASET_ID/OPERATION_ID"
}
The response contains an operation name. You can use the Operation get method to track the status of the operation:
curl -X GET \
    -H "Authorization: Bearer "$(gcloud auth print-access-token) \
    -H "Content-Type: application/json; charset=utf-8" \
    "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/REGION/datasets/SOURCE_DATASET_ID/operations/OPERATION_ID"
If the request is successful, the server returns a 200 OK HTTP status code and the response in JSON format. After the de-identification process finishes, the response contains "done": true.
200 OK
{
  "name": "projects/PROJECT_ID/locations/REGION/datasets/SOURCE_DATASET_ID/operations/OPERATION_NUMBER",
  "metadata": {
    "@type": "type.googleapis.com/google.cloud.healthcare.v1beta1.OperationMetadata",
    "apiMethodName": "google.cloud.healthcare.v1beta1.dataset.DatasetService.DeidentifyDataset",
    "createTime": "2018-01-01T00:00:00Z",
    "endTime": "2018-01-01T00:00:00Z"
  },
  "done": true,
  "response": {
    "@type": "...",
    "successStoreCount": "SUCCESS_STORE_COUNT"
  }
}
After the de-identification succeeds, you can retrieve the metadata for the de-identified instance to see how it was changed. The de-identified instance will have a new studies UID, series UID, and instances UID, so you first need to search in the new dataset for the de-identified instance:
curl -X GET \
     -H "Authorization: Bearer "$(gcloud auth print-access-token) \
     -H "Content-Type: application/dicom+json; charset=utf-8" \
     "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/REGION/datasets/DESTINATION_DATASET_ID/dicomStores/DICOM_STORE_ID/dicomWeb/instances"
If the request is successful, the server returns a 200 OK HTTP status code and the response in JSON format:
200 OK
[
  {
    "00080005":{"vr":"CS"},
    "00080016":{"vr":"UI"},
    "00080018":{"vr":"UI","Value":["1.3.6.1.4.1.11129.5.1.97415866390999888717168863957686758029"]},
    "00080020":{"vr":"DA"},
    "00080030":{"vr":"TM"},
    "00080050":{"vr":"SH"},
    "00080090":{"vr":"PN"},
    "00100010":{"vr":"PN"},
    "00100020":{"vr":"LO","Value":["S1214223-1"]},
    "00100030":{"vr":"DA"},
    "00100040":{"vr":"CS"},
    "0020000D":{"vr":"UI","Value":["1.3.6.1.4.1.11129.5.1.201854290391432893460946240745559593763"]},
    "0020000E":{"vr":"UI","Value":["1.3.6.1.4.1.11129.5.1.303327499491957026103380014864616068710"]},
    "00200010":{"vr":"SH"},
    "00200013":{"vr":"IS"},
    "00280010":{"vr":"US","Value":[1024]},
    "00280011":{"vr":"US","Value":[1024]},
    "00280100":{"vr":"US","Value":[8]}
  }
]
The following table shows how the studies UID, series UID, and instances UID changed:
  Original instance metadata De-identified instance metadata
Studies UID (0020000D) 2.25.70541616638819138568043293671559322355 1.3.6.1.4.1.11129.5.1.201854290391432893460946240745559593763
Series UID (0020000E) 1.2.276.0.7230010.3.1.3.8323329.78.1531234558.523694 1.3.6.1.4.1.11129.5.1.303327499491957026103380014864616068710
Instances UID (00080018) 1.2.276.0.7230010.3.1.4.8323329.78.1539083058.523695 1.3.6.1.4.1.11129.5.1.97415866390999888717168863957686758029
Using the new values, retrieve the metadata for the instance:
curl -X GET \
    -H "Authorization: Bearer "$(gcloud auth print-access-token) \
    -H "Content-Type: application/json; charset=utf-8" \
    "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/REGION/datasets/DESTINATION_DATASET_ID//dicomStores/DICOM_STORE_ID/dicomWeb/studies/1.3.6.1.4.1.11129.5.1.201854290391432893460946240745559593763/series/1.3.6.1.4.1.11129.5.1.303327499491957026103380014864616068710/instances/1.3.6.1.4.1.11129.5.1.97415866390999888717168863957686758029/metadata"
If the request is successful, the server returns a 200 OK HTTP status code and the new metadata in JSON format. You can compare the new metadata with the original metadata to see the effect of the transformation.
200 OK
[
  {
    "00020002":{"vr":"UI","Value":["1.2.840.10008.5.1.4.1.1.7"]},
    "00020003":{"vr":"UI","Value":["1.3.6.1.4.1.11129.5.1.97415866390999888717168863957686758029"]},
    "00020010":{"vr":"UI","Value":["1.2.840.10008.1.2.4.50"]},
    "00020012":{"vr":"UI","Value":["1.2.40.0.13.1.3"]},
    "00020013":{"vr":"SH","Value":["dcm4che-null"]},
    "00080005":{"vr":"CS","Value":["ISO_IR 100"]},
    "00080016":{"vr":"UI","Value":["1.2.840.10008.5.1.4.1.1.7"]},
    "00080018":{"vr":"UI","Value":["1.3.6.1.4.1.11129.5.1.97415866390999888717168863957686758029"]},
    "00080020":{"vr":"DA","Value":["19000101"]},
    "00080030":{"vr":"TM","Value":["110032"]},
    "00080050":{"vr":"SH"},
    "00080064":{"vr":"CS","Value":["WSD"]},
    "00080070":{"vr":"LO","Value":["Manufacturer"]},
    "00080090":{"vr":"PN","Value":[{"Alphabetic":"[PERSON_NAME]"}]},
    "00081090":{"vr":"LO","Value":["ABC1"]},
    "00100010":{"vr":"PN","Value":[{"Alphabetic":"[PERSON_NAME]"}]},
    "00100020":{"vr":"LO","Value":["S1214223-1"]},
    "00100030":{"vr":"DA","Value":["19000101"]},
    "00100040":{"vr":"CS","Value":["F"]},
    "0020000D":{"vr":"UI","Value":["1.3.6.1.4.1.11129.5.1.201854290391432893460946240745559593763"]},
    "0020000E":{"vr":"UI","Value":["1.3.6.1.4.1.11129.5.1.303327499491957026103380014864616068710"]},
    "00200010":{"vr":"SH"},
    "00200011":{"vr":"IS"},
    "00200013":{"vr":"IS"},
    "00200020":{"vr":"CS"},
    "00280002":{"vr":"US","Value":[3]},
    "00280004":{"vr":"CS","Value":["YBR_FULL_422"]},
    "00280006":{"vr":"US","Value":[0]},
    "00280010":{"vr":"US","Value":[1024]},
    "00280011":{"vr":"US","Value":[1024]},
    "00280100":{"vr":"US","Value":[8]},
    "00280101":{"vr":"US","Value":[8]},
    "00280102":{"vr":"US","Value":[7]},
    "00280103":{"vr":"US","Value":[0]},
    "00282110":{"vr":"CS","Value":["01"]},
    "00282114":{"vr":"CS","Value":["ISO_10918_1"]}
  }
]

PowerShell

To de-identify a dataset containing DICOM data and replace all relevant values with their infoTypes, make a POST request and provide the name of the destination dataset, the dicom.filter_profile set to DEIDENTIFY_TAG_CONTENTS, an empty transformations.infoTypes, an empty transformations.cryptoHashConfig, and an access token. The following sample shows a POST request using Windows PowerShell.

$cred = gcloud auth print-access-token
$headers = @{ Authorization = "Bearer $cred" }

Invoke-WebRequest `
  -Method Post `
  -Headers $headers `
  -ContentType: "application/json; charset=utf-8" `
  -Body "{
    'destinationDataset': 'projects/PROJECT_ID/locations/REGION/datasets/DESTINATION_DATASET_ID',
    'config': {
      'dicom': {
        'filterProfile': 'DEIDENTIFY_TAG_CONTENTS'
      },
      'text': {
        'transformations': [
          {
            'infoTypes': [],
            'replaceWithInfoTypeConfig': {}
          }
        ]
      }
    }
  }" `
  -Uri "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/REGION/datasets/SOURCE_DATASET_ID:deidentify" | Select-Object -Expand Content
If the request is successful, the server returns a 200 OK HTTP status code and the response in JSON format:
200 OK
{
  "name": "projects/PROJECT_ID/locations/REGION/datasets/SOURCE_DATASET_ID/OPERATION_ID"
}
The response contains an operation name. You can use the Operation get method to track the status of the operation:
$cred = gcloud auth print-access-token
$headers = @{ Authorization = "Bearer $cred" }

Invoke-WebRequest `
  -Method Get `
  -Headers $headers `
  -ContentType: "application/json; charset=utf-8" `
  -Uri "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/REGION/datasets/SOURCE_DATASET_ID/operations/OPERATION_ID" | Select-Object -Expand Content
If the request is successful, the server returns a 200 OK HTTP status code and the response in JSON format. After the de-identification process finishes, the response contains "done": true.
200 OK
{
  "name": "projects/PROJECT_ID/locations/REGION/datasets/SOURCE_DATASET_ID/operations/OPERATION_NUMBER",
  "metadata": {
    "@type": "type.googleapis.com/google.cloud.healthcare.v1beta1.OperationMetadata",
    "apiMethodName": "google.cloud.healthcare.v1beta1.dataset.DatasetService.DeidentifyDataset",
    "createTime": "2018-01-01T00:00:00Z",
    "endTime": "2018-01-01T00:00:00Z"
  },
  "done": true,
  "response": {
    "@type": "...",
    "successStoreCount": "SUCCESS_STORE_COUNT"
  }
}
After the de-identification succeeds, you can retrieve the metadata for the de-identified instance to see how it was changed. The de-identified instance will have a new studies UID, series UID, and instances UID, so you first need to search in the new dataset for the de-identified instance:
$cred = gcloud auth print-access-token
$headers = @{ Authorization = "Bearer $cred" }

Invoke-WebRequest `
  -Method Get `
  -Headers $headers `
  -ContentType: "application/json; charset=utf-8" `
  -Uri "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/REGION/datasets/DESTINATION_DATASET_ID/dicomStores/DICOM_STORE_ID/dicomWeb/instances" | Select-Object -Expand Content
If the request is successful, the server returns a 200 OK HTTP status code and the response in JSON format:
200 OK
[
  {
    "00080005":{"vr":"CS"},
    "00080016":{"vr":"UI"},
    "00080018":{"vr":"UI","Value":["1.3.6.1.4.1.11129.5.1.97415866390999888717168863957686758029"]},
    "00080020":{"vr":"DA"},
    "00080030":{"vr":"TM"},
    "00080050":{"vr":"SH"},
    "00080090":{"vr":"PN"},
    "00100010":{"vr":"PN"},
    "00100020":{"vr":"LO","Value":["S1214223-1"]},
    "00100030":{"vr":"DA"},
    "00100040":{"vr":"CS"},
    "0020000D":{"vr":"UI","Value":["1.3.6.1.4.1.11129.5.1.201854290391432893460946240745559593763"]},
    "0020000E":{"vr":"UI","Value":["1.3.6.1.4.1.11129.5.1.303327499491957026103380014864616068710"]},
    "00200010":{"vr":"SH"},
    "00200013":{"vr":"IS"},
    "00280010":{"vr":"US","Value":[1024]},
    "00280011":{"vr":"US","Value":[1024]},
    "00280100":{"vr":"US","Value":[8]}
  }
]
The following table shows how the studies UID, series UID, and instances UID changed:
  Original instance metadata De-identified instance metadata
Studies UID (0020000D) 2.25.70541616638819138568043293671559322355 1.3.6.1.4.1.11129.5.1.201854290391432893460946240745559593763
Series UID (0020000E) 1.2.276.0.7230010.3.1.3.8323329.78.1531234558.523694 1.3.6.1.4.1.11129.5.1.303327499491957026103380014864616068710
Instances UID (00080018) 1.2.276.0.7230010.3.1.4.8323329.78.1539083058.523695 1.3.6.1.4.1.11129.5.1.97415866390999888717168863957686758029
Using the new values, retrieve the metadata for the instance:
$cred = gcloud auth print-access-token
$headers = @{ Authorization = "Bearer $cred" }

Invoke-WebRequest `
  -Method Get `
  -Headers $headers `
  -ContentType: "application/json; charset=utf-8" `
  -Uri "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/REGION/datasets/DESTINATION_DATASET_ID//dicomStores/DICOM_STORE_ID/dicomWeb/studies/1.3.6.1.4.1.11129.5.1.201854290391432893460946240745559593763/series/1.3.6.1.4.1.11129.5.1.303327499491957026103380014864616068710/instances/1.3.6.1.4.1.11129.5.1.97415866390999888717168863957686758029/metadata" | Select-Object -Expand Content
If the request is successful, the server returns a 200 OK HTTP status code and the new metadata in JSON format. You can compare the new metadata with the original metadata to see the effect of the transformation.
200 OK
[
  {
    "00020002":{"vr":"UI","Value":["1.2.840.10008.5.1.4.1.1.7"]},
    "00020003":{"vr":"UI","Value":["1.3.6.1.4.1.11129.5.1.97415866390999888717168863957686758029"]},
    "00020010":{"vr":"UI","Value":["1.2.840.10008.1.2.4.50"]},
    "00020012":{"vr":"UI","Value":["1.2.40.0.13.1.3"]},
    "00020013":{"vr":"SH","Value":["dcm4che-null"]},
    "00080005":{"vr":"CS","Value":["ISO_IR 100"]},
    "00080016":{"vr":"UI","Value":["1.2.840.10008.5.1.4.1.1.7"]},
    "00080018":{"vr":"UI","Value":["1.3.6.1.4.1.11129.5.1.97415866390999888717168863957686758029"]},
    "00080020":{"vr":"DA","Value":["19000101"]},
    "00080030":{"vr":"TM","Value":["110032"]},
    "00080050":{"vr":"SH"},
    "00080064":{"vr":"CS","Value":["WSD"]},
    "00080070":{"vr":"LO","Value":["Manufacturer"]},
    "00080090":{"vr":"PN","Value":[{"Alphabetic":"[PERSON_NAME]"}]},
    "00081090":{"vr":"LO","Value":["ABC1"]},
    "00100010":{"vr":"PN","Value":[{"Alphabetic":"[PERSON_NAME]"}]},
    "00100020":{"vr":"LO","Value":["S1214223-1"]},
    "00100030":{"vr":"DA","Value":["19000101"]},
    "00100040":{"vr":"CS","Value":["F"]},
    "0020000D":{"vr":"UI","Value":["1.3.6.1.4.1.11129.5.1.201854290391432893460946240745559593763"]},
    "0020000E":{"vr":"UI","Value":["1.3.6.1.4.1.11129.5.1.303327499491957026103380014864616068710"]},
    "00200010":{"vr":"SH"},
    "00200011":{"vr":"IS"},
    "00200013":{"vr":"IS"},
    "00200020":{"vr":"CS"},
    "00280002":{"vr":"US","Value":[3]},
    "00280004":{"vr":"CS","Value":["YBR_FULL_422"]},
    "00280006":{"vr":"US","Value":[0]},
    "00280010":{"vr":"US","Value":[1024]},
    "00280011":{"vr":"US","Value":[1024]},
    "00280100":{"vr":"US","Value":[8]},
    "00280101":{"vr":"US","Value":[8]},
    "00280102":{"vr":"US","Value":[7]},
    "00280103":{"vr":"US","Value":[0]},
    "00282110":{"vr":"CS","Value":["01"]},
    "00282114":{"vr":"CS","Value":["ISO_10918_1"]}
  }
]

De-identifying data at the DICOM store level

The preceding samples show how to de-identify DICOM data at the dataset level. To change a dataset de-identification request to a DICOM store de-identification request, make the following changes:

  • Modify the destinationDataset in the request body to destinationStore
  • Add dicomStores/DESTINATION_DICOM_STORE_ID at the end of the value in destinationStore when specifying the destination
  • Add dicomStores/SOURCE_DICOM_STORE_ID when specifying the location of the source data

For example:

Dataset level de-identification:

'destinationDataset': 'projects/PROJECT_ID/locations/REGION/datasets/DESTINATION_DATASET_ID'
…
"https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/REGION/datasets/SOURCE_DATASET_ID:deidentify"

DICOM store level de-identification:

'destinationStore': 'projects/PROJECT_ID/locations/REGION/datasets/DESTINATION_DATASET_ID/dicomStores/DESTINATION_DICOM_STORE_ID'
…
"https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/REGION/datasets/SOURCE_DATASET_ID/dicomStores/SOURCE_DICOM_STORE_ID:deidentify"

The following sample expands on Combining tag de-identification and burnt-in text redaction, but the de-identification occurs on a single DICOM store and the de-identified data is copied to a new DICOM store. Note that the dataset referenced by DESTINATION_DATASET_ID must already exist.

curl command

curl -X POST \
    -H "Authorization: Bearer "$(gcloud auth print-access-token) \
    -H "Content-Type: application/json; charset=utf-8" \
    --data "{
      'destinationStore': 'projects/PROJECT_ID/locations/REGION/datasets/DESTINATION_DATASET_ID/dicomStores/DESTINATION_DICOM_STORE_ID',
      'config': {
        'dicom': {
          'filterProfile': 'DEIDENTIFY_TAG_CONTENTS'
        },
        'image': {
          'textRedactionMode': 'REDACT_ALL_TEXT'
        }
      }
    }" "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/REGION/datasets/SOURCE_DATASET_ID/dicomStores/SOURCE_DICOM_STORE_ID:deidentify"

If the request is successful, the server returns a 200 OK HTTP status code and the response in JSON format:

200 OK
{
  "name": "projects/PROJECT_ID/locations/REGION/datasets/SOURCE_DATASET_ID/operations/OPERATION_ID"
}

The response contains an operation ID. You can use the Operation get method to track the status of the operation:

curl -X GET \
    -H "Authorization: Bearer "$(gcloud auth print-access-token) \
    -H "Content-Type: application/json; charset=utf-8" \
    "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/REGION/datasets/SOURCE_DATASET_ID/operations/OPERATION_ID"

If the request is successful, the server returns a 200 OK HTTP status code and the response in JSON format. After the de-identification process finishes, the response contains "done": true.

200 OK
{
  "name": "projects/PROJECT_ID/locations/REGION/datasets/SOURCE_DATASET_ID/operations/OPERATION_ID",
  "metadata": {
    "@type": "type.googleapis.com/google.cloud.healthcare.v1beta1.OperationMetadata",
    "apiMethodName": "google.cloud.healthcare.v1beta1.deidentify.DeidentifyService.DeidentifyDicomStore",
    "createTime": "CREATE_TIME",
    "endTime": "END_TIME"
  },
  "done": true,
  "response": {
    "@type": "...",
    "successResourceCount": "SUCCESS_RESOURCE_COUNT"
  }
}

PowerShell

$cred = gcloud auth print-access-token
$headers = @{ Authorization = "Bearer $cred" }

Invoke-WebRequest `
  -Method Post `
  -Headers $headers `
  -ContentType: "application/json; charset=utf-8" `
  -Body "{
    'destinationStore': 'projects/PROJECT_ID/locations/REGION/datasets/DESTINATION_DATASET_ID/dicomStores/DESTINATION_DICOM_STORE_ID',
    'config': {
      'dicom': {
        'filterProfile': 'DEIDENTIFY_TAG_CONTENTS'
      },
      'image': {
        'textRedactionMode': 'REDACT_ALL_TEXT'
      }
    }
  }" `
  -Uri "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/REGION/datasets/SOURCE_DATASET_ID/dicomStores/SOURCE_DICOM_STORE_ID:deidentify" | Select-Object -Expand Content

If the request is successful, the server returns a 200 OK HTTP status code and the response in JSON format:

200 OK
{
  "name": "projects/PROJECT_ID/locations/REGION/datasets/SOURCE_DATASET_ID/operations/OPERATION_ID"
}

The response contains an operation ID. You can use the Operation get method to track the status of the operation:

$cred = gcloud auth print-access-token
$headers = @{ Authorization = "Bearer $cred" }

Invoke-WebRequest `
  -Method Get `
  -Headers $headers `
  -ContentType: "application/json; charset=utf-8" `
  -Uri "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/REGION/datasets/SOURCE_DATASET_ID/operations/OPERATION_ID" | Select-Object -Expand Content

If the request is successful, the server returns a 200 OK HTTP status code and the response in JSON format. After the de-identification process finishes, the response contains "done": true.

200 OK
{
  "name": "projects/PROJECT_ID/locations/REGION/datasets/SOURCE_DATASET_ID/operations/OPERATION_ID",
  "metadata": {
    "@type": "type.googleapis.com/google.cloud.healthcare.v1beta1.OperationMetadata",
    "apiMethodName": "google.cloud.healthcare.v1beta1.deidentify.DeidentifyService.DeidentifyDicomStore",
    "createTime": "CREATE_TIME",
    "endTime": "END_TIME"
  },
  "done": true,
  "response": {
    "@type": "...",
    "successResourceCount": "SUCCESS_RESOURCE_COUNT"
  }
}

De-identifying a subset of a DICOM store

When you de-identify DICOM data at the DICOM store level, you can de-identify a subset of the data by specifying a filter.

The filter takes the form of a filter file that you specify as a value for the resourcePathsGcsUri field in the DicomFilterConfig object. The filter file must exist in a Cloud Storage bucket; you cannot specify a filter file that exists on your local machine or any other source. The location of the file must be in the format gs://BUCKET/PATH/TO/FILE.

Creating a filter file

A filter file defines which DICOM files to de-identify. You can filter files at the following levels:

  • At the study level
  • At the series level
  • At the instance level

The filter file is made up of one line per study, series, or instance you want to de-identify. Each line uses the format /studies/STUDY_UID[/series/SERIES_UID[/instances/INSTANCE_UID]]. At the end of each line is a newline character: either \n or \r\n.

If a study, series, or instance is not specified in the filter file you passed in when calling the de-identify operation, that study, series, or instance will not be de-identified and will not be present in the destination DICOM store.

Only the /studies/STUDY_UID portion of the path is required. This means that you can de-identify a study by specifying /studies/STUDY_UID, or you can de-identify a series by specifying /studies/STUDY_UID/series/SERIES_UID.

Consider the following filter file. The filter file causes one study, two series, and three individual instances to be de-identified:

/studies/1.123.456.789
/studies/1.666.333.111/series/123.456\n
/studies/1.666.333.111/series/567.890\n
/studies/1.888.999.222/series/123.456/instances/111\n
/studies/1.888.999.222/series/123.456/instances/222\n
/studies/1.888.999.222/series/123.456/instances/333\n

Creating a filter file using BigQuery

You typically create a filter file by first exporting the metadata from a DICOM store to BigQuery. This lets you use BigQuery to view the study, series, and instance UIDs of the DICOM data in your DICOM store. You can then do the following:

  1. Query for the study, series, and instance UIDs you are interested in. For example, after exporting the metadata to BigQuery, you could run the following query to concatenate the study, series, and instance UIDs to a format that's compatible with the filter file requirements:

    SELECT CONCAT
      ('/studies/', StudyInstanceUID, '/series/', SeriesInstanceUID, '/instances/', SOPInstanceUID)
    FROM
      [PROJECT_ID:BIGQUERY_DATASET.BIGQUERY_TABLE]
    
  2. If the query returns a large result set, you can materialize a new table by saving the query results to a destination table in BigQuery.

  3. After saving the query results to the destination table, you can save the contents of the destination table to a file and export it to Cloud Storage. For steps on how to do so, see Exporting table data. The exported file is your filter file. You will use the location of the filter file in Cloud Storage when specifying the filter in the export operation.

Creating a filter file manually

You can create a filter file with custom content and upload it to a Cloud Storage bucket. You will use the location of the filter file in Cloud Storage when specifying the filter in the de-identify operation. The following sample shows how to upload a filter file to a Cloud Storage bucket using the gsutil cp command:

gsutil cp PATH/TO/FILTER_FILE gs://BUCKET/DIRECTORY

For example:

gsutil cp /home/user/Desktop/filters.txt gs://my-bucket/my-directory

Using a filter

After you have your filter file configured, you can pass it in as a value to the resourcePathsGcsUri field in the filterConfig object.

The following sample expands on De-identifying data at the DICOM store level, but a filter file in Cloud Storage is provided that determines which DICOM resources are de-identified.

curl command

curl -X POST \
    -H "Authorization: Bearer "$(gcloud auth print-access-token) \
    -H "Content-Type: application/json; charset=utf-8" \
    --data "{
      'destinationStore': 'projects/PROJECT_ID/locations/REGION/datasets/DESTINATION_DATASET_ID/dicomStores/DESTINATION_DICOM_STORE_ID',
      'config': {
        'dicom': {
          'filterProfile': 'DEIDENTIFY_TAG_CONTENTS'
        },
        'image': {
          'textRedactionMode': 'REDACT_ALL_TEXT'
        }
      },
      'filterConfig': {
        'resourcePathsGcsUri': 'gs://BUCKET/PATH/TO/FILE'
      }
    }" "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/REGION/datasets/SOURCE_DATASET_ID/dicomStores/SOURCE_DICOM_STORE_ID:deidentify"

If the request is successful, the server returns a 200 OK HTTP status code and the response in JSON format:

200 OK
{
  "name": "projects/PROJECT_ID/locations/REGION/datasets/SOURCE_DATASET_ID/operations/OPERATION_ID"
}

The response contains an operation ID. You can use the Operation get method to track the status of the operation:

curl -X GET \
    -H "Authorization: Bearer "$(gcloud auth print-access-token) \
    -H "Content-Type: application/json; charset=utf-8" \
    "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/REGION/datasets/SOURCE_DATASET_ID/operations/OPERATION_ID"

If the request is successful, the server returns a 200 OK HTTP status code and the response in JSON format. After the de-identification process finishes, the response contains "done": true.

200 OK
{
  "name": "projects/PROJECT_ID/locations/REGION/datasets/SOURCE_DATASET_ID/operations/OPERATION_ID",
  "metadata": {
    "@type": "type.googleapis.com/google.cloud.healthcare.v1beta1.OperationMetadata",
    "apiMethodName": "google.cloud.healthcare.v1beta1.deidentify.DeidentifyService.DeidentifyDicomStore",
    "createTime": "CREATE_TIME",
    "endTime": "END_TIME"
  },
  "done": true,
  "response": {
    "@type": "...",
    "successResourceCount": "SUCCESS_RESOURCE_COUNT"
  }
}

PowerShell

$cred = gcloud auth print-access-token
$headers = @{ Authorization = "Bearer $cred" }

Invoke-WebRequest `
  -Method Post `
  -Headers $headers `
  -ContentType: "application/json; charset=utf-8" `
  -Body "{
    'destinationStore': 'projects/PROJECT_ID/locations/REGION/datasets/DESTINATION_DATASET_ID/dicomStores/DESTINATION_DICOM_STORE_ID',
    'config': {
      'dicom': {
        'filterProfile': 'DEIDENTIFY_TAG_CONTENTS'
      },
      'image': {
        'textRedactionMode': 'REDACT_ALL_TEXT'
      },
    },
    'filterConfig': {
      'resourcePathsGcsUri': 'gs://BUCKET/PATH/TO/FILE'
    }
  }" `
  -Uri "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/REGION/datasets/SOURCE_DATASET_ID/dicomStores/SOURCE_DICOM_STORE_ID:deidentify" | Select-Object -Expand Content

If the request is successful, the server returns a 200 OK HTTP status code and the response in JSON format:

200 OK
{
  "name": "projects/PROJECT_ID/locations/REGION/datasets/SOURCE_DATASET_ID/operations/OPERATION_ID"
}

The response contains an operation ID. You can use the Operation get method to track the status of the operation:

$cred = gcloud auth print-access-token
$headers = @{ Authorization = "Bearer $cred" }

Invoke-WebRequest `
  -Method Get `
  -Headers $headers `
  -ContentType: "application/json; charset=utf-8" `
  -Uri "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/REGION/datasets/SOURCE_DATASET_ID/operations/OPERATION_ID" | Select-Object -Expand Content

If the request is successful, the server returns a 200 OK HTTP status code and the response in JSON format. After the de-identification process finishes, the response contains "done": true.

200 OK
{
  "name": "projects/PROJECT_ID/locations/REGION/datasets/SOURCE_DATASET_ID/operations/OPERATION_ID",
  "metadata": {
    "@type": "type.googleapis.com/google.cloud.healthcare.v1beta1.OperationMetadata",
    "apiMethodName": "google.cloud.healthcare.v1beta1.deidentify.DeidentifyService.DeidentifyDicomStore",
    "createTime": "CREATE_TIME",
    "endTime": "END_TIME"
  },
  "done": true,
  "response": {
    "@type": "...",
    "successResourceCount": "SUCCESS_RESOURCE_COUNT"
  }
}

Troubleshooting DICOM de-identification operations

If errors occur during a DICOM de-identification operation, the errors are logged to Stackdriver Logging. For more information, see Viewing error logs in Stackdriver Logging.

Was this page helpful? Let us know how we did:

Send feedback about...

Cloud Healthcare API