De-identifying FHIR data

This page explains how to de-identify sensitive data in FHIR resources using the Cloud Healthcare API at the following levels:

This page also explains how to apply filters when de-identifying data at the FHIR store level.

De-identification overview

Dataset level de-identification

To de-identify FHIR data at the dataset level, call the datasets.deidentify operation. The de-identification API call has the following components:

  • The source dataset: A dataset containing FHIR stores with one or more resources that have sensitive data.
  • What to de-identify: Configuration parameters that specify how to process the dataset. Specify these parameters in a FhirConfig and/or TextConfig inside the DeidentifyConfig object.
  • The destination dataset: De-identification does not impact the original dataset or its data. Instead, de-identified copies of the original data are written to a new dataset, called the destination dataset.

The majority of samples in this guide show how to de-identify FHIR data at the dataset level.

FHIR store level de-identification

De-identifying FHIR data at the FHIR store level lets you have more control over which FHIR data is de-identified.

To de-identify FHIR data in a FHIR store, call the fhirStores.deidentify method. The de-identification API call has the following components:

  • The source FHIR store: A FHIR store containing one or more resources that have sensitive data.
  • What to de-identify: Configuration parameters that specify how to process the dataset. Specify these parameters in a FhirConfig and/or TextConfig inside the DeidentifyConfig object.
  • The destination FHIR store: De-identification does not impact the original FHIR store or its data. Instead, de-identified copies of the original data are written to a new or existing FHIR store.

For an example of how to de-identify FHIR data at the FHIR store level, see De-identifying data at the FHIR store level.

Filters

You can de-identify a subset of data in a FHIR store by specifying a list of FHIR resource IDs in the fhirStores.deidentify request. For an example, see De-identifying a subset of a FHIR store.

Sample FHIR resource used in this guide

The samples in this guide use a Patient resource in a FHIR store. The Patient has the properties shown in the following sample. The id value is generated by the server. If you create the Patient resource in your own FHIR store, the id value returned is different than the value shown in the sample Patient.

{
  "address": [
    {
      "city": "Anycity",
      "district": "Anydistrict",
      "line": [
        "123 Main Street"
      ],
      "period": {
        "start": "1990-12-05"
      },
      "postalCode": "12345",
      "state": "CA",
      "text": "123 Main Street Anycity, Anydistrict, CA 12345",
      "type": "both",
      "use": "home"
    }
  ],
  "birthDate": "1980-12-05",
  "gender": "female",
  "id": "r77433dd-dkeuc-633743nfd-383nfdsjds732",
  "name": [
    {
      "family": "Smith",
      "given": [
        "Darcy"
      ],
      "use": "official"
    }
  ],
  "resourceType": "Patient",
  "text": {
    "status": "generated",
    "div": "<div><p><b>Patient</b></p><p><b>Name</b>: Smith, Darcy</p><p><b>DateOfBirth</b>: 1980-12-05</p><p><b>Gender</b>: Female</p></div>"
  }
}

Default FHIR data de-identification

You can de-identify FHIR data using a "default" method that redacts common protected health information (PHI) in the resources in a FHIR store.

When using the default method, use an empty FhirConfig inside the DeidentifyConfig object. The infoTypes specified in the default FHIR infoTypes are redacted with the default method.

The following samples show how to de-identify the Patient resource using the FHIR default method.

curl command

curl -X POST \
    -H "Authorization: Bearer "$(gcloud auth print-access-token) \
    -H "Content-Type: application/json; charset=utf-8" \
    --data "{
      'destinationDataset': 'projects/PROJECT_ID/locations/REGION/datasets/DESTINATION_DATASET_ID',
      'config': {
        'fhir': {}
      }
    }" "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/REGION/datasets/SOURCE_DATASET_ID:deidentify"
If the request is successful, the server returns a 200 OK HTTP status code and the response in JSON format:
200 OK
{
  "name": "projects/PROJECT_ID/locations/REGION/datasets/SOURCE_DATASET_ID/operations/OPERATION_ID"
}
The response contains an operation name. You can use the Operation get method to track the status of the operation:
curl -X GET \
    -H "Authorization: Bearer "$(gcloud auth print-access-token) \
    -H "Content-Type: application/json; charset=utf-8" \
    "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/REGION/datasets/SOURCE_DATASET_ID/operations/OPERATION_ID"
If the request is successful, the server returns a 200 OK HTTP status code and the response in JSON format. After the de-identification process finishes, the response contains "done": true.
200 OK
{
  "name": "projects/PROJECT_ID/locations/REGION/datasets/SOURCE_DATASET_ID/operations/OPERATION_ID",
  "metadata": {
    "@type": "type.googleapis.com/google.cloud.healthcare.v1beta1.OperationMetadata",
    "apiMethodName": "google.cloud.healthcare.v1beta1.dataset.DatasetService.DeidentifyDataset",
    "createTime": "2018-01-01T00:00:00Z",
    "endTime": "2018-01-01T00:00:00Z"
  },
  "done": true,
  "response": {
    "@type": "type.googleapis.com/google.cloud.healthcare.v1beta1.deidentify.DeidentifySummary",
    "successStoreCount": "1",
    "successResourceCount": "1"
  }
}
Next, using the Patient ID, you can get the details for the Patient resource in the new destination dataset:
curl -X GET \
     -H "Authorization: Bearer "$(gcloud auth print-access-token) \
     -H "Content-Type: application/fhir+json; charset=utf-8" \
     "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/REGION/datasets/DESTINATION_DATASET_ID/fhirStores/FHIR_STORE_ID/fhir/Patient/r77433dd-dkeuc-633743nfd-383nfdsjds732"
The server returns the following response:
200 OK
{
  "address": [
    {
      "city": "",
      "district": "",
      "line": [
        ""
      ],
      "period": {
        "start": "1990-12-05"
      },
      "postalCode": "",
      "state": "CA",
      "text": "",
      "type": "both",
      "use": "home"
    }
  ],
  "birthDate": "1981-02-24",
  "gender": "female",
  "id": "r77433dd-dkeuc-633743nfd-383nfdsjds732",
  "meta": {
    "lastUpdated": "2018-01-01T2018-01-01T00:00:00+00:00",
    "versionId": "MTU0MDU4NTcxNjI2MTUxNDAwMA"
  },
  "name": [
    {
      "family": "",
      "given": [
        ""
      ],
      "use": "official"
    }
  ],
  "resourceType": "Patient",
  "text": {
    "div": "<div><p><b>Patient</b></p><p><b>Name</b>: [PERSON_NAME][PERSON_NAME][PERSON_NAME]</p><p><b>DateOfBirth</b>: 1981-02-24</p><p><b>Gender</b>: Female</p></div>",
    "status": "generated"
  }
}

PowerShell

$cred = gcloud auth print-access-token
$headers = @{ Authorization = "Bearer $cred" }

Invoke-WebRequest `
  -Method Post `
  -Headers $headers `
  -ContentType: "application/json; charset=utf-8" `
  -Body "{
    'destinationDataset': 'projects/PROJECT_ID/locations/REGION/datasets/DESTINATION_DATASET_ID',
    'config': {
      'fhir': {}
    }
  }" `
  -Uri "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/REGION/datasets/SOURCE_DATASET_ID:deidentify" | Select-Object -Expand Content
If the request is successful, the server returns a 200 OK HTTP status code and the response in JSON format:
200 OK
{
  "name": "projects/PROJECT_ID/locations/REGION/datasets/SOURCE_DATASET_ID/operations/OPERATION_ID"
}
The response contains an operation name. You can use the Operation get method to track the status of the operation:
$cred = gcloud auth print-access-token
$headers = @{ Authorization = "Bearer $cred" }

Invoke-WebRequest `
  -Method Get `
  -Headers $headers `
  -ContentType: "application/json; charset=utf-8" `
  -Uri "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/REGION/datasets/SOURCE_DATASET_ID/operations/OPERATION_ID" | Select-Object -Expand Content
If the request is successful, the server returns a 200 OK HTTP status code and the response in JSON format. After the de-identification process finishes, the response contains "done": true.
200 OK
{
  "name": "projects/PROJECT_ID/locations/REGION/datasets/SOURCE_DATASET_ID/operations/OPERATION_NUMBER",
  "metadata": {
    "@type": "type.googleapis.com/google.cloud.healthcare.v1beta1.OperationMetadata",
    "apiMethodName": "google.cloud.healthcare.v1beta1.dataset.DatasetService.DeidentifyDataset",
    "createTime": "2018-01-01T00:00:00Z",
    "endTime": "2018-01-01T00:00:00Z"
  },
  "done": true,
  "response": {
    "@type": "type.googleapis.com/google.cloud.healthcare.v1beta1.deidentify.DeidentifySummary",
    "successStoreCount": "1",
    "successResourceCount": "1"
  }
}
Next, using the Patient ID, you can get the details for the Patient resource in the new destination dataset:
$cred = gcloud auth print-access-token
$headers = @{ Authorization = "Bearer $cred" }

Invoke-RestMethod `
  -Method Get `
  -Headers $headers `
  -ContentType: "application/fhir+json; charset=utf-8" `
  -Uri "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/REGION/datasets/DESTINATION_DATASET_ID/fhirStores/FHIR_STORE_ID/fhir/Patient/r77433dd-dkeuc-633743nfd-383nfdsjds732" | ConvertTo-Json
The server returns the following response:
200 OK
{
  "address": [
    {
      "city": "",
      "district": "",
      "line": [
        ""
      ],
      "period": {
        "start": "1990-12-05"
      },
      "postalCode": "",
      "state": "CA",
      "text": "",
      "type": "both",
      "use": "home"
    }
  ],
  "birthDate": "1981-02-24",
  "gender": "female",
  "id": "r77433dd-dkeuc-633743nfd-383nfdsjds732",
  "meta": {
    "lastUpdated": "2018-01-01T2018-01-01T00:00:00+00:00",
    "versionId": "MTU0MDU4NTcxNjI2MTUxNDAwMA"
  },
  "name": [
    {
      "family": "",
      "given": [
        ""
      ],
      "use": "official"
    }
  ],
  "resourceType": "Patient",
  "text": {
    "div": "<div><p><b>Patient</b></p><p><b>Name</b>: [PERSON_NAME][PERSON_NAME][PERSON_NAME]</p><p><b>DateOfBirth</b>: 1981-02-24</p><p><b>Gender</b>: Female</p></div>",
    "status": "generated"
  }
}

You can see that the following values were transformed to de-identify the resource:

  • A new value was provided in the birthDate field using a date shifting technique with a 100-day differential.
  • The value in address.city was redacted.
  • The value in address.district was redacted.
  • The value in address.line was redacted.
  • The value in address.postalCode was redacted.
  • The value in address.text was redacted.
  • The value in name.family was redacted.
  • The value in name.given was redacted.
  • The free text in the text.div field was modified to replace the patient's name with its infoType, [PERSON_NAME]. The patient's birthdate value was transformed in the same way as the value in the birthDate field was transformed.

De-identifying specific FHIR paths

To specify which FHIR paths to de-identify and how to transform them, configure the fieldMetadataList in the FhirConfig object.

Inside the fieldMetadataList, you specify a period-separated list of field names or FHIR resource type names in a paths list. Next, you specify an Action value to apply to everything listed in paths. See the Action documentation for the possible values.

See the FHIRPath documentation for information on how to format the possible paths values. The Cloud Healthcare API does not support all features of FHIRPath.

The following samples show how to configure the Patient resource de-identification with the following criteria:

  • The HumanName values for the Patient resource have a TRANSFORM (redaction) automatically applied. For the sample patient, the HumanName values are "family": "Smith" and "given": [ "Darcy" ].

No other values are provided in the paths list inside the fieldMetadataList, so the remaining data is unchanged.

The following samples show how to de-identify the Patient resource's HumanName values:

curl command

curl -X POST \
    -H "Authorization: Bearer "$(gcloud auth print-access-token) \
    -H "Content-Type: application/json; charset=utf-8" \
    --data "{
      'destinationDataset': 'projects/PROJECT_ID/locations/REGION/datasets/DESTINATION_DATASET_ID',
      'config': {
        'fhir': {
          'fieldMetadataList': [
            {
              'paths': [
                'Patient.HumanName'
              ],
              'action': 'TRANSFORM'
            }
          ]
        }
      }
    }" "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/REGION/datasets/SOURCE_DATASET_ID:deidentify"
If the request is successful, the server returns a 200 OK HTTP status code and the response in JSON format:
200 OK
{
  "name": "projects/PROJECT_ID/locations/REGION/datasets/SOURCE_DATASET_ID/operations/OPERATION_ID"
}
The response contains an operation name. You can use the Operation get method to track the status of the operation:
curl -X GET \
    -H "Authorization: Bearer "$(gcloud auth print-access-token) \
    -H "Content-Type: application/json; charset=utf-8" \
    "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/REGION/datasets/SOURCE_DATASET_ID/operations/OPERATION_ID"
If the request is successful, the server returns a 200 OK HTTP status code and the response in JSON format. After the de-identification process finishes, the response contains "done": true.
200 OK
{
  "name": "projects/PROJECT_ID/locations/REGION/datasets/SOURCE_DATASET_ID/operations/OPERATION_ID",
  "metadata": {
    "@type": "type.googleapis.com/google.cloud.healthcare.v1beta1.OperationMetadata",
    "apiMethodName": "google.cloud.healthcare.v1beta1.dataset.DatasetService.DeidentifyDataset",
    "createTime": "2018-01-01T00:00:00Z",
    "endTime": "2018-01-01T00:00:00Z"
  },
  "done": true,
  "response": {
    "@type": "type.googleapis.com/google.cloud.healthcare.v1beta1.deidentify.DeidentifySummary",
    "successStoreCount": "1",
    "successResourceCount": "1"
  }
}
Next, using the Patient ID, you can get the details for the Patient resource in the new destination dataset:
curl -X GET \
     -H "Authorization: Bearer "$(gcloud auth print-access-token) \
     -H "Content-Type: application/fhir+json; charset=utf-8" \
     "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/REGION/datasets/DESTINATION_DATASET_ID/fhirStores/FHIR_STORE_ID/fhir/Patient/r77433dd-dkeuc-633743nfd-383nfdsjds732"
The server returns the following response:
200 OK
{
  "address": [
    {
      "city": "Anycity",
      "district": "Anydistrict",
      "line": [
        "123 Main Street"
      ],
      "period": {
        "start": "1990-12-05"
      },
      "postalCode": "12345",
      "state": "CA",
      "text": "123 Main Street Anycity, Anydistrict, CA 12345",
      "type": "both",
      "use": "home"
    }
  ],
  "birthDate": "1980-12-05",
  "gender": "female",
  "id": "r77433dd-dkeuc-633743nfd-383nfdsjds732",
  "meta": {
    "lastUpdated": "2018-01-01T2018-01-01T00:00:00+00:00",
    "versionId": "MTU0MDU4NTcxNjI2MTUxNDAwMAA"
  },
  "name": [
    {
      "family": "",
      "given": [
        ""
      ],
      "use": "official"
    }
  ],
  "resourceType": "Patient",
  "text": {
    "div": "<div><p><b>Patient</b></p><p><b>Name</b>: Smith, Darcy</p><p><b>DateOfBirth</b>: 1980-12-05</p><p><b>Gender</b>: Female</p></div>",
    "status": "generated"
  }
}

PowerShell

$cred = gcloud auth print-access-token
$headers = @{ Authorization = "Bearer $cred" }

Invoke-WebRequest `
  -Method Post `
  -Headers $headers `
  -ContentType: "application/json; charset=utf-8" `
  -Body "{
    'destinationDataset': 'projects/PROJECT_ID/locations/REGION/datasets/DESTINATION_DATASET_ID',
    'config': {
      'fhir': {
        'fieldMetadataList': [
          {
            'paths': [
              'Patient.HumanName'
            ],
            'action': 'TRANSFORM'
          }
        ]
      }
    }
  }" `
  -Uri "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/REGION/datasets/SOURCE_DATASET_ID:deidentify" | Select-Object -Expand Content
If the request is successful, the server returns a 200 OK HTTP status code and the response in JSON format:
200 OK
{
  "name": "projects/PROJECT_ID/locations/REGION/datasets/SOURCE_DATASET_ID/operations/OPERATION_ID"
}
The response contains an operation name. You can use the Operation get method to track the status of the operation:
$cred = gcloud auth print-access-token
$headers = @{ Authorization = "Bearer $cred" }

Invoke-WebRequest `
  -Method Get `
  -Headers $headers `
  -ContentType: "application/json; charset=utf-8" `
  -Uri "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/REGION/datasets/SOURCE_DATASET_ID/operations/OPERATION_ID" | Select-Object -Expand Content
If the request is successful, the server returns a 200 OK HTTP status code and the response in JSON format. After the de-identification process finishes, the response contains "done": true.
200 OK
{
  "name": "projects/PROJECT_ID/locations/REGION/datasets/SOURCE_DATASET_ID/operations/OPERATION_NUMBER",
  "metadata": {
    "@type": "type.googleapis.com/google.cloud.healthcare.v1beta1.OperationMetadata",
    "apiMethodName": "google.cloud.healthcare.v1beta1.dataset.DatasetService.DeidentifyDataset",
    "createTime": "2018-01-01T00:00:00Z",
    "endTime": "2018-01-01T00:00:00Z"
  },
  "done": true,
  "response": {
    "@type": "type.googleapis.com/google.cloud.healthcare.v1beta1.deidentify.DeidentifySummary",
    "successStoreCount": "1",
    "successResourceCount": "1"
  }
}
Next, using the Patient ID, you can get the details for the Patient resource in the new destination dataset:
$cred = gcloud auth print-access-token
$headers = @{ Authorization = "Bearer $cred" }

Invoke-RestMethod `
  -Method Get `
  -Headers $headers `
  -ContentType: "application/fhir+json; charset=utf-8" `
  -Uri "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/REGION/datasets/DESTINATION_DATASET_ID/fhirStores/FHIR_STORE_ID/fhir/Patient/r77433dd-dkeuc-633743nfd-383nfdsjds732" | ConvertTo-Json
The server returns the following response:
200 OK
{
  "address": [
    {
      "city": "Anycity",
      "district": "Anydistrict",
      "line": [
        "123 Main Street"
      ],
      "period": {
        "start": "1990-12-05"
      },
      "postalCode": "12345",
      "state": "CA",
      "text": "123 Main Street Anycity, Anydistrict, CA 12345",
      "type": "both",
      "use": "home"
    }
  ],
  "birthDate": "1980-12-05",
  "gender": "female",
  "id": "r77433dd-dkeuc-633743nfd-383nfdsjds732",
  "meta": {
    "lastUpdated": "2018-01-01T2018-01-01T00:00:00+00:00",
    "versionId": "MTU0MDU4NTcxNjI2MTUxNDAwMAA"
  },
  "name": [
    {
      "family": "",
      "given": [
        ""
      ],
      "use": "official"
    }
  ],
  "resourceType": "Patient",
  "text": {
    "div": "<div><p><b>Patient</b></p><p><b>Name</b>: Smith, Darcy</p><p><b>DateOfBirth</b>: 1980-12-05</p><p><b>Gender</b>: Female</p></div>",
    "status": "generated"
  }
}

You can see that the following values were transformed to de-identify the resource:

  • The value in name.family was redacted.
  • The value in name.given was redacted.

However, unlike the sample in default FHIR de-identification, which transformed common PHI, the patient's address, birthDate, and the free text in text.div were not transformed because they were not added to the paths list in fieldMetadataList.

Using infoTypes and primitive transformations with FHIR resources

The Cloud Healthcare API can use information types (infoTypes) to define the data it scans when performing de-identification on FHIR resources. An infoType is a type of sensitive data, such as a patient name, email address, telephone number, identification number, or credit card number. The infoTypes used in the Cloud Healthcare API de-identification operation include those found in Cloud Data Loss Prevention.

Primitive transformations are rules that used for transforming an input value.

Default FHIR infoTypes

The default infoTypes used when de-identifying FHIR data are:

  • AGE
  • CREDIT_CARD_NUMBER
  • DATE
  • EMAIL_ADDRESS
  • IP_ADDRESS
  • LOCATION
  • MAC_ADDRESS
  • PASSPORT
  • PERSON_NAME
  • PHONE_NUMBER
  • SWIFT_CODE
  • US_DRIVERS_LICENSE_NUMBER
  • US_SOCIAL_SECURITY_NUMBER
  • US_VEHICLE_IDENTIFICATION_NUMBER
  • US_INDIVIDUAL_TAXPAYER_IDENTIFICATION_NUMBER

Primitive transformation options

The Cloud Healthcare API primitive transformation options include:

  • RedactConfig: Redacts a value by removing it.
  • CharacterMaskConfig: Masks a string either fully or partially by replacing input characters with a specified fixed character.
  • DateShiftConfig: Shifts dates by a random number of days, with the option to be consistent for the same context.
  • CryptoHashConfig: Uses SHA-256 to replace input values with a base64-encoded representation of a hashed output string generated using a given data encryption key.
  • ReplaceWithInfoTypeConfig: Replaces an input value with the name of its infoType.

Specifying configurations in TextConfig

InfoTypes and primitive transformations are specified within an InfoTypeTransformation, which is an object inside the TextConfig. Specify infoTypes in the infoTypes array as comma-separated values.

Specifying an infoType is optional. If you do not specify at least one infoType, the transformation applies to all built-in infoTypes in the data.

If you specify any infoTypes in InfoTypeTransformation, specify at least one primitive transformation.

The following sections show how to use the primitive transformations available in InfoTypeTransformation along with infoTypes to customize how FHIR resources are de-identified.

RedactConfig

Specifying redactConfig redacts a given value by removing it completely. The redactConfig message has no arguments; specifying it enables transformation.

The following samples show how to redact the Patient resource's US state. This task is accomplished by setting the US_STATE infoType with the Patient.address.state path and the redactConfig transform. After sending the de-identification request to the Cloud Healthcare API, the Patient.address.state value is redacted.

curl command

curl -X POST \
    -H "Authorization: Bearer "$(gcloud auth print-access-token) \
    -H "Content-Type: application/json; charset=utf-8" \
    --data "{
      'destinationDataset': 'projects/PROJECT_ID/locations/REGION/datasets/DESTINATION_DATASET_ID',
      'config': {
        'fhir': {
          'fieldMetadataList': {
            'paths': [
              'Patient.address.state'
            ],
            'action': 'TRANSFORM'
          }
        },
        'text': {
          'transformations': [
            {
              'infoTypes': [
                'US_STATE'
              ],
              'redactConfig': {}
            }
          ]
        }
      }
    }" "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/REGION/datasets/SOURCE_DATASET_ID:deidentify"
If the request is successful, the server returns a 200 OK HTTP status code and the response in JSON format:
200 OK
{
  "name": "projects/PROJECT_ID/locations/REGION/datasets/SOURCE_DATASET_ID/operations/OPERATION_ID"
}
The response contains an operation name. You can use the Operation get method to track the status of the operation:
curl -X GET \
    -H "Authorization: Bearer "$(gcloud auth print-access-token) \
    -H "Content-Type: application/json; charset=utf-8" \
    "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/REGION/datasets/SOURCE_DATASET_ID/operations/OPERATION_ID"
If the request is successful, the server returns a 200 OK HTTP status code and the response in JSON format. After the de-identification process finishes, the response contains "done": true.
200 OK
{
  "name": "projects/PROJECT_ID/locations/REGION/datasets/SOURCE_DATASET_ID/operations/OPERATION_ID",
  "metadata": {
    "@type": "type.googleapis.com/google.cloud.healthcare.v1beta1.OperationMetadata",
    "apiMethodName": "google.cloud.healthcare.v1beta1.dataset.DatasetService.DeidentifyDataset",
    "createTime": "2018-01-01T00:00:00Z",
    "endTime": "2018-01-01T00:00:00Z"
  },
  "done": true,
  "response": {
    "@type": "type.googleapis.com/google.cloud.healthcare.v1beta1.deidentify.DeidentifySummary",
    "successStoreCount": "1",
    "successResourceCount": "1"
  }
}
Next, using the Patient ID, you can get the details for the Patient resource in the new destination dataset:
curl -X GET \
     -H "Authorization: Bearer "$(gcloud auth print-access-token) \
     -H "Content-Type: application/fhir+json; charset=utf-8" \
     "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/REGION/datasets/DESTINATION_DATASET_ID/fhirStores/FHIR_STORE_ID/fhir/Patient/r77433dd-dkeuc-633743nfd-383nfdsjds732"
The server returns the following response:
200 OK
{
  "address": [
    {
      "city": "Anycity",
      "district": "Anydistrict",
      "line": [
        "123 Main Street"
      ],
      "period": {
        "start": "1990-12-05"
      },
      "postalCode": "12345",
      "state": "",
      "text": "123 Main Street Anycity, Anydistrict, CA 12345",
      "type": "both",
      "use": "home"
    }
  ],
  "birthDate": "1980-12-05",
  "gender": "female",
  "id": "r77433dd-dkeuc-633743nfd-383nfdsjds732",
  "meta": {
    "lastUpdated": "2018-01-01T2018-01-01T00:00:00+00:00",
    "versionId": "MTU0MDU4NTcxNjI2MTUxNDAwMAA"
  },
  "name": [
    {
      "family": "Smith",
      "given": [
        "Darcy"
      ],
      "use": "official"
    }
  ],
  "resourceType": "Patient",
  "text": {
    "div": "<div><p><b>Patient</b></p><p><b>Name</b>: Smith, Darcy</p><p><b>DateOfBirth</b>: 1980-12-05</p><p><b>Gender</b>: Female</p></div>",
    "status": "generated"
  }
}

PowerShell

$cred = gcloud auth print-access-token
$headers = @{ Authorization = "Bearer $cred" }

Invoke-WebRequest `
  -Method Post `
  -Headers $headers `
  -ContentType: "application/json; charset=utf-8" `
  -Body "{
    'destinationDataset': 'projects/PROJECT_ID/locations/REGION/datasets/DESTINATION_DATASET_ID',
    'config': {
      'fhir': {
        'fieldMetadataList': {
          'paths': [
            'Patient.address.state'
          ],
          'action': 'TRANSFORM'
        }
      },
      'text': {
        'transformations': [
          {
            'infoTypes': [
              'US_STATE'
            ],
            'redactConfig': {}
          }
        ]
      }
    }
  }" `
  -Uri "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/REGION/datasets/SOURCE_DATASET_ID:deidentify" | Select-Object -Expand Content
If the request is successful, the server returns a 200 OK HTTP status code and the response in JSON format:
200 OK
{
  "name": "projects/PROJECT_ID/locations/REGION/datasets/SOURCE_DATASET_ID/operations/OPERATION_ID"
}
The response contains an operation name. You can use the Operation get method to track the status of the operation:
$cred = gcloud auth print-access-token
$headers = @{ Authorization = "Bearer $cred" }

Invoke-WebRequest `
  -Method Get `
  -Headers $headers `
  -ContentType: "application/json; charset=utf-8" `
  -Uri "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/REGION/datasets/SOURCE_DATASET_ID/operations/OPERATION_ID" | Select-Object -Expand Content
If the request is successful, the server returns a 200 OK HTTP status code and the response in JSON format. After the de-identification process finishes, the response contains "done": true.
200 OK
{
  "name": "projects/PROJECT_ID/locations/REGION/datasets/SOURCE_DATASET_ID/operations/OPERATION_NUMBER",
  "metadata": {
    "@type": "type.googleapis.com/google.cloud.healthcare.v1beta1.OperationMetadata",
    "apiMethodName": "google.cloud.healthcare.v1beta1.dataset.DatasetService.DeidentifyDataset",
    "createTime": "2018-01-01T00:00:00Z",
    "endTime": "2018-01-01T00:00:00Z"
  },
  "done": true,
  "response": {
    "@type": "type.googleapis.com/google.cloud.healthcare.v1beta1.deidentify.DeidentifySummary",
    "successStoreCount": "1",
    "successResourceCount": "1"
  }
}
Next, using the Patient ID, you can get the details for the Patient resource in the new destination dataset:
$cred = gcloud auth print-access-token
$headers = @{ Authorization = "Bearer $cred" }

Invoke-RestMethod `
  -Method Get `
  -Headers $headers `
  -ContentType: "application/fhir+json; charset=utf-8" `
  -Uri "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/REGION/datasets/DESTINATION_DATASET_ID/fhirStores/FHIR_STORE_ID/fhir/Patient/r77433dd-dkeuc-633743nfd-383nfdsjds732" | ConvertTo-Json
The server returns the following response:
200 OK
{
  "address": [
    {
      "city": "Anycity",
      "district": "Anydistrict",
      "line": [
        "123 Main Street"
      ],
      "period": {
        "start": "1990-12-05"
      },
      "postalCode": "12345",
      "state": "",
      "text": "123 Main Street Anycity, Anydistrict, CA 12345",
      "type": "both",
      "use": "home"
    }
  ],
  "birthDate": "1980-12-05",
  "gender": "female",
  "id": "r77433dd-dkeuc-633743nfd-383nfdsjds732",
  "meta": {
    "lastUpdated": "2018-01-01T2018-01-01T00:00:00+00:00",
    "versionId": "MTU0MDU4NTcxNjI2MTUxNDAwMAA"
  },
  "name": [
    {
      "family": "Smith",
      "given": [
        "Darcy"
      ],
      "use": "official"
    }
  ],
  "resourceType": "Patient",
  "text": {
    "div": "<div><p><b>Patient</b></p><p><b>Name</b>: Smith, Darcy</p><p><b>DateOfBirth</b>: 1980-12-05</p><p><b>Gender</b>: Female</p></div>",
    "status": "generated"
  }
}

The output shows that the value in address.state has been removed. This is in contrast to the sample in De-identifying specific FHIR paths where address.state was not removed using the default configuration.

CharacterMaskConfig

Specifying characterMaskConfig replaces strings that correspond to the given infoTypes with a specified fixed character. For example, rather than redacting a patient's name or transforming it using cryptographic hashing, you can replace the name with a series of asterisks (*). Specify the fixed character as a value to the maskingCharacter field.

The following samples show how to expand on the sample used in De-identifying specific FHIR paths, but they now include setting the PERSON_NAME infoType with the characterMaskConfig transform. No fixed character is provided, so the masking defaults to using an asterisk. After sending the de-identification request to the Cloud Healthcare API, the values in name.family and name.given are replaced with asterisks.

curl command

curl -X POST \
    -H "Authorization: Bearer "$(gcloud auth print-access-token) \
    -H "Content-Type: application/json; charset=utf-8" \
    --data "{
      'destinationDataset': 'projects/PROJECT_ID/locations/REGION/datasets/DESTINATION_DATASET_ID',
      'config': {
        'fhir': {
          'fieldMetadataList': {
            'paths': [
              'Patient.HumanName'
            ],
            'action': 'TRANSFORM'
          }
        },
        'text': {
          'transformations': [
            {
              'infoTypes': [
                'PERSON_NAME'
              ],
              'characterMaskConfig': {
                'maskingCharacter': ''
              }
            }
          ]
        }
      }
    }" "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/REGION/datasets/SOURCE_DATASET_ID:deidentify"
If the request is successful, the server returns a 200 OK HTTP status code and the response in JSON format:
200 OK
{
  "name": "projects/PROJECT_ID/locations/REGION/datasets/SOURCE_DATASET_ID/operations/OPERATION_ID"
}
The response contains an operation name. You can use the Operation get method to track the status of the operation:
curl -X GET \
    -H "Authorization: Bearer "$(gcloud auth print-access-token) \
    -H "Content-Type: application/json; charset=utf-8" \
    "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/REGION/datasets/SOURCE_DATASET_ID/operations/OPERATION_ID"
If the request is successful, the server returns a 200 OK HTTP status code and the response in JSON format. After the de-identification process finishes, the response contains "done": true.
200 OK
{
  "name": "projects/PROJECT_ID/locations/REGION/datasets/SOURCE_DATASET_ID/operations/OPERATION_ID",
  "metadata": {
    "@type": "type.googleapis.com/google.cloud.healthcare.v1beta1.OperationMetadata",
    "apiMethodName": "google.cloud.healthcare.v1beta1.dataset.DatasetService.DeidentifyDataset",
    "createTime": "2018-01-01T00:00:00Z",
    "endTime": "2018-01-01T00:00:00Z"
  },
  "done": true,
  "response": {
    "@type": "type.googleapis.com/google.cloud.healthcare.v1beta1.deidentify.DeidentifySummary",
    "successStoreCount": "1",
    "successResourceCount": "1"
  }
}
Next, using the Patient ID, you can get the details for the Patient resource in the new destination dataset:
curl -X GET \
     -H "Authorization: Bearer "$(gcloud auth print-access-token) \
     -H "Content-Type: application/fhir+json; charset=utf-8" \
     "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/REGION/datasets/DESTINATION_DATASET_ID/fhirStores/FHIR_STORE_ID/fhir/Patient/r77433dd-dkeuc-633743nfd-383nfdsjds732"
The server returns the following response:
200 OK
{
  "address": [
    {
      "city": "Anycity",
      "district": "Anydistrict",
      "line": [
        "123 Main Street"
      ],
      "period": {
        "start": "1990-12-05"
      },
      "postalCode": "12345",
      "state": "CA",
      "text": "123 Main Street Anycity, Anydistrict, CA 12345",
      "type": "both",
      "use": "home"
    }
  ],
  "birthDate": "1980-12-05",
  "gender": "female",
  "id": "r77433dd-dkeuc-633743nfd-383nfdsjds732",
  "meta": {
    "lastUpdated": "2018-01-01T2018-01-01T00:00:00+00:00",
    "versionId": "MTU0MDU4NTcxNjI2MTUxNDAwMAA"
  },
  "name": [
    {
      "family": "*****",
      "given": [
        "*****"
      ],
      "use": "official"
    }
  ],
  "resourceType": "Patient",
  "text": {
    "div": "<div><p><b>Patient</b></p><p><b>Name</b>: Smith, Darcy</p><p><b>DateOfBirth</b>: 1980-12-05</p><p><b>Gender</b>: Female</p></div>",
    "status": "generated"
  }
}

PowerShell

$cred = gcloud auth print-access-token
$headers = @{ Authorization = "Bearer $cred" }

Invoke-WebRequest `
  -Method Post `
  -Headers $headers `
  -ContentType: "application/json; charset=utf-8" `
  -Body "{
    'destinationDataset': 'projects/PROJECT_ID/locations/REGION/datasets/DESTINATION_DATASET_ID',
    'config': {
      'fhir': {
        'fieldMetadataList': {
          'paths': [
            'Patient.HumanName'
          ],
          'action': 'TRANSFORM'
        }
      },
      'text': {
        'transformations': [
          {
            'infoTypes': [
              'PERSON_NAME'
            ],
            'characterMaskConfig': {
              'maskingCharacter': ''
            }
          }
        ]
      }
    }
  }" `
  -Uri "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/REGION/datasets/SOURCE_DATASET_ID:deidentify" | Select-Object -Expand Content
If the request is successful, the server returns a 200 OK HTTP status code and the response in JSON format:
200 OK
{
  "name": "projects/PROJECT_ID/locations/REGION/datasets/SOURCE_DATASET_ID/operations/OPERATION_ID"
}
The response contains an operation name. You can use the Operation get method to track the status of the operation:
$cred = gcloud auth print-access-token
$headers = @{ Authorization = "Bearer $cred" }

Invoke-WebRequest `
  -Method Get `
  -Headers $headers `
  -ContentType: "application/json; charset=utf-8" `
  -Uri "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/REGION/datasets/SOURCE_DATASET_ID/operations/OPERATION_ID" | Select-Object -Expand Content
If the request is successful, the server returns a 200 OK HTTP status code and the response in JSON format. After the de-identification process finishes, the response contains "done": true.
200 OK
{
  "name": "projects/PROJECT_ID/locations/REGION/datasets/SOURCE_DATASET_ID/operations/OPERATION_NUMBER",
  "metadata": {
    "@type": "type.googleapis.com/google.cloud.healthcare.v1beta1.OperationMetadata",
    "apiMethodName": "google.cloud.healthcare.v1beta1.dataset.DatasetService.DeidentifyDataset",
    "createTime": "2018-01-01T00:00:00Z",
    "endTime": "2018-01-01T00:00:00Z"
  },
  "done": true,
  "response": {
    "@type": "type.googleapis.com/google.cloud.healthcare.v1beta1.deidentify.DeidentifySummary",
    "successStoreCount": "1",
    "successResourceCount": "1"
  }
}
Next, using the Patient ID, you can get the details for the Patient resource in the new destination dataset:
$cred = gcloud auth print-access-token
$headers = @{ Authorization = "Bearer $cred" }

Invoke-RestMethod `
  -Method Get `
  -Headers $headers `
  -ContentType: "application/fhir+json; charset=utf-8" `
  -Uri "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/REGION/datasets/DESTINATION_DATASET_ID/fhirStores/FHIR_STORE_ID/fhir/Patient/r77433dd-dkeuc-633743nfd-383nfdsjds732" | ConvertTo-Json
The server returns the following response:
200 OK
{
  "address": [
    {
      "city": "Anycity",
      "district": "Anydistrict",
      "line": [
        "123 Main Street"
      ],
      "period": {
        "start": "1990-12-05"
      },
      "postalCode": "12345",
      "state": "CA",
      "text": "123 Main Street Anycity, Anydistrict, CA 12345",
      "type": "both",
      "use": "home"
    }
  ],
  "birthDate": "1980-12-05",
  "gender": "female",
  "id": "r77433dd-dkeuc-633743nfd-383nfdsjds732",
  "meta": {
    "lastUpdated": "2018-01-01T2018-01-01T00:00:00+00:00",
    "versionId": "MTU0MDU4NTcxNjI2MTUxNDAwMAA"
  },
  "name": [
    {
      "family": "*****",
      "given": [
        "*****"
      ],
      "use": "official"
    }
  ],
  "resourceType": "Patient",
  "text": {
    "div": "<div><p><b>Patient</b></p><p><b>Name</b>: Smith, Darcy</p><p><b>DateOfBirth</b>: 1980-12-05</p><p><b>Gender</b>: Female</p></div>",
    "status": "generated"
  }
}

The output shows that the values in name.family and name.given have been replaced with asterisks. This is in contrast to the sample in De-identifying specific FHIR paths where the values in name.family and name.given were redacted.

DateShiftConfig

The Cloud Healthcare API can transform dates by shifting them within a preset range. To keep date transformations consistent across de-identification runs, use DateShiftConfig and specify an AES 128/192/256-bit base 64-encoded key. The Cloud Healthcare API uses this key to compute the amount by which dates, such as a patient's birthdate, are shifted within a 100-day differential.

If you don't provide a key, the Cloud Healthcare API generates its own key each time the de-identification operation runs on date values. This can result in inconsistent date outputs between runs.

The following samples show how to set the DATE infoType with the DateShiftConfig transform on the Patient.birthDate path. After sending the de-identification request to the Cloud Healthcare API, the birthDate value will shift within plus or minus 100 days of the original birthdate, 1980-12-05.

The cryptokey provided in the example, U2FsdGVkX19bS2oZsdbK9X5zi2utBn22uY+I2Vo0zOU=, is an AES-encrypted 256-bit base64-encoded key generated using the following command. When prompted, provide a password of your choosing to the command:

echo -n "test" | openssl enc -e -aes-256-ofb -a -salt

curl command

curl -X POST \
    -H "Authorization: Bearer "$(gcloud auth print-access-token) \
    -H "Content-Type: application/json; charset=utf-8" \
    --data "{
      'destinationDataset': 'projects/PROJECT_ID/locations/REGION/datasets/DESTINATION_DATASET_ID',
      'config': {
        'fhir': {
          'fieldMetadataList': {
            'paths': [
              'Patient.birthDate'
            ],
            'action': 'TRANSFORM'
          }
        },
        'text': {
          'transformations': {
            'infoTypes': [
              'DATE'
            ],
            'dateShiftConfig': {
              'cryptoKey': 'U2FsdGVkX19bS2oZsdbK9X5zi2utBn22uY+I2Vo0zOU='
            }
          }
        }
      }
    }" "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/REGION/datasets/SOURCE_DATASET_ID:deidentify"
If the request is successful, the server returns a 200 OK HTTP status code and the response in JSON format:
200 OK
{
  "name": "projects/PROJECT_ID/locations/REGION/datasets/SOURCE_DATASET_ID/operations/OPERATION_ID"
}
The response contains an operation name. You can use the Operation get method to track the status of the operation:
curl -X GET \
    -H "Authorization: Bearer "$(gcloud auth print-access-token) \
    -H "Content-Type: application/json; charset=utf-8" \
    "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/REGION/datasets/SOURCE_DATASET_ID/operations/OPERATION_ID"
If the request is successful, the server returns a 200 OK HTTP status code and the response in JSON format. After the de-identification process finishes, the response contains "done": true.
200 OK
{
  "name": "projects/PROJECT_ID/locations/REGION/datasets/SOURCE_DATASET_ID/operations/OPERATION_ID",
  "metadata": {
    "@type": "type.googleapis.com/google.cloud.healthcare.v1beta1.OperationMetadata",
    "apiMethodName": "google.cloud.healthcare.v1beta1.dataset.DatasetService.DeidentifyDataset",
    "createTime": "2018-01-01T00:00:00Z",
    "endTime": "2018-01-01T00:00:00Z"
  },
  "done": true,
  "response": {
    "@type": "type.googleapis.com/google.cloud.healthcare.v1beta1.deidentify.DeidentifySummary",
    "successStoreCount": "1",
    "successResourceCount": "1"
  }
}
Next, using the Patient ID, you can get the details for the Patient resource in the new destination dataset:
curl -X GET \
     -H "Authorization: Bearer "$(gcloud auth print-access-token) \
     -H "Content-Type: application/fhir+json; charset=utf-8" \
     "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/REGION/datasets/DESTINATION_DATASET_ID/fhirStores/FHIR_STORE_ID/fhir/Patient/r77433dd-dkeuc-633743nfd-383nfdsjds732"
The server returns the following response:
200 OK
{
  "address": [
    {
      "city": "Anycity",
      "district": "Anydistrict",
      "line": [
        "123 Main Street"
      ],
      "period": {
        "start": "1990-12-05"
      },
      "postalCode": "12345",
      "state": "",
      "text": "123 Main Street Anycity, Anydistrict, CA 12345",
      "type": "both",
      "use": "home"
    }
  ],
  "birthDate": "1981-02-19",
  "gender": "female",
  "id": "r77433dd-dkeuc-633743nfd-383nfdsjds732",
  "meta": {
    "lastUpdated": "2018-01-01T2018-01-01T00:00:00+00:00",
    "versionId": "MTU0MDU4NTcxNjI2MTUxNDAwMAA"
  },
  "name": [
    {
      "family": "Smith",
      "given": [
        "Darcy"
      ],
      "use": "official"
    }
  ],
  "resourceType": "Patient",
  "text": {
    "div": "<div><p><b>Patient</b></p><p><b>Name</b>: Smith, Darcy</p><p><b>DateOfBirth</b>: 1980-12-05</p><p><b>Gender</b>: Female</p></div>",
    "status": "generated"
  }
}

PowerShell

$cred = gcloud auth print-access-token
$headers = @{ Authorization = "Bearer $cred" }

Invoke-WebRequest `
  -Method Post `
  -Headers $headers `
  -ContentType: "application/json; charset=utf-8" `
  -Body "{
    'destinationDataset': 'projects/PROJECT_ID/locations/REGION/datasets/DESTINATION_DATASET_ID',
    'config': {
      'fhir': {
        'fieldMetadataList': {
          'paths': [
            'Patient.HumanName'
          ],
          'action': 'TRANSFORM'
        }
      },
      'text': {
        'transformations': {
          'infoTypes': [
            'DATE'
          ],
          'dateShiftConfig': {
            'cryptoKey': 'U2FsdGVkX19bS2oZsdbK9X5zi2utBn22uY+I2Vo0zOU='
          }
        }
      }
    }
  }" `
  -Uri "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/REGION/datasets/SOURCE_DATASET_ID:deidentify" | Select-Object -Expand Content
If the request is successful, the server returns a 200 OK HTTP status code and the response in JSON format:
200 OK
{
  "name": "projects/PROJECT_ID/locations/REGION/datasets/SOURCE_DATASET_ID/operations/OPERATION_ID"
}
The response contains an operation name. You can use the Operation get method to track the status of the operation:
$cred = gcloud auth print-access-token
$headers = @{ Authorization = "Bearer $cred" }

Invoke-WebRequest `
  -Method Get `
  -Headers $headers `
  -ContentType: "application/json; charset=utf-8" `
  -Uri "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/REGION/datasets/SOURCE_DATASET_ID/operations/OPERATION_ID" | Select-Object -Expand Content
If the request is successful, the server returns a 200 OK HTTP status code and the response in JSON format. After the de-identification process finishes, the response contains "done": true.
200 OK
{
  "name": "projects/PROJECT_ID/locations/REGION/datasets/SOURCE_DATASET_ID/operations/OPERATION_NUMBER",
  "metadata": {
    "@type": "type.googleapis.com/google.cloud.healthcare.v1beta1.OperationMetadata",
    "apiMethodName": "google.cloud.healthcare.v1beta1.dataset.DatasetService.DeidentifyDataset",
    "createTime": "2018-01-01T00:00:00Z",
    "endTime": "2018-01-01T00:00:00Z"
  },
  "done": true,
  "response": {
    "@type": "type.googleapis.com/google.cloud.healthcare.v1beta1.deidentify.DeidentifySummary",
    "successStoreCount": "1",
    "successResourceCount": "1"
  }
}
Next, using the Patient ID, you can get the details for the Patient resource in the new destination dataset:
$cred = gcloud auth print-access-token
$headers = @{ Authorization = "Bearer $cred" }

Invoke-RestMethod `
  -Method Get `
  -Headers $headers `
  -ContentType: "application/fhir+json; charset=utf-8" `
  -Uri "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/REGION/datasets/DESTINATION_DATASET_ID/fhirStores/FHIR_STORE_ID/fhir/Patient/r77433dd-dkeuc-633743nfd-383nfdsjds732" | ConvertTo-Json
The server returns the following response:
200 OK
{
  "address": [
    {
      "city": "Anycity",
      "district": "Anydistrict",
      "line": [
        "123 Main Street"
      ],
      "period": {
        "start": "1990-12-05"
      },
      "postalCode": "12345",
      "state": "",
      "text": "123 Main Street Anycity, Anydistrict, CA 12345",
      "type": "both",
      "use": "home"
    }
  ],
  "birthDate": "1981-02-19",
  "gender": "female",
  "id": "r77433dd-dkeuc-633743nfd-383nfdsjds732",
  "meta": {
    "lastUpdated": "2018-01-01T2018-01-01T00:00:00+00:00",
    "versionId": "MTU0MDU4NTcxNjI2MTUxNDAwMAA"
  },
  "name": [
    {
      "family": "Smith",
      "given": [
        "Darcy"
      ],
      "use": "official"
    }
  ],
  "resourceType": "Patient",
  "text": {
    "div": "<div><p><b>Patient</b></p><p><b>Name</b>: Smith, Darcy</p><p><b>DateOfBirth</b>: 1980-12-05</p><p><b>Gender</b>: Female</p></div>",
    "status": "generated"
  }
}

The output shows that the value in birthDate has been transformed to a new value of 1981-02-19. This transformation occurred as a result of combining the 100-day differential with the Patient ID and the provided cryptoKey value. The new birthDate value is consistent for this Patient between de-identification runs as long as the same cryptoKey is provided.

CryptoHashConfig

The Cloud Healthcare API can transform data by replacing values with cryptographic hashes (also called surrogate values). To do so, specify a cryptoHashConfig message.

You can leave the cryptoHashConfig empty, or you can provide it with an AES 128/192/256-bit base 64-encoded key. Supplying a consistent key generates surrogate values that are consistent among de-identification runs. If you do not provide a key, the Cloud Healthcare API generates a new key each time the operation runs. Using a different key yields different surrogate values.

The following samples expand on the sample used in De-identifying specific FHIR paths, but they now include setting the PERSON_NAME infoType with the cryptoKey transform on the Patient.HumanName path. After sending the de-identification request to the Cloud Healthcare API, the name.family and name.given values are replaced with surrogate values.

The cryptokey provided in the example, U2FsdGVkX19bS2oZsdbK9X5zi2utBn22uY+I2Vo0zOU=, is an AES-encrypted 256-bit base64-encoded key generated using the following command. When prompted, provide a password of your choosing to the command:

echo -n "test" | openssl enc -e -aes-256-ofb -a -salt

curl command

curl -X POST \
    -H "Authorization: Bearer "$(gcloud auth print-access-token) \
    -H "Content-Type: application/json; charset=utf-8" \
    --data "{
      'destinationDataset': 'projects/PROJECT_ID/locations/REGION/datasets/DESTINATION_DATASET_ID',
      'config': {
        'fhir': {
          'fieldMetadataList': {
            'paths': [
              'Patient.HumanName'
            ],
            'action': 'TRANSFORM'
          }
        },
        'text': {
          'transformations': {
            'infoTypes': [
              'PERSON_NAME'
            ],
            'cryptoHashConfig': {
              'cryptoKey': 'U2FsdGVkX19bS2oZsdbK9X5zi2utBn22uY+I2Vo0zOU='
            }
          }
        }
      }
    }" "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/REGION/datasets/SOURCE_DATASET_ID:deidentify"
If the request is successful, the server returns a 200 OK HTTP status code and the response in JSON format:
200 OK
{
  "name": "projects/PROJECT_ID/locations/REGION/datasets/SOURCE_DATASET_ID/operations/OPERATION_ID"
}
The response contains an operation name. You can use the Operation get method to track the status of the operation:
curl -X GET \
    -H "Authorization: Bearer "$(gcloud auth print-access-token) \
    -H "Content-Type: application/json; charset=utf-8" \
    "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/REGION/datasets/SOURCE_DATASET_ID/operations/OPERATION_ID"
If the request is successful, the server returns a 200 OK HTTP status code and the response in JSON format. After the de-identification process finishes, the response contains "done": true.
200 OK
{
  "name": "projects/PROJECT_ID/locations/REGION/datasets/SOURCE_DATASET_ID/operations/OPERATION_ID",
  "metadata": {
    "@type": "type.googleapis.com/google.cloud.healthcare.v1beta1.OperationMetadata",
    "apiMethodName": "google.cloud.healthcare.v1beta1.dataset.DatasetService.DeidentifyDataset",
    "createTime": "2018-01-01T00:00:00Z",
    "endTime": "2018-01-01T00:00:00Z"
  },
  "done": true,
  "response": {
    "@type": "type.googleapis.com/google.cloud.healthcare.v1beta1.deidentify.DeidentifySummary",
    "successStoreCount": "1",
    "successResourceCount": "1"
  }
}
Next, using the Patient ID, you can get the details for the Patient resource in the new destination dataset:
curl -X GET \
     -H "Authorization: Bearer "$(gcloud auth print-access-token) \
     -H "Content-Type: application/fhir+json; charset=utf-8" \
     "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/REGION/datasets/DESTINATION_DATASET_ID/fhirStores/FHIR_STORE_ID/fhir/Patient/r77433dd-dkeuc-633743nfd-383nfdsjds732"
The server returns the following response:
200 OK
{
  "address": [
    {
      "city": "Anycity",
      "district": "Anydistrict",
      "line": [
        "123 Main Street"
      ],
      "period": {
        "start": "1990-12-05"
      },
      "postalCode": "12345",
      "state": "CA",
      "text": "123 Main Street Anycity, Anydistrict, CA 12345",
      "type": "both",
      "use": "home"
    }
  ],
  "birthDate": "1980-12-05",
  "gender": "female",
  "id": "r77433dd-dkeuc-633743nfd-383nfdsjds732",
  "meta": {
    "lastUpdated": "2018-01-01T2018-01-01T00:00:00+00:00",
    "versionId": "MTU0MDU4NTcxNjI2MTUxNDAwMAA"
  },
  "name": [
    {
      "family": "NlVBV12Hhb5DD8WNqlTpXboFxzlUSlqAmYDet/jIViQ=",
      "given": [
        "FSH4D/IGb80a1rS0L0kqfC3DCDt6//17VPhIkOzH2pk="
      ],
      "use": "official"
    }
  ],
  "resourceType": "Patient",
  "text": {
    "div": "<div><p><b>Patient</b></p><p><b>Name</b>: Smith, Darcy</p><p><b>DateOfBirth</b>: 1980-12-05</p><p><b>Gender</b>: Female</p></div>",
    "status": "generated"
  }
}

PowerShell

$cred = gcloud auth print-access-token
$headers = @{ Authorization = "Bearer $cred" }

Invoke-WebRequest `
  -Method Post `
  -Headers $headers `
  -ContentType: "application/json; charset=utf-8" `
  -Body "{
    'destinationDataset': 'projects/PROJECT_ID/locations/REGION/datasets/DESTINATION_DATASET_ID',
    'config': {
      'fhir': {
        'fieldMetadataList': {
          'paths': [
            'Patient.HumanName'
          ],
          'action': 'TRANSFORM'
        }
      },
      'text': {
        'transformations': {
          'infoTypes': [
            'PERSON_NAME'
          ],
          'cryptoHashConfig': {
            'cryptoKey': 'U2FsdGVkX19bS2oZsdbK9X5zi2utBn22uY+I2Vo0zOU='
          }
        }
      }
    }
  }" `
  -Uri "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/REGION/datasets/SOURCE_DATASET_ID:deidentify" | Select-Object -Expand Content
If the request is successful, the server returns a 200 OK HTTP status code and the response in JSON format:
200 OK
{
  "name": "projects/PROJECT_ID/locations/REGION/datasets/SOURCE_DATASET_ID/operations/OPERATION_ID"
}
The response contains an operation name. You can use the Operation get method to track the status of the operation:
$cred = gcloud auth print-access-token
$headers = @{ Authorization = "Bearer $cred" }

Invoke-WebRequest `
  -Method Get `
  -Headers $headers `
  -ContentType: "application/json; charset=utf-8" `
  -Uri "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/REGION/datasets/SOURCE_DATASET_ID/operations/OPERATION_ID" | Select-Object -Expand Content
If the request is successful, the server returns a 200 OK HTTP status code and the response in JSON format. After the de-identification process finishes, the response contains "done": true.
200 OK
{
  "name": "projects/PROJECT_ID/locations/REGION/datasets/SOURCE_DATASET_ID/operations/OPERATION_NUMBER",
  "metadata": {
    "@type": "type.googleapis.com/google.cloud.healthcare.v1beta1.OperationMetadata",
    "apiMethodName": "google.cloud.healthcare.v1beta1.dataset.DatasetService.DeidentifyDataset",
    "createTime": "2018-01-01T00:00:00Z",
    "endTime": "2018-01-01T00:00:00Z"
  },
  "done": true,
  "response": {
    "@type": "type.googleapis.com/google.cloud.healthcare.v1beta1.deidentify.DeidentifySummary",
    "successStoreCount": "1",
    "successResourceCount": "1"
  }
}
Next, using the Patient ID, you can get the details for the Patient resource in the new destination dataset:
$cred = gcloud auth print-access-token
$headers = @{ Authorization = "Bearer $cred" }

Invoke-RestMethod `
  -Method Get `
  -Headers $headers `
  -ContentType: "application/fhir+json; charset=utf-8" `
  -Uri "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/REGION/datasets/DESTINATION_DATASET_ID/fhirStores/FHIR_STORE_ID/fhir/Patient/r77433dd-dkeuc-633743nfd-383nfdsjds732" | ConvertTo-Json
The server returns the following response:
200 OK
{
  "address": [
    {
      "city": "Anycity",
      "district": "Anydistrict",
      "line": [
        "123 Main Street"
      ],
      "period": {
        "start": "1990-12-05"
      },
      "postalCode": "12345",
      "state": "CA",
      "text": "123 Main Street Anycity, Anydistrict, CA 12345",
      "type": "both",
      "use": "home"
    }
  ],
  "birthDate": "1980-12-05",
  "gender": "female",
  "id": "r77433dd-dkeuc-633743nfd-383nfdsjds732",
  "meta": {
    "lastUpdated": "2018-01-01T2018-01-01T00:00:00+00:00",
    "versionId": "MTU0MDU4NTcxNjI2MTUxNDAwMAA"
  },
  "name": [
    {
      "family": "NlVBV12Hhb5DD8WNqlTpXboFxzlUSlqAmYDet/jIViQ=",
      "given": [
        "FSH4D/IGb80a1rS0L0kqfC3DCDt6//17VPhIkOzH2pk="
      ],
      "use": "official"
    }
  ],
  "resourceType": "Patient",
  "text": {
    "div": "<div><p><b>Patient</b></p><p><b>Name</b>: Smith, Darcy</p><p><b>DateOfBirth</b>: 1980-12-05</p><p><b>Gender</b>: Female</p></div>",
    "status": "generated"
  }
}

The output shows that the values for name.family and name.given have been transformed using cryptographic hashing. This transformation occurred as a result of combining the Patient ID and the provided cryptoKey value. The new name.family and name.given values are consistent for this Patient between de-identification runs as long as the same cryptoKey is provided.

De-identifying data at the FHIR store level

The preceding examples show how to de-identify FHIR data at the dataset level. To change a dataset de-identification request to a FHIR store de-identification request, make the following changes:

  • Modify the destinationDataset in the request body to destinationStore
  • Add fhirStores/DESTINATION_FHIR_STORE_ID at the end of the value in destinationStore
  • Add fhirStores/SOURCE_FHIR_STORE_ID when specifying the location of the source data.

For example:

Dataset level de-identification:

'destinationDataset': 'projects/PROJECT_ID/locations/REGION/datasets/DESTINATION_DATASET_ID'
…
"https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/REGION/datasets/SOURCE_DATASET_ID:deidentify"

FHIR store level de-identification:

'destinationStore': 'projects/PROJECT_ID/locations/REGION/datasets/DESTINATION_DATASET_ID/fhirStores/DESTINATION_FHIR_STORE_ID'
…
"https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/REGION/datasets/SOURCE_DATASET_ID/fhirStores/SOURCE_FHIR_STORE_ID:deidentify"

The following samples expand on De-identifying specific FHIR paths, but de-identification occurs on a single FHIR store and the de-identified data is copied to a new FHIR store. Note that the dataset referenced by DESTINATION_DATASET_ID must already exist.

curl command

curl -X POST \
    -H "Authorization: Bearer "$(gcloud auth print-access-token) \
    -H "Content-Type: application/json; charset=utf-8" \
    --data "{
      'destinationStore': 'projects/PROJECT_ID/locations/REGION/datasets/DESTINATION_DATASET_ID/fhirStores/DESTINATION_FHIR_STORE_ID',
     'config': {
        'fhir': {
          'fieldMetadataList': [
            {
              'paths': [
                'Patient.HumanName'
              ],
              'action': 'TRANSFORM'
            }
          ]
        }
      }
    }" "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/REGION/datasets/SOURCE_DATASET_ID/fhirStores/SOURCE_FHIR_STORE_ID:deidentify"

If the request is successful, the server returns a 200 OK HTTP status code and the response in JSON format:

200 OK
{
  "name": "projects/PROJECT_ID/locations/REGION/datasets/SOURCE_DATASET_ID/operations/OPERATION_ID"
}

The response contains an operation name. You can use the Operation get method to track the status of the operation:

curl -X GET \
    -H "Authorization: Bearer "$(gcloud auth print-access-token) \
    -H "Content-Type: application/json; charset=utf-8" \
    "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/REGION/datasets/SOURCE_DATASET_ID/operations/OPERATION_ID"

If the request is successful, the server returns a 200 OK HTTP status code and the response in JSON format. After the de-identification process finishes, the response contains "done": true.

200 OK
{
  "name": "projects/PROJECT_ID/locations/REGION/datasets/SOURCE_DATASET_ID/operations/OPERATION_ID",
  "metadata": {
    "@type": "type.googleapis.com/google.cloud.healthcare.v1beta1.OperationMetadata",
    "apiMethodName": "google.cloud.healthcare.v1beta1.deidentify.DeidentifyService.DeidentifyFhirStore",
    "createTime": "CREATE_TIME",
    "endTime": "END_TIME"
  },
  "done": true,
  "response": {
    "@type": "...",
    "successResourceCount": "SUCCESS_RESOURCE_COUNT"
  }
}

PowerShell

$cred = gcloud auth print-access-token
$headers = @{ Authorization = "Bearer $cred" }

Invoke-WebRequest `
  -Method Post `
  -Headers $headers `
  -ContentType: "application/json; charset=utf-8" `
  -Body "{
    'destinationStore': 'projects/PROJECT_ID/locations/REGION/datasets/DESTINATION_DATASET_ID/fhirStores/DESTINATION_FHIR_STORE_ID',
    'config': {
      'fhir': {
        'fieldMetadataList': [
          {
            'paths': [
              'Patient.HumanName'
            ],
            'action': 'TRANSFORM'
          }
        ]
      }
    }
  }" `
  -Uri "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/REGION/datasets/SOURCE_DATASET_ID/fhirStores/SOURCE_FHIR_STORE_ID:deidentify" | Select-Object -Expand Content

If the request is successful, the server returns a 200 OK HTTP status code nd the response in JSON format:

200 OK
{
  "name": "projects/PROJECT_ID/locations/REGION/datasets/SOURCE_DATASET_ID/operations/OPERATION_ID"
}

The response contains an operation ID. You can use the Operation get method to track the status of the operation:

$cred = gcloud auth print-access-token
$headers = @{ Authorization = "Bearer $cred" }

Invoke-WebRequest `
  -Method Get `
  -Headers $headers `
  -ContentType: "application/json; charset=utf-8" `
  -Uri "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/REGION/datasets/SOURCE_DATASET_ID/operations/OPERATION_ID" | Select-Object -Expand Content

If the request is successful, the server returns a 200 OK HTTP status code and the response in JSON format. After the de-identification process finishes, the response contains "done": true.

200 OK
{
  "name": "projects/PROJECT_ID/locations/REGION/datasets/SOURCE_DATASET_ID/operations/OPERATION_ID",
  "metadata": {
    "@type": "type.googleapis.com/google.cloud.healthcare.v1beta1.OperationMetadata",
    "apiMethodName": "google.cloud.healthcare.v1beta1.deidentify.DeidentifyService.DeidentifyFhirStore",
    "createTime": "CREATE_TIME",
    "endTime": "END_TIME"
  },
  "done": true,
  "response": {
    "@type": "...",
    "successResourceCount": "SUCCESS_RESOURCE_COUNT"
  }
}

De-identifying a subset of a FHIR store

When you de-identify FHIR data at the FHIR store level, you can de-identify a subset of the data by specifying a filter.

The filter takes the form of a list of FHIR resource IDs. You specify the IDs in a Resources object inside of the FhirFilter object.

The following sample expands on De-identifying data at the FHIR store level, but a list of two FHIR resource IDs (one for a Patient and one for an Observation) is provided that determines which resources are de-identified.

curl command

curl -X POST \
    -H "Authorization: Bearer "$(gcloud auth print-access-token) \
    -H "Content-Type: application/json; charset=utf-8" \
    --data "{
      'destinationStore': 'projects/PROJECT_ID/locations/REGION/datasets/DESTINATION_DATASET_ID/fhirStores/DESTINATION_FHIR_STORE_ID',
      'resourceFilter': {
        'resources': {
          'resources': [
            'Patient/PATIENT_ID',
            'Observation/OBSERVATION_ID'
          ]
        }
      }
      'config': {
        'fhir': {
          'fieldMetadataList': [
            {
              'paths': [
                'Patient.HumanName'
              ],
              'action': 'TRANSFORM'
            }
          ]
        }
      }
    }" "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/REGION/datasets/SOURCE_DATASET_ID/fhirStores/SOURCE_FHIR_STORE_ID:deidentify"

If the request is successful, the server returns a 200 OK HTTP status code and the response in JSON format:

200 OK
{
  "name": "projects/PROJECT_ID/locations/REGION/datasets/SOURCE_DATASET_ID/operations/OPERATION_ID"
}

The response contains an operation name. You can use the Operation get method to track the status of the operation:

curl -X GET \
    -H "Authorization: Bearer "$(gcloud auth print-access-token) \
    -H "Content-Type: application/json; charset=utf-8" \
    "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/REGION/datasets/SOURCE_DATASET_ID/operations/OPERATION_ID"

If the request is successful, the server returns a 200 OK HTTP status code and the response in JSON format. After the de-identification process finishes, the response contains "done": true.

200 OK
{
  "name": "projects/PROJECT_ID/locations/REGION/datasets/SOURCE_DATASET_ID/operations/OPERATION_ID",
  "metadata": {
    "@type": "type.googleapis.com/google.cloud.healthcare.v1beta1.OperationMetadata",
    "apiMethodName": "google.cloud.healthcare.v1beta1.deidentify.DeidentifyService.DeidentifyFhirStore",
    "createTime": "CREATE_TIME",
    "endTime": "END_TIME"
  },
  "done": true,
  "response": {
    "@type": "...",
    "successResourceCount": "SUCCESS_RESOURCE_COUNT"
  }
}

PowerShell

$cred = gcloud auth print-access-token
$headers = @{ Authorization = "Bearer $cred" }

Invoke-WebRequest `
  -Method Post `
  -Headers $headers `
  -ContentType: "application/json; charset=utf-8" `
  -Body "{
    'destinationStore': 'projects/PROJECT_ID/locations/REGION/datasets/DESTINATION_DATASET_ID/fhirStores/DESTINATION_FHIR_STORE_ID',
    'resourceFilter': {
      'resources': {
        'resources': [
          'Patient/PATIENT_ID',
          'Observation/OBSERVATION_ID'
        ]
      }
    },
    'config': {
      'fhir': {
        'fieldMetadataList': [
          {
            'paths': [
              'Patient.HumanName'
            ],
            'action': 'TRANSFORM'
          }
        ]
      }
    }
  }" `
  -Uri "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/REGION/datasets/SOURCE_DATASET_ID/fhirStores/SOURCE_FHIR_STORE_ID:deidentify" | Select-Object -Expand Content

If the request is successful, the server returns a 200 OK HTTP status code and the response in JSON format:

200 OK
{
  "name": "projects/PROJECT_ID/locations/REGION/datasets/SOURCE_DATASET_ID/operations/OPERATION_ID"
}

The response contains an operation ID. You can use the Operation get method to track the status of the operation:

$cred = gcloud auth print-access-token
$headers = @{ Authorization = "Bearer $cred" }

Invoke-WebRequest `
  -Method Get `
  -Headers $headers `
  -ContentType: "application/json; charset=utf-8" `
  -Uri "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/REGION/datasets/SOURCE_DATASET_ID/operations/OPERATION_ID" | Select-Object -Expand Content

If the request is successful, the server returns a 200 OK HTTP status code and the response in JSON format. After the de-identification process finishes, the response contains "done": true.

200 OK
{
  "name": "projects/PROJECT_ID/locations/REGION/datasets/SOURCE_DATASET_ID/operations/OPERATION_ID",
  "metadata": {
    "@type": "type.googleapis.com/google.cloud.healthcare.v1beta1.OperationMetadata",
    "apiMethodName": "google.cloud.healthcare.v1beta1.deidentify.DeidentifyService.DeidentifyFhirStore",
    "createTime": "CREATE_TIME",
    "endTime": "END_TIME"
  },
  "done": true,
  "response": {
    "@type": "...",
    "successResourceCount": "SUCCESS_RESOURCE_COUNT"
  }
}

Troubleshooting FHIR de-identification operations

If errors occur during a FHIR de-identification operation, the errors are logged to Stackdriver Logging. For more information, see Viewing error logs in Stackdriver Logging.

Var denne siden nyttig? Si fra hva du synes:

Send tilbakemelding om ...

Cloud Healthcare API