Method: datasets.deidentify

Full name: projects.locations.datasets.deidentify

Creates a new dataset containing de-identified data from the source dataset. The metadata field type is OperationMetadata. If the request is successful, the response field type is DeidentifySummary. If errors occur, [details][google.longrunning.Operation.error.details] field type is DeidentifyErrorDetails.

HTTP request

POST https://healthcare.googleapis.com/v1alpha2/{sourceDataset=projects/*/locations/*/datasets/*}:deidentify

The URL uses gRPC Transcoding syntax.

Path parameters

Parameters
sourceDataset

string

Source dataset resource name. (e.g., projects/{projectId}/locations/{locationId}/datasets/{datasetId}).

Request body

The request body contains data with the following structure:

JSON representation
{
  "destinationDataset": string,
  "config": {
    object (DeidentifyConfig)
  }
}
Fields
destinationDataset

string

The name of the dataset resource to create and write the redacted data to (e.g., projects/{projectId}/locations/{locationId}/datasets/{datasetId}).

  • The destination dataset must not exist.
  • The destination dataset must be in the same project as the source dataset. De-identifying data across multiple projects is not supported.

config

object (DeidentifyConfig)

Deidentify configuration.

Response body

If successful, the response body contains an instance of Operation.

Authorization Scopes

Requires one of the following OAuth scopes:

  • https://www.googleapis.com/auth/cloud-healthcare
  • https://www.googleapis.com/auth/cloud-platform

For more information, see the Authentication Overview.

DeidentifyConfig

Configures de-id options specific to different types of content. Each submessage customizes the handling of an https://tools.ietf.org/html/rfc6838 media type or subtype. Configs are applied in a nested manner at runtime.

JSON representation
{
  "dicom": {
    object (DicomConfig)
  },
  "fhir": {
    object (FhirConfig)
  },
  "image": {
    object (ImageConfig)
  },
  "text": {
    object (TextConfig)
  }
}
Fields
dicom

object (DicomConfig)

Configures de-id of application/DICOM content.

fhir

object (FhirConfig)

Configures de-id of application/FHIR content.

image

object (ImageConfig)

Configures de-identification of image pixels wherever they are found in the sourceDataset.

text

object (TextConfig)

Configures de-identification of text wherever it is found in the sourceDataset.

DicomConfig

Specifies the parameters needed for de-identification of DICOM stores.

JSON representation
{

  // Union field tag_filter can be only one of the following:
  "keepList": {
    object (TagFilterList)
  },
  "removeList": {
    object (TagFilterList)
  },
  "filterProfile": enum (TagFilterProfile)
  // End of list of possible types for union field tag_filter.
}
Fields
Union field tag_filter. Determines tag filtering method (meaning which tags to keep/remove). tag_filter can be only one of the following:
keepList

object (TagFilterList)

List of tags to keep. Remove all other tags.

removeList

object (TagFilterList)

List of tags to remove. Keep all other tags.

filterProfile

enum (TagFilterProfile)

Tag filtering profile that determines which tags to keep/remove.

TagFilterList

List of tags to be filtered.

JSON representation
{
  "tags": [
    string
  ]
}
Fields
tags[]

string

Tags to be filtered. Tags must be DICOM Data Elements, File Meta Elements, or Directory Structuring Elements, as defined at: http://dicom.nema.org/medical/dicom/current/output/html/part06.html#table_6-1,. They may be provided by "Keyword" or "Tag". For example "PatientID", "00100010".

TagFilterProfile

Profile that determines which tags to keep/remove.

Enums
TAG_FILTER_PROFILE_UNSPECIFIED No tag filtration profile provided. Same as KEEP_ALL_PROFILE.
MINIMAL_KEEP_LIST_PROFILE Keep only tags required to produce valid DICOM.
ATTRIBUTE_CONFIDENTIALITY_BASIC_PROFILE Remove tags based on DICOM Standard's Attribute Confidentiality Basic Profile (DICOM Standard Edition 2018e).
KEEP_ALL_PROFILE Keep all tags.
DEIDENTIFY_TAG_CONTENTS Inspects within tag contents and replaces sensitive text. The process can be configured using the TextConfig. Applies to all tags with the following Value Representation names: AE, LO, LT, PN, SH, ST, UC, UT, DA, DT, AS

FhirConfig

Specifies how de-identification of a FHIR store should be handled.

JSON representation
{
  "fieldMetadataList": [
    {
      object (FieldMetadata)
    }
  ]
}
Fields
fieldMetadataList[]

object (FieldMetadata)

Specifies FHIR paths to match and how to transform them. Any field that is not matched by a FieldMetadata will be passed through to the output dataset unmodified. All extensions are removed in the output.

FieldMetadata

Specifies FHIR paths to match, and how to handle de-identification of matching fields.

JSON representation
{
  "paths": [
    string
  ],
  "action": enum (Action)
}
Fields
paths[]

string

List of paths to FHIR fields to be redacted. Each path is a period-separated list where each component is either a field name or FHIR type name, for example: Patient, HumanName. For "choice" types (those defined in the FHIR spec with the form: field[x]) we use two separate components. e.g. "deceasedAge.unit" is matched by "Deceased.Age.unit". Supported types are: AdministrativeGenderCode, Code, Date, DateTime, Decimal, HumanName, Id, LanguageCode, Markdown, MimeTypeCode, Oid, String, Uri, Uuid, Xhtml.

action

enum (Action)

Deidentify action for one field.

Action

Whether this field should be redacted or not, or if it should be inspected for PHI.

Enums
ACTION_UNSPECIFIED No action specified.
TRANSFORM Transform the entire field.
INSPECT_AND_TRANSFORM Inspect and transform any found PHI. When AnnotationConfig is provided, annotations of PHI will be generated, except for Date and Datetime.
DO_NOT_TRANSFORM Do not transform.

ImageConfig

Specifies how de-identification of image pixel should be handled.

JSON representation
{
  "textRedactionMode": enum (TextRedactionMode)
}
Fields
textRedactionMode

enum (TextRedactionMode)

Determines how to redact text from image.

TextRedactionMode

How to redact text found in images (if at all).

Enums
TEXT_REDACTION_MODE_UNSPECIFIED No text redaction specified. Same as REDACT_NO_TEXT.
REDACT_ALL_TEXT Redact all text.
REDACT_SENSITIVE_TEXT Redact sensitive text.
REDACT_NO_TEXT Do not redact text.

TextConfig

JSON representation
{
  "transformations": [
    {
      object (InfoTypeTransformation)
    }
  ],
  "experimentalConfig": string
}
Fields
transformations[]

object (InfoTypeTransformation)

The transformations to apply to the detected data.

experimentalConfig

string

Experimental de-identification config to use. For internal use only. If not specified, it is ignored and standard DLP is used.

InfoTypeTransformation

A transformation to apply to text that is identified as a specific infoType.

JSON representation
{
  "infoTypes": [
    string
  ],

  // Union field config can be only one of the following:
  "redactConfig": {
    object (RedactConfig)
  },
  "characterMaskConfig": {
    object (CharacterMaskConfig)
  },
  "dateShiftConfig": {
    object (DateShiftConfig)
  },
  "cryptoHashConfig": {
    object (CryptoHashConfig)
  },
  "replaceWithInfoTypeConfig": {
    object (ReplaceWithInfoTypeConfig)
  }
  // End of list of possible types for union field config.
}
Fields
infoTypes[]

string

InfoTypes to apply this transformation to. If this is not specified, the transformation applies to any infoType.

Union field config.

config can be only one of the following:

redactConfig

object (RedactConfig)

Config for text redaction.

characterMaskConfig

object (CharacterMaskConfig)

Config for character mask.

dateShiftConfig

object (DateShiftConfig)

Config for date shift.

cryptoHashConfig

object (CryptoHashConfig)

Config for crypto hash.

replaceWithInfoTypeConfig

object (ReplaceWithInfoTypeConfig)

Config for replace with InfoType.

RedactConfig

Define how to redact sensitive values. Default behaviour is erase, e.g. "My name is Jake." becomes "My name is ."

CharacterMaskConfig

Mask a string by replacing its characters with a fixed character.

JSON representation
{
  "maskingCharacter": string
}
Fields
maskingCharacter

string

Character to mask the sensitive values. If not supplied, defaults to "*".

DateShiftConfig

Shift a date forward or backward in time by a random amount which is consistent for a given patient and crypto key combination.

JSON representation
{
  "cryptoKey": string
}
Fields
cryptoKey

string (bytes format)

An AES 128/192/256 bit key. Causes the shift to be computed based on this key and the patient ID. A default key is generated for each datasets.deidentify operation and is used wherever cryptoKey is not specified.

A base64-encoded string.

CryptoHashConfig

Pseudonymization method that generates surrogates via cryptographic hashing. Uses SHA-256. Outputs a base64-encoded representation of the hashed output (for example, L7k0BHmF1ha5U3NfGykjro4xWi1MPVQPjhMAZbSV9mM=).

JSON representation
{
  "cryptoKey": string
}
Fields
cryptoKey

string (bytes format)

An AES 128/192/256 bit key. Causes the hash to be computed based on this key. A default key is generated for each datasets.deidentify operation and is used wherever cryptoKey is not specified.

A base64-encoded string.

ReplaceWithInfoTypeConfig

When using the [INSPECT_AND_TRANSFORM][] action, each match is replaced with the name of the infoType. For example, "My name is Jake" becomes "My name is [PERSON_NAME]." The [TRANSFORM][] action is equivalent to redacting.

Was this page helpful? Let us know how we did:

Send feedback about...

Cloud Healthcare API