Streaming DICOM metadata to BigQuery

This page explains how configure a DICOM store to export DICOM instance metadata to a BigQuery table each time a DICOM instance is inserted into the DICOM store. The DICOM instances can be inserted using either the Store transaction or by importing from Cloud Storage.

Before configuring streaming, view Exporting DICOM metadata to BigQuery for an overview of how exporting DICOM metadata to BigQuery works.

Setting BigQuery permissions

Before exporting DICOM metadata to BigQuery, you must grant extra permissions to the Cloud Healthcare Service Agent service account. For more information, see DICOM store BigQuery permissions.

Configuring the DICOM store

To enable streaming to BigQuery, configure the StreamConfig object in your DICOM store. Inside of StreamConfig, configure BigQueryDestination to include a fully qualified BigQuery table URI where DICOM instance metadata will be streamed. StreamConfig is an array, so you can specify multiple BigQuery destinations. You can stream metadata from a single DICOM store to up to five BigQuery tables in a BigQuery dataset.

When you delete DICOM instances in the Cloud Healthcare API, BigQuery rows containing the metadata for those instances are not deleted.

The following samples show how to update a DICOM store to enable BigQuery streaming. In these samples, the DICOM store and the BigQuery table are in the same project. To export DICOM metadata to another project, see Exporting DICOM metadata to a different project.

curl

To enable BigQuery streaming on an existing DICOM store, make a PATCH request and specify the following information:

  • The name of the parent dataset
  • The name of the DICOM store
  • The name of an existing BigQuery dataset
  • A name for 1-5 BigQuery destinations. The name can contain only letters (upper or lower case), numbers, and underscores. The BigQuery dataset must exist, but the Cloud Healthcare API can update an existing table or create a new one.
  • An update mask
  • An access token

The following sample shows a PATCH request using curl.

curl -X PATCH \
    -H "Authorization: Bearer $(gcloud auth application-default print-access-token)" \
    -H "Content-Type: application/json; charset=utf-8" \
    --data "{
      'streamConfigs': [{
         'bigqueryDestination': {
          'tableUri': 'bq://PROJECT_ID.BIGQUERY_DATASET_ID.BIGQUERY_TABLE_ID'
         }
      }]
    }" \
    "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/REGION/datasets/DATASET_ID/dicomStores/DICOM_STORE_ID?updateMask=streamConfigs"

If the request is successful, the server returns the response in JSON format:

{
  "name": "projects/PROJECT_ID/locations/REGION/datasets/DATASET_ID/dicomStores/DICOM_STORE_ID",
  "streamConfigs": [
    {
      "bigqueryDestination": {
        "tableUri": "bq://PROJECT_ID.BIGQUERY_DATASET_ID.BIGQUERY_TABLE_ID"
      }
    }
  ]
}

PowerShell

To enable BigQuery streaming on an existing DICOM store, make a PATCH request and specify the following information:

  • The name of the parent dataset
  • The name of the DICOM store
  • The name of an existing BigQuery dataset
  • A name for the BigQuery export table. The name can contain only letters (upper or lower case), numbers, and underscores. The BigQuery dataset must exist, but the Cloud Healthcare API can update an existing table or create a new one.
  • An update mask
  • An access token

The following sample shows a PATCH request using Windows PowerShell.

$cred = gcloud auth application-default print-access-token
$headers = @{ Authorization = "Bearer $cred" }

Invoke-WebRequest `
  -Method Patch `
  -Headers $headers `
  -ContentType: "application/json; charset=utf-8" `
  -Body "{
      'streamConfigs': [
        {
          'bigqueryDestination': {
            'tableUri': 'bq://PROJECT_ID.BIGQUERY_DATASET_ID.BIGQUERY_TABLE_ID'
        }
      }
    ]
  }" `
  -Uri "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/datasets/DATASET_ID/dicomStores/DICOM_STORE_ID?updateMask=streamConfigs" | Select-Object -Expand Content

If the request is successful, the server returns the response in JSON format:

{
  "name": "projects/PROJECT_ID/locations/REGION/datasets/DATASET_ID/dicomStores/DICOM_STORE_ID",
  "streamConfigs": [
    {
      "bigqueryDestination": {
        "tableUri": "bq://PROJECT_ID.BIGQUERY_DATASET_ID.BIGQUERY_TABLE_ID"
      }
    }
  ]
}

Limitations and additional behavior

  • DICOM tags will be listed in separate column (called DroppedTags.TagName) in the destination BigQuery table if either of the following occurs:

    • The DICOM tag's size is equal to or greater than 1 MB
    • The DICOM tag does not have a supported type in BigQuery (listed in Excluded VRs)
  • If you delete an instance in a DICOM store, the instance won't be deleted in BigQuery.

Troubleshooting DICOM streaming requests

If errors occur during a DICOM export metadata to BigQuery request, the errors are logged to Cloud Logging. For more information, see Viewing error logs in Cloud Logging.

To filter logs related to streaming DICOM metadata, select healthcare.googleapis.com/dicom_stream from the second list under Filter by label or text search.