Streaming FHIR resource changes to BigQuery

Stay organized with collections Save and categorize content based on your preferences.

This page explains how to configure the Cloud Healthcare API to export resource changes to BigQuery each time a FHIR resource is created, updated, patched, or deleted in a FHIR store.

You can use BigQuery streaming to synchronize the data in a FHIR store with a BigQuery dataset in near real time. You can perform complex queries on FHIR data without needing to export the latest version of your FHIR store to BigQuery every time you want to analyze your data.

Setting BigQuery permissions

To stream FHIR resource changes to BigQuery, you must grant additional permissions to the Cloud Healthcare Service Agent service account. For more information, see FHIR store BigQuery permissions.

Streaming FHIR resource changes

The following samples show how to stream FHIR resource changes to a BigQuery dataset.

For more information, see the streamConfigs field in the FHIR store resource.

Console

To configure BigQuery streaming in the Google Cloud console, complete the following steps:

  1. In the Google Cloud console, go to the Datasets page.
    Go to Datasets

  2. Select the dataset containing the FHIR store you want to edit.

  3. In the Data Stores list, click the FHIR store you want to edit.

  4. To edit the store's configuration, click the FHIR Store Configuration edit button.

    For more information on the FHIR store's configuration options, see Creating a FHIR store.

  5. To export resource changes to BigQuery each time the FHIR resources in your store are created, updated, patched, or deleted, click Add new streaming config.

  6. In the Dataset list, select a BigQuery dataset where resource changes will be streamed.

  7. To set the depth for all recursive structures in the output schema, click the relevant depth in the Recursive Structure Depth slider. By default, the recursive depth is 2.

  8. To export resource changes only for specific resource types, select the resource types from the Select FHIR resource types list.

  9. Click Done to save the streaming configuration.

API

curl

To enable BigQuery streaming, you must update the FHIR store's streamConfigs field. To update the FHIR store, make a PATCH request with the following information:

  • The parent dataset
  • The FHIR store
  • The BigQuery dataset
  • The BigQuery project
  • The schemaType field set to one of the following:

    • ANALYTICS: the FHIR analytics schema
    • ANALYTICS_V2: the FHIR analytics schema with support for repeated extension columns and contained resources as JSON strings

  • An update mask

  • An access token

The following sample shows a PATCH request using curl.

curl -X PATCH \
    -H "Authorization: Bearer $(gcloud auth application-default print-access-token)" \
    -H "Content-Type: application/json; charset=utf-8" \
    --data "{
      'streamConfigs': [
        {
          'bigqueryDestination': {
            'datasetUri': 'bq://BIGQUERY_PROJECT_ID.BIGQUERY_DATASET_ID',
            'schemaConfig': {
              'schemaType': '{ANALYTICS|ANALYTICS_V2}'
            }
          }
        }
      ]
    }" "https://healthcare.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/datasets/DATASET_ID/fhirStores/FHIR_STORE_ID?updateMask=streamConfigs"

If the request is successful, the server returns the response in JSON format:

{
  "name": "projects/PROJECT_ID/locations/LOCATION/datasets/DATASET_ID/fhirStores/FHIR_STORE_ID",
  'streamConfigs': [
   {
     'bigqueryDestination': {
       'datasetUri': 'bq://BIGQUERY_PROJECT_ID.BIGQUERY_DATASET_ID',
       'schemaConfig': {
         'schemaType': '{ANALYTICS|ANALYTICS_V2}'
       }
     }
   }
  ]
}

After you configure BigQuery streaming, the Cloud Healthcare API inserts a new row into a BigQuery table each time a FHIR resource is created, updated, patched, or deleted.

PowerShell

To enable BigQuery streaming, you must update the FHIR store's streamConfigs field. To update the FHIR store, make a PATCH request with the following information:

  • The parent dataset
  • The FHIR store
  • The BigQuery dataset
  • The BigQuery project
  • The schemaType field set to one of the following:

    • ANALYTICS: the FHIR analytics schema
    • ANALYTICS_V2: the FHIR analytics schema with support for repeated extension columns and contained resources as JSON strings

  • An update mask

  • An access token

The following sample shows a PATCH request using Windows PowerShell.

$cred = gcloud auth application-default print-access-token
$headers = @{ Authorization = "Bearer $cred" }

Invoke-WebRequest `
  -Method Patch `
  -Headers $headers `
  -ContentType: "application/json; charset=utf-8" `
  -Body "{
      'streamConfigs': [
        {
          'bigqueryDestination': {
            'datasetUri': 'bq://BIGQUERY_PROJECT_ID.BIGQUERY_DATASET_ID',
            'schemaConfig': {
              'schemaType': '{ANALYTICS|ANALYTICS_V2}'
            }
          }
        }
      ]
  }" `
  -Uri "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/datasets/DATASET_ID/fhirStores/FHIR_STORE_ID?updateMask=streamConfigs" | Select-Object -Expand Content

If the request is successful, the server returns the response in JSON format:

{
  "name": "projects/PROJECT_ID/locations/LOCATION/datasets/DATASET_ID/fhirStores/FHIR_STORE_ID",
  'streamConfigs': [
   {
     'bigqueryDestination': {
       'datasetUri': 'bq://BIGQUERY_PROJECT_ID.BIGQUERY_DATASET_ID',
       'schemaConfig': {
         'schemaType': '{ANALYTICS|ANALYTICS_V2}'
       }
     }
   }
  ]
}

After you configure BigQuery streaming, the Cloud Healthcare API inserts a new row into a BigQuery table each time a FHIR resource is created, updated, patched, or deleted.

Troubleshooting FHIR streaming

If errors occur when resource changes are sent to BigQuery, the errors are logged to Cloud Logging. For more information, see Viewing error logs in Cloud Logging.

Missing FHIR resource view creation

By default, when you stream FHIR resource changes to BigQuery, a view is created for each resource streamed. The view has the following properties:

  • Has the same name as the resource and the resource's table in the BigQuery dataset
  • Only contains the current resource version, rather than all historical versions

If you bulk export a FHIR resource to BigQuery before streaming that FHIR resource, BigQuery does not create views for the FHIR resource.

For example, you might not see any views for Encounter resources in the following situation:

  1. You configure BigQuery streaming on a FHIR store, and then use the REST API to create a Patient resource.

    BigQuery creates a table and a view for the Patient resource.

  2. You bulk export Encounter resources to the same BigQuery dataset as the previous step.

    BigQuery creates a table for the Encounter resources.

  3. You use the REST API to create an Encounter resource.

    After this step, BigQuery views are not created for the Encounter resource.

To resolve this issue, use the following query to create a view:

SELECT
    * EXCEPT (_resource_row_id)
FROM (
  SELECT
    ROW_NUMBER() OVER(PARTITION BY id ORDER BY meta.lastUpdated DESC) as _resource_row_id,
    *
    FROM `PROJECT_ID.BIGQUERY_DATASET_ID.RESOURCE_TABLE` AS p
) AS p
WHERE
  p._resource_row_id=1
  AND
  NOT EXISTS (
  SELECT
    *
  FROM
    UNNEST(p.meta.tag)
  WHERE
    code = 'DELETE');

Replace the following:

  • PROJECT_ID: the ID of your project
  • BIGQUERY_DATASET_ID: the ID of the BigQuery dataset where you bulk exported the FHIR resource
  • RESOURCE_TABLE: the name of the table corresponding to the FHIR resource that you want to create views for

After creating the view, you can continue to stream changes to the FHIR resource and the view is updated accordingly.

What's next

For a tutorial on a use case for streaming FHIR resource changes, see Stream and synchronize FHIR resources with BigQuery.