Exporting FHIR resources to BigQuery

This page explains how to export FHIR resources to BigQuery for exploration and analysis.

Setting BigQuery permissions

Before exporting FHIR resources to BigQuery, you must grant additional permissions to the Cloud Healthcare Service Agent service account. For more information, see FHIR store BigQuery permissions.

Exporting FHIR resources

The following samples show how to export FHIR resources to a BigQuery table.

When setting the BigQuery destination, use the fully qualified URI, like so:

bq://PROJECT_ID.BIGQUERY_DATASET_ID.BIGQUERY_TABLE_ID

The behavior of the export operation can vary depending on the following:

  • Whether the destination table already exists
  • Whether you have set the force field
  • Whether you have set a value for the writeDisposition enum. If you use the writeDisposition enum, do not set the force field.

The behavior in each of these cases is as follows:

  • If the destination table already exists and force is set to true, the export operation overwrites the existing table.
  • If the destination table already exists and force is set to false, an error occurs.
  • If the destination table does not already exist then the export operation creates a new table, regardless of whether you specify the force field.
  • When using writeDisposition, if the destination table already exists and is empty, the export operation completes successfully instead of returning an error.

Console

To export FHIR resources to BigQuery, complete the following steps:

  1. In the Cloud Console, go to the Datasets page.
    Go to the Datasets page
  2. Click the dataset that contains the FHIR store from which you are exporting FHIR resources.
  3. In the list of data stores, choose Export from the Actions list for the FHIR store.
  4. On the Export FHIR resources page that appears, select BigQuery Table.
  5. In BigQuery output options, choose how the data is exported to BigQuery.
  6. In BigQuery output options, click Browse to choose the BigQuery project and dataset.
  7. In Select dataset, search for the BigQuery project and dataset to which you want to export FHIR resources.
  8. To set the depth for all recursive structures in the output schema, click the relevant depth level in the Recursive Structure Depth slider. By default, the recursive value is 2.

    For more information, see projects.locations.datasets.fhirStores.SchemaConfig.
  9. Click Export to export FHIR resources to BigQuery.
  10. To track the status of the operation, click the Operations tab. After the operation completes, the following indications appear:
    • The Long-running operation status section has a green check mark under the OK heading.
    • The Overview section has a green check mark and an OK indicator in the same row as the operation ID.
    If you encounter any errors, click Actions, and then click View details in Cloud Logging.

gcloud

To export FHIR resources to BigQuery, use the gcloud healthcare fhir-stores export bq command. Specify the following information:

  • The name of the parent dataset
  • The name of the FHIR store
  • The name of an existing BigQuery dataset
  • The name of the project that owns the BigQuery dataset
  • The schema-type set to the analytics
  • One of the following values for the --writeDisposition flag:
    • write-empty: only export data if the destination tables are empty. This is the default.
    • write-truncate: erase all existing data in the tables before writing the instances.
    • write-append: append data to the existing tables.

The following sample shows the gcloud healthcare fhir-stores export bq command.

gcloud healthcare fhir-stores export bq FHIR_STORE_ID \
  --dataset=DATASET_ID \
  --location=LOCATION \
  --bq-dataset=bq://PROJECT_ID.BIGQUERY_DATASET_ID \
  --schema-type=analytics \
  --write-disposition={write-empty|write-truncate|write-append}

The command line displays the operation ID and, after the operation completes, done:

Request issued for: [FHIR_STORE_ID]
Waiting for operation [projects/PROJECT_ID/locations/LOCATION/datasets/DATASET_ID/OPERATION_ID] to complete...done.
name: projects/PROJECT_ID/locations/LOCATION/datasets/DATASET_ID/fhirStores/FHIR_STORE_ID

To view more details of the operation, run the gcloud healthcare operations describe command, providing the OPERATION_ID from the response:

gcloud healthcare operations describe OPERATION_ID \
  --dataset=DATASET_ID

The response includes done: true.

done: true
metadata:
'@type': type.googleapis.com/google.cloud.healthcare.v1.OperationMetadata
apiMethodName: google.cloud.healthcare.v1.fhir.FhirService.ExportResources
createTime: 'CREATE_TIME'
endTime: 'END_TIME'
name: projects/PROJECT_ID/locations/LOCATION/datasets/DATASET_ID/fhirStores/FHIR_STORE_ID
response:
'@type': type.googleapis.com/google.cloud.healthcare.v1.fhir.rest.ExportResourcesResponse
fhirStore: projects/PROJECT_ID/locations/LOCATION/datasets/DATASET_ID/fhirStores/FHIR_STORE_ID

API

For more information, see projects.locations.datasets.fhirStores.export.

curl

To export FHIR resources, make a POST request and supply the following information:

  • The name of the parent dataset
  • The name of the FHIR store
  • The name of an existing BigQuery dataset
  • The name of the project that owns the BigQuery dataset
  • The schemaType field set to the value ANALYTICS
  • An access token
  • One of the following values for the writeDisposition enum:
    • write-empty: only export data if the destination tables are empty. This is the default.
    • write-truncate: erase all existing data in the tables before writing the instances.
    • write-append: append data to the existing tables.
  • (Only available in v1beta1): An optional field, _type, that only exports one or more types of FHIR resources
  • (Only available in v1beta1): An optional field, _since, that only exports resources updated after a specific time, defined as YYYY-MM-DDThh:mm:ss.sss+zz:zz

The operation outputs one BigQuery table for each resource type in the FHIR store.

The following sample shows a POST request using curl.

curl -X POST \
    -H "Authorization: Bearer $(gcloud auth application-default print-access-token)" \
    -H "Content-Type: application/json; charset=utf-8" \
    --data "{
      'bigqueryDestination': {
        'datasetUri': 'bq://PROJECT_ID.BIGQUERY_DATASET_ID',
        'schemaConfig': {
          'schemaType': 'ANALYTICS'
        }
        'writeDisposition': '{write-empty|write-truncate|write-append}'
      }
    }" "https://healthcare.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/datasets/DATASET_ID/fhirStores/FHIR_STORE_ID:export"

If the request is successful, the server returns the response in JSON format:

{
  "name": "projects/PROJECT_ID/locations/LOCATION/datasets/DATASET_ID/operations/OPERATION_ID"
}

The response contains an operation name. To track the status of the operation, you can use the Operation get method:

curl -X GET \
    -H "Authorization: Bearer $(gcloud auth application-default print-access-token)" \
    "https://healthcare.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/datasets/DATASET_ID/operations/OPERATION_ID"

If the request is successful, the server returns a response with the status of the operation in JSON format:

{
  "name": "projects/PROJECT_ID/locations/LOCATION/datasets/DATASET_ID/operations/OPERATION_ID",
  "metadata": {
    "@type": "type.googleapis.com/google.cloud.healthcare.v1.OperationMetadata",
    "apiMethodName": "google.cloud.healthcare.v1.fhir.FhirService.ExportResources",
    "createTime": "CREATE_TIME",
    "endTime": "END_TIME",
    "logsUrl": "https://console.cloud.google.com/logs/viewer/CLOUD_LOGGING_URL",
    "counter": {
      "success": "SUCCESS_COUNT"
    }
  },
  "done": true,
  "response": {
    "@type": "type.googleapis.com/google.cloud.healthcare.v1.fhir.rest.ExportResourcesResponse",
  }
}

PowerShell

To export FHIR resources, make a POST request and supply the following information:

  • The name of the parent dataset
  • The name of the FHIR store
  • The name of an existing BigQuery dataset
  • The name of the project that owns the BigQuery dataset
  • The schemaType field set to the value ANALYTICS
  • An access token
  • One of the following values for the writeDisposition enum:
    • write-empty: only export data if the destination tables are empty. This is the default.
    • write-truncate: erase all existing data in the tables before writing the instances.
    • write-append: append data to the existing tables.
  • (Only available in v1beta1): An optional field, _type, that only exports one or more types of FHIR resources
  • (Only available in v1beta1): An optional field, _since, that only exports resources updated after a specific time, defined as YYYY-MM-DDThh:mm:ss.sss+zz:zz

The operation outputs one BigQuery table for each resource type in the FHIR store.

The following sample shows a POST request using Windows PowerShell.

$cred = gcloud auth application-default print-access-token
$headers = @{ Authorization = "Bearer $cred" }

Invoke-WebRequest `
  -Method Post `
  -Headers $headers `
  -ContentType: "application/json; charset=utf-8" `
  -Body "{
    'bigqueryDestination': {
      'datasetUri': 'bq://PROJECT_ID.BIGQUERY_DATASET_ID',
      'schemaConfig': {
        'schemaType': 'ANALYTICS'
      },
      'writeDisposition': '{write-empty|write-truncate|write-append}'
    }
  }" `
  -Uri "https://healthcare.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/datasets/DATASET_ID/fhirStores/FHIR_STORE_ID:export" | Select-Object -Expand Content

If the request is successful, the server returns the response in JSON format:

{
  "name": "projects/PROJECT_ID/locations/LOCATION/datasets/DATASET_ID/operations/OPERATION_ID"
}

The response contains an operation name. To track the status of the operation, you can use the Operation get method:

$cred = gcloud auth application-default print-access-token
$headers = @{ Authorization = "Bearer $cred" }

Invoke-WebRequest `
  -Method Get `
  -Headers $headers `
  -Uri "https://healthcare.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/datasets/DATASET_ID/operations/OPERATION_ID" | Select-Object -Expand Content

If the request is successful, the server returns a response with the status of the operation in JSON format:

{
  "name": "projects/PROJECT_ID/locations/LOCATION/datasets/DATASET_ID/operations/OPERATION_ID",
  "metadata": {
    "@type": "type.googleapis.com/google.cloud.healthcare.v1.OperationMetadata",
    "apiMethodName": "google.cloud.healthcare.v1.fhir.FhirService.ExportResources",
    "createTime": "CREATE_TIME",
    "endTime": "END_TIME",
    "logsUrl": "https://console.cloud.google.com/logs/viewer/CLOUD_LOGGING_URL",
    "counter": {
      "success": "SUCCESS_COUNT"
    }
  },
  "done": true,
  "response": {
    "@type": "type.googleapis.com/google.cloud.healthcare.v1.fhir.rest.ExportResourcesResponse",
  }
}

Querying and analyzing FHIR data in BigQuery

After exporting the FHIR resources to BigQuery, see the Analyzing FHIR data in BigQuery solution for information on how to query and analyze the exported data. The solution uses the Synthea™ Generated Synthetic Data in FHIR public dataset, which hosts over 1 million synthetic patient records generated in the Synthea™ and FHIR formats.

Troubleshooting FHIR export requests

If errors occur during a FHIR export request, the errors are logged to Cloud Logging. For more information, see Viewing error logs in Cloud Logging.

What's next