This page explains how to export FHIR resources to BigQuery for exploration and analysis.
Setting BigQuery permissions
Before exporting FHIR resources to BigQuery, you must grant additional permissions to the Cloud Healthcare Service Agent service account. For more information, see FHIR store BigQuery permissions.
Exporting FHIR resources
The following samples show how to export FHIR resources to a BigQuery table.
When setting the BigQuery destination, use the fully qualified URI, like so:
bq://PROJECT_ID.BIGQUERY_DATASET_ID.BIGQUERY_TABLE_ID
The behavior of the export operation can vary depending on the following:
- Whether the destination table already exists
- Whether you have set the
force
field - Whether you have set a value for the
writeDisposition
enum. If you use thewriteDisposition
enum, do not set theforce
field.
The behavior in each of these cases is as follows:
- If the destination table already exists and
force
is set totrue
, the export operation overwrites the existing table. - If the destination table already exists and
force
is set tofalse
, an error occurs. - If the destination table does not already exist then the export operation creates a new table, regardless of whether you specify the
force
field. - When using
writeDisposition
, if the destination table already exists and is empty, the export operation completes successfully instead of returning an error.
gcloud
To export FHIR resources to BigQuery, use the
gcloud healthcare fhir-stores export bq
command. Specify the following information:
- The name of the parent dataset
- The name of the FHIR store
- The name of an existing BigQuery dataset
- The name of the project that owns the BigQuery dataset
- The
schema-type
set to theanalytics
- One of the following values for the
writeDisposition
enum:WRITE_EMPTY
: only export data if the destination tables are empty. This is the default.WRITE_TRUNCATE
: erase all existing data in the tables before writing the instances.WRITE_APPEND
: append data to the existing tables.
The following sample shows the gcloud healthcare fhir-stores export bq
command.
gcloud healthcare fhir-stores export bq FHIR_STORE_ID \ --dataset=DATASET_ID \ --location=LOCATION \ --bq-dataset=bq://PROJECT_ID.BIGQUERY_DATASET_ID \ --schema-type=analytics \ --write-disposition=WRITE_DISPOSITION_TYPE
The command line displays the operation ID and, after the operation completes,
done
:
Request issued for: [FHIR_STORE_ID] Waiting for operation [projects/PROJECT_ID/locations/LOCATION/datasets/DATASET_ID/OPERATION_ID] to complete...done. name: projects/PROJECT_ID/locations/LOCATION/datasets/DATASET_ID/fhirStores/FHIR_STORE_ID
To view more details of the operation, run the
gcloud healthcare operations describe
command, providing the OPERATION_ID from the response:
gcloud healthcare operations describe OPERATION_ID \ --dataset=DATASET_ID
The response includes done: true
.
done: true metadata: '@type': type.googleapis.com/google.cloud.healthcare.v1.OperationMetadata apiMethodName: google.cloud.healthcare.v1.fhir.FhirService.ExportResources createTime: 'CREATE_TIME' endTime: 'END_TIME' name: projects/PROJECT_ID/locations/LOCATION/datasets/DATASET_ID/fhirStores/FHIR_STORE_ID response: '@type': type.googleapis.com/google.cloud.healthcare.v1.fhir.rest.ExportResourcesResponse fhirStore: projects/PROJECT_ID/locations/LOCATION/datasets/DATASET_ID/fhirStores/FHIR_STORE_ID
API
For more information, see
projects.locations.datasets.fhirStores.export
.
curl
To export FHIR resources, make a POST
request and supply the following
information:
- The name of the parent dataset
- The name of the FHIR store
- The name of an existing BigQuery dataset
- The name of the project that owns the BigQuery dataset
- The
schemaType
field set to the valueANALYTICS
- An access token
- One of the following values for the
writeDisposition
enum:write-empty
: only export data if the destination tables are empty. This is the default.write-truncate
: erase all existing data in the tables before writing the instances.write-append
: append data to the existing tables.
The operation outputs one BigQuery table for each resource type in the FHIR store.
The following sample shows a POST
request using curl
.
curl -X POST \ -H "Authorization: Bearer $(gcloud auth application-default print-access-token)" \ -H "Content-Type: application/json; charset=utf-8" \ --data "{ 'bigqueryDestination': { 'datasetUri': 'bq://PROJECT_ID.BIGQUERY_DATASET_ID', 'schemaConfig': { 'schemaType': 'ANALYTICS' } 'writeDisposition': 'WRITE_DISPOSITION_TYPE' } }" "https://healthcare.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/datasets/DATASET_ID/fhirStores/FHIR_STORE_ID:export"
If the request is successful, the server returns the response in JSON format:
{ "name": "projects/PROJECT_ID/locations/LOCATION/datasets/DATASET_ID/operations/OPERATION_ID" }
The response contains an operation name. To track the status of the operation,
you can use the
Operation get
method:
curl -X GET \ -H "Authorization: Bearer $(gcloud auth application-default print-access-token)" \ "https://healthcare.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/datasets/DATASET_ID/operations/OPERATION_ID"
If the request is successful, the server returns a response with the status of the operation in JSON format:
{ "name": "projects/PROJECT_ID/locations/LOCATION/datasets/DATASET_ID/operations/OPERATION_ID", "metadata": { "@type": "type.googleapis.com/google.cloud.healthcare.v1.OperationMetadata", "apiMethodName": "google.cloud.healthcare.v1.fhir.FhirService.ExportResources", "createTime": "CREATE_TIME", "endTime": "END_TIME", "logsUrl": "https://console.cloud.google.com/logs/viewer/CLOUD_LOGGING_URL", "counter": { "success": "SUCCESS_COUNT" } }, "done": true, "response": { "@type": "type.googleapis.com/google.cloud.healthcare.v1.fhir.rest.ExportResourcesResponse", } }
PowerShell
To export FHIR resources, make a POST
request and supply the following
information:
- The name of the parent dataset
- The name of the FHIR store
- The name of an existing BigQuery dataset
- The name of the project that owns the BigQuery dataset
- The
schemaType
field set to the valueANALYTICS
- One of the following values for the
writeDisposition
enum:WRITE_EMPTY
: only export data if the destination tables are empty. This is the default.WRITE_TRUNCATE
: erase all existing data in the tables before writing the instances.WRITE_APPEND
: append data to the existing tables.
- An access token
The operation outputs one BigQuery table for each resource type in the FHIR store.
The following sample shows a POST
request using Windows PowerShell.
$cred = gcloud auth application-default print-access-token $headers = @{ Authorization = "Bearer $cred" } Invoke-WebRequest ` -Method Post ` -Headers $headers ` -ContentType: "application/json; charset=utf-8" ` -Body "{ 'bigqueryDestination': { 'datasetUri': 'bq://PROJECT_ID.BIGQUERY_DATASET_ID', 'schemaConfig': { 'schemaType': 'ANALYTICS' }, 'writeDisposition': 'WRITE_DISPOSITION_TYPE' } }" ` -Uri "https://healthcare.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/datasets/DATASET_ID/fhirStores/FHIR_STORE_ID:export" | Select-Object -Expand Content
If the request is successful, the server returns the response in JSON format:
{ "name": "projects/PROJECT_ID/locations/LOCATION/datasets/DATASET_ID/operations/OPERATION_ID" }
The response contains an operation name. To track the status of the operation,
you can use the
Operation get
method:
$cred = gcloud auth application-default print-access-token $headers = @{ Authorization = "Bearer $cred" } Invoke-WebRequest ` -Method Get ` -Headers $headers ` -Uri "https://healthcare.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/datasets/DATASET_ID/operations/OPERATION_ID" | Select-Object -Expand Content
If the request is successful, the server returns a response with the status of the operation in JSON format:
{ "name": "projects/PROJECT_ID/locations/LOCATION/datasets/DATASET_ID/operations/OPERATION_ID", "metadata": { "@type": "type.googleapis.com/google.cloud.healthcare.v1.OperationMetadata", "apiMethodName": "google.cloud.healthcare.v1.fhir.FhirService.ExportResources", "createTime": "CREATE_TIME", "endTime": "END_TIME", "logsUrl": "https://console.cloud.google.com/logs/viewer/CLOUD_LOGGING_URL", "counter": { "success": "SUCCESS_COUNT" } }, "done": true, "response": { "@type": "type.googleapis.com/google.cloud.healthcare.v1.fhir.rest.ExportResourcesResponse", } }
Querying and analyzing FHIR data in BigQuery
After exporting the FHIR resources to BigQuery, see the Analyzing FHIR data in BigQuery solution for information on how to query and analyze the exported data. The solution uses the Synthea™ Generated Synthetic Data in FHIR public dataset, which hosts over 1 million synthetic patient records generated in the Synthea™ and FHIR formats.
Troubleshooting FHIR export requests
If errors occur during a FHIR export request, the errors are logged to Cloud Logging. For more information, see Viewing error logs in Cloud Logging.
What's next
- If you are a researcher, data scientist, or IT team, see the Storing healthcare data in BigQuery solution for additional information on analyzing and storing data in BigQuery.
- The Managing healthcare data access in BigQuery solution explains strategies for securing clinical and operational healthcare data in BigQuery.