This page explains how to export FHIR resources to BigQuery for exploration and analysis.
Setting BigQuery permissions
Before exporting FHIR resources to BigQuery, you must grant additional permissions to the Cloud Healthcare Service Agent service account. For more information, see FHIR store BigQuery permissions.
Exporting FHIR resources
The following samples show how to export FHIR resources to a BigQuery table.
When setting the BigQuery destination, use the fully qualified URI, like so:
bq://PROJECT_ID.BIGQUERY_DATASET_ID.BIGQUERY_TABLE_ID
The behavior of the export operation can vary depending on the following:
- Whether the destination table already exists
- Whether you have set the
force
field - Whether you have set a value for the
writeDisposition
enum. If you use thewriteDisposition
enum, do not set theforce
field.
The behavior in each of these cases is as follows:
- If the destination table already exists and
force
is set totrue
, the export operation overwrites the existing table. - If the destination table already exists and
force
is set tofalse
, an error occurs. - If the destination table does not already exist then the export operation creates a new table, regardless of whether you specify the
force
field. - When using
writeDisposition
, if the destination table already exists and is empty, the export operation completes successfully instead of returning an error.
Console
To export FHIR resources to BigQuery, complete the following steps:
- In the Cloud console, go to the Datasets page.
Go to the Datasets page - Click the dataset that contains the FHIR store from which you are exporting FHIR resources.
- In the list of data stores, choose Export from the Actions list for the FHIR store.
- On the Export FHIR resources page that appears, select BigQuery Table.
- In BigQuery output options, choose how the data is exported to BigQuery.
- In BigQuery output options, click Browse to choose the BigQuery project and dataset.
- In Select dataset, search for the BigQuery project and dataset to which you want to export FHIR resources.
- To set the depth for all recursive structures in the output schema,
click the relevant depth level in the Recursive Structure Depth
slider. By default, the recursive value is 2.
For more information, seeprojects.locations.datasets.fhirStores.SchemaConfig
. - Click Export to export FHIR resources to BigQuery.
- To track the status of the operation, click the Operations tab. After the operation
completes, the following indications appear:
- The Long-running operation status section has a green check mark under the OK heading.
- The Overview section has a green check mark and an OK indicator in the same row as the operation ID.
gcloud
To export FHIR resources to BigQuery, use the
gcloud healthcare fhir-stores export bq
command. Specify the following information:
- The name of the parent dataset
- The name of the FHIR store
- The name of an existing BigQuery dataset
- The name of the project that owns the BigQuery dataset
- The
schema-type
set to theanalytics
- One of the following values for the
--writeDisposition
flag:write-empty
: only export data if the destination tables are empty. This is the default.write-truncate
: erase all existing data in the tables before writing the instances.write-append
: append data to the existing tables.
The following sample shows the gcloud healthcare fhir-stores export bq
command.
gcloud healthcare fhir-stores export bq FHIR_STORE_ID \ --dataset=DATASET_ID \ --location=LOCATION \ --bq-dataset=bq://PROJECT_ID.BIGQUERY_DATASET_ID \ --schema-type=analytics \ --write-disposition={write-empty|write-truncate|write-append}
The command line displays the operation ID and, after the operation completes,
done
:
Request issued for: [FHIR_STORE_ID] Waiting for operation [projects/PROJECT_ID/locations/LOCATION/datasets/DATASET_ID/OPERATION_ID] to complete...done. name: projects/PROJECT_ID/locations/LOCATION/datasets/DATASET_ID/fhirStores/FHIR_STORE_ID
To view more details of the operation, run the
gcloud healthcare operations describe
command, providing the OPERATION_ID from the response:
gcloud healthcare operations describe OPERATION_ID \ --dataset=DATASET_ID
The response includes done: true
.
done: true metadata: '@type': type.googleapis.com/google.cloud.healthcare.v1.OperationMetadata apiMethodName: google.cloud.healthcare.v1.fhir.FhirService.ExportResources createTime: 'CREATE_TIME' endTime: 'END_TIME' name: projects/PROJECT_ID/locations/LOCATION/datasets/DATASET_ID/fhirStores/FHIR_STORE_ID response: '@type': type.googleapis.com/google.cloud.healthcare.v1.fhir.rest.ExportResourcesResponse fhirStore: projects/PROJECT_ID/locations/LOCATION/datasets/DATASET_ID/fhirStores/FHIR_STORE_ID
API
For more information, see
projects.locations.datasets.fhirStores.export
.
curl
To export FHIR resources, make a POST
request and supply the following
information:
- The name of the parent dataset
- The name of the FHIR store
- The name of an existing BigQuery dataset
- The name of the project that owns the BigQuery dataset
The
schemaType
field set to one of the following:ANALYTICS
: the FHIR analytics schemaANALYTICS_V2
: the FHIR analytics schema with support for repeated extension columns and contained resources as JSON strings
An access token
One of the following values for the
writeDisposition
enum:write-empty
: only export data if the destination tables are empty. This is the default.write-truncate
: erase all existing data in the tables before writing the instances.write-append
: append data to the existing tables.
(Only available in v1beta1): An optional field,
_type
, that only exports one or more types of FHIR resources(Only available in v1beta1): An optional field,
_since
, that only exports resources updated after a specific time, defined asYYYY-MM-DDThh:mm:ss.sss+zz:zz
The operation outputs one BigQuery table for each resource type in the FHIR store.
The following sample shows a POST
request using curl
.
curl -X POST \ -H "Authorization: Bearer $(gcloud auth application-default print-access-token)" \ -H "Content-Type: application/json; charset=utf-8" \ --data "{ 'bigqueryDestination': { 'datasetUri': 'bq://PROJECT_ID.BIGQUERY_DATASET_ID', 'schemaConfig': { 'schemaType': '{ANALYTICS|ANALYTICS_V2}' }, 'writeDisposition': '{write-empty|write-truncate|write-append}' } }" "https://healthcare.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/datasets/DATASET_ID/fhirStores/FHIR_STORE_ID:export"
If the request is successful, the server returns the response in JSON format:
{ "name": "projects/PROJECT_ID/locations/LOCATION/datasets/DATASET_ID/operations/OPERATION_ID" }
The response contains an operation name. To track the status of the operation,
you can use the
Operation get
method:
curl -X GET \ -H "Authorization: Bearer $(gcloud auth application-default print-access-token)" \ "https://healthcare.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/datasets/DATASET_ID/operations/OPERATION_ID"
If the request is successful, the server returns a response with the status of the operation in JSON format:
{ "name": "projects/PROJECT_ID/locations/LOCATION/datasets/DATASET_ID/operations/OPERATION_ID", "metadata": { "@type": "type.googleapis.com/google.cloud.healthcare.v1.OperationMetadata", "apiMethodName": "google.cloud.healthcare.v1.fhir.FhirService.ExportResources", "createTime": "CREATE_TIME", "endTime": "END_TIME", "logsUrl": "https://console.cloud.google.com/logs/query/CLOUD_LOGGING_URL", "counter": { "success": "SUCCESS_COUNT" } }, "done": true, "response": { "@type": "type.googleapis.com/google.cloud.healthcare.v1.fhir.rest.ExportResourcesResponse", } }
PowerShell
To export FHIR resources, make a POST
request and supply the following
information:
- The name of the parent dataset
- The name of the FHIR store
- The name of an existing BigQuery dataset
- The name of the project that owns the BigQuery dataset
The
schemaType
field set to one of the following:ANALYTICS
: the FHIR analytics schemaANALYTICS_V2
: the FHIR analytics schema with support for repeated extension columns and contained resources as JSON strings
An access token
One of the following values for the
writeDisposition
enum:write-empty
: only export data if the destination tables are empty. This is the default.write-truncate
: erase all existing data in the tables before writing the instances.write-append
: append data to the existing tables.
(Only available in v1beta1): An optional field,
_type
, that only exports one or more types of FHIR resources(Only available in v1beta1): An optional field,
_since
, that only exports resources updated after a specific time, defined asYYYY-MM-DDThh:mm:ss.sss+zz:zz
The operation outputs one BigQuery table for each resource type in the FHIR store.
The following sample shows a POST
request using Windows PowerShell.
$cred = gcloud auth application-default print-access-token $headers = @{ Authorization = "Bearer $cred" } Invoke-WebRequest ` -Method Post ` -Headers $headers ` -ContentType: "application/json; charset=utf-8" ` -Body "{ 'bigqueryDestination': { 'datasetUri': 'bq://PROJECT_ID.BIGQUERY_DATASET_ID', 'schemaConfig': { 'schemaType': '{ANALYTICS|ANALYTICS_V2}' }, 'writeDisposition': '{write-empty|write-truncate|write-append}' } }" ` -Uri "https://healthcare.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/datasets/DATASET_ID/fhirStores/FHIR_STORE_ID:export" | Select-Object -Expand Content
If the request is successful, the server returns the response in JSON format:
{ "name": "projects/PROJECT_ID/locations/LOCATION/datasets/DATASET_ID/operations/OPERATION_ID" }
The response contains an operation name. To track the status of the operation,
you can use the
Operation get
method:
$cred = gcloud auth application-default print-access-token $headers = @{ Authorization = "Bearer $cred" } Invoke-WebRequest ` -Method Get ` -Headers $headers ` -Uri "https://healthcare.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/datasets/DATASET_ID/operations/OPERATION_ID" | Select-Object -Expand Content
If the request is successful, the server returns a response with the status of the operation in JSON format:
{ "name": "projects/PROJECT_ID/locations/LOCATION/datasets/DATASET_ID/operations/OPERATION_ID", "metadata": { "@type": "type.googleapis.com/google.cloud.healthcare.v1.OperationMetadata", "apiMethodName": "google.cloud.healthcare.v1.fhir.FhirService.ExportResources", "createTime": "CREATE_TIME", "endTime": "END_TIME", "logsUrl": "https://console.cloud.google.com/logs/query/CLOUD_LOGGING_URL", "counter": { "success": "SUCCESS_COUNT" } }, "done": true, "response": { "@type": "type.googleapis.com/google.cloud.healthcare.v1.fhir.rest.ExportResourcesResponse", } }
Querying and analyzing FHIR data in BigQuery
After exporting the FHIR resources to BigQuery, see the Analyzing FHIR data in BigQuery solution for information on how to query and analyze the exported data. The solution uses the Synthea™ Generated Synthetic Data in FHIR public dataset, which hosts over 1 million synthetic patient records generated in the Synthea™ and FHIR formats.
Troubleshooting FHIR export requests
If errors occur during a FHIR export request, the errors are logged to Cloud Logging. For more information, see Viewing error logs in Cloud Logging.
If the entire operation returns an error, see Troubleshooting long-running operations.
What's next
- If you are a researcher, data scientist, or IT team, see the Storing healthcare data in BigQuery solution for additional information on analyzing and storing data in BigQuery.
- The Managing healthcare data access in BigQuery solution explains strategies for securing clinical and operational healthcare data in BigQuery.