Export query results to Amazon S3
This document describes how to export the result of a query that runs against a BigLake table to your Amazon Simple Storage Service (Amazon S3) bucket.
For information about how data flows between BigQuery and Amazon S3, see Data flow when exporting data.
Limitations
For a full list of limitations that apply to BigLake tables based on Amazon S3 and Blob Storage, see Limitations.
Before you begin
Ensure that you have the following resources:
- A connection to access your Amazon S3 bucket.
- An Amazon S3 BigLake table.
- The correct Amazon Web Services (AWS) Identity and Access Management (IAM)
policy:
- You must have the
PutObject
permission to write data into the Amazon S3 bucket. For more information, see Create an AWS IAM policy for BigQuery.
- You must have the
- If you are on the capacity-based pricing model, then ensure that you have enabled the BigQuery Reservation API for your project. For information about pricing, see BigQuery Omni pricing.
Export query results
BigQuery Omni writes to the specified Amazon S3 location regardless of any existing content. The export query can overwrite existing data or mix the query result with existing data. We recommend that you export the query result to an empty Amazon S3 bucket.
To run a query, select one of the following options:
SQL
In the Query editor field, enter a GoogleSQL export query. GoogleSQL is the default syntax in the Google Cloud console.
In the Google Cloud console, go to the BigQuery page.
In the query editor, enter the following statement:
EXPORT DATA WITH CONNECTION `CONNECTION_REGION.CONNECTION_NAME` OPTIONS(uri="s3://BUCKET_NAME/PATH", format="FORMAT", ...) AS QUERY
Replace the following:
CONNECTION_REGION
: the region where the connection was created.CONNECTION_NAME
: the connection name that you created with the necessary permission to write to the Amazon S3 bucket.BUCKET_NAME
: the Amazon S3 bucket where you want to write the data.PATH
: the path where you want to write the exported file to. It must contain exactly one wildcard*
anywhere in the leaf directory of the path string, for example,../aa/*
,../aa/b*c
,../aa/*bc
, and../aa/bc*
. BigQuery replaces*
with0000..N
depending on the number of files exported. BigQuery determines the file count and sizes. If BigQuery decides to export two files, then*
in the first file's filename is replaced by000000000000
, and*
in the second file's filename is replaced by000000000001
.FORMAT
: supported formats areJSON
,AVRO
,CSV
, andPARQUET
.QUERY
: the query to analyze the data that is stored in a BigLake table.Click
Run.
For more information about how to run queries, see Run an interactive query.
Java
Before trying this sample, follow the Java setup instructions in the BigQuery quickstart using client libraries. For more information, see the BigQuery Java API reference documentation.
To authenticate to BigQuery, set up Application Default Credentials. For more information, see Set up authentication for client libraries.
Troubleshooting
If you get an error related to quota failure
, then check if you have reserved
capacity for your queries. For more information about slot reservations, see
Before you begin in this document.
What's next
- Learn about BigQuery Omni.
- Learn how to export table data.
- Learn how to query data stored in Amazon S3.
- Learn how to set up VPC Service Controls for BigQuery Omni.