Federated queries
This page provides an overview of how to use federated queries in BigQuery.
Overview
A federated query is a way to send a query statement to an external database
and get the result back as a temporary table. Federated queries use the
BigQuery Connection API to establish a connection with the external database.
In your standard SQL query, you use the EXTERNAL_QUERY
function to send a
query statement to the external database, using that database's SQL dialect.
The results are converted to BigQuery standard SQL data types.
You can use federated queries with the following external databases:
After the initial one-time setup, you can write a query with the
EXTERNAL_QUERY
SQL function.
Workflow
- Choose a Google Cloud project that includes the data source that you want to query.
- A
bigquery.admin
user creates a connection resource in BigQuery. - The admin user grants permission to use the connection resource
to user B.
- If the admin and user B are the same person, there is no need to grant permission.
- User B writes a query in BigQuery with the new
EXTERNAL_QUERY
function.
Supported regions
Federated queries are only supported in regions that support both the external data source and BigQuery.
Locations or region types
There are two types of locations:
A region is a specific geographic place, such as London.
A multi-region is a large geographic area, such as the United States, that contains two or more geographic places.
You can create a connection and execute a federated query across regions according to the following rules.
Multi-regions
A BigQuery multi-region can query any data source region in the same large geographic area (US, EU), for example:
- A query that runs in the BigQuery US multi-region can query any
single region in the US geographic area, such as
us-central1
,us-east4
,us-west2
, etc. - A query that runs in the BigQuery EU multi-region can query any
single region in member states
of the European Union, such as
europe-north1
,europe-west3
, etc. The location where the query runs must be the same as the location of the connection resource. For example, queries executed from the
US
multi-region must use a connection located in theUS
multi-region.
The
query processing location
is the multi-region location, either US
or EU
.
For more information about regions and multi-regions, see the Dataset locations page.
Single regions
A BigQuery single region can only query a resource in the same region. For example:
- The BigQuery single region
us-east4
can only query Cloud SQL inus-east4
.
In this example, the query processing location is the BigQuery single region.
For more information, see Cloud SQL federated queries.
Data type mappings
When you execute a federated query, the data from the external data source is converted to BigQuery standard SQL types. For more information, see Cloud SQL federated queries.
Quotas and limits
- Cross-region federated querying: if the
BigQuery query processing location
and the external data source location are different, this is a cross-region
query. You can run up to 1 TB in cross-region queries per project per
day. The following is an example of a cross-region query.
- The Cloud SQL instance is in
us-west1
while the BigQuery connection is based in theUS
multi-region. The BigQuery query processing location isUS
.
- The Cloud SQL instance is in
- Quota: users should control query quota in the external data source, such as Cloud SQL. There is no extra quota setting for federated querying. To achieve workload isolation, it's recommended to only query a database read replica.
- Maximum bytes billed allowed: this field isn't supported for federated queries at this time. Calculating the bytes billed before actually executing the federated queries is not possible at this time.
- Number of connections: a federated query can have at most 10 unique connections.
Pricing
When executing federated queries from BigQuery, you are charged for the number of bytes returned from the external query. For more information, see On-demand analysis pricing.