Introduction to external data sources
This page provides an overview of querying data stored outside of BigQuery.
An external data source is a data source that you can query directly from BigQuery, even though the data is not stored in BigQuery storage.
BigQuery supports the following external data sources:
Use cases for external data sources include the following:
- For ELT (extract-load-transform) workloads, loading and cleaning your data
in one pass and writing the cleaned result into BigQuery
storage, by using a
CREATE TABLE ... AS SELECTquery.
- Joining BigQuery tables with frequently changing data from an external data source. By querying the external data source directly, you don't need to reload the data into BigQuery storage every time it changes.
BigQuery has two different mechanisms for querying external data:
BigLake tables and external tables
BigLake tables and external tables look like standard BigQuery tables, in that these tables store their metadata and schema in BigQuery storage. However, their data resides in an external source.
These tables can be temporary or permanent. Permanent BigLake and external tables are contained inside a dataset, and you manage them in the same way that you manage a standard BigQuery table. For example, you can view the table's properties, set access controls, and so forth. You can query these tables and join them with other tables.
You can use BigLake tables and external tables with the following data sources:
A federated query is a way to send a query statement to an external database and get the result back as a temporary table. Federated queries use the BigQuery Connection API to establish a connection with the external database. In your standard SQL query, you use the
EXTERNAL_QUERYfunction to send a query statement to the external database, using that database's SQL dialect. The results are converted to Google Standard SQL data types.
You can use federated queries with the following external databases: