Introduction to external data sources

This page provides an overview of querying data stored outside of BigQuery.

Overview

An external data source is a data source that you can query directly from BigQuery, even though the data is not stored in BigQuery storage.

BigQuery supports the following external data sources:

Use cases for external data sources include:

  • For ETL workloads, loading and cleaning your data in one pass and writing the cleaned result into BigQuery storage.
  • Joining BigQuery tables with frequently changing data from an external data source. By querying the external data source directly, you don't need to reload the data into BigQuery storage every time it changes.

BigQuery has two different mechanisms for querying external data:

  • External tables

    An external table is a table that acts like a standard BigQuery table. The table metadata, including the table schema, is stored in BigQuery storage, but the data itself resides in the external source.

    External tables can be temporary or permanent. A permanent external table is contained inside a dataset, and you manage it in the same way that you manage a standard BigQuery table. For example, you can view the table properties, set access controls, and so forth. You can query the table and join it with other tables.

    You can use external tables with the following data sources:

  • Federated queries

    A federated query is a way to send a query statement to an external database and get the result back as a temporary table. Federated queries use the BigQuery Connection API to establish a connection with the external database. In your standard SQL query, you use the EXTERNAL_QUERY function to send a query statement to the external database, using that database's SQL dialect. The results are converted to BigQuery standard SQL data types.

    You can use federated queries with the following external databases:

Next steps