Stay organized with collections Save and categorize content based on your preferences.

Introduction to external data sources

This page provides an overview of querying data stored outside of BigQuery.

Overview

An external data source is a data source that you can query directly from BigQuery, even though the data is not stored in BigQuery storage.

BigQuery supports the following external data sources:

Use cases for external data sources include the following:

  • For ELT (extract-load-transform) workloads, loading and cleaning your data in one pass and writing the cleaned result into BigQuery storage, by using a CREATE TABLE ... AS SELECT query.
  • Joining BigQuery tables with frequently changing data from an external data source. By querying the external data source directly, you don't need to reload the data into BigQuery storage every time it changes.

BigQuery has two different mechanisms for querying external data:

  • BigLake tables and external tables

    BigLake tables and external tables look like standard BigQuery tables, in that these tables store their metadata and schema in BigQuery storage. However, their data resides in an external source.

    These tables can be temporary or permanent. Permanent BigLake and external tables are contained inside a dataset, and you manage them in the same way that you manage a standard BigQuery table. For example, you can view the table's properties, set access controls, and so forth. You can query these tables and join them with other tables.

    BigLake tables extend external tables with fine-grained security, including row-level, column-level, and dynamic data masking.

    You can use BigLake tables and external tables with the following data sources:

  • Federated queries

    A federated query is a way to send a query statement to an external database and get the result back as a temporary table. Federated queries use the BigQuery Connection API to establish a connection with the external database. In your standard SQL query, you use the EXTERNAL_QUERY function to send a query statement to the external database, using that database's SQL dialect. The results are converted to Google Standard SQL data types.

    You can use federated queries with the following external databases:

What's next