Querying External Data Sources

An external data source (also known as a federated data source) is a data source that you can query directly even though the data is not stored in BigQuery. Instead of loading or streaming the data, you create a table that references the external data source.

BigQuery offers support for querying data directly from:

Use cases for external data sources include:

  • Loading and cleaning your data in one pass by querying the data from an external data source (a location external to BigQuery) and writing the cleaned result into BigQuery storage.
  • Having a small amount of frequently changing data that you join with other tables. As an external data source, the frequently changing data does not need to be reloaded every time it is updated.

Limitations

External data source limitations include the following:

  • BigQuery does not guarantee data consistency for external data sources. Changes to the underlying data while a query is running can result in unexpected behavior.
  • Query performance for external data sources may not be as high as querying data in a native BigQuery table. If query speed is a priority, load the data into BigQuery instead of setting up an external data source. The performance of a query that includes an external data source depends on the external storage type. For example, querying data stored in Google Cloud Storage is faster than querying data stored in Google Drive. In general, query performance for external data sources should be equivalent to reading the data directly from the external storage.
  • You cannot use the TableDataList JSON API method to retrieve data from tables that reside in an external data source. For more information, see Tabledata: list.
  • You cannot run a BigQuery job that exports data from an external data source.
  • You cannot reference an external data source in a wildcard table query.

Limits

The limits for external data sources are the same as the limits for load jobs, as described in the Load jobs section on the Quota Policy page.

Pricing

When querying an external data source, you are charged for the number of bytes read by the query. For more information, see: Query pricing.

Monitor your resources on the go

Get the Google Cloud Console app to help you manage your projects.

Send feedback about...