This page provides an overview of querying data stored outside of BigQuery.
An external data source (also known as a federated data source) is a data source that you can query directly even though the data is not stored in BigQuery. Instead of loading or streaming the data, you create a table that references the external data source.
BigQuery offers support for querying data directly from:
Use cases for external data sources include:
- Loading and cleaning your data in one pass by querying the data from an external data source (a location external to BigQuery) and writing the cleaned result into BigQuery storage.
- Having a small amount of frequently changing data that you join with other tables. As an external data source, the frequently changing data does not need to be reloaded every time it is updated.
External data source limitations include the following:
- BigQuery does not guarantee data consistency for external data sources. Changes to the underlying data while a query is running can result in unexpected behavior.
- Query performance for external data sources may not be as high as querying data in a native BigQuery table. If query speed is a priority, load the data into BigQuery instead of setting up an external data source. The performance of a query that includes an external data source depends on the external storage type. For example, querying data stored in Google Cloud Storage is faster than querying data stored in Google Drive. In general, query performance for external data sources should be equivalent to reading the data directly from the external storage.
- You cannot use the
TableDataListJSON API method to retrieve data from tables that reside in an external data source. For more information, see Tabledata: list.
- You cannot run a BigQuery job that exports data from an external data source.
- You cannot reference an external data source in a wildcard table query.
- When you query an external data source, the results are not cached. You are charged for each query against an external table even if you issue the same query multiple times. If you need to repeatedly issue a query against an external table that does not change frequently, consider writing the query results to a permanent table and run the queries against the permanent table instead.
The limits for external data sources are the same as the limits for load jobs, as described in the Load jobs section on the Quota Policy page.
When querying an external data source, you are charged for the number of bytes read by the query. For more information, see: Query pricing.