An external data source (also known as a federated data source) is a data source that you can query directly even though the data is not stored in BigQuery. Instead of loading or streaming the data, you create a table that references the external data source.
BigQuery offers support for querying data directly from:
Use cases for external data sources include:
- Loading and cleaning your data in one pass by querying the data from an external data source (a location external to BigQuery) and writing the cleaned result into BigQuery storage.
- Having a small amount of frequently changing data that you join with other tables. As an external data source, the frequently changing data does not need to be reloaded every time it is updated.
External data source limitations include the following:
- BigQuery does not guarantee data consistency for external data sources. Changes to the underlying data while a query is running can result in unexpected behavior.
- If query speed is a priority, load the data into BigQuery instead of setting up an external data source. The performance of a query that includes an external data source depends on the external storage type. For example, querying data stored in Google Cloud Storage is faster than querying data stored in Google Drive. In general, query performance for external data sources should be equivalent to reading the data directly from the external storage.
- You cannot run a BigQuery job that exports data from an external data source.
- You cannot use the
TableDataListJSON API method to retrieve data from tables that reside in an external data source. For more information, see Tabledata: list.
The limits for external data sources are the same as the limits for load jobs, as described in the Load jobs section on the Quota Policy page.
When querying an external data source, you are charged for the number of bytes read. For more information, see: Query pricing.