Introduction to materialized views
In BigQuery, materialized views are precomputed views that periodically cache the results of a query for increased performance and efficiency. BigQuery leverages precomputed results from materialized views and whenever possible reads only delta changes from the base tables to compute up-to-date results. Materialized views can be queried directly or can be used by the BigQuery optimizer to process queries to the base tables.
Queries that use materialized views are generally faster and consume fewer resources than queries that retrieve the same data only from the base tables. Materialized views can significantly improve the performance of workloads that have the characteristic of common and repeated queries.
The following are key characteristics of BigQuery Materialized Views:
- Zero maintenance. Materialized views are recomputed in the background when the base tables change. Any incremental data changes from the base tables are automatically added to the materialized views, with no user action required.
- Fresh data. Materialized views return fresh data. If changes to base tables might invalidate the materialized view, then data is read directly from the base tables. If the changes to the base tables do not invalidate the materialized view, then rest of the data is read from the materialized view and only the changes are read from the base tables.
- Smart tuning. If any part of a query against the source table can be resolved by querying the materialized view, then BigQuery reroutes the query to use the materialized view for better performance and efficiency.
BigQuery Materialized Views can optimize queries with high computation cost and small dataset results. Processes that benefit from materialized views include online analytical processing (OLAP) operations that require significant processing with predictable and repeated queries like those in from extract, transform, load (ETL) processes or business intelligence (BI) pipelines.
The following use cases highlight the value of materialized views. Materialized views can improve query performance if you frequently require the following:
- Pre-aggregate data. Aggregation of streaming data.
- Pre-filter data. Run queries that only read a particular subset of the table.
- Pre-join data. Query joins, especially between large and small tables.
- Recluster data. Run queries that would benefit from a clustering scheme that differs from the base tables.
Comparison to other BigQuery techniques
The following table summarizes the similarities and differences between BigQuery caching, scheduled queries, standard views, and materialized views.
|Component||Caching||Scheduled queries||Standard views||Materialized views|
|Partitioning and clustering||No||Yes||N/A||Yes|
Interaction with other BigQuery features
The following BigQuery features work transparently with BigQuery Materialized Views:
Query plan explanation: The query plan reflects which materialized views are scanned (if any), and shows how many bytes are read from the materialized views and base tables combined.
Query caching: The results of a query that BigQuery rewrites using a materialized view can be cached subject to the usual limitations (using of deterministic functions, no streaming into the base tables, etc.).
Cost restriction: If you have set a value for maximum bytes billed, and a query would read a number of bytes beyond the limit, the query fails without incurring a charge, whether the query uses materialized views, the base tables, or both.
Cost estimation using dry run: A dry run repeats query rewrite logic using the available materialized views and provides a cost estimate. You can use this feature as a way to test whether a specific query uses any materialized views.
Manipulating materialized view data directly is not supported. This includes the following actions:
- Copy a materialized view, either as a source or destination of a copy job.
- Export a materialized view.
- Load data into a materialized view
- Write a query result into a materialized view
- Run DML statements over a materialized view.
A materialized view must reside in the same organization as the base tables, or in the same project if the project does not belong to an organization.
Each base table can be referenced by up to 20 materialized views from the same dataset, up to 100 materialized views from the same project, and up to 500 materialized views from the whole organization.
Only materialized views from the same dataset are considered for automatic query rewrite (or smart tuning).
Materialized views use a restricted SQL syntax and a limited set of aggregation functions. For more information, see Supported materialized views.
Materialized views cannot be nested on other materialized views.
Materialized views cannot query external tables.
Only the standard SQL dialect is supported for materialized views.
If you delete a base table without first deleting the materialized view, queries over the materialized view fail, as do refreshes. If you decide to recreate the base table, you must also recreate the materialized view.
Materialized views pricing
Costs are associated with the following aspects of BigQuery Materialized Views:
- Querying materialized views.
- Maintaining materialized views, such as when materialized views are refreshed. The cost for automatic refresh is billed to the project where the view resides. The cost for manual refresh is billed to the project in which the manual refresh job is run. For more information about controlling maintenance cost, see Refresh job maintenance.
- Storing materialized view tables.
|Component||On-demand pricing||Flat-rate pricing|
|Querying||Bytes processed by materialized views and any necessary portions of the base tables.1||Slots are consumed during query time.|
|Maintenance||Bytes processed during refresh time.||Slots are consumed during refresh time.|
|Storage||Bytes stored in materialized views.||Bytes stored in materialized views.|
1Where possible, BigQuery reads only the changes since the last time the view was refreshed. For more information, see Incremental updates.
Storage cost details
APPROX_COUNT_DISTINCT aggregate values in a
materialized view, the final value is not directly stored. Instead,
BigQuery internally stores a materialized view as an intermediate
sketch, which is used to produce the final value.
As an example, consider a materialized view that's created with the following command:
CREATE MATERIALIZED VIEW project-id.my_dataset.my_mv_table AS SELECT date, AVG(net_paid) AS avg_paid FROM project-id.my_dataset.my_base_table GROUP BY date
avg_paid column is rendered as
FLOAT64 to the user,
internally it is stored as
BYTES, with its content being an intermediate
sketch in proprietary format. For data size calculation,
the column is treated as