This document provides an overview of BigQuery support for materialized views. Before you read this document, familiarize yourself with BigQuery and BigQuery's standard views.
Overview
In BigQuery, materialized views are precomputed views that periodically cache the results of a query for increased performance and efficiency. BigQuery leverages precomputed results from materialized views and whenever possible reads only delta changes from the base table to compute up-to-date results. Materialized views can be queried directly or can be used by the BigQuery optimizer to process queries to the base tables.
Queries that use materialized views are generally faster and consume fewer resources than queries that retrieve the same data only from the base table. Materialized views can significantly improve the performance of workloads that have the characteristic of common and repeated queries.
Why use materialized views?
BigQuery Materialized Views can be beneficial in the following ways:
Reduction in the execution time and cost for queries with aggregate functions. The largest benefit is gained when a query's computation cost is high and the resulting dataset is small. If you have a raw table and you perform online analytical processing (OLAP) style aggregation that requires significant processing, and you have predictable and repeated queries, such as from extract, transform, load (ETL) or business intelligence (BI) pipelines, you should use materialized views.
Automatic and transparent BigQuery query optimization. The BigQuery optimizer uses a materialized view, if available, to improve the query execution plan. This optimization does not require any changes to the queries.
Aggregation of real-time data. If you want access to data for making decisions in real-time, you can use BigQuery Materialized Views with BigQuery streaming to perform aggregations in real-time, providing up-to-date information.
Required permissions
For information about the permissions that are required to perform operations on materialized views, see required permissions for working with materialized views.
Comparison to other BigQuery techniques
The following table summarizes the similarities and differences between BigQuery caching, scheduled queries, standard views, and materialized views.
Component | Caching | Scheduled queries | Standard views | Materialized views |
---|---|---|---|---|
Performance | Yes | No | No | Yes |
Query supported | All | All | All | Aggregation — filtering and grouping |
Partitioning and clustering | No | Yes | N/A | Yes |
Incremental refresh | No | No | No | Yes |
Additional storage | No | Yes | No | Yes |
Query rewrite | No | No | No | Yes |
Maintenance costs | No | Yes | N/A | Yes |
Data staleness | Never | Yes | Never | Never |
Limited to a single table | No | No | No | Yes |
Interaction with other BigQuery features
The following BigQuery features work transparently with BigQuery Materialized Views:
Query plan explanation: The query plan reflects which materialized views will be scanned (if any), and shows how many bytes will be read from the materialized views and base table combined.
Query caching: The results of a query that BigQuery rewrites using a materialized view can be cached subject to the usual limitations (using of deterministic functions, no streaming into the base table, etc.).
Cost calculation: With on-demand billing, if a query is rewritten using one or more materialized views and as a result, scans fewer bytes than scanning the entire base table, then the query is less expensive.
Cost restriction: If you have set a value for maximum bytes billed, and a query would read a number of bytes beyond the limit, the query fails without incurring a charge, whether the query uses materialized views, the base table, or both.
Cost estimation using dry run: A dry run repeats query rewrite logic using the available materialized views and provides a cost estimate. You can use this feature as a way to test whether a specific query uses any materialized views.
Limitations
You cannot directly manipulate, or read raw content from, a materialized view. This means the following actions are not supported:
- Copying a materialized view, either as a source or destination of a copy job.
- Exporting a materialized view.
- Loading data into a materialized view
- Writing a query result into a materialized view
- Running DML statements over a materialized view.
A materialized view must reside in the same organization as the base table, or in the same project if the project does not belong to an organization.
Each base table can be referenced by up to 20 materialized views from the same dataset, up to 100 materialized views from the same project, and up to 500 materialized views from the whole organization.
Only materialized views from the same dataset are considered for automatic query rewrite (or smart tuning).
Materialized views use a restricted SQL syntax and a limited set of aggregation functions. For more information, see Supported materialized views.
A materialized view can reference only a single table, and cannot use joins.
Materialized views cannot be nested on other materialized views.
You can use only the standard SQL dialect for materialized views.
SHOW
DDL statements are not supported for materialized views.ALTER
DDL statements, other thanALTER
withSET OPTIONS
, are not supported for materialized views.If you delete the base table without first deleting the materialized view, queries over the materialized view fail, as do refreshes. If you decide to recreate the base table, you must also recreate the materialized view.
After a materialized view is created on top of a partitioned base table, the table's partition expiration cannot be changed. To change the partition expiration of the base table, you need to first delete all materialized views that are based on the partitioned table.
There are some limitations on the use of
ARRAY_AGG
andJOIN
statements in materialized views. For more information, seeARRAY_AGG
considerations andJOIN
considerations.
Materialized views pricing
Costs are associated with the following aspects of BigQuery Materialized Views:
- Querying materialized views.
- Maintaining materialized views, such as when materialized views are refreshed.
- Storing materialized view tables.
Component | On-demand pricing | Flat-rate pricing |
---|---|---|
Querying | Bytes processed by materialized views and any necessary portions of the base tables.1 | Slots are consumed during query time. |
Maintenance | Bytes processed during refresh time. | Slots are consumed during refresh time. |
Storage | Bytes stored in materialized views. | Bytes stored in materialized views. |
1If the base table is append-only, then only the delta since the last materialized view refresh is processed from the base table. For more information, see Incremental updates.
Storage cost details
For AVG
, ARRAY_AGG
, and APPROX_COUNT_DISTINCT
aggregate values in a
materialized view, the final value is not directly stored. Instead,
BigQuery internally stores a materialized view as an intermediate
sketch, which is used to produce the final value.
As an example, consider a materialized view that's created with the following command:
CREATE MATERIALIZED VIEW project-id.my_dataset.my_mv_table AS SELECT date, AVG(net_paid) AS avg_paid FROM project-id.my_dataset.my_base_table GROUP BY date
While the avg_paid
column is rendered as NUMERIC
or FLOAT64
to the user,
internally it is stored as BYTES
, with its content being an intermediate
sketch in proprietary format. For the purpose of data size calculation, the
column is treated as BYTES
.
Next steps
- Learn how to Create and use materialized views.
- Learn about Materialized views best practices.
- Read the Materialized views FAQ.