This page provides answers to frequently asked questions about BigQuery Materialized Views.
When should I use scheduled queries versus materialized views?
Scheduled queries are a convenient way to run arbitrarily complex calculations periodically. Each time the query runs, it is being run fully. The previous results are not used, and you pay the full price for the query. Scheduled queries are great when you don't need the freshest data and you have a high tolerance for data staleness.
Materialized views are suited for when you need to query the latest data while cutting down latency and cost by reusing the previously computed result. You can use materialized views as pseudo-indexes, accelerating queries to the base table without updating any existing workflows.
As a general guideline, whenever possible and if you are not running arbitrarily complex calculations, use materialized views.
How fast is a change in the base table reflected when querying the materialized view?
Immediately. Whether you stream into the base table, bulk load data into it, or run a query that writes the result into the base table, the new table state will be reflected immediately when you perform a query. This applies whether you are querying a materialized view over the base table, or directly querying the base table and the execution plan leverages the materialized view.
How frequently is a materialized view being refreshed?
Each materialized view is automatically refreshed (recomputed) when the base table changes, but not more often than a system-defined interval (currently 30 minutes). In addition, users can also control the refresh frequency by setting up an automatic refresh, and also by forcing a manual refresh. Automatic refresh is more efficient and less error-prone than manually maintaining the equivalent of a materialized view at the application level. For details about refreshes, see Maintaining materialized views.
How do I find out if a query was rewritten by the optimizer to utilize a materialized view?
Inspect the query plan. If a query was rewritten by BigQuery
to leverage a materialized view, then the query plan will contain a
my_mv_table step, where
my_mv_table is the name of the materialized view used
by the query rewrite.
How do I manage materialized views through the Cloud Console?
Materialized views are listed on the Dataset page within the Cloud Console. As an example, the following shows how a materialized view's schema appears:
How do I disable query rewrite for a particular query?
The BigQuery query optimizer will automatically rewrite your query to leverage a materialized view. One reason you might want to disable query rewrite is to compare results or performance, between a query with and without a materialized view.
To disable query rewrite, add a
"WHERE RAND()<1" filter to your query over the
RAND() is a non-deterministic function, so the query will not be
RAND()<1 always evaluates to true, it will not affect your
Does a query trigger an automatic refresh of the materialized view?
No, a query doesn't automatically trigger a materialized view refresh.
Why am I getting these errors?
The following lists details on common errors.
Error: "Materialized view query contains unsupported feature"
This error can occur during when you create a materialized view. Some causes are:
The query contains a
-- Results in an error SELECT date, COUNT(event) count FROM base_table GROUP BY date HAVING count >= 100
The query contains a computation on top of an aggregation function:
-- Results in an error SELECT date, ROUND(SUM(net_paid)) as sum_paid FROM base_table GROUP BY date
Materialized views must contain only bare aggregation functions and must not use
HAVING clause. Any post-filtering and post-computation must be put on top
of a materialized view. For example, by creating a logical view:
CREATE MATERIALIZED VIEW my_dataset.my_mv AS SELECT date, SUM(net_paid) sum_paid FROM my_project.my_dataset.my_base_table GROUP BY date CREATE VIEW my_dataset.my_view AS SELECT date, ROUND(sum_paid) sum_paid FROM my_project.my_dataset.my_mv WHERE sum_paid > 10.0 -- Now you can query the regular view SELECT ... FROM my_dataset.my_view
Error "An internal error occurred and the request could not be completed"
This error can occur when you create a materialized view. A common reason for
this error is a post-computation present on top of certain aggregate
functions, such as
Example of a statement that produces the error:
-- Results in an error CREATE MATERIALIZED VIEW my_dataset.my_mv AS SELECT ROUND(AVG(x)) avg_x, ARRAY_AGG(x ORDER BY z LIMIT 1)[OFFSET(0)] top_x, APPROX_COUNT_DISTINCT(x) * 10 approx_cnt_x FROM my_project.my_dataset.my_table
[OFFSET(0)] into a regular view on top of the materialized view:
CREATE MATERIALIZED VIEW my_dataset.my_mv AS SELECT AVG(x) avg_x, ARRAY_AGG(x ORDER BY z LIMIT 1) top_x_arr, APPROX_COUNT_DISTINCT(x) approx_cnt_x FROM my_project.my_dataset.my_table CREATE VIEW my_dataset.my_view AS SELECT ROUND(avg_x) as avg_x, top_x_arr[OFFSET(0)] as top_x, approx_cnt_x * 10 as approx_cnt_x FROM my_project.my_dataset.my_mv -- Now you can query the regular view SELECT ... FROM my_dataset.my_view