Materialized views FAQ

This page provides answers to frequently asked questions about BigQuery Materialized Views.

When should I use scheduled queries versus materialized views?

Scheduled queries are a convenient way to run arbitrarily complex calculations periodically. Each time the query runs, it is being run fully. The previous results are not used, and you pay the full price for the query. Scheduled queries are great when you don't need the freshest data and you have a high tolerance for data staleness.

Materialized views are suited for when you need to query the latest data while cutting down latency and cost by reusing the previously computed result. You can use materialized views as pseudo-indexes, accelerating queries to the base table without updating any existing workflows.

As a general guideline, whenever possible and if you are not running arbitrarily complex calculations, use materialized views.

How fast is a change in the base table reflected when querying the materialized view?

Immediately. Whether you stream into the base table, bulk load data into it, or run a query that writes the result into the base table, the new table state will be reflected immediately when you perform a query. This applies whether you are querying a materialized view over the base table, or directly querying the base table and the execution plan leverages the materialized view.

How frequently is a materialized view being refreshed?

Each materialized view is automatically refreshed (recomputed) when the base table changes, but not more often than a system-defined interval (currently 30 minutes). In addition, users can also control the refresh frequency by setting up an automatic refresh, and also by forcing a manual refresh. Automatic refresh is more efficient and less error-prone than manually maintaining the equivalent of a materialized view at the application level. For details about refreshes, see Maintaining materialized views.

How do I find out if a query was rewritten by the optimizer to utilize a materialized view?

Inspect the query plan. If a query was rewritten by BigQuery to leverage a materialized view, then the query plan will contain a READ my_mv_table step, where my_mv_table is the name of the materialized view used by the query rewrite.

How do I manage materialized views through the Cloud Console?

Materialized views are listed on the Dataset page within the Cloud Console. As an example, the following shows how a materialized view's schema appears:

Materialized view schema in Google Cloud Console

How do I disable query rewrite for a particular query?

The BigQuery query optimizer will automatically rewrite your query to leverage a materialized view. One reason you might want to disable query rewrite is to compare results or performance, between a query with and without a materialized view.

To disable query rewrite, add a "WHERE RAND()<1" filter to your query over the base table. RAND() is a non-deterministic function, so the query will not be rewritten. Because RAND()<1 always evaluates to true, it will not affect your query result.

Does a query trigger an automatic refresh of the materialized view?

No, a query doesn't automatically trigger a materialized view refresh.

Why am I getting these errors?

The following lists details on common errors.

Error: "Materialized view query contains unsupported feature"

This error can occur during when you create a materialized view. Some causes are:

  • The query contains a HAVING clause:

    -- Results in an error
    SELECT date, COUNT(event) count FROM base_table GROUP BY date HAVING count >= 100
    
  • The query contains a computation on top of an aggregation function:

    -- Results in an error
    SELECT date, ROUND(SUM(net_paid)) as sum_paid FROM base_table GROUP BY date
    

Materialized views must contain only bare aggregation functions and must not use the HAVING clause. Any post-filtering and post-computation must be put on top of a materialized view. For example, by creating a logical view:

CREATE MATERIALIZED VIEW my_dataset.my_mv AS
SELECT date, SUM(net_paid) sum_paid
FROM my_project.my_dataset.my_base_table
GROUP BY date

CREATE VIEW my_dataset.my_view AS
SELECT date, ROUND(sum_paid) sum_paid
FROM my_project.my_dataset.my_mv
WHERE sum_paid > 10.0

-- Now you can query the regular view
SELECT ... FROM my_dataset.my_view

Error "An internal error occurred and the request could not be completed"

This error can occur when you create a materialized view. A common reason for this error is a post-computation present on top of certain aggregate functions, such as AVG, ARRAY_AGG, and APPROX_COUNT_DISTINCT.

Example of a statement that produces the error:

-- Results in an error
CREATE MATERIALIZED VIEW my_dataset.my_mv AS
SELECT ROUND(AVG(x)) avg_x,
  ARRAY_AGG(x ORDER BY z LIMIT 1)[OFFSET(0)] top_x,
    APPROX_COUNT_DISTINCT(x) * 10 approx_cnt_x
FROM my_project.my_dataset.my_table

Instead, move [OFFSET(0)] into a regular view on top of the materialized view:

CREATE MATERIALIZED VIEW my_dataset.my_mv AS
SELECT AVG(x) avg_x,
  ARRAY_AGG(x ORDER BY z LIMIT 1) top_x_arr,
  APPROX_COUNT_DISTINCT(x) approx_cnt_x
FROM my_project.my_dataset.my_table

CREATE VIEW my_dataset.my_view AS
SELECT ROUND(avg_x) as avg_x,
  top_x_arr[OFFSET(0)] as top_x,
  approx_cnt_x * 10 as approx_cnt_x
FROM my_project.my_dataset.my_mv

-- Now you can query the regular view
SELECT ... FROM my_dataset.my_view