Creating and using materialized views

This document describes how to create and use materialized views using BigQuery Materialized Views. Before you read this document, familiarize yourself with Introduction to materialized views.

Required permissions

Before you can work with a materialized view, you must have the required permissions for the materialized view. The following table lists the types of operations that you can perform on materialized views, the commands and methods that you can use to perform those operations, the required permissions for each operation, and the default roles that include those permissions.

For more information about BigQuery Identity and Access Management (IAM), see Predefined roles and permissions.

Operation type Command or method Required permissions Default roles
Create CREATE MATERIALIZED VIEW bq mk --materialized view tables.insert bigquery.tables.create bigquery.dataEditor bigquery.dataOwner bigquery.admin
Query Standard SQL query bq query tables.getData bigquery.dataViewer bigquery.dataEditor bigquery.dataOwner bigquery.admin
Update ALTER MATERIALIZED VIEW bq query bq update tables.patch tables.update bigquery.tables.get bigquery.tables.update bigquery.dataEditor bigquery.dataOwner bigquery.admin
Delete DROP MATERIALIZED VIEW bq query bq rm tables.delete bigquery.tables.get bigquery.tables.delete bigquery.dataEditor bigquery.dataOwner bigquery.admin
Manual refresh CALL BQ.REFRESH_MATERIALIZED_VIEW bigquery.tables.getData bigquery.tables.update bigquery.tables.updateData bigquery.dataEditor bigquery.dataOwner bigquery.admin

Creating materialized views

You can create BigQuery materialized views through the Google Cloud Console, the bq command-line tool, or the BigQuery API.

For purposes of these examples, assume your base table is named my_base_table and has this schema:

Column name Type
product_id integer
clicks integer

Also assume that you want a materialized view that summarizes the number of clicks per product_id. The following steps create a materialized view named my_mv_table that has the following schema.

Column name Type
product_id integer
sum_clicks integer

To create a materialized view:

Console

Data definition language (DDL) statements allow you to create and modify tables and views using standard SQL query syntax.

See more on Using data definition language statements.

To create a materialized view in the Cloud Console by using a DDL statement:

  1. In the Cloud Console, go to the BigQuery page.

    Go to BigQuery

  2. Click Compose new query.

  3. Type your CREATE MATERIALIZED VIEW DDL statement into the Query editor text area.

    CREATE MATERIALIZED VIEW  project-id.my_dataset.my_mv_table
    AS SELECT product_id, SUM(clicks) AS sum_clicks
    FROM  project-id.my_dataset.my_base_table
    GROUP BY 1
    

    where:

    • project-id is your project ID.
    • my_dataset is the ID of a dataset in your project.
    • my_mv_table is the ID of the materialized view that you're creating.
    • my_base_table is the ID of a table in your dataset that serves as the base table for your materialized view.
    • product_id is a column from the base table.
    • clicks is a column from the base table.
    • sum_clicks is a column in the materialized view that you are creating.
  4. Click Run.

Unless you disable automatic refreshes, BigQuery starts an asynchronous full refresh for the materialized view. The query might return success right away, but the initial refresh might still be running. When the materialized view is successfully created, it appears in the Datasets pane.

bq

Use the bq query command and supply the DDL statement as the query parameter.

bq query --use_legacy_sql=false '
CREATE MATERIALIZED VIEW project-id.my_dataset.my_mv_table
AS SELECT product_id, SUM(clicks) AS sum_clicks
FROM  project-id.my_dataset.my_base_table
GROUP BY 1'

where:

  • project-id is your project ID.
  • my_dataset is the ID of a dataset in your project.
  • my_mv_table is the ID of the materialized view that you're creating.
  • my_base_table is the ID of a table in your dataset that serves as the base table for your materialized view.
  • product_id is a column from the base table.
  • clicks is a column from the base table.
  • sum_clicks is a column in the materialized view that you are creating.

Unless you disable automatic refreshes, BigQuery starts an asynchronous full refresh for the materialized view. The query might return success right away, but the initial refresh might still be running. When the materialized view is successfully created, it appears in the Datasets pane.

Alternatively, you can use the bq mk command with the --materialized_view argument to create a materialized view. The following arguments work with the --materialized_view argument:

  • --enable_refresh: Specifies whether automatic refresh is enabled.
  • --refresh_interval_ms: Specifies the time, in milliseconds, for the refresh interval.

API

Call the tables.insert method with a defined materializedView resource as part of your API request. The materializedView resource contains a query field. For example:

{
  "kind": "bigquery#table",
  "tableReference": {
    "projectId": "project-id",
    "datasetId": "my_dataset",
    "tableId": "my_mv_table"
  },
  "materializedView": {
    "query": "select product_id,sum(clicks) as
                sum_clicks from project-id.my_dataset.my_base_table
                group by 1"
  }
}

where:

  • project-id is your project ID.
  • my_dataset is the ID of a dataset in your project.
  • my_mv_table is the ID of the materialized view that you're creating.
  • my_base_table is the ID of a table in your dataset that serves as the base table for your materialized view.
  • product_id is a column from the base table.
  • clicks is a column from the base table.
  • sum_clicks is a column in the materialized view that you are creating.

Unless you disable automatic refreshes, BigQuery starts an asynchronous full refresh for the materialized view. The query might return success right away, but the initial refresh might still be running. When the materialized view is successfully created, it appears in the Datasets pane.

Java

Before trying this sample, follow the Java setup instructions in the BigQuery Quickstart Using Client Libraries. For more information, see the BigQuery Java API reference documentation.

import com.google.cloud.bigquery.BigQuery;
import com.google.cloud.bigquery.BigQueryException;
import com.google.cloud.bigquery.BigQueryOptions;
import com.google.cloud.bigquery.MaterializedViewDefinition;
import com.google.cloud.bigquery.TableId;
import com.google.cloud.bigquery.TableInfo;

// Sample to create materialized view
public class CreateMaterializedView {

  public static void main(String[] args) {
    // TODO(developer): Replace these variables before running the sample.
    String datasetName = "MY_DATASET_NAME";
    String tableName = "MY_TABLE_NAME";
    String materializedViewName = "MY_MATERIALIZED_VIEW_NAME";
    String query =
        String.format(
            "SELECT MAX(TimestampField) AS TimestampField, StringField, "
                + "MAX(BooleanField) AS BooleanField "
                + "FROM %s.%s GROUP BY StringField",
            datasetName, tableName);
    createMaterializedView(datasetName, materializedViewName, query);
  }

  public static void createMaterializedView(
      String datasetName, String materializedViewName, String query) {
    try {
      // Initialize client that will be used to send requests. This client only needs to be created
      // once, and can be reused for multiple requests.
      BigQuery bigquery = BigQueryOptions.getDefaultInstance().getService();

      TableId tableId = TableId.of(datasetName, materializedViewName);

      MaterializedViewDefinition materializedViewDefinition =
          MaterializedViewDefinition.newBuilder(query).build();

      bigquery.create(TableInfo.of(tableId, materializedViewDefinition));
      System.out.println("Materialized view created successfully");
    } catch (BigQueryException e) {
      System.out.println("Materialized view was not created. \n" + e.toString());
    }
  }
}

Access control

You can grant access to a materialized view at the dataset level, the view level, or the column level. You can also set access at a higher level in the IAM resource hierarchy.

Querying materialized views

Query your materialized views directly, the same way you query a regular table or standard view.

When you query a base table that has a materialized view, the query optimizer might automatically rewrite the query to use the cached result that is stored in the materialized view. The query plan shows that the query was rewritten to use the materialized view.

If multiple materialized views can be used to rewrite a query, the materialized view with the smallest estimated number of rows to be scanned is used.

If you query a materialized view that is not up-to-date, data from the materialized view is combined with delta changes from the base table to compute up-to-date results. Reasons that a materialized view might not be up-to-date include:

  • The materialized view is not set up for automatic refresh because the user decided to control refreshes manually.

  • The base table is updated too frequently, so the frequency refresh cap is being enforced.

  • The base table was changed only seconds ago, so the materialized view hasn't yet caught up with the base table.

Even if a materialized view is not up-to-date due to one or more of these reasons, queries on the materialized view are still up-to-date, because changes from the base table are also incorporated in the query results.

For information about running queries, see Overview of querying BigQuery data.

Altering a materialized view

You can alter a materialized view through the Cloud Console or the bq command-line tool, by using DDL with ALTER MATERIALIZED VIEW and SET OPTIONS.

The following shows an example that sets enable_refresh to true. Adjust as needed for your use case.

Console

To alter a materialized view in the Cloud Console by using a DDL statement:

  1. In the Cloud Console, open the BigQuery page.

Go to BigQuery

  1. Click Compose new query.

  2. Type your ALTER MATERIALIZED VIEW DDL statement into the Query editor text area.

    ALTER MATERIALIZED VIEW project-id.my_dataset.my_mv_table
    SET OPTIONS (enable_refresh=true)
    

    where:

    • project-id is your project ID.
    • my_dataset is the ID of a dataset in your project.
    • my_mv_table is the ID of the materialized view that you're altering.
  3. Click Run.

bq

Use the bq query command and supply the DDL statement as the query parameter.

ALTER MATERIALIZED VIEW project-id.my_dataset.my_mv_table
SET OPTIONS (enable_refresh=true)

where:

  • project-id is your project ID.
  • my_dataset is the ID of a dataset in your project.
  • my_mv_table is the ID of the materialized view that you're altering.

Alternatively, you can run the bq update command.

bq update \
--enable_refresh=true \
--refresh_interval_ms= \
project-id.my_dataset.my_mv_table

Java

Before trying this sample, follow the Java setup instructions in the BigQuery Quickstart Using Client Libraries. For more information, see the BigQuery Java API reference documentation.

import com.google.cloud.bigquery.BigQuery;
import com.google.cloud.bigquery.BigQueryException;
import com.google.cloud.bigquery.BigQueryOptions;
import com.google.cloud.bigquery.QueryJobConfiguration;
import com.google.cloud.bigquery.TableResult;

public class QueryMaterializedView {

  public static void main(String[] args) throws InterruptedException {
    // TODO(developer): Replace these variables before running the sample.
    String datasetName = "MY_DATASET_NAME";
    String materializedViewName = "MY_MATERIALIZED_VIEW_NAME";
    String query = String.format("SELECT * FROM %s.%s", datasetName, materializedViewName);
    queryMaterializedView(query);
  }

  public static void queryMaterializedView(String query) throws InterruptedException {
    try {
      // Initialize client that will be used to send requests. This client only needs to be created
      // once, and can be reused for multiple requests.
      BigQuery bigquery = BigQueryOptions.getDefaultInstance().getService();

      TableResult results = bigquery.query(QueryJobConfiguration.of(query));
      results
          .iterateAll()
          .forEach(row -> row.forEach(val -> System.out.printf("%s\n", val.toString())));

      System.out.println("Query performed successfully.");
    } catch (BigQueryException e) {
      System.out.println("Query was not performed. \n" + e.toString());
    }
  }
}

Manipulating a materialized view

A materialized view can be manipulated only by the CREATE, DROP, or ALTER statements, along with their API equivalents, and a manual refresh.

The following operations are not allowed on a materialized view.

  • Running Copy, Import, or Export jobs where either the source or the destination is a materialized view.

  • Writing query results into a materialized view.

  • Calling tabledata.list.

  • Using the BigQuery Storage Read API.

The following example alters a material view:

Java

Before trying this sample, follow the Java setup instructions in the BigQuery Quickstart Using Client Libraries. For more information, see the BigQuery Java API reference documentation.

import com.google.cloud.bigquery.BigQuery;
import com.google.cloud.bigquery.BigQueryException;
import com.google.cloud.bigquery.BigQueryOptions;
import com.google.cloud.bigquery.MaterializedViewDefinition;
import com.google.cloud.bigquery.Table;
import com.google.cloud.bigquery.TableId;

// Sample to alter materialized view
public class AlterMaterializedView {

  public static void main(String[] args) {
    // TODO(developer): Replace these variables before running the sample.
    String datasetName = "MY_DATASET_NAME";
    String materializedViewName = "MY_MATERIALIZED_VIEW_NAME";
    alterMaterializedView(datasetName, materializedViewName);
  }

  public static void alterMaterializedView(String datasetName, String materializedViewName) {
    try {
      // Initialize client that will be used to send requests. This client only needs to be created
      // once, and can be reused for multiple requests.
      BigQuery bigquery = BigQueryOptions.getDefaultInstance().getService();

      TableId tableId = TableId.of(datasetName, materializedViewName);

      // Get existing materialized view
      Table table = bigquery.getTable(tableId);
      MaterializedViewDefinition materializedViewDefinition = table.getDefinition();
      // Alter materialized view
      materializedViewDefinition
          .toBuilder()
          .setEnableRefresh(true)
          .setRefreshIntervalMs(1000L)
          .build();
      table.toBuilder().setDefinition(materializedViewDefinition).build().update();
      System.out.println("Materialized view altered successfully");
    } catch (BigQueryException e) {
      System.out.println("Materialized view was not altered. \n" + e.toString());
    }
  }
}

Deleting a materialized view

You can delete a materialized view through the Cloud Console, the bq command-line tool, or the API.

Console

To delete a materialized view in the Cloud Console by using a DDL statement:

  1. In the Cloud Console, open the BigQuery page.

Go to BigQuery

  1. Click Compose new query.

  2. Type your DELETE MATERIALIZED VIEW DDL statement into the Query editor text area.

    DROP MATERIALIZED VIEW project-id.my_dataset.my_mv_table
    

    where:

    • project-id is your project ID.
    • my_dataset is the ID of a dataset in your project.
    • my_mv_table is the ID of the materialized view that you're deleting.
  3. Click Run.

bq

Use the bq query command and supply the DDL statement as the query parameter.

bq query --use_legacy_sql=false '
DROP MATERIALIZED VIEW
project-id.my_dataset.my_mv_table'

where:

  • project-id is your project ID.
  • my_dataset is the ID of a dataset in your project.
  • my_mv_table is the ID of the materialized view that you're deleting.

Alternatively, you can use the bq rm command to delete the materialized view.

API

Call the tables.delete method and specify values for the projectId, datasetId, and tableId parameters:

  • Assign the projectId parameter to your project ID.
  • Assign the datasetId parameter to your dataset ID.
  • Assign the tableId parameter to the table ID of the materialized view that you're deleting.

Java

Before trying this sample, follow the Java setup instructions in the BigQuery Quickstart Using Client Libraries. For more information, see the BigQuery Java API reference documentation.

import com.google.cloud.bigquery.BigQuery;
import com.google.cloud.bigquery.BigQueryException;
import com.google.cloud.bigquery.BigQueryOptions;
import com.google.cloud.bigquery.TableId;

// Sample to delete materialized view
public class DeleteMaterializedView {

  public static void main(String[] args) {
    // TODO(developer): Replace these variables before running the sample.
    String datasetName = "MY_DATASET_NAME";
    String materializedViewName = "MY_MATERIALIZED_VIEW_NAME";
    deleteMaterializedView(datasetName, materializedViewName);
  }

  public static void deleteMaterializedView(String datasetName, String materializedViewName) {
    try {
      // Initialize client that will be used to send requests. This client only needs to be created
      // once, and can be reused for multiple requests.
      BigQuery bigquery = BigQueryOptions.getDefaultInstance().getService();

      TableId tableId = TableId.of(datasetName, materializedViewName);

      boolean success = bigquery.delete(tableId);
      if (success) {
        System.out.println("Materialized view deleted successfully");
      } else {
        System.out.println("Materialized view was not found");
      }
    } catch (BigQueryException e) {
      System.out.println("Materialized view was not found. \n" + e.toString());
    }
  }
}

Monitoring materialized views

You can get information about materialized views and materialized views refresh jobs by using the BigQuery API.

Materialized views monitoring

For discovery of materialized views, call the tables.list method or query the INFORMATION_SCHEMA.TABLES table.

To retrieve the properties of a materialized view, call the tables.get method, or query the INFORMATION_SCHEMA.TABLE_OPTIONS table.

Materialized views are not listed in the INFORMATION_SCHEMA.VIEWS table.

Materialized views refresh jobs monitoring

For discovery of materialized view refresh jobs, list them by calling the jobs.list method. To retrieve details about the jobs, call the jobs.get method. Automatic refresh jobs contain the materialized_view_refresh prefix within the job ID, and they are started by a BigQuery admin account.

For example:

SELECT job_id, total_slot_ms, total_bytes_processed
FROM region-US.INFORMATION_SCHEMA.JOBS_BY_PROJECT
WHERE job_id LIKE "%materialized_view_refresh_%"
LIMIT 10

Supported materialized views

Materialized views must be an aggregation over a single table. The use of GROUP BY is optional.

Materialized views use a restricted SQL syntax. Queries must use the following pattern:

SELECT
  expression [ [ AS ] alias ] ] [, ...]
FROM from_item [, ...]
[ WHERE bool_expression ]
[ GROUP BY expression [, ...] ]

Only the following aggregation functions are currently supported:

  • ANY_VALUE (but not over STRUCT)
  • APPROX_COUNT_DISTINCT
  • ARRAY_AGG (but not over ARRAY or STRUCT)
  • AVG
  • BIT_AND
  • BIT_OR
  • BIT_XOR
  • COUNT
  • COUNTIF
  • HLL_COUNT.INIT
  • LOGICAL_AND
  • LOGICAL_OR
  • MAX
  • MIN
  • SUM

Certain subqueries are supported, as long as they don't add any computation or filtering on top of an aggregation. For example, the following SQL queries are supported:

WITH tmp AS (
  SELECT TIMESTAMP_TRUNC(ts, HOUR) as ts_hour, *
  FROM dataset.table
)
SELECT ts_hour, COUNT(*) as cnt
FROM tmp
GROUP BY ts_hour

WITH tmp AS (
  SELECT TIMESTAMP_TRUNC(ts, HOUR) as ts_hour, COUNT(*) as cnt
  FROM dataset.table
  GROUP BY 1
)
SELECT * FROM tmp

SELECT ts_hour, COUNT(*) as cnt
FROM (
  SELECT TIMESTAMP_TRUNC(ts, HOUR) as ts_hour, *
  FROM dataset.table
)
GROUP BY ts_hour

SELECT * FROM (
  SELECT TIMESTAMP_TRUNC(ts, HOUR) as ts_hour, COUNT(*) as cnt
  FROM dataset.table
  GROUP BY 1
)

The following SQL query is not supported for materialized views, because it applies computation on top of an aggregation:

-- Not supported for a materialized view
WITH tmp AS (
  SELECT TIMESTAMP_TRUNC(ts, HOUR) as ts_hour, COUNT(*) as cnt
  FROM dataset.table
  GROUP BY 1
)
SELECT TIMESTAMP_TRUNC(ts_hour, DAY) as ts_day, cnt+1 as cnt FROM tmp

The following SQL query is not supported for materialized views, because it applies filtering on top of an aggregation:

-- Not supported for a materialized view
SELECT ts_hour, cnt
FROM (
  SELECT TIMESTAMP_TRUNC(ts, HOUR) as ts_hour, COUNT(*) as cnt
  FROM dataset.table
  GROUP BY 1
)
WHERE ts_hour != TIMESTAMP(DATE('2020-01-01'))

FROM clause

The FROM clause must include a single table and can unnest one or more array expressions.

Examples

FROM mytable

FROM mytable AS t, t.struct_column.array_field AS x

FROM mytable AS t LEFT JOIN UNNEST(t.array_column) AS x

FROM mytable AS t, t.array_column AS x, x.array_field AS y

FROM mytable AS t, UNNEST(SPLIT(t.string_column)) AS x

WITH OFFSET is not supported.

Supported query rewrite patterns

The following query rewrite patterns are supported for base table queries and materialized view definitions.

Pattern 1

This pattern shows that grouping keys and aggregators in the query are a subset of those in the materialized view.

For this base table query:

SELECT
  ss_sold_date_sk,
  SUM(ss_net_profit) AS sum_profit
FROM store_sales
GROUP BY 1

The following is a materialized view query that shows the base table query's grouping keys and aggregators as a subset of the materialized view's query.

SELECT
  ss_store_sk,
  ss_sold_date_sk,
  SUM(ss_net_paid) AS sum_paid,
  SUM(ss_net_profit) AS sum_profit,
  COUNT(*) AS cnt_sales
FROM store_sales
GROUP BY 1, 2

Pattern 2

This pattern shows grouping keys in the base table query can be computed from grouping keys in the materialized view.

For this base table query:

SELECT
  DATE_TRUNC(DATE(l_shipdate) MONTH) as shipmonth,
  SUM(l_extendedprice) as sum_price
FROM lineitem
GROUP BY 1

The following is a materialized view query that computes grouping keys in the base table:

SELECT
  DATE(l_shipdate) as shipdate,
  SUM(l_extendedprice) as sum_price
FROM lineitem
GROUP BY 1

Pattern 3

This patterns shows filter expressions in the query can be derived from grouping keys in the materialized view, or match the materialized view exactly.

For this base table query:

SELECT
  DATE_TRUNC(DATE(l_shipdate) MONTH) as shipmonth,
  SUM(l_extendedprice) as sum_price
FROM lineitem
WHERE DATE(l_shipdate) >= DATE(2016, 01, 01) AND l_shipmode = "AIR"
GROUP BY 1

The following is a materialized view query that derives a filter expression from the base table:

SELECT
  DATE(l_shipdate) as shipdate,
  SUM(l_extendedprice) as sum_price
FROM lineitem
WHERE l_shipmode = "AIR"
GROUP BY 1

Pattern 4

This pattern shows expressions in the base table query select a subset of the data in the materialized view.

For this base table query:

SELECT
  l_discount,
  COUNT(*) as cnt
FROM lineitem
WHERE l_discount > 20.0
GROUP BY 1

The following is a materialized view query whose results are a superset of the base table query:

SELECT
  l_discount,
  COUNT(*) as cnt
FROM lineitem
WHERE l_discount IS NOT NULL
GROUP BY 1

Incremental updates

In cases where the base table keeps changing with appends only, the query that uses the materialized view scans all of the materialized view data, plus a delta in the base table since the last refresh. This applies whether the materialized view is explicitly referenced or selected by the query optimizer, resulting in faster and less expensive queries in both cases.

However, if there were any updates or deletions in the base table since the last materialized view refresh, then the materialized view might not be scanned.

The following are examples of actions that cause an update or deletion:

  • DML UPDATE
  • DML MERGE
  • DML DELETE
  • truncation
  • partition expiration
  • console, bq command-line, and API equivalents of the preceding items in this list

If these actions occur, the materialized view query might not achieve any savings, until the next view refresh occurs. In effect, any update or deletion in the base table invalidates some portion of the materialized view state. For unpartitioned views, the entire view is invalidated. For partitioned views, only the affected partitions are invalidated in most cases.

Note that data from the BigQuery streaming buffer of the base table is not saved into a materialized view. A streaming buffer is still being scanned fully, regardless of whether a materialized view is used.

Working with partitioned and clustered tables

If the base table is partitioned, then you can partition a materialized view on the same partitioning column. For time-based partitions, the granularity must match (hourly, daily, monthly, or yearly). For integer-range partitions, the range specification must exactly match. You cannot partition a materialized view over a non-partitioned base table.

If the base table is partitioned by ingestion time, then a materialized view can group by the _PARTITIONDATE column of the base table, and also partition by it.

Materialized views can be clustered by arbitrary columns, subject to the BigQuery clustered table limitations.

Example 1

In this example, the base table is partitioned on the transaction_time column with daily partitions. The materialized view is partitioned on the same column and clustered on the employee_id column.

CREATE TABLE project-id.my_dataset.my_base_table(
  employee_id INT64,
  transaction_time TIMESTAMP)
PARTITION BY DATE(transaction_time)
OPTIONS (partition_expiration_days = 2);

CREATE MATERIALIZED VIEW project-id.my_dataset.my_mv_table
PARTITION BY DATE(transaction_time)
CLUSTER BY employee_id
AS SELECT
  employee_id,
  transaction_time,
  COUNT(1) AS cnt
FROM my_dataset.my_base_table
GROUP BY 1, 2;

Example 2

In this example, the base table is partitioned by ingestion time with daily partitions. The materialized view selects the ingestion time as a column named date. The materialized view is grouped by the date column and partitioned by the same column.

CREATE TABLE project-id.my_dataset.my_base_table(
  employee_id INT64)
PARTITION BY _PARTITIONDATE
OPTIONS (partition_expiration_days = 2);

CREATE MATERIALIZED VIEW project-id.my_dataset.my_mv_table
PARTITION BY date
CLUSTER BY employee_id
AS SELECT
  employee_id,
  _PARTITIONDATE as date,
  COUNT(1) AS count
FROM my_dataset.my_base_table
GROUP BY 1, 2;

Example 3

In this example, the base table is partitioned on a TIMESTAMP column named transaction_time, with daily partitions. The materialized view defines a column named transaction_hour, using the TIMESTAMP_TRUNC function to truncate the value to the nearest hour. The materialized view is grouped by transaction_hour and also partitioned by it.

Note the following:

  • The truncation function that is applied to the partitioning column must be at least as granular as the partitioning of the base table. For example, if the base table uses daily partitions, the truncation function cannot use MONTH or YEAR granularity.

  • In the materialized view's partition specification, the granularity has to match the base table.

CREATE TABLE project-id.my_dataset.my_base_table(
  employee_id INT64,
  transaction_time TIMESTAMP)
PARTITION BY DATE(transaction_time);

CREATE MATERIALIZED VIEW project-id.my_dataset.my_mv_table
PARTITION BY DATE(transaction_hour)
AS SELECT
  employee_id,
  TIMESTAMP_TRUNC(transaction_time, HOUR) as transaction_hour,
  COUNT(1) AS cnt
FROM my_dataset.my_base_table
GROUP BY 1, 2;

Partition alignment

If a materialized view is partitioned, BigQuery ensures that its partitions are aligned with the partitions of the base table. For example, a row from partition 2020-01-01 of the base table is being aggregated to a row from partition 2020-01-01 of the materialized view.

Partition alignment helps guarantee that a materialized view is maintained efficiently. During refresh, the only partitions of the materialized view that are updated are those that have appends, updates, or deletions in the corresponding partitions of the base table. (Although in rare cases, a deletion or update in the base table might cause the entire view to be recomputed.) At query time, partitioning alignment ensures that a partition filter is applied to both the materialized view scan and the base table scan.

Partition expiration

Partition expiration can't be set on materialized views. A materialized view implicitly inherits the partition expiration time from the base table. Materialized view partitions are aligned with the base table partitions, so they expire synchronously.

After a materialized view is created on top of a partitioned base table, the table's partition expiration cannot be changed. To change the partition expiration of the base table, you must first delete all materialized views that you created over that table.

Refreshing materialized views

You can specify whether BigQuery uses automatic or manual refreshes to update precomputed results from a base table. If you don't want to use the default values, then you can configure the refresh settings when you create a materialized view. You can also alter the refresh settings after the materialized view is created.

You can manually refresh a materialized view at any time.

Automatic refresh

By default, materialized views are automatically refreshed within 5 minutes of a change to the base table, but no more frequently than every 30 minutes. Examples of changes include row insertions or row deletions.

Automatic refresh can be enabled or disabled at any time.

To turn automatic refresh off when you create a table, set enable_refresh to false.

CREATE MATERIALIZED VIEW project-id.my_dataset.my_mv_table
PARTITION BY RANGE_BUCKET(column, buckets)
OPTIONS (enable_refresh = false)
AS SELECT ...

For an existing materialized view, you can modify the enable_refresh value using ALTER MATERIALIZED VIEW.

ALTER MATERIALIZED VIEW project-id.my_dataset.my_mv_table
SET OPTIONS (enable_refresh = true)

Even if automatic refresh is disabled, the materialized view can be manually refreshed at any time.

Setting the frequency cap

You can configure a frequency cap on how often automatic refresh is run. By default, materialized views are refreshed no more often than every 30 minutes.

The refresh frequency cap can be changed at any time.

To set a refresh frequency cap when you create a materialized view, set refresh_interval_minutes in DDL (or refresh_interval_ms in the API and bq command-line tool), to the value you want.

CREATE MATERIALIZED VIEW project-id.my_dataset.my_mv_table
OPTIONS (enable_refresh = true, refresh_interval_minutes = 60)
AS SELECT ...

Similarly, you can set the frequency cap when you modify a table. This example assumes you have already enabled automatic refresh, and just want to change the frequency cap:

ALTER MATERIALIZED VIEW project-id.my_dataset.my_mv_table
SET OPTIONS (refresh_interval_minutes = 60)

The minimum refresh frequency cap is 1 minute. The maximum refresh frequency cap is 7 days.

You can perform a manual refresh of a materialized view at any time, and its timing is not subject to the frequency cap.

Best-effort

Automatic refresh is performed on a best-effort basis. BigQuery attempts to start a refresh within 5 minutes of a change in the base table (if the previous refresh was done earlier than 30 minutes ago), but it doesn't guarantee that the refresh will be started at that time, nor does it guarantee when it will complete. (Querying materialized views reflects the latest state of the base table, but if the view wasn't refreshed recently, the query cost/latency can be higher than expected.)

Automatic refresh is treated similarly to a query with batch priority. If the materialized view's project does not have the capacity at the moment, the refresh is delayed. If the project contains many views whose refresh turns out to be expensive, each individual view may lag significantly relative to its base table.

Manual refresh

To update the data in the materialized view, call the BQ.REFRESH_MATERIALIZED_VIEW system procedure at any time. When this procedure is called, BigQuery identifies the changes that have taken place in the base table and applies those changes to the materialized view. The query to run BQ.REFRESH_MATERIALIZED_VIEW finishes when the refresh is complete.

CALL BQ.REFRESH_MATERIALIZED_VIEW('project-id.my_dataset.my_mv_table')

Next steps