Maintain freshness of in-memory column store data

This page describes how to maintain the freshness of data in the columnar engine's column store in AlloyDB Omni in a container.

As table rows are added, deleted, or updated, the columnar engine doesn't immediately update data in the column store. Instead, it marks affected content in the column store as invalid, and queries are planned and executed using both column-oriented data from the column store and row-oriented data from the row cache and database storage.

Over time, as more and more data blocks become invalid, query performance improvements provided by the columnar engine are reduced to the point where the columnar data needs to be refreshed to regain query performance improvements. When this refresh threshold is reached, the columnar engine uses a background job to refresh the invalid data in the column store. By default, the columnar engine uses a refresh threshold of 50, indicating that given content in the column store gets marked for refresh when 50% of its content has become invalid. You can change this refresh threshold to adjust the overhead of the automatic refresh and potentially gain better ongoing query performance for your application workload.

Additionally, you can manually refresh a table's data in the column store at any time. Usually, you perform a manual refresh when you want changes to a table to be reflected in the column store as soon as possible after making the changes.

Change the refresh threshold of the columnar engine

Edit the value of the google_columnar_engine.refresh_threshold_percentage database flag to change the refresh threshold of the columnar engine.

    ALTER SYSTEM SET google_columnar_engine.refresh_threshold_percentage=THRESHOLD;

THRESHOLD: an integer value between 1 and 100. This value specifies the percentage of data blocks in content that must become invalid for the content to be marked for refresh.

Manually refresh a table's data in the column store

Execute the google_columnar_engine_refresh SQL function to manually refresh a table's data in the in-memory column store.

SELECT google_columnar_engine_refresh(TABLE_NAME);

TABLE_NAME: a string containing the name of the table or the materialized view. If the table or the view is in a schema other than public, specify the name of the schema using the format SCHEMA_NAME.TABLE_NAME—for example, myschema.mytable.