This page provides an overview of the columnar engine that AlloyDB for PostgreSQL offers and shows how to use it.
The AlloyDB columnar engine accelerates SQL query processing of scans, joins, and aggregates by providing these components:
A column store that contains table and materialized-view data for selected columns, reorganized into a column-oriented format.
A columnar query planner and execution engine to support use of the column store in queries.
The columnar engine can be used on the primary instance, a read pool instance, or both. You can also use auto-columnarization to analyze your workload and automatically populate the column store with the columns that provide the best performance gain.
To use the columnar engine with a specific query, all columns in that query fragments, such as joins, scans, etc., must be in the column store.
By default, the columnar engine is set to use 30% of your instance's memory. Depending on your workload, memory usage, and if you have a read pool configured, you may choose to reduce the columnar engine memory allocation on your primary instance and allocate more memory to the read pool instance. To view and monitor memory usage by the columnar engine, see View column store memory usage. To modify the memory size used by the column store, see Configure the size of the column store. To find the recommended columnar engine memory size for your instance, see Recommend column store memory size.
Query types that benefit from the columnar engine
Certain queries can benefit from the columnar engine. The following is a list of operations and their query patterns that benefit most from the columnar engine:
Table scans
- It has selective filters, such as
WHERE
clauses. - It uses a small number of columns from a larger table or materialized view.
- It uses expressions such as
LIKE
,SUBSTR
, or `TRIM.
- It has selective filters, such as
Aggregation functions
- They only use the following expressions:
SUM
,MIN
,MAX
,AVG
, andCOUNT
. - They are at the beginning of the query of a columnar scan.
- They are ungrouped, or are grouped-by columns.
- They only use the following expressions:
ORDER-BY
: only if the operator is at the beginning of the query of a columnar scan.SORT
: only if the operator is at the beginning of the query of a columnar scan and sorts only on the base columns of the table or the materialized view.LIMIT
: only if the operator is at the beginning of the query of a columnar scan and is before anySORT
orGROUP BY
operators.INNER HASH JOIN
only if the keys used are columns and no join qualifiers are used.Selective joins only if the joins are at the beginning of the query of a columnar scan.
For more information on which queries work best with the columnar engine, whether
the columnar engine was used by a query, and how it was used, see
Verify usage of the columnar engine using EXPLAIN
.
How to use the columnar engine
To use the columnar engine in an AlloyDB instance, you perform these high-level steps:
Enable the engine on the instance.
Enabling the engine is a one-time operation and requires a restart.
Add columns to the column store.
To add columns to the column store, use one of the following methods:
Use auto-columnarization, which analyzes your workload and automatically adds columns.
Add the columns manually based on your knowledge of the workload on the databases in the instance.
You can track what's in the column store using the
g_columnar_relations
view, and, after columns have been added, you can use theEXPLAIN
statement to verify usage of the columnar engine in SQL queries.
For detailed instructions on how to use the columnar engine, see Configure the columnar engine.
What data you can add to the column store
There are some limitations on the data types and data sources you can use when adding columns to the column store.
Supported data types
The columnar engine supports only columns with the following built-in data types:
array
bigint
boolean
bytea
char
date
decimal
double precision
enum
float4
float8
integer
json
jsonb
numeric
real
serial
short
smallint
text
timestamp
timestamptz
uuid
varchar
The columnar engine ignores any attempts to manually add columns with unsupported data types to the column store.
Unsupported data sources
The columnar engine does not support tables or materialized views with the following attributes as data sources:
Non-leaf partitioned tables
Foreign tables
Tables or views with fewer than 5,000 rows
Columnar Engine limitations
- If you're running an analytical query on a column that has an index, the AlloyDB optimizer may choose to use row-store.
- Columns added manually to the column store aren't automatically removed. To force remove
manually added columns, use
google_columnar_engine_drop()
on your instance. - Auto-columnarization may dynamically add and remove columns based on query usage.
- Not all data types are supported by the columnar engine. To see supported data types, see supported data types.
Frequent updates to rows invalidate columnar data. To validate a table or a materialized view in the columnar store, you can either reduce the update frequency, or schedule more frequent the columnar engine refreshes.
You can compare the
invalid_block_count
andtotal_block_count
columns in g_columnar_relations to check if your table or view is impacted. If you have a frequent or high-volume changes to your table or view, theinvalid_block_count
will be high.