AlloyDB Omni metrics

AlloyDB Omni on Kubernetes supports various metrics for monitoring the health and performance of the database. These metrics are exposed in a format suitable for scraping by Prometheus.

Labels

AlloyDB Omni on Kubernetes exposes the following types of labels.

Resource labels

AlloyDB Omni on Kubernetes exposes the following resource labels that uniquely identify the database container that the metrics belong to. These resource labels match the names of the Kubernetes resource that owns the database container:

Label key Label value
dbnamespace Namespace of the dbcluster CR
dbcluster Name of the dbcluster CR
dbinstance Name of the dbinstance CR. Only the dbinstance of ReadPool type is supported. If the database container does not belong to a ReadPool dbinstance, this value is n/a
dbnode Name of the instance CR. Every instance CR has a one-to-one mapping to a database container.

System metadata labels

System metadata labels will change dynamically when the roles of the DB container changes. For example, When your dbcluster is promoted from secondary to primary, dbcluster_type will change from Secondary to Primary.

Label key Label value
dbcluster_type Disaster recovery (DR) role of the dbcluster CR. Can be Primary or Secondary
dbinstance_type Type of the dbinstance CR. If the container belongs to a ReadPool dbinstance, this value is ReadPool, otherwise this value is n/a
dbnode_type HA role of the dbnode, can be Primary or Standby

Metric Labels

The specific labels of each metric are listed in the following tables. For example, database means the name of a Postgres database hosted inside the AlloyDB Omni database container.

Metrics

AlloyDB Omni on Kubernetes exposes the following metrics. The metrics list mentions only metrics labels. All metrics start with alloydb_omni. To learn more about metric types, see Metric types.

Database container-level Metrics

These metrics are collected per AlloyDB Omni database container. All these metrics have resource and system metadata labels.

Name Description label Unit Type
alloydb_omni_database_postgresql_backends The number of active and idle connections to the AlloyDB Omni instance. gauge
alloydb_omni_database_postgresql_max_connections The current value of the Postgres max_connections runtime parameter. gauge
alloydb_omni_database_postgresql_up Whether the Postgres main process is running, 1 if running, 0 if down. gauge
alloydb_omni_database_postgresql_uptime_second Time elapsed since Postgres main process starts. second gauge
alloydb_omni_database_postgresql_vacuum_oldest_transaction_age The current age of the oldest uncommitted transaction that is blocking vacuum operation. It's measured in the number of transactions that started after the oldest transaction. type: one of [running, prepared, replication_slot, replica] gauge
alloydb_omni_database_postgresql_vacuum_transaction_id_utilization_percentage The ratio of transaction ID space consumed. 1 means 100%. gauge
alloydb_omni_instance_postgresql_backends_by_state Current number of connections. state: State of the connections, one of [idle, active, idle_in_transaction, idle_in_transaction_aborted, disabled, fastpath_function_call]. gauge
alloydb_omni_instance_postgresql_backends_for_top_applications Current number of connections per application. application_name: Name of the application gauge
alloydb_omni_instance_postgresql_blks_hit_count_total Total number of times Postgres found the requested block in the buffer cache. counter
alloydb_omni_instance_postgresql_blks_read_count_total Total number of blocks read by Postgres that were not in the Postgres buffer cache. counter
alloydb_omni_instance_postgresql_committed_transactions_count_total Total number of transactions committed. counter
alloydb_omni_instance_postgresql_deadlock_count_total Number of deadlocks detected. counter
alloydb_omni_instance_postgresql_new_connections_count_total Total number of new connections. counter
alloydb_omni_instance_postgresql_rolledback_transactions_count_total Total number of transactions rolled back. counter
alloydb_omni_instance_postgresql_temp_bytes_written_count_total Total amount of data written to temporary files by queries. byte counter
alloydb_omni_instance_postgresql_temp_files_written_count_total Total number of temporary files used for writing data while performing internal algorithms. counter
alloydb_omni_instance_postgresql_tuples_deleted_count_total Total number of rows deleted. counter
alloydb_omni_instance_postgresql_tuples_fetched_count_total Total number of rows fetched. counter
alloydb_omni_instance_postgresql_tuples_inserted_count_total Total number of rows inserted. counter
alloydb_omni_instance_postgresql_tuples_returned_count_total Total number of rows returned. counter
alloydb_omni_instance_postgresql_tuples_updated_count_total Total number of rows updated. counter
alloydb_omni_instance_postgresql_wait_count_total Total wait count for a wait event.
  • wait_event_name: Name of the wait event.
  • wait_event_type: Type of the wait event.
counter
alloydb_omni_instance_postgresql_wait_time_second_total Total time elapsed on a wait event.
  • wait_event_name: Name of the wait event.
  • wait_event_type: Type of the wait event.
second counter
alloydb_omni_instance_postgresql_replication_flush_lag_ms Time elapsed between flushing recent WAL locally and receiving notification that the replica server has written and flushed it (but not yet applied it).
  • application_name: application_name in the replica's connection string to the primary. It matches the name of the replica instance CR.
  • client_addr: IP address of the replica pod.
ms gauge
alloydb_omni_instance_postgresql_replication_replay_lag_ms Time elapsed between flushing recent WAL locally and receiving notification that the replica server has written, flushed and applied it.
  • application_name: application_name in the replica's connection string to the primary. It matches the name of the replica instance CR.
  • client_addr: IP address of the replica pod.
ms gauge
alloydb_omni_instance_postgresql_replication_write_lag_ms Time elapsed between flushing recent WAL locally and receiving notification that the replica server has written it (but not yet flushed it or applied it).
  • application_name: application_name in the replica's connection string to the primary. It matches the name of the replica instance CR.
  • client_addr: IP address of the replica pod.
ms gauge
alloydb_omni_memory_available_byte Estimate of the amount of memory available for allocation. byte gauge
alloydb_omni_node_cpu_mcpu Number of mCPUs allocated. 1000 mCPU = 1 CPU mCPU gauge
alloydb_omni_node_cpu_usage_second_total Total CPU seconds used. second counter
alloydb_omni_node_network_received_bytes_count_total Network received bytes count on the AlloyDB Omni pod. byte counter
alloydb_omni_node_network_sent_bytes_count_total Network sent bytes count on the AlloyDB Omni pod. byte counter
alloydb_omni_node_storage_limit_per_disk_byte Storage limit in byte. disk: Name of the disk byte gauge
alloydb_omni_node_storage_read_bytes_count_total Number of bytes read from disk. byte counter
alloydb_omni_node_storage_read_ops_count_total Number of disk read IO operations. counter
alloydb_omni_node_storage_usage_per_disk_byte Storage used in bytes per disk. disk: Name of the disk byte gauge
alloydb_omni_node_storage_write_bytes_count_total Number of bytes written to disk. byte counter
alloydb_omni_node_storage_write_ops_count_total Number of disk write IO operations. counter

Database-level metrics

These metrics are collected on a per AlloyDB Omni database container per Postgres database level. You can create multiple Postgres databases in one database container. All these metrics have resource, system metadata, and "database" labels. The database label is the name of the Postgres database that the metric belongs to.

Name Description label Unit Type
alloydb_omni_database_postgresql_backends_for_top_databases The current number of connections per database. gauge
alloydb_omni_database_postgresql_blks_hit_for_top_databases_count_total Total number of times Postgres found the requested block in the buffer cache per database. counter
alloydb_omni_database_postgresql_blks_read_for_top_databases_count_total Total number of blocks read by Postgres that were not in the Postgres buffer cache per database. counter
alloydb_omni_database_postgresql_committed_transactions_for_top_databases_count_total Total number of transactions committed per database. counter
alloydb_omni_database_postgresql_deadlock_for_top_databases_count_total The number of deadlocks per database. counter
alloydb_omni_database_postgresql_insights_aggregate_execution_time_us_total Total execution time across all queries.
  • user: Postgres user that ran the queries
  • client_addr: IP address of the client if available, otherwise empty
us counter
alloydb_omni_database_postgresql_insights_aggregate_io_time_us_total Total time spent doing IO for across all queries.
  • user: Postgres user that ran the queries
  • io_type: read or write
us counter
alloydb_omni_database_postgresql_new_connections_for_top_databases_count_total The number of new connections per database. counter
alloydb_omni_database_postgresql_rolledback_transactions_for_top_databases_count_total Total number of transactions rolled back per database. counter
alloydb_omni_database_postgresql_size_byte Size of the database. byte gauge
alloydb_omni_database_postgresql_statements_executed_count_total Total count of statements executed per database. operation_type: Name of the operation, one of [SELECT, UPDATE, INSERT, DELETE, MERGE, UTILITY, NOTHING, UNKNOWN]. counter
alloydb_omni_database_postgresql_temp_bytes_written_for_top_databases_count_total Total amount of data written to temporary files by queries per database. byte counter
alloydb_omni_database_postgresql_temp_files_written_for_top_databases_count_total Total number of temporary files used for writing data while performing internal algorithms per database. counter
alloydb_omni_database_postgresql_tuples Number of rows in the database. state: one of [live, dead] gauge
alloydb_omni_database_postgresql_tuples_deleted_for_top_databases_count_total The total number of rows deleted per database. counter
alloydb_omni_database_postgresql_tuples_fetched_for_top_databases_count_total The total number of rows fetched per database. counter
alloydb_omni_database_postgresql_tuples_inserted_for_top_databases_count_total The total number of rows inserted per database. counter
alloydb_omni_database_postgresql_tuples_returned_for_top_databases_count_total The total number of rows returned per database. counter
alloydb_omni_database_postgresql_tuples_updated_for_top_databases_count_total The total number of rows updated per database. counter

Metrics collection metrics

These metrics indicate the status of each metric collection cycle. They have the resource labels mentioned in Labels.

Name Description Unit Type
alloydb_omni_monitor_collect_ms Number of milliseconds spent to collect metrics. ms gauge
alloydb_omni_monitor_error_count Number of errors encountered while trying to collect metrics this cycle. gauge
alloydb_omni_monitor_metric_count Number of metrics collected successfully this cycle. gauge

Prometheus Metric Handler Metrics

These metrics are automatically generated by Prometheus for each collection cycle.

Name Description Cause Type
promhttp_metric_handler_errors_total Total number of internal errors encountered by the promhttp metric handler. Cause of the error counter

What's next