This page helps you monitor AlloyDB instances by using the following methods:
- Monitor an instance by using the AlloyDB System Insights dashboard.
- Monitor an instance by using the Cloud Monitoring dashboard.
- Monitor an instance by using AlloyDB active queries.
- View metrics on the AlloyDB cluster Overview page.
Before you begin
To access the System Insights dashboard, enable access to AlloyDB for PostgreSQL in your Google Cloud project.
Required roles
To use AlloyDB System Insights, you need permissions to do the following:
- To access System Insights, you need permissions to access the AlloyDB System Insights dashboard.
- To edit System Insights, you need permissions to update AlloyDB instances.
To get these permissions, ask your administrator to grant you one of the following roles:
- Basic viewer (
roles/viewer
) - Database Insights viewer (
roles/databaseinsights.eventsViewer
)
For more information about granting roles, see Manage access.
Monitor an instance using the AlloyDB System Insights dashboard
The AlloyDB System Insights dashboard displays metrics of the resources that you use and lets you monitor these resources and metrics.
To view the AlloyDB System Insights dashboard, do the following:
In the Google Cloud console, open the AlloyDB page.
Click the name of a cluster to open its Overview page.
Select the System insights tab.
The AlloyDB System Insights dashboard opens. It shows details about the cluster at the top followed by charts for the key metrics.
The dashboard offers the following high-level options:
By default, the Cluster (all instances) view is selected. You can choose to view metrics for the primary instance or for read pool instances.
To view either two charts side-by-side or only one chart per row, click the toggle button.
The time selector shows
1 day
selected by default. To change the period, select one of the other predefined periods or click Custom and define a start and end time. Data is available for the last 30 days.
The summary cards show the latest or aggregated values for some metrics, and provide an overview of the database health and performance during the selected period.
The metrics charts show information about some important metrics that help you gain insights into resource usage such as CPU, memory, connections, replication, and storage, and respond proactively as your application needs change.
The system events show information that helps you evaluate the impact of system events on the health and performance of the instance.
Summary cards
The following table describes the summary cards displayed on the AlloyDB System Insights dashboard. These cards provide an overview of the database health and performance during the selected period.
Summary cards | Description | Dashboard availability |
---|---|---|
Storage | The total storage that the AlloyDB cluster uses during the selected period. | Cluster dashboard |
Active node count | The latest number of active nodes that can serve user traffic in the cluster during the selected period. |
|
CPU utilization | The 99th and 50th percentile values of CPU utilization during the selected period. |
|
Peak connections | The ratio of peak connections to maximum connections across all serving nodes of the instance during the selected period. This includes both active and idle connections. |
|
Log errors | The total number of error logs generated during the selected period. |
|
Transactions per second | The latest value of transactions per second that are committed or rolled back across all serving nodes of the instance during the selected period. |
|
Max replication lag | The average value of the maximum time taken for replication from a primary instance to a read pool during the selected period. | Primary instance dashboard |
Metrics charts
A chart card for a sample metric appears as follows.
The toolbar on each chart card provides the following set of standard options:
To hide or display the legend, click legend_toggle.
To view metric values for a specific moment in the selected period, move the cursor over the chart.
To view a chart in full-screen mode, click fullscreen. To exit full-screen mode, click Esc.
To zoom a chart, click the chart and drag horizontally along the x-axis or vertically along the y-axis. To revert the zoom operation, click youtube_searched_for. Zoom operations apply at the same time to all charts on a dashboard.
To view additional options, click more_vert. Most charts offer these options:
- Download a PNG image.
- Add to custom dashboard. Provide a name for the dashboard or select an existing custom dashboard. View and edit the dashboard using Cloud Monitoring.
- View in Metrics Explorer. View the metric in Metrics Explorer. You can view other AlloyDB metrics in the Metrics Explorer by selecting the AlloyDB instance resource type.
The following table describes the metrics charts that are displayed on the
AlloyDB System Insights dashboard and in the Metrics Explorer.
The metric type strings must be prefixed with alloydb.googleapis.com/
. For example,
alloydb.googleapis.com/Instance
.
Launch stages of these metrics: BETA, GA
Cluster dashboard metrics
Display name and metric name | Description |
---|---|
Rows in database by state BETAdatabase/postgresql/tuples |
The number of live and dead rows in the instance. Only available for instances with the number of databases less than 50. |
Query latency BETAdatabase/postgresql/insights/aggregate/latencies |
The aggregated P99 query latency distribution for each instance. Only available for instances with Query Insights enabled. |
Database load BETAdatabase/postgresql/insights/aggregate/execution_time |
The accumulated query execution time for each instance. This is the sum of CPU time, I/O wait time, lock wait time, process context switch, and scheduling for all processes involved in query execution. Only available for instances with Query Insights enabled. |
Max CPU utilization GAinstance/cpu/maximum_utilization
|
The maximum CPU utilization calculated across all serving nodes of the instance, represented as a percentage for each instance. |
Mean CPU utilization GAinstance/cpu/average_utilization |
The mean CPU utilization calculated as an average across all serving nodes of the instance, represented as a percentage for each instance. |
Minimum available memory GAinstance/memory/min_available_memory |
The minimum available memory across all serving nodes of the instance. The available memory is an estimate of memory in bytes available for allocation on the VM including memory that is currently used, but can be potentially freed. |
Cluster storage GAcluster/storage/usage |
The billable storage that the AlloyDB cluster uses. It does not include the storage that automatic and on-demand backups use. |
Queries IO wait time BETAdatabase/postgresql/insights/aggregate/io_time |
The aggregated queries I/O wait time for each instance. Only available for instances with Query Insights enabled. |
Number of connections BETAinstance/postgresql/backends_by_state |
The number of connections to each instance across all serving nodes of the
instance. This includes connections in all states such as idle ,
active , idle_in_transaction , idle_in_transaction_aborted ,
disabled , and fastpath_function_call .
|
Rows fetched vs. rows returned vs. rows written BETA
|
|
Rows processed by operation BETA
|
The number of rows processed in the primary instance per operation such
as insert , update , or delete .
Only available for instances with the number of databases less than 5000.
|
Oldest transaction age BETApostgresql/vacuum/oldest_transaction_age |
The current age of the oldest uncommitted transaction for each instance that is blocking the vacuum operation. It's measured in the number of transactions that started after the oldest transaction. |
Transaction ID utilization BETApostgresql/vacuum/transaction_id_utilization |
The current percentage of transaction ID space that's consumed by each instance. It records up to a maximum of two billion unvacuumed transactions as a percentage. |
Wait time BETAinstance/postgresql/wait_time |
The aggregated accumulated elapsed wait time per wait event name and wait event type across the cluster. |
Wait count BETAinstance/postgresql/wait_count |
The number of times processes waited per wait event name and wait event type across the cluster. |
Temp data size used BETAinstance/postgresql/temp_bytes_written_count |
The amount of data (bytes) used in each instance for executing queries
and performing algorithms such as join and sort .
Only available for instances with the number of databases less than 5000.
|
Temp files used BETAinstance/postgresql/temp_files_written_count |
The number of temporary files that are used in each instance for executing
queries and performing algorithms such as join and sort .
Only available for instances with the number of databases less than 5000.
|
New connection count BETAinstance/postgresql/new_connections_count |
The number of new connections that are added to each instance. Only available for instances with the number of databases less than 5000. |
Statements executed count BETAdatabase/postgresql/statements_executed_count |
The number of statements that are executed in each instance. Only available for instances with Query Insights enabled. |
Transaction count GAinstance/postgres/transaction_count |
The number of transactions that have occurred in each instance. Only available for instances with the number of databases less than 5000. |
Block read count BETA
|
The number of blocks read from the disk and from the buffer cache across the cluster. Only available for instances with the number of databases less than 5000. |
Deadlock count BETAinstance/postgresql/deadlock_count |
The number of deadlocks in each instance. Only available for instances with the number of databases less than 5000. |
Read pool replication lag GAinstance/postgres/replication/maximum_lag |
The maximum time taken for replication from a primary instance to a read pool. The time lag is calculated across all serving read pool nodes per read pool instance in the cluster. |
Active nodes GAinstance/postgres/instances |
The number of nodes that can serve user traffic in the cluster. |
Instance dashboard metrics
Display name and metric name | Description |
---|---|
Rows in database by state BETAdatabase/postgresql/tuples |
The number of live and dead rows in the instance. Only available for instances with the number of databases less than 50. |
Query latency BETAdatabase/postgresql/insights/aggregate/latencies |
The aggregated P99, P95, and P50 query latency distribution. Only available for instances with Query Insights enabled. |
Database load per database, user, or client address BETAdatabase/postgresql/insights/aggregate/execution_time |
The accumulated query execution time per database, user, or client address. This is the sum of CPU time, I/O wait time, lock wait time, process context switch, and scheduling for all processes involved in query execution. Only available for instances with Query Insights enabled. |
CPU utilization GAinstance/cpu/maximum_utilization
|
The current CPU utilization, represented as a percentage of the reserved CPU that's currently in use. |
Available memory GAinstance/memory/min_available_memory |
The available memory is an estimate of memory in bytes available for allocation on the VM including memory that is currently used, but can be potentially freed. |
Queries IO wait breakdown by type BETAdatabase/postgresql/insights/aggregate/io_time |
The breakdown of I/O wait time for SQL statements by IO type (read/write). Only available for instances with Query Insights enabled. |
Connections by status BETAinstance/postgresql/backends_by_state |
The number of connections grouped by the following statuses: idle ,
active , idle_in_transaction , idle_in_transaction_aborted ,
disabled , and fastpath_function_call .
For information about these statuses, see the state text row in the
pg_stat_activity documentation.
|
Connections per database BETAdatabase/postgresql/backends_for_top_databases |
The number of connections to the instance per database. |
Connections per application BETAinstance/postgresql/backends_for_top_applications |
The number of connections to the instance, grouped by applications.
Connections without an application name are grouped as Unknown .
|
Rows fetched vs. rows returned vs. rows written BETA
|
0 because it's negligible compared to the value of
rows returned.
|
Rows processed by operation BETA
|
The number of rows processed in the primary instance per operation such
as insert , update , or delete .
|
Oldest transaction age BETApostgresql/vacuum/oldest_transaction_age |
The current age of the oldest uncommitted transaction that's blocking the vacuum operation. It's measured in the number of transactions that started after the oldest transaction. |
Transaction ID utilization BETApostgresql/vacuum/transaction_id_utilization |
The current percentage of transaction ID space that is consumed by the instance. It records up to a maximum of two billion unvacuumed transactions as a percentage. |
Wait time BETAinstance/postgresql/wait_time |
The accumulated elapsed wait time in the system per wait event name and wait event type. |
Wait count BETAinstance/postgresql/wait_count |
The number of times processes waited per wait event name and wait event type in the instance. |
Temp data size used per database BETAdatabase/postgresql/temp_bytes_written_for_top_databases |
The amount of data (bytes) used in each database for executing queries
and performing algorithms such as join and sort .
|
Temp files used per database BETAdatabase/postgresql/temp_files_written_for_top_databases |
The number of temporary files that are used in each database for executing
queries and performing algorithms such as join and sort .
|
New connection count per database BETAdatabase/postgresql/new_connections_for_top_databases |
The number of new connections that are added to the instance per database. |
Statements executed count BETAdatabase/postgresql/statements_executed_count |
The number of statements that are executed in each instance per operation
such as SELECT , UPDATE , INSERT , DELETE ,
MERGE , UTILITY , NOTHING , and UNKNOWN .
Only available for instances with Query Insights
enabled.
|
Transaction count GAdatabase/postgresql/transaction_count |
The number of committed and rolled-back transactions that have occurred in the instance. |
Block read count BETA
|
The number of blocks read in the instance from the disk and from the buffer cache. |
Deadlock count per database BETAdatabase/postgresql/deadlock_count_for_top_databases |
The number of deadlocks in the instance per database. |
Read pool replication lag GAinstance/postgres/replication/maximum_lag |
The maximum time taken for replication from a primary instance to a read pool. The time lag is calculated across all serving read pool nodes per read pool instance in the cluster. |
Read pool dashboard metrics
Display name and metric name | Description |
---|---|
Query latency BETAdatabase/postgresql/insights/aggregate/latencies |
The aggregated P99, P95, and P50 query latency distribution. Only available for instances with Query Insights enabled. |
Database load BETAdatabase/postgresql/insights/aggregate/execution_time |
The accumulated query execution time per database, user, or client address across all serving nodes of the instance. This is the sum of CPU time, I/O wait time, lock wait time, process context switch, and scheduling for all processes involved in query execution. Only available for instances with Query Insights enabled. |
CPU utilization GA
|
|
Minimum available memory BETAinstance/memory/min_available_memory |
The minimum available memory across all serving nodes of the instance. The available memory is an estimate of memory in bytes available for allocation on the VM including memory that is currently used, but can be potentially freed. |
Connections by status BETAinstance/postgresql/backends_by_state |
The number of connections across all serving nodes of the read pool instance
grouped by the following statuses: idle , active ,
idle_in_transaction , idle_in_transaction_aborted ,
disabled , and fastpath_function_call .
For information about these statuses, see the state text row in the
pg_stat_activity documentation.
|
Connections per database BETAdatabase/postgresql/backends_for_top_databases |
The number of connections per database across all serving nodes of the read pool instance. |
Connections per application BETAinstance/postgresql/backends_for_top_applications |
The number of connections per application name across all serving nodes
of the read pool instance.
Connections without an application name are grouped as Unknown .
|
Rows fetched vs. rows returned BETA
|
0 because it's negligible compared to the value of
rows returned.
|
Oldest transaction age BETApostgresql/vacuum/oldest_transaction_age |
The current age of the oldest uncommitted transaction that's blocking the vacuum operation. It's measured in the number of transactions that started after the oldest transaction. |
Transaction ID utilization BETApostgresql/vacuum/transaction_id_utilization |
The current percentage of transaction ID space that is consumed by the instance. It records up to a maximum of two billion unvacuumed transactions as a percentage. |
Wait time BETAinstance/postgresql/wait_time |
The aggregated accumulated elapsed wait time per wait event name and wait event type across all serving nodes of the read pool instance. |
Wait count BETAinstance/postgresql/wait_count |
The number of times processes waited per wait event name and wait event type across all serving nodes of the read pool instance. |
Temp data size used per database BETAdatabase/postgresql/temp_bytes_written_for_top_databases |
The amount of data (bytes) used in each database across all serving nodes
of the read pool instance for executing queries and performing algorithms
such as join and sort .
|
Temp files used per database BETAdatabase/postgresql/temp_files_written_for_top_databases |
The number of temporary files that are used in each database across all
serving nodes of the read pool instance for executing queries and performing
algorithms such as join and sort .
|
New connection count per database BETAdatabase/postgresql/new_connections_for_top_databases |
The number of new connections that are added per database to the serving nodes of the read pool instance. |
Statements executed count BETAdatabase/postgresql/statements_executed_count |
The number of statements that are executed in the serving nodes of the
read pool instance per operation such as SELECT , UPDATE ,
INSERT , DELETE , MERGE , UTILITY ,
NOTHING , and UNKNOWN .
Only available for instances with Query Insights
enabled.
|
Transaction count GAdatabase/postgresql/transaction_count |
The number of committed and rolledback transactions across all the serving nodes of the read pool instance. |
Block read count BETA
|
The number of blocks read from the disk and from the buffer cache across all the serving nodes of the read pool instance. |
Deadlock count per database BETAdatabase/postgresql/deadlock_count_for_top_databases |
The number of deadlocks across all the serving nodes of the read pool instance. |
Read pool replication lag GAinstance/postgres/replication/maximum_lag |
The maximum time taken for replication from a primary instance to a read pool. The time lag is calculated across all serving read pool nodes in the read pool instance. |
Nodes and replication status GAinstance/postgres/replication/replicas |
The number of read nodes connected to the primary instance per read pool
along with one of the following states: startup , catchup ,
streaming , backup , and stopping .
|
Active nodes BETAinstance/postgres/instances |
The number of nodes that can serve user traffic in the read pool instance. |
Additionally, the Cloud Logging metric, Log entries by severity (logging.googleapis.com/log_entry_count
),
shows the total numbers of error and warning log entries per instance. These
are extracted from postgres.log
, which is the database log, and
pgaudit.log
, which contains data access information.
For more information, see AlloyDB metrics.
Events timeline
The dashboard provides details about the following events. System events appear in Events timeline within approximately 5 minutes after the event occurs.
Event name | Description | Operation type |
---|---|---|
Instance create |
Creates an AlloyDB instance. | INSTANCE_CREATE |
Instance update |
Updates an AlloyDB instance. | INSTANCE_UPDATE |
Instance restart |
Restarts the AlloyDB instance. | INSTANCE_RESTART |
Instance failover |
Initiates a manual failover of a highly available primary instance for a standby instance to become the primary instance. | INSTANCE_FAILOVER |
Cluster maintenance |
Indicates that the cluster is currently in maintenance, and the instances become unavailable for a few minutes. | MAINTENANCE |
Backup create |
Creates a backup of an AlloyDB instance. | BACKUP_CREATE |
Backup update |
Updates a backup of an AlloyDB instance. | BACKUP_UPDATE |
Cluster create |
Creates an AlloyDB cluster. | CLUSTER_CREATE |
Cluster update |
Update an AlloyDB cluster. | CLUSTER_UPDATE |
Cluster promote |
Promotes a secondary cluster to primary cluster. | CLUSTER_PROMOTE |
User create |
Creates a new user. | USER_CREATE |
User update |
Update a user. | USER_UPDATE |
User delete |
Deletes a user. | USER_DELETE |
Monitor an instance by using the Cloud Monitoring dashboard
Cloud Monitoring offers predefined dashboards for several Google Cloud products. You can also create your own custom dashboards to display data that is of interest to you and to monitor the general health of your primary and read pool instances.
Set up alerts
You can use Cloud Monitoring to set up alerts for a project or a specified instance.
For example, you can set up an alert for a message to be sent to specific email IDs when the Memory usage metric for a AlloyDB instance exceeds the threshold of 80%.
Monitor an instance using AlloyDB active queries
This page describes how to monitor and troubleshoot the queries that are active in your database. Active queries are long-running queries on your database that can affect performance. Monitoring these queries can help identify causes of system latency and high CPU usage.
You can view active queries on the Query insights dashboard. These queries are displayed by the transaction start time. If there is a large number of queries running, the results may be limited to a subset of total queries due to the memory constraints the system enforces on the data collection.
View active queries
To view your active queries, do the following:
Create a custom role and add the
databaseinsights.monitoringViewer
Identity and Access Management (IAM) permission.Add this role for each user that is viewing active queries.
In the Google Cloud console, open the AlloyDB page.
Click the name of a cluster to open its Overview page.
Select the Query insights tab. The Query insights dashboard shows details about the selected cluster.
Click the Active query view tab. The tab contains a summary of active queries running along with the top 50 longest running transactions.
You can filter the information by either of the following:
- Database: filters query load on a specific database or all databases.
- User: filters query load from a specific database user or all of the users.
Summary scorecard of all normalized active queries: provides an overview of all the active queries by displaying the total connections based on the following parameters:
- Distribution of active connections categorized by connection state.
- Distribution of active connections based on different wait event types.
- Distribution of transaction durations for queries that are not in an idle state.
Longest running transactions: provides an overview of the top 50 running queries in the
active
andidle in transaction
state based on descending execution time.
You can view a list of top fifty normalized active queries on the Query insights dashboard.
A normalized active query removes sensitive data and returns a digest. The digest is the same for different values used in the following example:
Regular queries
select * from test_table where id=1;
select * from test_table_where id=2;
Digest or normalized query
select * from test_table where id=?;
A query that runs in different sessions is displayed as different entries on the dashboard.
View top longest running transactions
The following table describes the columns of the Longest running transactions table on the Query insights dashboard:
Column name | Description |
---|---|
Process ID | Process ID is the unique identifier of the database connection. |
Query | Normalized SQL query text. |
Status | The current status of the connection. Status values include active and idle in transaction . |
Session duration | The duration of current session (in seconds). |
Transaction duration | The duration of current transaction (in seconds). |
Query duration | The duration of current active Query (in seconds). Query duration of queries in the idle in transaction state are 0 as they are not actively running. |
Wait event type | The currently occurring wait event type during query execution. |
Wait event | The currently occurring wait event during query execution. |
Database | Database name on which this connection is running. |
Application name | Application name on which this connection is running. |
User name | Name of the database user connected to the database. |
Client address | Specific IP address of the client that sent a query. |
Action | Contains a link for terminating a transaction. |
The display is refreshed every 60 seconds automatically.
Terminate a query or transaction
To terminate a query or a transaction, you must have the databaseinsights.operationsAdmin
role. For more information, see IAM basic and predefined roles reference.
A transaction can contain multiple queries. To identify which of the queries are active at the moment, see the Query column in the Longest running transaction table.
To terminate a query or transaction, complete the following steps:
- Select the query in the Longest running transaction table.
- Scroll right to the Action column.
- Click Terminate connection.
In the Terminate connection window, click Confirm.
The page refreshes after initiating termination. If termination fails, it is silent and the query continues showing up in the Active Query list. If termination is successful, the query no longer displays in the list.
View metrics on the AlloyDB cluster Overview page
In the cluster Overview page, the default metrics chart appears at the top of the page.
Select another metric from the Chart drop-down list. The chart shows the data for the selected metric.
What's next
- View the list of AlloyDB metrics.
- Use Query Insights to improve query performance.
- Learn more about Cloud Logging and Cloud Monitoring.