Use System insights to improve system performance

This page describes how you can use the Cloud SQL System insights dashboard. The System insights dashboard displays metrics for the resources that your instance uses and helps you detect and analyze system performance problems.

View the System insights dashboard

To view the System insights dashboard, do the following:

  1. In the Google Cloud console, go to the Cloud SQL Instances page.

    Go to Cloud SQL Instances

  2. Click the name of an instance.
  3. Select the System Insights tab from the SQL navigation panel on the left.

The System insights dashboard opens. It shows details about the instance at the top, followed by summary cards for metrics, and then charts for some of the key metrics.

Shows the Cloud SQL Monitoring dashboard. It has a filter for setting
         a time range. Charts of important metrics are shown.

The dashboard offers the following high-level options:

  • To view either two charts side-by-side or only one chart per row, click the toggle button.

  • The time selector shows 1 hour selected by default. To change the period, select one of the other predefined periods or click Custom and define a start and end time. Data is available for the last 30 days.

  • To create an absolute link to the dashboard, click the Copy Link button. You can share this link with other Cloud SQL users who have the same permissions.

The summary cards show the latest or aggregated values for some of the metrics and provide an overview of the database health and performance.

The metrics charts show information about some important metrics that help you gain insights into several issues, such as throughput, latency, and cost. This information helps you respond proactively as your application needs change.

Summary cards

The following table describes the summary cards displayed at the top of the System insights dashboard. These cards provide an overview of the database health and performance during the selected period.

Summary cardDescription
CPU utilization - P99 | P50 The P99 and P50 CPU utilization values over the selected period.
Disk utilization The latest disk utilization value over the selected period.
Peak connections The ratio of the peak connections to the maximum connections for the selected period. The peak connection count might be higher than the maximum count in case the maximum count has recently changed, such as due to instance scaling or manually changing the max_connections setting.
Log errors The number of errors that were logged during the selected period.

Metrics charts

A chart card for a sample metric appears as follows.

Graph showing sample Cloud SQL metric data.

The toolbar on each chart card provides the following set of standard options:

  • To hide or display the legend, click .

  • To view metric values for a specific moment in the selected period, move the cursor over the chart.

  • To view a chart in full-screen mode, click . To exit full-screen mode, click Esc.

  • To zoom a chart, click the chart and drag horizontally along the x-axis or vertically along the y-axis. To revert the zoom operation, click . Zoom operations apply at the same time to all charts on a dashboard.

  • To view additional options, click . Most charts offer these options:

    • Download a PNG image.
    • Add to custom dashboard. Provide a name for the dashboard or select an existing custom dashboard. View and edit the dashboard using Cloud Monitoring.
    • View in Metrics Explorer. View the metric in Metrics Explorer. You can view other Cloud SQL metrics in the Metrics Explorer after selecting the Cloud SQL Database resource type.

The following table describes the Cloud SQL charts that appear by default on the Cloud SQL System insights dashboard. The metric type strings follow this prefix:

Metric name and typeDescription
Query latency

The aggregated query latency distribution by P99, P95, and P50 per user and database.

Only available for instances with Query Insights enabled.

Database load per database/user/client address

The accumulated query execution time per database, user, or client address. This is the sum of CPU time, IO wait time, lock wait time, process context switch, and scheduling for all processes involved in query execution.

Only available for instances with Query insights enabled.

CPU utilization

The current CPU utilization represented as a percentage of the reserved CPU that's currently in use.

Disk storage by type

The breakdown of instance disk usage by data types, including data, binlog, and tmp_data.

This metric helps you understand your storage costs. For more information about storage usage charges, see Storage and networking pricing.

Point-in-time recovery (PITR) uses write-ahead log (WAL) archiving. These logs update regularly and use storage space. Write-ahead logs are automatically deleted with their associated automatic backup, which generally happens after about 7 days.

If the size of your write-ahead logs is causing an issue for your instance, you can increase your storage size, but the write-ahead log size increase in disk usage might be temporary. To avoid unexpected storage issues, Google recommends enabling automatic storage increases when using PITR.

To delete the logs and recover storage, you can disable point-in-time recovery. Note, however, that decreasing the storage used does not shrink the size of the storage provisioned for the instance.

Temp data is included in the storage usage metric. Temp data is removed as part of maintenance and is allowed to increase beyond user-defined capacity limits to avoid a disk full event, at no charge to the user.

A newly created database uses about 100 MB for system tables and files.

Disk read/write ops
disk/read_ops_count, disk/write_ops_count

The Number of Reads metric indicates the number of read operations served from disk that do not come from cache. You can use this metric to understand whether your instance is correctly sized for your environment. If needed, you can move to a larger machine type to serve more requests from cache and reduce latency.

The Number of Writes metric indicates the number of write operations to disk. Write activity is generated even if your application is not active because Cloud SQL instances, barring replicas, write to a system table approximately every second.

Connections by status

The number of connections grouped by these statuses: idle, active, idle_in_transaction, idle_in_transaction_aborted, disabled, and fastpath_function_call.

For information about these statuses, see the state text row in the pg_stat_activity documentation.

Connections per database

The number of connections held by the database instance.

Ingress/egress bytes
network/received_bytes_count, network/sent_bytes_count

The network traffic in terms of the number of ingress bytes (bytes received) and egress bytes (bytes sent) to and from the instance, respectively.

IO wait breakdown by type

The breakdown of IO wait time for SQL statements by read and write types.

Only available for instances with Query Insights enabled.

Deadlock count by database

The number of deadlocks per database.

Block read count

The number of blocks read per second from the disk and from the buffer cache.

Rows processed by operation

The number of rows processed per operation per second.

Rows in database by state

The number of rows per state in the database.

Oldest transaction by age

The age of the oldest transaction that's blocking the vacuum operation.

WAL archiving
replication/log_archive_success_count, replication/log_archive_failure_count

The number of write-ahead log files that were successfully or unsuccessfully archived per minute.

Also, the Cloud Logging metric, Log entries by severity (, shows the total numbers of error and warning log entries. These are extracted from postgres.log, which is the database log, and pgaudit.log, which contains data access information.

For more information, see Cloud SQL metrics.

What's next