CPU utilization metrics

This page describes the CPU utilization metrics that Cloud Spanner provides. You can view these metrics in the Google Cloud Platform Console and in the Stackdriver Monitoring console.

CPU utilization and task priority

When Cloud Spanner measures CPU utilization, it organizes tasks into the following categories:

User tasks System tasks
High priority High-priority user tasks High-priority system tasks

Tasks that your application initiates and that Cloud Spanner handles as a high priority.

A read or commit request is usually high priority.

Tasks that Cloud Spanner initiates and handles as a high priority.

Examples include backfilling an index and data splitting.

Low priority Low-priority user tasks Low-priority system tasks

Tasks that your application initiates, and that do not need to be completed as quickly as high-priority tasks.

Examples include batch reads and batch queries.

Tasks that Cloud Spanner initiates, and that do not need to be completed as quickly as high-priority tasks.

Examples include database compaction and schema change validation.

High-priority tasks immediately preempt low-priority tasks. If necessary, Cloud Spanner stops all low-priority tasks and allows high-priority tasks to utilize up to 100% of the available CPU resources. While low-priority system tasks can be delayed in the short term, they must run eventually for optimal performance. Therefore, you must provision your instance with enough nodes to handle both high- and low-priority tasks.

If there are no high-priority tasks, Cloud Spanner will utilize up to 100% of the available CPU resources to complete low-priority tasks more quickly. Spikes in background usage are not a sign of a problem. Low-priority tasks can yield to high-priority tasks, including user tasks, almost instantly.

Available metrics

Cloud Spanner provides the following metrics for CPU utilization:

  • Rolling average 24 hour: A rolling average of total CPU utilization, as a percentage of the instance's CPU resources, for each database. Each data point is an average for the previous 24 hours.
  • High priority: The CPU utilization, as a percentage of the instance's CPU resources, for high-priority tasks.
  • Total: The total CPU utilization, as a percentage of the instance's CPU resources.

    For instances, you can view total CPU utilization by database or by task priority.

    For databases, you can view total CPU utilization by task priority.

You can view charts for these metrics in the GCP Console or in the Stackdriver Monitoring console. You can also use the Stackdriver Monitoring console to create alerts for high CPU utilization, as described below.

To ensure that your Cloud Spanner instance has enough CPU resources to support your workload, we recommend that you keep CPU utilization below the following maximum values:

Metric Maximum for single-region instances Maximum per region for multi-region instances
High priority total 65% 45%
24-hour smoothed aggregate 90% 90%

To help you stay below the recommended maximums, create alerts in Stackdriver Monitoring that track high-priority CPU utilization and the average CPU utilization over 24 hours.

If you exceed the recommended maximums, we strongly recommend provisioning more nodes for your instance so it can continue to operate. If you want to automate this process, you can create an application that monitors CPU utilization, then adds and removes nodes as needed, using either a client library or the gcloud command-line tool.

To determine the number of nodes you need, consider both the peak high-priority CPU utilization and the 24-hour smoothed average. Always allocate enough nodes to keep the CPU utilization below the recommended maximums. We recommend allocating extra resources to accommodate workload spikes, especially for performance-sensitive applications.

If you do not have enough nodes, Cloud Spanner postpones tasks by priority level. Low-priority system tasks, like database compaction and schema change validation, can be deferred in favor of user tasks. However, these tasks are critical to the health of your instance, and Cloud Spanner cannot defer them indefinitely. If Cloud Spanner cannot complete its low-priority system tasks within a certain time window—on the order of several hours to a day—due to insufficient compute resources, Cloud Spanner might increase their priority. When this happens, it affects the performance of user tasks.

What's next

¿Te ha resultado útil esta página? Enviar comentarios:

Enviar comentarios sobre...

Cloud Spanner Documentation