Reviewing persistent disk performance metrics

The following persistent disk metrics are available in Cloud Monitoring, Google Cloud's integrated monitoring solution.

Metric Description
Peak disk read bytes
(instance/disk/max_read_bytes_count)
Peak disk write bytes
(instance/disk/max_write_bytes_count)
The maximum number of bytes read or written per second over a period of time specified by the user*.
Peak disk read ops
(instance/disk/max_read_ops_count)
Peak disk write ops
(instance/disk/max_write_ops_count)
The maximum number of read/write operations per second over a period of time specified by the user*.
Disk read bytes
(instance/disk/read_bytes_count)
Disk write bytes
(instance/disk/write_bytes_count)
The average number of bytes read or written over a period of time specified by the user*.
Disk read operations
(instance/disk/read_ops_count)
Disk write operations
(instance/disk/write_ops_count)
The average number of read/write operations over a period of time specified by the user*.
Throttled read bytes
(instance/disk/throttled_read_bytes_count)
Throttled write bytes
(instance/disk/throttled_write_bytes_count)
The average number of throttled read or written bytes over a period of time specified by the user*.
Throttled read operations
(instance/disk/throttled_read_ops_count)
Throttled write operations
(instance/disk/throttled_write_ops_count)
The average number of throttled read/write operations over a period of time specified by the user*.
*The period must be one minute or longer.

To use these metrics to view historic IOPS and throughput rates, see Graph observed disk performance. For a full list of metrics and detailed documentation, see Compute Engine metrics.

Graph observed disk performance

The Cloud Monitoring > Metrics Explorer page lets you graph multiple persistent disk performance metrics on the same chart.

The following instructions focus on metrics for read requests, but you can follow the same procedure for write requests. See persistent disk performance metrics for the analogous metric names.

Average IOPS and throughput rates

Use the Disk read operations metric to graph average IOPS.

  1. In the Cloud Console, go to the Cloud Monitoring > Metrics Explorer page.

    Go to the Metrics Explorer page

  2. In the Resource types list, select GCE VM Instance (gce_instance).

  3. In the Metrics list, select Disk read operations (instance/disk/read_ops_count).

  4. Under Filter:

    1. Click Add a filter.
    2. Select project_id from the drop-down list.
    3. In the Value field, enter your project ID.
    4. Click Apply.
    5. Click Add a filter.
    6. Select device_name from the drop-down list.
    7. In the Value field, enter the name of your persistent disk.
    8. Click Apply.
    9. In the Aggregator drop-down list, select None.
  5. Click Show advanced options.

  6. In the Advanced aggregation pane, click the Aligner drop-down list. Select rate so that the data points display the IOPS rate (operations per second).

  7. Set the alignment period.

Use the Disk read bytes metric to graph average throughput rates.

  1. Click Add Metric.
  2. In the Resource types list, select GCE VM Instance (gce_instance).
  3. In the Metrics list, select Disk read bytes. (instance/disk/read_bytes_count).
  4. Under Filter, select your project ID and persistent disk device name.
  5. In the Aggregator drop-down list, select None.
  6. Click Show advanced options.
  7. In the Advanced aggregation pane, click the Aligner drop-down list. Select rate so that the data points display the throughpout rate (bytes per second).
  8. Set the alignment period.

Peak IOPS and throughput rates

Use the Peak disk read operations metric to graph maximum per second read operations, sampled every minute.

  1. Click Add Metric.
  2. In the Resource types list, select GCE VM Instance (gce_instance).
  3. In the Metrics list, select Peak disk read ops (instance/disk/max_read_ops_count)
  4. Under Filter, select your project ID and persistent disk device name.
  5. In the Aggregator drop-down list, select None.
  6. Click Show advanced options.
  7. In the Advanced aggregation pane, click the Aligner list and select max.
  8. Set the alignment period.

Use the Peak disk read bytes metric to graph maximum per second bytes read, sampled every minute.

  1. Click Add Metric.
  2. In the Resource types list, select GCE VM Instance (gce_instance).
  3. In the Metrics list, select Peak disk read bytes (instance/disk/max_read_bytes_count)
  4. Under Filter, select your project ID and persistent disk device name.
  5. In the Aggregator drop-down list, select None.
  6. Click Show advanced options.
  7. In the Advanced aggregation pane, click the Aligner list and select max.
  8. Set the alignment period.

Throttling rates

Use the Throttled read operations metric to graph average throttled operations rates.

  1. Click Add Metric.
  2. In the Resource types list, select GCE VM Instance (gce_instance).
  3. In the Metrics list, select Throttled read operations (instance/disk/throttled_read_ops_count)
  4. In the Aggregator drop-down list, select None.
  5. Under Filter, select your project ID and persistent disk device name.
  6. Click Show advanced options.
  7. In the Advanced aggregation pane, click the Aligner list and select rate so that the data points display the IOPS rate (throttled IOPS per second).
  8. Set the alignment period.

Use the Throttled read bytes metric to graph average throttled bytes rates.

  1. Click Add Metric.
  2. In the Resource types list, select GCE VM Instance (gce_instance).
  3. In the Metrics list, select Throttled read bytes (instance/disk/throttled_read_bytes_count)
  4. Under Filter, select your project ID and persistent disk device name.
  5. In the Aggregator drop-down list, select None.
  6. Click Show advanced options.
  7. In the Advanced aggregation pane, click the Aligner list and select rate so that the data points display the throughput rate (throttled bytes per second).
  8. Set the alignment period.

Throttling smooths out bursty I/O (input/output) operations. With throttling, bursty I/O operations can be spread over a period of time such that the performance limits of your disk can be met but not exceeded at any given instant.

If your workload has a bursty I/O usage pattern, expect to see bursts in throttled bytes corresponding to bursts in read or written bytes. Similarly, expect to see bursts in throttled operations corresponding to bursts in read/write operations.

Databases are a common example of bursty workloads. Databases tend to have short microbursts of I/O operations, which lead to temporary increases in queue depth. Higher queue depth can result in higher latency because more outstanding I/O operation requests are waiting in queue.

If your workload has a uniform I/O usage pattern and you are continuously reaching the performance limits of your disk, you can expect to see uniform levels of throttled bytes and operations.

The throttling metrics include a throttle_reason label that indicates whether throttling is due to limits based on the disk size or limits based on the the number of vCPUs on the VM instance. Consider the following steps to increase performance, especially for latency- sensitive workloads such as databases:

Set the alignment period

You can adjust the period of time over which the metrics are aggregated by, for example, a max or an average operation. To set the alignment period, click Show advanced options and under Alignment Period enter the alignment period in whole minutes. The alignment period must be one minute or longer.

The alignment period is displayed in your chart.

One minute interval.

Compare average performance, peak performance, and throttling rates

Consider the following example. Five bursts of read requests were issued to a 3,400 GB SSD persistent disk. The durations of the five bursts were 60 seconds, 30 seconds, 1 second, 500 milliseconds, and 100 milliseconds. The five spikes correspond to the five bursts, from left to right:

Burst tests.

For the 60 second-long burst, the Peak disk read operations metric shows that the disk reached the expected performance limit of 100,000 IOPS. Some operations were throttled to smooth out the burst of requests. However, because it captures an average, the Disk read operations metric does not show that the expected performance limit of 100,000 IOPS was reached in that minute.

For the bursts lasting one second or less, the burst duration is so short relative to the metrics' sampling period that the metrics fail to capture true peak performance rates.