Fleet resource utilization metrics

Stay organized with collections Save and categorize content based on your preferences.

This page dives deeper into the fleet resource utilization metrics described in Use the Anthos overview. These metrics describe how effectively your clusters are utilizing the physically available resources you pay for or resources that you allocate on on-premises hardware. You can use this information to understand resource utilization effectiveness at scale, on a fleet level, and to either optimize cluster size and resource allocation across clusters or optimize how application teams request and reserve resources.

This page explains how these metrics are calculated and provides some tips for how to use thse metrics to optimize resource usage.

Understand resource utilization metrics

The following metrics are provided in the Anthos overview, calculated using information from Cloud Monitoring on your fleet clusters.

CPU metrics

  • Total CPU utilization: An average of all points-in-time for a given time period where point-in-time is a ratio between allocatable and used resources across all clusters that belong to a fleet.
    • Used: The amount of CPU used by all containers across all clusters that belong to a fleet. Calculated from the container/cpu/core_usage_time metric.
    • Allocatable: The amount of CPU allocated to all nodes across all clusters that belong to a fleet. Calculated from the node/cpu/allocatable_cores metric.
  • CPU utilization by fleet: Shows the relationship between the following CPU metrics:
    • Used: The amount of CPU used by all containers across all clusters that belong to a fleet. Calculated from the container/cpu/core_usage_time metric.
    • Requested: The amount of CPU requested by all containers across all clusters that belong to a fleet. Calculated from the container/cpu/request_cores metric.
    • Allocatable: The amount of CPU allocated to all nodes across all clusters that belong to a fleet. Calculated from the node/cpu/allocatable_cores metric.
  • Top CPU utilization by cluster: Cluster list sorted by an average of all points-in-time for a given time period where point-in-time is a ratio between allocatable and used resources for a particular cluster.

Memory metrics

  • Total memory utilization: An average of all points-in-time for a given time period where point-in-time is a ratio between allocatable and used resources across all clusters that belong to a fleet.
    • Used: The amount of non-evictable memory used by all containers across all clusters that belong to a fleet. Calculated from the container/memory/used_bytes metric.
    • Allocatable: The amount of memory allocated to all nodes across all clusters that belong to a fleet. This metric is shown on the Clusters page. Calculated from the node/memory/allocatable_bytes metric.
  • Memory utilization by fleet: Shows the relationship between the following memory metrics:
    • Used: The amount of non-evictable memory used by all containers across all clusters that belong to a fleet. Calculated from the container/memory/used_bytes metric.
    • Requested: The amount of memory requested by all containers across all clusters that belong to a fleet. Calculated from the container/memory/request_bytes metric.
    • Allocatable: The amount of memory allocated to all nodes across all clusters that belong to a fleet. Calculated from the node/memory/allocatable_bytes metric.
  • Top memory utilization by cluster: Cluster list sorted by an average of all points-in-time for a given time period where point-in-time is a ratio between allocatable and used resources for a particular cluster.
    • Used: The amount of non-evictable memory used by all containers in a cluster. Calculated from the container/memory/used_bytes metric.
    • Allocatable: The amount of memory allocated to all nodes in a cluster. This metric is shown on the Clusters page. Calculated from the node/memory/allocatable_bytes metric.

Disk metrics

  • Total disk utilization: An average of all points-in-time for a given time period where point-in-time is a ratio between allocatable and used resources across all clusters that belong to a fleet.
  • Disk utilization by fleet: Shows the relationship between the following storage metrics:
  • Top disk utilization by cluster: Cluster list sorted by an average of all points-in-time for a given time period where point-in-time is a ratio between allocatable and used resources for a particular cluster.

Use resource utilization metrics

The following tips can help you use the metrics in the console to identify and address problems:

  • If your fleet's Total CPU/Memory/Disk utilization indicates unexpectedly high or low utilization over the last seven days, always check the corresponding CPU/Memory/Disk utilization by fleet chart to evaluate if the unexpected utilization is constant or caused by usage spikes.
  • If Top CPU/Memory/Disk utilization by cluster indicates individual clusters that behave differently than the rest, consider investigating those particular clusters more closely. Consider resizing the clusters if possible.
  • CPU/Memory/Disk utilization by fleet lets you observe the ratio between used and requested resources. A big difference between the two might mean that application teams are requesting and reserving too many resources.