This documentation is for the most recent version of Anthos clusters on Azure, released in November 2021. See the Release notes for more information.
Stay organized with collections Save and categorize content based on your preferences.

Cloud monitoring

This topic describes how Anthos clusters on Azure integrates with Cloud Monitoring and how to view your metrics.

Before you begin

  1. Configure the Google Cloud CLI and enable the required APIs in your Google Cloud project.

  2. Authorize Cloud Logging / Cloud Monitoring to set up permissions for Google Cloud's operations suite.

Overview

Anthos clusters on Azure has built-in integration with Cloud Monitoring for system metrics of nodes, pods, and containers. This allows you to easily see the resource consumption of workloads in the cluster, build dashboards, and configure alerts.

Anthos clusters on Azure installs the metrics collector gke-metrics-agent in your cluster. This agent is based on OpenTelemetry Collector, and runs on every node in the cluster. It samples metrics every minute, and uploads measurements to Cloud Monitoring.

Once metrics for your cluster have been uploaded, they reside in your Google Cloud project. You can aggregate data across all of your clusters, build custom dashboards, explore a single cluster's data, view line charts, set up alerts, and more.

Using the Metrics Explorer

To use Metrics Explorer to view the metrics for a monitored resource, follow these steps:

  1. In the Google Cloud console, go to the Metrics Explorer page within Monitoring.
  2. Go to Metrics Explorer

  3. Select the Configuration tab.
  4. Expand the Select a metric menu, enter Kubernetes Container in the filter bar, and then use the submenus to select a specific resource type and metric:
    1. In the Active resources menu, select Kubernetes Container.
    2. In the Active metric categories menu, select Container.
    3. In the Active metrics menu, select CPU usage time.
    4. Click Apply.
  5. Optional: To configure how the data is viewed, add filters and use the Group By, Aggregator, and chart-type menus. For example, you can group by resource or metric labels. For more information, see Select metrics when using Metrics Explorer.
  6. Optional: Change the graph settings:
    • For quota and other metrics that report one sample per day, set the time frame to at least one week and set the plot type to Stacked bar chart.
    • For distribution valued metrics, set the plot type to Heatmap chart.

What metrics are collected

Metrics are collected using one of the following three monitored resource types. These types correspond to a Kubernetes object that the measurement is being made for:

For instance, measurements about a Pod would use the monitored resource type k8s_pod. These metrics would therefore include labels for pod_name and namespace_name, that identify a particular Pod.

A different set of metrics types are used for each monitored resource type. To learn more about these metrics types, see GKE system metrics.

By default, Anthos clusters on Azure collects the following metrics:

k8s_container

  • kubernetes.io/container/cpu/limit_utilization
  • kubernetes.io/container/cpu/request_utilization
  • kubernetes.io/container/cpu/core_usage_time
  • kubernetes.io/container/memory/limit_utilization
  • kubernetes.io/container/memory/used_bytes
  • kubernetes.io/container/restart_count
  • kubernetes.io/container/ephemeral_storage/limit_bytes
  • kubernetes.io/container/ephemeral_storage/request_bytes
  • kubernetes.io/container/ephemeral_storage/used_bytes
  • kubernetes.io/container/cpu/limit_cores
  • kubernetes.io/container/memory/limit_bytes
  • kubernetes.io/container/memory/request_bytes
  • kubernetes.io/container/memory/request_utilization
  • kubernetes.io/container/memory/page_fault_count
  • kubernetes.io/container/cpu/request_cores
  • kubernetes.io/container/uptime

k8s_node

  • kubernetes.io/node/cpu/allocatable_utilization
  • kubernetes.io/node/cpu/core_usage_time
  • kubernetes.io/node/memory/allocatable_utilization
  • kubernetes.io/node/memory/used_bytes
  • kubernetes.io/node/cpu/total_cores
  • kubernetes.io/node/cpu/allocatable_cores
  • kubernetes.io/node/ephemeral_storage/allocatable_bytes
  • kubernetes.io/node/memory/allocatable_bytes
  • kubernetes.io/node_daemon/cpu/core_usage_time
  • kubernetes.io/node/ephemeral_storage/used_bytes
  • kubernetes.io/node/ephemeral_storage/inodes_free
  • kubernetes.io/node_daemon/memory/used_bytes
  • kubernetes.io/node/pid_limit
  • kubernetes.io/node/pid_used
  • kubernetes.io/node/ephemeral_storage/total_bytes
  • kubernetes.io/node/ephemeral_storage/inodes_total
  • kubernetes.io/node/memory/total_bytes

k8s_pod

  • kubernetes.io/pod/network/received_bytes_count
  • kubernetes.io/pod/network/sent_bytes_count
  • kubernetes.io/pod/volume/total_bytes
  • kubernetes.io/pod/volume/used_bytes
  • kubernetes.io/pod/volume/utilization

What's next?