Observing your GKE clusters

This page describes how to access the Kubernetes Engine Monitoring and Legacy Logging and Monitoring monitoring dashboards, and how to use the Kubernetes Engine Monitoring monitoring dashboard.

Accessing the monitoring dashboard

  1. From the Cloud Console, go to Monitoring:

    Go to Monitoring

    If your Google Cloud project is already associated with a Workspace, then the Cloud Monitoring home page is displayed. Otherwise, a Workspace is created automatically. In general, this process requires no interaction from you, but it takes a few moments to complete. In some cases, the Add your project to a Workspace dialog is displayed. In this case, the simplest action is to create a new Workspace.

  2. Select Dashboards:

    • If your clusters use Kubernetes Engine Monitoring, select the dashboard named Kubernetes Engine New.

    • If your clusters use Legacy Logging and Monitoring, select the dashboard named Kubernetes Engine.

      If you don't see any clusters or if you don't see all the resources in your clusters, refer to Troubleshooting your GKE dashboard.

Kubernetes Engine Monitoring dashboard interface

The Kubernetes Engine Monitoring dashboard is divided into three parts:

Display the Kubernetes Engine Monitoring dashboard tabular view.

  1. The dashboard toolbar controls the time window for observations and provides dashboard settings and filters.

  2. The timeline event selector lets you select a specific time and display summaries of alerts. For detailed information, go to the Timeline events section.

  3. The details section lets you choose how your cluster information is presented to you. The next section provides more information on your choices.

Viewing tabs

The Kubernetes Engine Monitoring dashboard viewing tabs let you organize your cluster information by different hierarchies:

  • Infrastructure: Aggregates resources by Cluster, then Node, then Pod, and then by Container.

  • Workloads: Aggregates resources by Cluster, then Namespace, then Workload, then Pod, and lastly by Container.

  • Services: Aggregates resources by Cluster, then Namespace, then Service, then Pod, and lastly by Container.

Select your Kubernetes Engine Monitoring viewing mode.

The table is sorted to show resources with open incidents first. To view subcomponents of a resource, click expand for that resource. The following screenshot shows an expanded hierarchy of Kubernetes resources:

Display of the expanded hierarchy of Kubernetes resources.

Each resource name is preceded by an indicator which is red or green. A red indicator means that the resource, or a subcomponent of the resource, has an open incident. A green indicator means that there are no open incidents. To see the alerting details, metrics, and logs for a resource, click its row. For more details, go to the section on Viewing alerts, metrics, logs and details.

Column definitions

The Kubernetes Engine Monitoring dashboard displays data in columns based on the selected time range:

  • Name: The label you assigned to the Kubernetes resource.
  • Resource Type: The possible values are Cluster, Container, Namespace, Node, Pod, and Workspace.
  • Ready: The number of running pods aggregated at the specified entity. A checkmark indicates that the entity has at least 1 pod ready and running. Note that this Ready indicator is not the same as Pod status in the GKE console. Ready only indicates that the pod is ready to serve traffic, while Pod status displays other statuses, like Pending, Running, Crashlooping, etc.
  • Incidents: The number of alerting violations.
  • CPU Utilization: The percent utilization compared to the requested CPU resources.
  • Memory Utilization: The percent utilization of requested memory.
  • Total Memory Usage: The amount of memory allocated.

Viewing alerts, metrics, logs, and details

The Kubernetes Engine Monitoring dashboard displays a summary line for each Kubernetes resource by default. Each resource with a subcomponent is listed with an expand button and all resources are listed with a a red or green indicator. A red indicator means that the resource, or a subcomponent of the resource, has an open incident. A green indicator means that there are no open incidents:

  • To view subcomponents of a resource, click expand for that resource.
  • To open a pane that displays a summary of incidents, system metrics, logs, and details for a resource, click the resource's row. When you click a row, the information that is displayed is dependent on the resource type. For example when you click a row for a cluster, you won't see metrics or log information. However, this information is displayed when you click a row for a pod.

    In the following example, there are no open incidents on the node:

    Display of a Kubernetes alerts details.

    To go to the Kubernetes page in the Cloud Console, click Manage.

Timeline events

You can also access the alerting details panel from the Kubernetes Engine Monitoring dashboard timeline event selector. A timeline of incidents gives you a view of alerting violations that happened within the selected time range. If you place your pointer over a red area in the timeline, event cards appear:

Using the timeline view of a Kubernetes alert.

Each event card provides detailed information about one incident displayed in the timeline. To view alerting details for an event, click its event card.

Troubleshooting

For troubleshooting information, refer to Troubleshooting your GKE dashboard.