Observing Your Kubernetes Clusters

Stackdriver lets you explore monitoring and logging information in your Google Kubernetes Engine clusters and application containers using a single dashboard.

Getting started

  1. From the GCP Console, go to the Stackdriver Monitoring home page by selecting Stackdriver > Monitoring. You can click the following button to go there:

    Go to the Stackdriver Monitoring console

  2. Select the Workspace containing your Google Kubernetes Engine cluster:

    • In most cases, the Workspace is the Google Cloud Platform project containing your Google Kubernetes Engine cluster.
    • You might be prompted to create a Workspace, or you may not see your GCP project in the list of accounts. In these cases, you should create a new Workspace using your GCP project. For more information, see Creating a Stackdriver Account.
    • To monitor clusters from multiple projects in the same dashboard, you must create a Workspace that is different from your GCP project(s). For more information, see Monitoring multiple projects.
  3. Navigate to the Kubernetes monitoring console:

    1. If you're using Legacy Stackdriver, select Resources > Kubernetes Engine.

    2. If you're using Stackdriver Kubernetes Engine Monitoring, select Resources > Kubernetes Engine NEW.

      You'll only see these menu items if you have clusters using Stackdriver.

    Go to the Stackdriver Kubernetes Monitoring Console

    This console shows you only those clusters that use Stackdriver Kubernetes Monitoring. If you don't see any clusters or you don't see all the resources in your clusters, see the Troubleshooting section on this page.

Stackdriver Kubernetes Engine Monitoring dashboard interface

The Stackdriver Kubernetes Engine Monitoring dashboard is divided into several parts, as indicated by the red numbers in the screenshot below:

Kubernetes Tabular View

  1. The dashboard toolbar provides dashboard settings, filtering, and control over the timeline shown underneath it.

  2. The timeline event selector lets you hover over the timeline to reveal summaries of alerting violations. See the Timeline events section below.

  3. The details section lets you choose from one of three viewing tabs: Infrastructure, Workloads, and Services. These viewing tabs are discussed the Viewing tabs section below.

Viewing tabs

The dashboard provides multiple viewing tabs, which organize your cluster information in different ways. The possible viewing tabs are:

  • Infrastructure. Aggregates Kubernetes resources by this hierarchy: Cluster > Node > Pod > Container.

  • Workloads. Aggregates Kubernetes resources by this hierarchy: Cluster > Namespace > Workload > Pod > Container.

  • Services. Aggregates Kubernetes resources by this hierarchy: Cluster > Namespace > Service > Pod > Container.

You can select your viewing mode from the tabs above the details section:

Kubernetes Event Details

The table is sorted to show Kubernetes resources with open incidents first. You can click the expander arrow (▸) in front of each Kubernetes resource to look at any subcomponents of the resource. The following screenshot shows an expanded hierarchy of Kubernetes resources:

Kubernetes Event Details

Each resource name is preceded by an indicator which, if it is red, indicates that incidents have occurred in that resource or in resources lower in the hierarchy. To see the alerting details, click Name. For more details, see the Alerting details section below.

Column definitions

Following are explanations of the columns that appear in the three tabs. The displayed values are based on the selected time range:

  • Name: The label you assigned to the Kubernetes resource.
  • Resource Type: The possible values are Cluster, Container, Namespace, Node, Pod, and Workspace.
  • Ready: The number of node instances available.
  • Incidents: The number of alerting violations.
  • CPU Utilization: The percent utilization compared to the requested CPU resources.
  • Memory Utilization: The percent utilization of requested memory.
  • Total Memory Usage: The amount of memory allocated.

Alerting details

The Kubernetes Monitoring dashboard displays a summary line for each Kubernetes resource by default. To see the details for the resource, click the expander arrow (▸) in front of Kubernetes resource.

If you click the buttons, which are red or green, in front of the entry, a panel with alerting details appears:

Kubernetes Event Details

This details view aggregates incidents, system metrics, and logs within one view.

Timeline events

You can also access the alerting details panel from the timeline event selector at the top of the dashboard. A timeline of incidents gives you a view of alerting violations that happened within the selected time range. If you hover over red areas in the timeline, event cards appear:

Kubernetes Timeline View

Event cards provide more information on each incident displayed in the timeline. If you click on an individual event card, you see the alerting details for the incident in a new panel.

Bubble chart

The Kubernetes Monitoring dashboard provides a bubble visualization that allows you to explore trends and patterns that appear in your metrics. It also provides at-a-glance health information about the nodes in your cluster.

Example Bubble Chart

Keep in mind the following information when viewing the chart:

  • Each bubble represents a node, and its size, the plot size, represents the number of pods in the node.

  • A gray plot indicates a healthy node; a red plot indicates a node with an open incident.

  • For the beta release, you can select CPU Usage and Memory Usage for the axes of the chart. You can also select GPU Usage if your nodes are using GPUs.


If you don't see any Kubernetes resources in your dashboard, then check the following:

  • Is the correct GCP project selected at the top of the page? If not, use the drop-down menu at the top of the page to select a project. You must select the project whose data you want to see.

  • Does your project have any activity? If you just created your cluster, wait a few minutes for it to populate with data. See Installing Stackdriver Support for details.

  • Is the time range too narrow? You can use the Time menu in the dashboard toolbar at the top of the page to select other time ranges or define a Custom range.

  • Do you have the proper permissions to view the dashboard? If you see either of the following permission-denied error messages when viewing a service's deployment details or a GCP project's metrics, you need to update your Cloud Identity and Access Management role to include roles/monitoring.viewer or roles/viewer:

    • You do not have sufficient permissions to view this page
    • You don't have permissions to perform the action on the selected resources

    For more details, go to Predefined roles.

  • Does your cluster's service account have permission to write data into Stackdriver? If you see high error rates on your API dashboard, then your service account might be missing the following roles:

    • metricWriter
    • logWriter
    • Stackdriver Resource Metadata Writer
Was this page helpful? Let us know how we did:

Send feedback about...

Stackdriver Monitoring
Need help? Visit our support page.