Stackdriver now lets you explore monitoring and logging information in your Kubernetes Engine clusters and application containers using a single dashboard.
Go to the Stackdriver home page, Stackdriver > Monitoring, in the GCP Console:
Select the Stackdriver account containing your Kubernetes Engine cluster:
- In most cases, the Stackdriver account is the GCP project containing your Kubernetes Engine cluster.
- You might be prompted to create a Stackdriver account, or you might not see your GCP project in the list of accounts. In that case, you should create a new Stackdriver account using your GCP project. For more information, see Creating a Stackdriver Account.
- To monitor clusters from multiple projects in the same dashboards, you must create a Stackdriver account that is different from your GCP project(s). For more information, see Monitoring multiple projects.
In Stackdriver Monitoring, go to Resources > Kubernetes Engine v2. A list of the clusters available in your Stackdriver account appears.
If you don't see any resources, see the Troubleshooting section below.
Kubernetes Monitoring dashboard interface
The Kubernetes Monitoring dashboard is divided into several parts, as indicated by the red numbers in the screenshot below:
The dashboard toolbar provides dashboard settings, filtering, and control over the timeline shown underneath it.
The timeline event selector lets you hover over the timeline to reveal summaries of alerting violations. See the Timeline events section below.
The details section lets you choose from one of three viewing tabs: Infrastructure, Workloads, and Services. These viewing tabs are discussed the Viewing tabs section below.
The dashboard provides multiple viewing tabs, which organize your cluster information in different ways. The possible viewing tabs are:
Infrastructure. Aggregates Kubernetes resources by this hierarchy: Cluster > Node > Pod > Container.
Workloads. Aggregates Kubernetes resources by this hierarchy: Cluster > Namespace > Workload > Pod > Container.
Services. Aggregates Kubernetes resources by this hierarchy: Cluster > Namespace > Service > Pod > Container.
You can select your viewing mode from the tabs above the details section:
The table is sorted to show Kubernetes resources with open incidents first. You can click the expander arrow (▸) in front of each Kubernetes resource to look at any subcomponents of the resource. The following screenshot shows an expanded hierarchy of Kubernetes resources:
Each resource name is preceded by an indicator which, if it is red, indicates that incidents have occurred in that resource or in resources lower in the hierarchy. To see the alerting details, click Name. For more details, see the Alerting details section below.
Following are explanations of the columns that appear in the three tabs. The displayed values are based on the selected time range:
- Name: The label you assigned to the Kubernetes resource.
- Resource Type: The possible values are Cluster, Container, Namespace, Node, Pod, and Workspace. Container.
- Ready: The number of node instances available.
- Incidents: The number of alerting violations.
- CPU Utilization: The percent utilization compared to the requested CPU resources.
- Memory Utilization: The percent utilization of requested memory.
- Total Memory Usage: The amount of memory allocated.
It is a known issue in this beta release that certain containers, pods, and workloads might be missing from the tab displays.
If a container does not have limits and requests set for CPU and Memory, then
that container and its parent pods and workloads might not be shown in the tab
displays. You can check for the existence of pods with
kubectl commands such
as the following:
kubectl get pods --namespace=[NAMESPACE_ID]
To correct this problem, add limits to the container configuration.
The Kubernetes Monitoring dashboard displays a summary line for each Kubernetes resource by default. To see the details for the resource, click the expander arrow (▸) in front of Kubernetes resource.
If you click the buttons, which are red or green, in front of the entry, a panel with alerting details appears:
This details view aggregates incidents, system metrics, and logs within one view.
You can also access the alerting details panel from the timeline event selector at the top of the dashboard. A timeline of incidents gives you a view of alerting violations that happened within the selected time range. If you hover over red areas in the timeline, event cards appear:
Event cards provide more information on each incident displayed in the timeline. If you click on an individual event card, you see the alerting details for the incident in a new panel.
If you don't see any Kubernetes resources in your dashboard, then check the following:
Is the correct GCP project selected at the top of the page? If not, use the drop-down menu at the top of the page to select a project. You must select the project whose data you want to see.
Does your project have any activity? If you just created your cluster, wait a few minutes for it to populate with data. See Installing Stackdriver Support for details.
Is the time range too narrow? You can use the Time menu in the dashboard toolbar at the top of the page to select other time ranges or define a Custom range.