Use the GKE Enterprise overview

The GKE Enterprise overview in the Google Cloud console provides a "big picture" overview of your entire fleet. It provides a fleet-level view of your resource utilization that you can use to help optimize spending, application design, and resource allocation, including CPU, memory, and disk utilization aggregated by fleet and by cluster. It also shows fleet-wide Policy Controller compliance, helping you identify areas where you can improve security, and the synchronization status of your Config Sync packages.

This page assumes that you are familiar with resource management in Kubernetes. If you need to learn more, see Resource management for Pods and containers in the Kubernetes documentation.

The GKE Enterprise overview in the Google Cloud console is available for fleet users who have enabled the entire GKE Enterprise platform. If you've enabled GKE Enterprise you can also view the fleet overview information in the GKE Enterprise overview.

View the overview

To view the overview:

Select a time filter

By default, the GKE Enterprise overview shows resource utilization over the past one hour. To change this time period, use the time filter options:

  • Select the period over which you want to view the average resource utilization of the fleet containers. Choose one of the predefined options, or select Custom to specify a custom time period.

View clusters and total resource utilization

The overview at the top of the page provides an at-a-glance view of your clusters, and the total CPU /memory/disk utilization over the time period you have chosen. Resource utilization metrics are generated using system Cloud Monitoring data from your fleet's clusters. See Enable system Cloud Monitoring for fleet clusters below if you see Missing data from... at the top of the page.

View cluster status

In the Clusters in this Fleet section, you can see how many clusters are in your fleet, with warnings or errors displayed if there are any issues with their connectivity to the fleet: for example, if you have deleted a cluster without unregistering it first, or if you need to log in to a cluster outside Google Cloud to see its details.

  • If an error or warning is displayed, click the notification to see the problem cluster or clusters and fix the issue.
  • Click View all clusters to see your fleet's full cluster list.

View total resource utilization

The Total CPU/memory/disk utilization sections show the average usage of all your fleet containers' actual CPU, memory, and disk resources relative to allocatable resources across cluster nodes in this fleet over the time period you have chosen. Allocatable on a Kubernetes node is defined as the amount of resources that can be used by regular Pods on that node.

This view gives you a quick overview of your fleet's resource utilization and available resources, and can indicate possible issues to investigate further with more detailed metrics: for example, if total CPU utilization is very low, you can use the "by cluster" metrics below to identify clusters that could be resized.

View detailed resource utilization

This section provides a detailed view of how your fleet is using its cloud or on-premises resources, including resource utilization by fleet, and top and low resource utilization by cluster. This can help you see, for example, where you have potentially underutilized or overutilized clusters that you might want to resize. You can read about how these metrics are calculated in more detail in Fleet resource utilization metrics.

View resource utilization over time

CPU/memory/disk utilization by fleet lets you dig deeper into how your fleet uses resources over time, and also lets you consider requested resources from your clusters in addition to allocatable resources and actual usage. Each panel shows a graph of your fleet-aggregated CPU, memory, or disk usage over the time period you have chosen, with the following information displayed as separate lines:

  • Allocatable: The amount of the resource that is allocatable across your fleet cluster nodes
  • Requested: The amount of the resource that containers across your fleet have requested
  • Used: The actual amount of the resource that your containers used

To see details for a given point on the graph, scroll across the graph to the time that you are interested in (for example, a visible spike in actual usage on the graph). The allocatable, requested, and actual resource usage information for that time is displayed.

To toggle the display of one or more of the lines in the chart, click the relevant metric or metrics below the graph.

View top resource utilization by cluster

The next row shows your fleet's Top CPU/memory/disk utilization by cluster, letting you quickly see which specific clusters are the biggest users of their allocatable resources. Each panel lists your top clusters in order of utilization (highest first). For each cluster, you can see both a graph of their usage of the resource, and an average of their resource usage relative to their allocatable resources over the chosen time period. This view can help you, for example, to see clusters that are overutilized. Clusters that don't have enough resources available might not be able to schedule Pods.

Click on name of the clusters that you're interested in to see more details. In the cluster overview, you can drill down further by clicking View more details in GKE to see additional node, workload, and service details in the GKE dashboards.

Click View all clusters by CPU/memory/disk utilization to view a sorted list of all clusters in your fleet.

View low resource utilization by cluster

The final resource utilization row shows your fleet's Low CPU/memory/disk utilization by cluster, so that you can quickly see which clusters are underutilized. The clusters using the least resources appear at the top of each panel, with a graph of their usage, and an average of the resource usage relative to their allocatable resources over the chosen time period.

Click on name of the clusters that you're interested in to see more details about the cluster. Click View all clusters by CPU/memory/disk utilization to view a sorted list of all clusters in your fleet.

View Policy Controller coverage

Policy Controller enables the enforcement of fully programmable policies for your clusters. These policies act as "guardrails" and prevent any changes to the configuration of the Kubernetes API from violating your organization's security, operational, or compliance controls.

The Policy status section shows you how many clusters have Policy Controller enabled.

Click View Policy to view the Policy Controller dashboard. If you haven't installed Policy Controller on a cluster, click Enable Policy.

You can learn more about Policy Controller in its documentation.

View Config Sync package health

Config Sync is a GitOps service that lets cluster operators and platform administrators deploy packages from a source of truth. A package contains all of the configurations that are stored in each source that you sync your cluster from. The source might be a Git repository, a directory in a Git repository, an OCI image, or a Helm repository. Because you can sync your cluster from multiple sources, you might have multiple packages per cluster.

The Config Status section shows you the following information:

  • The total number of packages in your fleet
  • The synchronization status of the packages in your fleet

Click View Config overview to view the Config Sync dashboard. If you haven't installed Config Sync on a cluster, click Enable Config Sync.

You can learn more about Config Sync in its documentation.

Enable system Cloud Monitoring for fleet clusters

As mentioned above, the metrics in the dashboard are generated using Cloud Monitoring data for cluster components (such as workloads in the kube-system and gke-connect namespaces). Because of this, Cloud Monitoring must be enabled for all system, control plane, and kube state metrics components of your fleet member clusters.

Most GKE and GKE clusters have Cloud Logging and Cloud Monitoring enabled by default, but you still need to manually enable Cloud Monitoring for all cluster components. Attached clusters always require you to set up Cloud Monitoring manually.

If any of your fleet's cluster components do not have Cloud Monitoring enabled, a panel is displayed at the top of the page showing the number of clusters with missing data. To enable Cloud Monitoring for components on these clusters, complete the following steps:

  1. In the Missing data... panel, click View clusters to see the clusters that are not sending data to the Google Cloud console.

  2. For each cluster in the list, see the following guide for your cluster type to enable Cloud Monitoring:

Enable monitoring for cross-project registered clusters

To gather and view metrics across multiple Google Cloud projects, Cloud Monitoring lets you create multi-project metrics scopes. When you register a GKE cluster from a different project to your fleet host project, a new metrics scope is automatically created that includes both projects (if it doesn't already exist). This lets you see utilization data from the cluster in the overview.

What's next