GKE cluster architecture

Stay organized with collections Save and categorize content based on your preferences.

This page introduces the architecture of a Google Kubernetes Engine (GKE) cluster. Your containerized Kubernetes workloads all run in a GKE cluster.

A GKE cluster consists of a control plane and worker machines called nodes. The control plane and nodes make up the Kubernetes cluster orchestration system. GKE manages the entire underlying infrastructure of clusters, including the control plane, nodes, and all system components. If you use GKE Standard mode, GKE manages the control plane and system components, and you manage the nodes.

The following diagram shows the architecture of a GKE cluster:

GKE cluster architecture. The control plane is
     GKE managed and runs the API server, resource controllers,
     scheduler, and cluster storage. The nodes are GKE managed
     in Autopilot mode and user-managed in Standard mode.
     User Pods run containers in nodes. Other Google Cloud services are
     available to integrate with GKE.

Control plane

The control plane runs processes such as the Kubernetes API server, scheduler, and core resource controllers. GKE manages the control plane lifecycle from cluster creation to deletion. This includes upgrades to the Kubernetes version running on the control plane, which GKE performs automatically, or manually at your request if you prefer to upgrade earlier than the automatic schedule.

Control plane and the Kubernetes API

The control plane is the unified endpoint for your cluster. You interact with the control plane through Kubernetes API calls. The control plane runs the Kubernetes API server process (kube-apiserver) to handle API requests. You can make Kubernetes API calls in the following ways:

  • Direct calls: HTTP/gRPC
  • Indirect calls: Kubernetes command-line clients such as kubectl, or the Google Cloud console.

The API server process is the hub for all communication for the cluster. All internal cluster components such as nodes, system processes, and application controllers act as clients of the API server.

Your API requests tell Kubernetes what your desired state is for the objects in your cluster. Kubernetes attempts to constantly maintain that state. Kubernetes lets you configure objects in the API either imperatively or declaratively.

To learn more about object management in Kubernetes, refer to the following pages:

Control plane and node interaction

The control plane manages what runs on all of the cluster's nodes. The control plane schedules workloads and manages the workloads' lifecycle, scaling, and upgrades. The control plane also manages network and storage resources for those workloads. The control plane and nodes communicate with each other using Kubernetes APIs.

Control plane interactions with Artifact Registry

When you create or update a cluster, GKE pulls container images for the Kubernetes system software running on the control plane and nodes from the pkg.dev Artifact Registry or the gcr.io Container Registry. An outage affecting these registries might cause the following actions to fail:

  • New cluster creation
  • Cluster version upgrades

Disruptions to workloads might occur even without your intervention, depending on the specific nature and duration of the outage.

If the pkg.dev Artifact Registry or the gcr.io Container Registry outage is regional, Google might redirect requests to a zone or region that isn't affected by the outage.

To check the status of Google Cloud services, go to the Google Cloud status dashboard.

Nodes

Nodes are the worker machines that run your containerized applications and other workloads. The individual machines are Compute Engine virtual machines (VMs) that GKE creates. The control plane manages and receives updates on each node's self-reported status.

A node runs the services necessary to support the containers that make up your cluster's workloads. These include the runtime and the Kubernetes node agent (kubelet), which communicates with the control plane and is responsible for starting and running containers scheduled on the node.

GKE also runs a number of system containers that run as per-node agents, called DaemonSets, that provide functionality such as log collection and intra-cluster network connectivity.

Node management varies based on the cluster mode of operation, as follows:

Node component Autopilot mode Standard mode
Lifecycle

Fully managed by GKE, including:

GKE manages the following:

You can manage the following:

Visibility View nodes using kubectl. Underlying Compute Engine virtual machines not visible or accessible in the gcloud CLI or the Google Cloud console. View nodes using kubectl, the gcloud CLI, and the Google Cloud console. View and access underlying Compute Engine VMs.
Connectivity No direct connection to the underlying VMs. Connect to underlying VMs using SSH.
Node operating system (OS) Managed by GKE. All nodes use Container-Optimized OS with containerd (cos_containerd). Choose an operating system for your nodes.
Machine hardware selection

Request compute classes in Pods based on use case.

GKE manages machine configuration, scheduling, quantity, and lifecycle.

Choose and configure Compute Engine machine types when creating node pools. Configure settings for sizing, scaling, quantity, scheduling, and location based on need.

Node allocatable resources for GKE Standard

Some of a node's resources are required to run the GKE and Kubernetes node components necessary to make that node function as part of your cluster. For this reason, you might notice a disparity between your node's total resources (as specified in the machine type documentation) and the node's allocatable resources in GKE.

Because larger machine types tend to run more containers (and by extension, more Pods), the amount of resources that GKE reserves for Kubernetes components scales upward for larger machines. Windows Server nodes also require more resources than a typical Linux node. The nodes need the extra resources to account for running the Windows OS and for the Windows Server components that can't run in containers.

You can request resources for your Pods or limit their resource usage. To learn how to request or limit resource usage for Pods, refer to Managing Resources for Containers.

To inspect the node-allocatable resources available in a cluster, run the following command, replacing NODE_NAME with the name of the node you want to inspect:

kubectl describe node NODE_NAME | grep Allocatable -B 7 -A 6

The returned output contains Capacity and Allocatable fields with measurements for ephemeral storage, memory, and CPU.

Eviction threshold

To determine how much memory is available for Pods, you must also consider the eviction threshold. GKE reserves an additional 100 MiB of memory on each node for kubelet eviction.

Allocatable memory and CPU resources

Allocatable resources are calculated in the following way:

ALLOCATABLE = CAPACITY - RESERVED - EVICTION-THRESHOLD

For memory resources, GKE reserves the following:

  • 255 MiB of memory for machines with less than 1 GiB of memory
  • 25% of the first 4 GiB of memory
  • 20% of the next 4 GiB of memory (up to 8 GiB)
  • 10% of the next 8 GiB of memory (up to 16 GiB)
  • 6% of the next 112 GiB of memory (up to 128 GiB)
  • 2% of any memory above 128 GiB

For CPU resources, GKE reserves the following:

  • 6% of the first core
  • 1% of the next core (up to 2 cores)
  • 0.5% of the next 2 cores (up to 4 cores)
  • 0.25% of any cores above 4 cores

Allocatable local ephemeral storage resources

Beginning in GKE version 1.10, you can manage your local ephemeral storage resources as you do your CPU and memory resources. To learn how to make your Pods specify ephemeral storage requests and limits and to see how they are acted on, see Local ephemeral storage in the Kubernetes documentation.

GKE typically configures its nodes with a single file system and periodic scanning. Ephemeral storage can also be backed by local SSDs. In either case, a portion of the file system is reserved for kubelet use. The remaining portion, called allocatable local ephemeral storage, is available for use as ephemeral storage resources.

The amount of the file system reserved for kubelet and other system components is given by:

EVICTION-THRESHOLD + SYSTEM-RESERVATION

Ephemeral storage backed by boot disk

By default, ephemeral storage is backed by the node boot disk. In this case, the eviction threshold and system reservation size are given by the following formulas:

EVICTION-THRESHOLD = 10% * BOOT-DISK-CAPACITY

SYSTEM-RESERVATION = Min(50% * BOOT-DISK-CAPACITY, 6GiB + 35% * BOOT-DISK-CAPACITY, 100 GiB)

For an approximate representation of the amount of allocatable ephemeral storage available as boot disk capacity increases, see the following graph:

A graph showing how ephemeral storage increases with boot disk capacity. The relationship is approximately linear.

Ephemeral storage backed by local SSDs

The system reserved space depends on the number of local SSDs:

Number of local SSDs SYSTEM-RESERVATION (GiB)
1 50
2 75
3 or more 100

The eviction threshold is similar to ephemeral storage backed by the boot disk:

EVICTION-THRESHOLD = 10% * NUM-LOCAL-SSDS * 375 GB

The capacity of each local SSD is 375 GiB. To learn more, see the Compute Engine documentation on Adding Local SSDs.