Run CPU-intensive workloads with optimal performance

Autopilot

This page shows you how to optimize CPU-intensive workloads for performance by telling Google Kubernetes Engine (GKE) to place each Pod on its own node, with full access to all of the node's resources. To use this Pod placement model, request the Performance compute class in your Autopilot workloads.

Benefits of the Performance compute class

Dedicated nodes per Pod are ideal when you run large-scale CPU-intensive workloads that might need access to capabilities on the underlying virtual machine (VM). For example, CPU-intensive AI/ML training workloads or high performance computing (HPC) batch workloads.

Pods on these dedicated nodes have the following benefits:

Predictable performance: Access all of the node resources at any time.
Burstable workloads: If you don't set resource limits in your manifests, your Performance class Pods can burst into all of the unused capacity on the node with minimal risk of Kubernetes node-pressure eviction.

How Performance class Pods work

You deploy a Pod that has the following characteristics:

Selects the Performance class and a Compute Engine machine series
Specifies resource requests and, ideally, doesn't specify resource limits

GKE does the following:

Ensures that the deployed Pod requests at least the minimum resources for the compute class
Calculates the total resource requests of the deployed Pod and any DaemonSets in the cluster
Provisions a node that's backed by the selected machine series
Modifies the Pod manifest with a combination of node selectors and tolerations to ensure that the Pod runs on its own node

Compatibility with other GKE features

You can use Performance class Pods with the following GKE capabilities and features:

Spot Pods and extended run time Pods are mutually exclusive. GKE doesn't enforce higher minimum resource requests for Performance class Pods that use workload separation.

Pricing

Your Pod can use the entire underlying VM and any attached hardware at any time, and you're billed for this hardware by Compute Engine, with a premium for Autopilot node management and scalability. For details, see GKE pricing.

Before you begin

Before you start, make sure you have performed the following tasks:

Enable the Google Kubernetes Engine API.

Enable Google Kubernetes Engine API

If you want to use the Google Cloud CLI for this task, install and then initialize the gcloud CLI. If you previously installed the gcloud CLI, get the latest version by running gcloud components update.
Note: For existing gcloud CLI installations, make sure to set the compute/region and compute/zone properties. By setting default locations, you can avoid errors in gcloud CLI like the following: One of [--zone, --region] must be supplied: Please specify location.

Ensure that you're familiar with the following:
- Compute Engine machine series and use cases
- Kernel-level requirements for your applications
Ensure that you have an existing Autopilot cluster running version 1.28.6-gke.1369000 and later or version 1.29.1-gke.1575000 and later. To create a cluster, see Create an Autopilot cluster.

Connect to your cluster

Use the Google Cloud CLI to connect to your Autopilot cluster:

gcloud container clusters get-credentials CLUSTER_NAME \
    --location=LOCATION

Replace the following:

CLUSTER_NAME: the name of your cluster.
LOCATION: the Compute Engine location of the cluster.

Deploy a Performance class Pod

Save the following manifest as perf-class-pod.yaml:

apiVersion: v1
kind: Pod
metadata:
  name: performance-pod
spec:
  nodeSelector:
    cloud.google.com/compute-class: Performance
    cloud.google.com/machine-family: MACHINE_SERIES
  containers:
  - name: my-container
    image: "k8s.gcr.io/pause"
    resources:
      requests:
        cpu: 20
        memory: "100Gi"

Replace MACHINE_SERIES with the Compute Engine machine series for your Pod, like c3. For supported values, see Supported machine series in this document.

Deploy the Pod:
```
kubectl apply -f perf-class-pod.yaml
```

Use Local SSDs in Performance class Pods

Performance class Pods can use Local SSDs for ephemeral storage if you select a machine series that includes a Local SSD. GKE considers ephemeral storage requests when provisioning a node for the Performance class Pod.

Save the following manifest as perf-class-ssd-pod.yaml:

apiVersion: v1
kind: Pod
metadata:
  name: performance-pod
spec:
  nodeSelector:
    cloud.google.com/compute-class: Performance
    cloud.google.com/machine-family: MACHINE_SERIES
    cloud.google.com/gke-ephemeral-storage-local-ssd: "true"
  containers:
  - name: my-container
    image: "k8s.gcr.io/pause"
    resources:
      requests:
        cpu: 12
        memory: "50Gi"
        ephemeral: "200Gi"

Replace MACHINE_SERIES with a supported machine series that also supports Local SSDs. If your specified machine series doesn't support Local SSDs, the deployment fails with an error.

Deploy the Pod:
```
kubectl apply -f perf-class-pod.yaml
```

Supported machine series

The Performance compute class supports the following machine series:

Machine series	Local SSD selection in Autopilot
C3 machine series (`c3`)
C3D machine series (`c3d`)
H3 machine series (`h3`)
C2 machine series (`c2`)
C2D machine series (`c2d`)
T2D machine series (`t2d`)
T2A machine series (`t2a`)

To compare these machine series and their use cases, see Machine series comparison in the Compute Engine documentation.

How GKE selects a machine size

To select a machine size in the specified machine series, GKE calculates the total CPU, total memory, and total ephemeral storage requests of the Performance class Pod and any DaemonSets that will run on the new node. GKE rounds these values up to the nearest available Compute Engine machine type that supports all of these totals.

Example 1: Consider a Performance class Pod that selects the C3 machine series. The total resource requests including DaemonSets are as follows:
- 70 vCPU
- 200 GiB of memory
GKE places the Pod on a node that's backed by the c3-standard-88 machine type, which has 88 vCPUs and 352 GB of memory.
Example 2: Consider a Performance class Pod that selects the C3D machine series and Local SSDs for ephemeral storage. The total resource requests including DaemonSets are as follows:
- 12 vCPU
- 50 GiB of memory
- 200 GiB of ephemeral storage
GKE places the Pod on a node that uses the c3d-standard-16-lssd machine type, which has 16 vCPUs, 64 GiB of memory, and 365 GiB of Local SSD capacity.