This document shows you how to select specific compute classes to run workloads that have unique hardware requirements in your Google Kubernetes Engine (GKE) Autopilot clusters. Before reading this document, ensure that you're familiar with the concept of compute classes in GKE Autopilot.
Overview of Autopilot compute classes
Autopilot offers compute classes that are designed to run workloads that have specific hardware requirements. These compute classes are useful for workloads such as machine learning and AI tasks, or running real-time high traffic databases.
These compute classes are a subset of the Compute Engine
machine series, and offer
flexibility beyond the default Autopilot general-purpose compute class.
For example, the Scale-Out
class turns off simultaneous multi-threading so that each
vCPU is one physical core.
Based on your individual Pod needs, you can configure your regular Autopilot Pods or your Spot Pods to request nodes backed by these compute classes. You can also request specific CPU architecture, such as Arm, in compute classes that support that architecture.
Before you begin
Before you start, make sure you have performed the following tasks:
- Enable the Google Kubernetes Engine API. Enable Google Kubernetes Engine API
- If you want to use the Google Cloud CLI for this task,
install and then
initialize the
gcloud CLI. If you previously installed the gcloud CLI, get the latest
version by running
gcloud components update
.
- Ensure that you have a GKE Autopilot cluster running GKE version 1.24.1-gke.1400 or later.
Request a compute class in your Autopilot Pod
To tell Autopilot to place your Pods on a specific compute class, specify thecloud.google.com/compute-class
label in a
nodeSelector
or a node affinity rule,
such as in the following examples:
nodeSelector
apiVersion: apps/v1 kind: Deployment metadata: name: hello-app spec: replicas: 3 selector: matchLabels: app: hello-app template: metadata: labels: app: hello-app spec: nodeSelector: cloud.google.com/compute-class: "COMPUTE_CLASS" containers: - name: hello-app image: us-docker.pkg.dev/google-samples/containers/gke/hello-app:1.0 resources: requests: cpu: "2000m" memory: "2Gi"
Replace COMPUTE_CLASS
with the name of the
compute class
based on your use case, such as Scale-Out
.
If you select Accelerator
, you must also specify a compatible GPU. For instructions,
see Deploy GPU workloads in Autopilot. If you select Performance
,
you must also select a Compute Engine machine series in the node selector. For instructions,
see Run CPU-intensive workloads with optimal performance.
nodeAffinity
apiVersion: apps/v1 kind: Deployment metadata: name: hello-app spec: replicas: 3 selector: matchLabels: app: hello-app template: metadata: labels: app: hello-app spec: terminationGracePeriodSeconds: 25 containers: - name: hello-app image: us-docker.pkg.dev/google-samples/containers/gke/hello-app:1.0 resources: requests: cpu: "2000m" memory: "2Gi" ephemeral-storage: "1Gi" affinity: nodeAffinity: requiredDuringSchedulingIgnoredDuringExecution: nodeSelectorTerms: - matchExpressions: - key: cloud.google.com/compute-class operator: In values: - "COMPUTE_CLASS"
Replace COMPUTE_CLASS
with the name of the
compute class
based on your use case, such as Scale-Out
. If you select Accelerator
, you must also specify a compatible GPU. For instructions,
see Deploy GPU workloads in Autopilot. If you select Performance
,
you must also select a Compute Engine machine series in the node selector. For instructions,
see Run CPU-intensive workloads with optimal performance.
You can also request specific compute classes for your Spot Pods.
Specify resource requests
When you choose a compute class, make sure that you specify resource requests for your Pods based on the Minimum and maximum resource requests for your selected class. If your requests are less than the minimum, Autopilot automatically scales your requests up. However, if your requests are greater than the maximum, Autopilot does not deploy your Pods and displays an error message.
Choose a CPU architecture
Some compute classes support multiple CPU architectures. For example, the
Scale-Out
class supports both Arm and x86 architectures. If you
don't request a specific architecture, Autopilot provisions nodes that
have the default architecture of the specified compute class. If your Pods need
to use a different architecture, request that architecture in your node selector
or node affinity rule, alongside your compute class request. The compute class
that you request must support the CPU architecture you specify.
For instructions, refer to Deploy Autopilot Pods on Arm architecture.
What's next
- Learn more about Autopilot cluster architecture.
- Learn about the lifecycle of Pods.
- Learn about the available Autopilot compute classes.
- Read about the default, minimum, and maximum resource requests for each platform.