Choose compute classes for Autopilot Pods

Autopilot

This document shows you how to select specific compute classes to run workloads that have unique hardware requirements in your Google Kubernetes Engine (GKE) Autopilot clusters. Before reading this document, ensure that you're familiar with the concept of compute classes in GKE Autopilot.

Overview of Autopilot compute classes

Autopilot offers compute classes that are designed to run workloads that have specific hardware requirements. These compute classes are useful for workloads such as machine learning and AI tasks, or running real-time high traffic databases.

These compute classes are a subset of the Compute Engine machine series, and offer flexibility beyond the default Autopilot general-purpose compute class. For example, the Scale-Out class turns off simultaneous multi-threading so that each vCPU is one physical core.

Based on your individual Pod needs, you can configure your regular Autopilot Pods or your Spot Pods to request nodes backed by these compute classes. You can also request specific CPU architecture, such as Arm, in compute classes that support that architecture.

Before you begin

Before you start, make sure you have performed the following tasks:

Enable the Google Kubernetes Engine API.

Enable Google Kubernetes Engine API

If you want to use the Google Cloud CLI for this task, install and then initialize the gcloud CLI. If you previously installed the gcloud CLI, get the latest version by running gcloud components update.
Note: For existing gcloud CLI installations, make sure to set the compute/region and compute/zone properties. By setting default locations, you can avoid errors in gcloud CLI like the following: One of [--zone, --region] must be supplied: Please specify location.

Ensure that you have a GKE Autopilot cluster running GKE version 1.24.1-gke.1400 or later.

Request a compute class in your Autopilot Pod

To tell Autopilot to place your Pods on a specific compute class, specify the cloud.google.com/compute-class label in a nodeSelector or a node affinity rule, such as in the following examples:

nodeSelector

    apiVersion: apps/v1
    kind: Deployment
    metadata:
      name: hello-app
    spec:
      replicas: 3
      selector:
        matchLabels:
          app: hello-app
      template:
        metadata:
          labels:
            app: hello-app
        spec:
          nodeSelector:
            cloud.google.com/compute-class: "COMPUTE_CLASS"
          containers:
          - name: hello-app
            image: us-docker.pkg.dev/google-samples/containers/gke/hello-app:1.0
            resources:
              requests:
                cpu: "2000m"
                memory: "2Gi"

Replace COMPUTE_CLASS with the name of the compute class based on your use case, such as Scale-Out. If you select Accelerator, you must also specify a compatible GPU. For instructions, see Deploy GPU workloads in Autopilot. If you select Performance, you can optionally select a Compute Engine machine series in the node selector. If you don't specify a machine series, GKE uses the C4 machine series depending on regional availability. For instructions, see Run CPU-intensive workloads with optimal performance.

nodeAffinity

    apiVersion: apps/v1
    kind: Deployment
    metadata:
      name: hello-app
    spec:
      replicas: 3
      selector:
        matchLabels:
          app: hello-app
      template:
        metadata:
          labels:
            app: hello-app
        spec:
          terminationGracePeriodSeconds: 25
          containers:
          - name: hello-app
            image: us-docker.pkg.dev/google-samples/containers/gke/hello-app:1.0
            resources:
              requests:
                cpu: "2000m"
                memory: "2Gi"
                ephemeral-storage: "1Gi"
          affinity:
            nodeAffinity:
              requiredDuringSchedulingIgnoredDuringExecution:
                nodeSelectorTerms:
                - matchExpressions:
                  - key: cloud.google.com/compute-class
                    operator: In
                    values:
                    - "COMPUTE_CLASS"

You can also request specific compute classes for your Spot Pods.

Specify resource requests

When you choose a compute class, make sure that you specify resource requests for your Pods based on the Minimum and maximum resource requests for your selected class. If your requests are less than the minimum, Autopilot automatically scales your requests up. However, if your requests are greater than the maximum, Autopilot does not deploy your Pods and displays an error message.

Choose a CPU architecture

Some compute classes support multiple CPU architectures. For example, the Scale-Out class supports both Arm and x86 architectures. If you don't request a specific architecture, Autopilot provisions nodes that have the default architecture of the specified compute class. If your Pods need to use a different architecture, request that architecture in your node selector or node affinity rule, alongside your compute class request. The compute class that you request must support the CPU architecture you specify.

For instructions, refer to Deploy Autopilot Pods on Arm architecture.