You can use the Balanced
and Scale-Out
ComputeClasses in
Google Kubernetes Engine (GKE) Autopilot clusters to run workloads that
require extra compute capacity or specialized CPU configurations. This page is
intended for cluster administrators who want more flexible compute options than
the default Autopilot cluster configuration provides.
Overview of Balanced and Scale-Out ComputeClasses
By default, Pods in GKE Autopilot clusters run on a container-optimized compute platform. This platform is ideal for general-purpose workloads such as web servers and medium-intensity batch jobs. The container-optimized compute platform provides a reliable, scalable, cost-optimized hardware configuration that can handle the requirements of most workloads.
If you have workloads that have unique hardware requirements (such as performing machine learning or AI tasks, running real-time high traffic databases, or needing specific CPU platforms and architecture) you can use ComputeClasses to provision that hardware.
In Autopilot clusters only, GKE provides the following curated ComputeClasses that let you run Pods that need more flexibility than the default container-optimized compute platform:
Balanced
: provides higher maximum CPU and memory capacity than the container-optimized compute platform.Scale-Out
: disables simultaneous multi-threading (SMT) and is optimized for scaling out.
These ComputeClasses are available in only Autopilot clusters. Similar to the default container-optimized compute platform, Autopilot manages node sizing and resource allocation based on your running Pods.
Custom ComputeClasses for additional flexibility
If the Balanced or Scale-Out ComputeClasses in Autopilot clusters
don't meet your workload requirements, you can configure
your own ComputeClasses.
You deploy ComputeClass Kubernetes custom resources to your clusters with sets
of node attributes that GKE uses to configure new nodes in the
cluster. These custom ComputeClasses can, for example, let you deploy workloads
on the same hardware as the Balanced
or Scale-Out
ComputeClasses in any
GKE Autopilot or Standard cluster. For more
information, see
About Autopilot mode workloads in GKE Standard.
Pricing
Pods that use the Balanced
or Scale-Out
ComputeClasses are billed based on
the following SKUs:
For more information, see GKE pricing.
Balanced and Scale-Out technical details
This section describes the machine types and use cases for the Balanced
and
Scale-Out
classes. If you don't request a ComputeClass in your Pods,
Autopilot places the Pods on the container-optimized compute platform
by default. You might sometimes see ek
as the node machine series in your
Autopilot nodes that use the container-optimized compute platform. EK
machines are E2 machine types that are exclusive to Autopilot.
The following table provides a technical overview of the Balanced
and
Scale-Out
ComputeClasses.
Balanced and Scale-Out ComputeClasses | ||
---|---|---|
Balanced |
Provides more CPU capacity and memory capacity than the container-optimized compute platform maximums. Provides additional CPU platforms and the ability to set minimum CPU platforms for Pods, such as Intel Ice Lake or later.
Use the
|
|
Scale-Out |
Provides single-thread-per-core computing and horizontal scaling.
Use the
|
ComputeClass selection in workloads
To use a ComputeClass for a GKE workload, you select the
ComputeClass in the workload manifest by using a
node selector
for the cloud.google.com/compute-class
label.
The following example Deployment manifest selects a ComputeClass:
Replace COMPUTE_CLASS
with the name of a ComputeClass,
such as Balanced
or Scale-Out
. You can select a maximum of one ComputeClass
in a workload.
When you deploy the workload, GKE does the following:
- Automatically provisions nodes backed by the specified configuration to run your Pods.
- Automatically adds node labels and taints to the new nodes to prevent other Pods from scheduling on those nodes. The taints are unique to each ComputeClass. If you also select a CPU architecture, GKE adds a separate taint unique to that architecture.
- Automatically adds tolerations corresponding to the applied taints to your deployed Pods, which lets GKE place those Pods on the new nodes.
For example, if you request the Scale-Out
ComputeClass for a Pod:
- Autopilot adds a taint specific to
Scale-Out
for those nodes. - Autopilot adds a toleration for that taint to the
Scale-Out
Pods.
Pods that don't request Scale-Out
won't get the toleration. As a result,
GKE won't schedule those Pods on the Scale-Out
nodes.
If you don't explicitly request a ComputeClass in your workload specification, Autopilot schedules Pods on nodes that use the default container-optimized compute platform. Most general-purpose workloads can run with no issues on this platform.
How to request a CPU architecture
In some cases, your workloads might be built for a specific architecture, such as Arm. The Scale-Out ComputeClass supports multiple CPU architectures. You can request a specific architecture alongside your ComputeClass request by specifying a label in your node selector or node affinity rule, such as in the following example:
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-arm
spec:
replicas: 3
selector:
matchLabels:
app: nginx-arm
template:
metadata:
labels:
app: nginx-arm
spec:
nodeSelector:
cloud.google.com/compute-class: COMPUTE_CLASS
kubernetes.io/arch: ARCHITECTURE
containers:
- name: nginx-arm
image: nginx
resources:
requests:
cpu: 2000m
memory: 2Gi
Replace ARCHITECTURE
with the CPU architecture that you
want, such as arm64
or amd64
. You can select a maximum of one architecture
in your workload. The ComputeClass that you select must support your
specified architecture.
If you don't explicitly request an architecture, Autopilot uses the default architecture of the ComputeClass.
Arm architecture on Autopilot
Autopilot supports requests for nodes that use the Arm CPU architecture. Arm nodes are more cost-efficient than similar x86 nodes while delivering performance improvements. For instructions to request Arm nodes, refer to Deploy Autopilot workloads on Arm architecture.
Ensure that you're using the correct images in your deployments. If your Pods use Arm images and you don't request Arm nodes, Autopilot schedules the Pods on x86 nodes and the Pods will crash. Similarly, if you accidentally use x86 images but request Arm nodes for the Pods, the Pods will crash.
Default, minimum, and maximum resource requests
When choosing a ComputeClass for your Autopilot workloads, make sure that you specify resource requests that meet the minimum and maximum requests for that ComputeClass. For information about the default requests, as well as the minimum and maximum requests for each ComputeClass, refer to Resource requests and limits in GKE Autopilot.
What's next
- Learn how to select specific ComputeClasses in your Autopilot workloads.
- Read about the default, minimum, and maximum resource requests for each platform.