You can use ComputeClasses to run Google Kubernetes Engine (GKE) Autopilot workloads in your GKE Standard mode clusters. This page describes the methods that you can use to run your workloads in Autopilot mode and helps you to decide when to run a workload in a specific mode.
This information is intended for the following people:
- Cloud architects who want to optimize operational costs in organizations.
- Platform administrators who want to reduce the overhead of manual infrastructure management.
- Site reliability engineers (SREs) who want to shift infrastructure maintenance, upgrades, and scaling to Google Cloud when possible.
You should already be familiar with the following concepts:
About GKE Autopilot
Autopilot is a mode of operation in GKE in which Google manages your node infrastructure, scaling, security, and pre-configured features. Autopilot mode is optimized for running most production workloads in an environment that applies recommended settings for security, reliability, performance, and scalability. To decide between Autopilot mode and Standard mode based on your requirements, see About GKE modes of operation.
You can use Autopilot mode in the following ways:
- Create a cluster that uses Autopilot mode: Google manages the entire cluster and applies best practices for automation, reliability, security, and costs.
- Run workloads in Autopilot mode in Standard clusters: you deploy Autopilot ComputeClasses and select them in workloads. Google manages the nodes that GKE creates for those specific workloads. You control the cluster and can run your own node pools alongside the nodes that GKE manages.
About Autopilot mode for ComputeClasses
A ComputeClass is a Kubernetes custom resource that defines a list of node configurations, like machine types or feature settings. You can select specific ComputeClasses in Kubernetes workload specifications. When a workload that selects a ComputeClass needs a new node, GKE attempts to provision the node with one of the configurations that the ComputeClass declares. GKE tries each configuration in the ComputeClass in order and falls back to the next configuration if node creation fails. For more information, see About custom ComputeClasses.
To run Autopilot workloads in your GKE Standard clusters, you enable Autopilot mode in a ComputeClass and select that ComputeClass in specific workloads. Google manages any new nodes that GKE provisions for these workloads, similar to how Google manages the nodes in Autopilot clusters. Most of the benefits and security features of Autopilot mode apply to those workloads and the host nodes.
Autopilot mode ComputeClasses provide cluster administrators with additional flexibility to choose the level of control that you want over specific workloads and infrastructure in your cluster, such as in the following ways:
- You can let GKE fully manage specific workloads by running them in Autopilot mode.
- You retain full control over workloads and infrastructure that don't use Autopilot mode, such as manually created node pools.
- You can set an Autopilot ComputeClass as the default for your cluster or namespace, so that workloads run in Autopilot mode unless they explicitly request a different option.
These options let cluster admins decide on the level and the scope with which they use Autopilot.
Autopilot mode ComputeClasses provide cluster administrators with additional flexibility to choose the level of control that you want over specific workloads and infrastructure in your cluster, such as in the following ways:
- You can let GKE fully manage specific workloads by running them in Autopilot mode.
- You retain full control over workloads and infrastructure that don't use Autopilot mode, such as manually created node pools.
- You can set an Autopilot ComputeClass as the default for your cluster or namespace, so that workloads run in Autopilot mode unless they explicitly request a different option.
These options let cluster admins decide on the level and the scope with which they use Autopilot.
Benefits of Autopilot ComputeClasses in Standard clusters
Running some of your workloads in Autopilot mode provides benefits like the following:
- Reduce infrastructure management costs: Google upgrades, maintains, configures, and fine-tunes specific nodes for you.
- Use the Autopilot pricing model: workloads that use an Autopilot ComputeClass are billing using the Autopilot pricing model. This pricing model includes per-Pod billing for workloads that don't request specific hardware. For more information, see the Pricing section.
- Improve scaling and security posture: Autopilot workloads get benefits like access to the container-optimized compute platform, improved default security constraints, and node autoscaling based on resource requests. The nodes for those workloads use features like node auto-upgrades and automatic repairs.
- Improve reliability: the GKE service-level agreement (SLA) includes a Pod uptime service-level objective (SLO) for Autopilot.
Many of these benefits are also provided by Autopilot clusters, which also provide a more managed experience than Standard clusters and include multiple security, networking, and resource management benefits. For more information, see Autopilot overview.
Hardware selection in Autopilot ComputeClasses
In Autopilot ComputeClasses, you can select specific hardware for your nodes (like GPUs or machine types), or you can let GKE place Pods on a general-purpose, container-optimized compute platform. The general-purpose option is recommended for most production workloads that don't require specific hardware to run well.
The following table describes these configuration options, how to choose one in a ComputeClass, and how this choice affects your billing model:
Workload requirement | Recommended ComputeClass configuration | Billing model |
---|---|---|
General-purpose workloads | Use an Autopilot ComputeClass that has the
The
built-in Autopilot ComputeClasses
that are available for Standard clusters use the
|
Pod-based billing model |
Workloads that need specific hardware | Use a ComputeClass that uses any available hardware configuration
rule, such as the |
Node-based billing model |
Configuration of Autopilot in ComputeClasses
You can use Autopilot mode in a Standard cluster by using a built-in Autopilot ComputeClass that GKE provides, or by enabling Autopilot in any custom ComputeClass that you create. The following sections describe each option.
Built-in Autopilot ComputeClasses
GKE configures specific Autopilot ComputeClasses for
you. You can
select these built-in Autopilot classes
in any eligible cluster. The built-in Autopilot ComputeClasses in
Standard clusters use the podFamily
priority rule to run Pods on
the container-optimized compute platform. For more information, see
About built-in ComputeClasses in GKE.
Custom Autopilot ComputeClasses
You can enable Autopilot in any custom ComputeClass that you manage.
This option is useful if your workloads have specific hardware requirements.
The autopilot
field in the ComputeClass
custom resource lets you enable or
disable Autopilot in a specific ComputeClass.
To enable Autopilot in an existing ComputeClass, you must delete it, update the configuration, and then recreate the ComputeClass in your cluster. Your changes apply to any new nodes that GKE creates for workloads that you deploy after you update the Autopilot ComputeClass.
For more information about enabling Autopilot in your custom compute classes, see Select specific hardware for your Autopilot Pods.
Pricing
GKE Autopilot pricing applies to the nodes and workloads that GKE creates for an Autopilot ComputeClass. The following table describes the billing model that applies to different Autopilot ComputeClass configurations in your Standard mode clusters.
Billing models for different ComputeClass configurations | |
---|---|
Pod-based billing model | The Pod-based billing model applies to Autopilot compute
classes that use the podFamily priority rule instead of
selecting specific machines or hardware. The
built-in Autopilot ComputeClasses,
which use the podFamily rule, use the Pod-based billing
model. |
Node-based billing model | The node-based billing model applies to Autopilot compute classes that explicitly request specific node configurations, such as N2 instances or GPUs. |
Autopilot pricing applies only to the workloads and nodes that use an Autopilot ComputeClass. Your Standard mode cluster and any other node pools that you run continue to use GKE Standard mode pricing.
Pre-configured settings for nodes managed by Autopilot
Before you enable Autopilot mode in your ComputeClasses, know what to expect from the nodes that GKE creates to run the Autopilot workloads. Google configures specific features and security constraints in Autopilot nodes. As a result, workloads that deploy and function correctly in your Standard mode nodes might be rejected by Autopilot mode if they don't meet the security requirements of Autopilot.
The following table describes the feature configurations that override the corresponding settings in your Standard cluster. If a configuration isn't in this table, the Autopilot nodes use the Standard cluster setting. For example, Workload Identity Federation for GKE isn't in this table, which means that the Workload Identity Federation for GKE setting of the Standard cluster applies to the Autopilot nodes that GKE creates.
Feature | Standard cluster-level setting | Autopilot-managed node setting |
---|---|---|
Node upgrades and maintenance |
Configurable: |
Pre-configured:
|
Autoscaling | Configurable: Autoscaling profile | Pre-configured: optimize-utilization autoscaling profile |
Networking | VPC-native or routes-based | Requires a VPC-native cluster |
Security |
Configurable:
|
Pre-configured:
|
Node operating system |
Configurable: |
Pre-configured:
|
Node boot disk |
Configurable: |
Configurable:
|
Node metadata |
|
|
Resource requests for Autopilot workloads
For Autopilot workloads to run efficiently, GKE enforces certain minimum and maximum values for CPU, memory, and ephemeral storage requests in your Pods. GKE also applies default requests to Pods that don't explicitly request one of these resources. The specific values for the minimum, maximum, and default resource requirements in GKE Autopilot workloads vary based on the type of hardware that your Pods use.
For ephemeral storage, the default value if you don't request ephemeral storage is the same for all ComputeClasses and hardware selections. For more information, see Default resource requests.
The following table provides links to the CPU and memory requirements for your Pod requests, depending on the type of hardware:
Resource type | Minimum and maximum requests | Default requests |
---|---|---|
General-purpose PodspodFamily priority rule |
See the "General-purpose" row in the Minimums and maximums for ComputeClasses table. | See the "General-purpose" row in the Default requests for ComputeClasses table. |
GPUs and TPUs | Depends on the type and quantity of hardware accelerator. For more information, see Minimums and maximums for the Accelerator ComputeClass. | Depends on the type and quantity of hardware accelerator. For more information, see Default requests for accelerators. |
Specific Compute Engine machine types and machine families |
|
For any Compute Engine machine type or machine family, the default requests in the "General-purpose" row in the Default requests for ComputeClasses table. |