About Autopilot mode workloads in GKE Standard


You can use ComputeClasses to run Google Kubernetes Engine (GKE) Autopilot workloads in your GKE Standard mode clusters. This page describes the methods that you can use to run your workloads in Autopilot mode and helps you to decide when to run a workload in a specific mode.

This information is intended for the following people:

  • Cloud architects who want to optimize operational costs in organizations.
  • Platform administrators who want to reduce the overhead of manual infrastructure management.
  • Site reliability engineers (SREs) who want to shift infrastructure maintenance, upgrades, and scaling to Google Cloud when possible.

You should already be familiar with the following concepts:

About GKE Autopilot

Autopilot is a mode of operation in GKE in which Google manages your node infrastructure, scaling, security, and pre-configured features. Autopilot mode is optimized for running most production workloads in an environment that applies recommended settings for security, reliability, performance, and scalability. To decide between Autopilot mode and Standard mode based on your requirements, see About GKE modes of operation.

You can use Autopilot mode in the following ways:

  • Create a cluster that uses Autopilot mode: Google manages the entire cluster and applies best practices for automation, reliability, security, and costs.
  • Run workloads in Autopilot mode in Standard clusters: you deploy Autopilot ComputeClasses and select them in workloads. Google manages the nodes that GKE creates for those specific workloads. You control the cluster and can run your own node pools alongside the nodes that GKE manages.

About Autopilot mode for ComputeClasses

A ComputeClass is a Kubernetes custom resource that defines a list of node configurations, like machine types or feature settings. You can select specific ComputeClasses in Kubernetes workload specifications. When a workload that selects a ComputeClass needs a new node, GKE attempts to provision the node with one of the configurations that the ComputeClass declares. GKE tries each configuration in the ComputeClass in order and falls back to the next configuration if node creation fails. For more information, see About custom ComputeClasses.

To run Autopilot workloads in your GKE Standard clusters, you enable Autopilot mode in a ComputeClass and select that ComputeClass in specific workloads. Google manages any new nodes that GKE provisions for these workloads, similar to how Google manages the nodes in Autopilot clusters. Most of the benefits and security features of Autopilot mode apply to those workloads and the host nodes.

Autopilot mode ComputeClasses provide cluster administrators with additional flexibility to choose the level of control that you want over specific workloads and infrastructure in your cluster, such as in the following ways:

  • You can let GKE fully manage specific workloads by running them in Autopilot mode.
  • You retain full control over workloads and infrastructure that don't use Autopilot mode, such as manually created node pools.
  • You can set an Autopilot ComputeClass as the default for your cluster or namespace, so that workloads run in Autopilot mode unless they explicitly request a different option.

These options let cluster admins decide on the level and the scope with which they use Autopilot.

Autopilot mode ComputeClasses provide cluster administrators with additional flexibility to choose the level of control that you want over specific workloads and infrastructure in your cluster, such as in the following ways:

  • You can let GKE fully manage specific workloads by running them in Autopilot mode.
  • You retain full control over workloads and infrastructure that don't use Autopilot mode, such as manually created node pools.
  • You can set an Autopilot ComputeClass as the default for your cluster or namespace, so that workloads run in Autopilot mode unless they explicitly request a different option.

These options let cluster admins decide on the level and the scope with which they use Autopilot.

Benefits of Autopilot ComputeClasses in Standard clusters

Running some of your workloads in Autopilot mode provides benefits like the following:

  • Reduce infrastructure management costs: Google upgrades, maintains, configures, and fine-tunes specific nodes for you.
  • Use the Autopilot pricing model: workloads that use an Autopilot ComputeClass are billing using the Autopilot pricing model. This pricing model includes per-Pod billing for workloads that don't request specific hardware. For more information, see the Pricing section.
  • Improve scaling and security posture: Autopilot workloads get benefits like access to the container-optimized compute platform, improved default security constraints, and node autoscaling based on resource requests. The nodes for those workloads use features like node auto-upgrades and automatic repairs.
  • Improve reliability: the GKE service-level agreement (SLA) includes a Pod uptime service-level objective (SLO) for Autopilot.

Many of these benefits are also provided by Autopilot clusters, which also provide a more managed experience than Standard clusters and include multiple security, networking, and resource management benefits. For more information, see Autopilot overview.

Hardware selection in Autopilot ComputeClasses

In Autopilot ComputeClasses, you can select specific hardware for your nodes (like GPUs or machine types), or you can let GKE place Pods on a general-purpose, container-optimized compute platform. The general-purpose option is recommended for most production workloads that don't require specific hardware to run well.

The following table describes these configuration options, how to choose one in a ComputeClass, and how this choice affects your billing model:

Table 1. Hardware selection in Autopilot ComputeClasses
Workload requirement Recommended ComputeClass configuration Billing model
General-purpose workloads

Use an Autopilot ComputeClass that has the podFamily priority rule to run workloads that don't require specific hardware on the Autopilot container-optimized compute platform. This platform works well for general-purpose workloads like the following:

  • Web servers
  • Event-driven jobs
  • Batch processing
  • CI/CD pipelines

The built-in Autopilot ComputeClasses that are available for Standard clusters use the podFamily priority rule.

Pod-based billing model
Workloads that need specific hardware

Use a ComputeClass that uses any available hardware configuration rule, such as the machineFamily rule or the gpus rule.

Node-based billing model

Configuration of Autopilot in ComputeClasses

You can use Autopilot mode in a Standard cluster by using a built-in Autopilot ComputeClass that GKE provides, or by enabling Autopilot in any custom ComputeClass that you create. The following sections describe each option.

Built-in Autopilot ComputeClasses

GKE configures specific Autopilot ComputeClasses for you. You can select these built-in Autopilot classes in any eligible cluster. The built-in Autopilot ComputeClasses in Standard clusters use the podFamily priority rule to run Pods on the container-optimized compute platform. For more information, see About built-in ComputeClasses in GKE.

Custom Autopilot ComputeClasses

You can enable Autopilot in any custom ComputeClass that you manage. This option is useful if your workloads have specific hardware requirements. The autopilot field in the ComputeClass custom resource lets you enable or disable Autopilot in a specific ComputeClass.

To enable Autopilot in an existing ComputeClass, you must delete it, update the configuration, and then recreate the ComputeClass in your cluster. Your changes apply to any new nodes that GKE creates for workloads that you deploy after you update the Autopilot ComputeClass.

For more information about enabling Autopilot in your custom compute classes, see Select specific hardware for your Autopilot Pods.

Pricing

GKE Autopilot pricing applies to the nodes and workloads that GKE creates for an Autopilot ComputeClass. The following table describes the billing model that applies to different Autopilot ComputeClass configurations in your Standard mode clusters.

Table 3. Pricing for Autopilot ComputeClasses
Billing models for different ComputeClass configurations
Pod-based billing model The Pod-based billing model applies to Autopilot compute classes that use the podFamily priority rule instead of selecting specific machines or hardware. The built-in Autopilot ComputeClasses, which use the podFamily rule, use the Pod-based billing model.
Node-based billing model The node-based billing model applies to Autopilot compute classes that explicitly request specific node configurations, such as N2 instances or GPUs.

Autopilot pricing applies only to the workloads and nodes that use an Autopilot ComputeClass. Your Standard mode cluster and any other node pools that you run continue to use GKE Standard mode pricing.

Pre-configured settings for nodes managed by Autopilot

Before you enable Autopilot mode in your ComputeClasses, know what to expect from the nodes that GKE creates to run the Autopilot workloads. Google configures specific features and security constraints in Autopilot nodes. As a result, workloads that deploy and function correctly in your Standard mode nodes might be rejected by Autopilot mode if they don't meet the security requirements of Autopilot.

The following table describes the feature configurations that override the corresponding settings in your Standard cluster. If a configuration isn't in this table, the Autopilot nodes use the Standard cluster setting. For example, Workload Identity Federation for GKE isn't in this table, which means that the Workload Identity Federation for GKE setting of the Standard cluster applies to the Autopilot nodes that GKE creates.

Table 4. Pre-configured settings for Autopilot nodes
Feature Standard cluster-level setting Autopilot-managed node setting
Node upgrades and maintenance

Configurable:

Pre-configured:

  • Node auto-repair: enabled
  • Node auto-upgrade: enabled
  • Node upgrade strategy: surge upgrades with pre-configured parameters
Autoscaling Configurable: Autoscaling profile Pre-configured: optimize-utilization autoscaling profile
Networking VPC-native or routes-based Requires a VPC-native cluster
Security

Configurable:

Pre-configured:

Node operating system

Configurable:

Pre-configured:

Node boot disk

Configurable:

Configurable:

  • Boot disk type: uses the value in the ComputeClass storage.bootDiskType field. If this field isn't set, GKE sets the boot disk type as follows:
    • If the ComputeClass uses podFamily rules, GKE uses a pd-balanced disk.
    • If the ComputeClass doesn't use podFamily rules, GKE uses the default boot disk type for the cluster.
  • Boot disk size: GKE uses the value in the compute class storage.bootDiskSize field. If this field isn't set, GKE sets the boot disk size as follows:
Node metadata

Resource requests for Autopilot workloads

For Autopilot workloads to run efficiently, GKE enforces certain minimum and maximum values for CPU, memory, and ephemeral storage requests in your Pods. GKE also applies default requests to Pods that don't explicitly request one of these resources. The specific values for the minimum, maximum, and default resource requirements in GKE Autopilot workloads vary based on the type of hardware that your Pods use.

For ephemeral storage, the default value if you don't request ephemeral storage is the same for all ComputeClasses and hardware selections. For more information, see Default resource requests.

The following table provides links to the CPU and memory requirements for your Pod requests, depending on the type of hardware:

Table 5. Autopilot CPU and memory requirements
Resource type Minimum and maximum requests Default requests
General-purpose Pods
podFamily priority rule
See the "General-purpose" row in the Minimums and maximums for ComputeClasses table. See the "General-purpose" row in the Default requests for ComputeClasses table.
GPUs and TPUs Depends on the type and quantity of hardware accelerator. For more information, see Minimums and maximums for the Accelerator ComputeClass. Depends on the type and quantity of hardware accelerator. For more information, see Default requests for accelerators.
Specific Compute Engine machine types and machine families
  • Minimum: no minimum values for CPU or memory.
  • Maximum: the maximum value is the resource capacity of the Compute Engine instance.
For any Compute Engine machine type or machine family, the default requests in the "General-purpose" row in the Default requests for ComputeClasses table.

What's next