This page describes the Autopilot mode of operation in Google Kubernetes Engine (GKE) and provides you with resources that you can use to plan, set up, and manage your clusters.
What is Autopilot?
GKE Autopilot is a mode of operation in GKE in which Google manages your cluster configuration, including your nodes, scaling, security, and other preconfigured settings. Autopilot clusters are optimized to run most production workloads, and provision compute resources based on your Kubernetes manifests. The streamlined configuration follows GKE best practices and recommendations for cluster and workload setup, scalability, and security. For a list of built-in settings, refer to the Autopilot and Standard comparison table.
You only pay for the CPU, memory, and storage that your workloads request while running on GKE Autopilot.
You aren't billed for unused capacity on your nodes, because GKE manages the nodes. You also aren't charged for system Pods, operating system costs, or unscheduled workloads. For detailed pricing information, refer to Autopilot pricing.
- Focus on your apps: Google manages the infrastructure, so you can focus on building and deploying your applications.
- Security: Clusters have a default hardened configuration, with many security settings enabled by default. GKE automatically applies security patches to your nodes when available, adhering to any maintenance schedules you configured.
- Pricing: The Autopilot pricing model simplifies billing forecasts and attribution because it's based on resources requested by your Pods.
- Node management: Google manages worker nodes, so you don't need to create new nodes to accommodate your workloads or configure automatic upgrades and repairs.
- Scaling: When your workloads experience high load and you add more Pods to accommodate the traffic, such as with Kubernetes Horizontal Pod Autoscaling, GKE automatically provisions new nodes for those Pods, and automatically expands the resources in your existing nodes based on need.
- Scheduling: Autopilot manages Pod bin-packing for you, so you don't have to think about how many Pods are running on each node. You can further control Pod placement by using Kubernetes mechanisms such as affinity and Pod spread topology.
- Resource management: If you deploy workloads without setting resource values such as CPU and memory, Autopilot automatically sets pre-configured default values and modifies your resource requests at the workload level.
- Networking: Autopilot enables some networking security features by default, such as ensuring that all Pod network traffic passes through your Virtual Private Cloud firewall rules, even if the traffic is going to other Pods in the cluster.
- Release management: All Autopilot clusters are enrolled in a GKE release channel, which ensures that your control plane and nodes run on the latest qualified versions in that channel.
- Managed flexibility: If your workloads have specific hardware or resource requirements, such as high CPU or memory, Autopilot offers pre-configured compute classes built for those workloads. You request the compute class in your deployment instead of needing to manually create new nodes that are backed by customized machine types and hardware. You can also select GPUs to accelerate workloads like batch or AI/ML applications.
- Reduced operational complexity: Autopilot reduces platform administration overhead by removing the need to continuously monitor nodes, scaling, and scheduling operations.
Autopilot comes with a SLA that covers both the control plane and the compute capacity used by your Pods.
Operating a platform effectively at scale requires planning and careful consideration. You must consider the scalability of your design, which is the ability of your clusters to grow while remaining within service-level objectives (SLOs). For detailed guidance for both platform administrators and developers, refer to the Guidelines for creating scalable clusters.
You should also consider the GKE quotas and limits, especially if you plan to run large clusters with potentially thousands of Pods.
Scale Autopilot workloads
In Autopilot, GKE automatically scales your nodes based on the number of Pods in your cluster. To automatically scale the number of Pods in your cluster, we recommend that you use a mechanism such as Kubernetes horizontal Pod autoscaling, which can scale Pods based on the built-in CPU and memory metrics, or custom metrics from Cloud Monitoring. To learn how to configure scaling based on various metrics, refer to Optimize Pod autoscaling based on metrics.
Allowable resource ranges
Autopilot lets you request CPU, memory, and ephemeral storage resources for your workloads. The allowed ranges depend on whether you want to run your Pods on the default general-purpose compute platform, or on a compute class. For information about the default container resource requests and the allowed resource ranges, refer to Resource requests in Autopilot.
Node selectors and node affinity
Zonal affinity topologies are supported. Node affinity
and node selectors
are limited for use only with the following keys:
kubernetes.io/arch. Not all values of OS and arch are supported in
You can also use node selectors and node affinity for the following purposes:
- Configure workload separation.
- Automatically provision Spot Pods in clusters running GKE version 1.21.4 and later.
Pod affinity and anti-affinity
Although GKE manages your nodes for you in Autopilot, you
retain the ability to schedule your Pods. Autopilot supports Pod affinity
, so that you can co-locate Pods together on a single node for network efficiency.
For example, you can use Pod affinity to deploy frontend Pods on nodes with
backend Pods. Pod affinity is limited for use only with the following keys:
Autopilot also supports anti-affinity, so that you can spread Pods across nodes to avoid single points of failure. For example, you can use Pod anti-affinity to prevent frontend Pods from co-locating with backend Pods.
Tolerations supported only for workload separation
Tolerations are supported only for workload separation. Taints are automatically added by node auto-provisioning as needed.
If you want to learn more about Autopilot hardening measures and how to implement your specific security requirements, refer to Security measures in Autopilot.
For troubleshooting steps, refer to Troubleshooting Autopilot clusters.