Autoscaling a cluster

Standard

This page shows you how to autoscale your Standard Google Kubernetes Engine (GKE) clusters. To learn about how the cluster autoscaler works, refer to Cluster autoscaler.

With Autopilot clusters, you don't need to worry about provisioning nodes or managing node pools because node pools are automatically provisioned through node auto-provisioning, and are automatically scaled to meet the requirements of your workloads.

Using the cluster autoscaler

The following sections explain how to use cluster autoscaler.

Creating a cluster with autoscaling

You can create a cluster with autoscaling enabled using the Google Cloud CLI or the Google Cloud console.

gcloud

To create a cluster with autoscaling enabled, use the --enable-autoscaling flag and specify --min-nodes and --max-nodes:

gcloud container clusters create CLUSTER_NAME \
    --enable-autoscaling \
    --num-nodes NUM_NODES \
    --min-nodes MIN_NODES \
    --max-nodes MAX_NODES \
    --region=COMPUTE_REGION

Replace the following:

CLUSTER_NAME: the name of the cluster to create.
NUM_NODES: the number of nodes to create in each location.
MIN_NODES: the minimum number of nodes to automatically scale for the specified node pool per zone. To specify the minimum number of nodes for the entire node pool in GKE versions 1.24 and later, use --total-min-nodes. The flags --total-min-nodes and --total-max-nodes are mutually exclusive with the flags --min-nodes and --max-nodes.
MAX_NODES: the maximum number of nodes to automatically scale for the specified node pool per zone. To specify the maximum number of nodes for the entire node pool in GKE versions 1.24 and later, use --total-max-nodes. The flags --total-min-nodes and --total-max-nodes are mutually exclusive with the flags --min-nodes and --max-nodes.
COMPUTE_REGION: the Compute Engine region for the new cluster. For zonal clusters, use --zone=COMPUTE_ZONE.

Example: Creating a cluster with node autoscaling enabled and min and max nodes

The following command creates a cluster with 90 nodes, or 30 nodes in each of the 3 zones present in the region. Node autoscaling is enabled and resizes the number of nodes based on cluster load. The cluster autoscaler can reduce the size of the default node pool to 15 nodes or increase the node pool to a maximum of 50 nodes per zone.

gcloud container clusters create my-cluster --enable-autoscaling \
    --num-nodes=30 \
    --min-nodes=15 --max-nodes=50 \
    --region=us-central

Example: Creating a cluster with node autoscaling enabled and total nodes

The following command creates a cluster with 30 nodes, or 10 nodes in each of the 3 zones present in the region. Node autoscaling is enabled and resizes the number of nodes based on cluster load. In this example, the total size of the cluster can be between 10 and 60 nodes, regardless of spreading between zones.

gcloud container clusters create my-cluster --enable-autoscaling \
    --num-nodes 10 \
    --region us-central1 \
    --total-min-nodes 10  --total-max-nodes 60

Console

To create a new cluster in which the default node pool has autoscaling enabled:

Go to the Google Kubernetes Engine page in the Google Cloud console.

Go to Google Kubernetes Engine
Click Create.
Configure your cluster as desired.
From the navigation pane, under Node Pools, click default-pool.
Select the Enable autoscaling checkbox.
Change the values of the Minimum number of nodes and Maximum number of nodes fields as desired.
Click Create.

Adding a node pool with autoscaling

You can create a node pool with autoscaling enabled using the gcloud CLI or the Google Cloud console.

gcloud

To add a node pool with autoscaling to an existing cluster, use the following command:

gcloud container node-pools create POOL_NAME \
    --cluster=CLUSTER_NAME \
    --enable-autoscaling \
    --min-nodes=MIN_NODES \
    --max-nodes=MAX_NODES \
    --region=COMPUTE_REGION

Replace the following:

POOL_NAME: the name of the desired node pool.
CLUSTER_NAME: the name of the cluster in which the node pool is created.
MIN_NODES: the minimum number of nodes to automatically scale for the specified node pool per zone. To specify the minimum number of nodes for the entire node pool in GKE versions 1.24 and later, use --total-min-nodes. The flags --total-min-nodes and --total-max-nodes are mutually exclusive with the flags --min-nodes and --max-nodes.
MAX_NODES: the maximum number of nodes to automatically scale for the specified node pool per zone. To specify the maximum number of nodes for the entire node pool in GKE versions 1.24 and later, use --total-max-nodes. The flags --total-min-nodes and --total-max-nodes are mutually exclusive with the flags --min-nodes and --max-nodes.
COMPUTE_REGION: the Compute Engine region for the new cluster. For zonal clusters, use --zone=COMPUTE_ZONE.

Example: Adding a node pool with node autoscaling enabled

The following command creates a node pool with node autoscaling that scales the node pool to a maximum of 5 nodes and a minimum of 1 node:

gcloud container node-pools create my-node-pool \
    --cluster my-cluster \
    --enable-autoscaling \
    --min-nodes 1 --max-nodes 5 \
    --zone us-central1-c

Console

To add a node pool with autoscaling to an existing cluster:

Go to the Google Kubernetes Engine page in the Google Cloud console.

Go to Google Kubernetes Engine
In the cluster list, click the name of the cluster you want to modify.
Click Add Node Pool.
Configure the node pool as desired.
Under Size, select the Enable autoscaling checkbox.
Change the values of the Minimum number of nodes and Maximum number of nodes fields as desired.
Click Create.

Enabling autoscaling for an existing node pool

You can enable autoscaling for an existing node pool using the gcloud CLI or the Google Cloud console.

gcloud

To enable autoscaling for an existing node pool, use the following command:

gcloud container clusters update CLUSTER_NAME \
    --enable-autoscaling \
    --node-pool=POOL_NAME \
    --min-nodes=MIN_NODES \
    --max-nodes=MAX_NODES \
    --region=COMPUTE_REGION

Replace the following:

CLUSTER_NAME: the name of the cluster to update.
POOL_NAME: the name of the desired node pool. If you have only one node pool, supply default-pool as the value.
MIN_NODES: the minimum number of nodes to automatically scale for the specified node pool per zone. To specify the minimum number of nodes for the entire node pool in GKE versions 1.24 and later, use --total-min-nodes. The flags --total-min-nodes and --total-max-nodes are mutually exclusive with the flags --min-nodes and --max-nodes.
MAX_NODES: the maximum number of nodes to automatically scale for the specified node pool per zone. To specify the maximum number of nodes for the entire node pool in GKE versions 1.24 and later, use --total-max-nodes. The flags --total-min-nodes and --total-max-nodes are mutually exclusive with the flags --min-nodes and --max-nodes.
COMPUTE_REGION: the Compute Engine region for the new cluster. For zonal clusters, use --zone=COMPUTE_ZONE.

Console

To enable autoscaling for an existing node pool:

Go to the Google Kubernetes Engine page in the Google Cloud console.

Go to Google Kubernetes Engine
In the cluster list, click the name of the cluster you want to modify.
Click the Nodes tab.
Under Node Pools, click the name of the node pool you want to modify, then click Edit.
Under Size, select the Enable autoscaling checkbox.
Change the values of the Minimum number of nodes and Maximum number of nodes fields as desired.
Click Save.

Verifying that autoscaling for the existing node pool is enabled

You verify that your cluster is using autoscaling with the Google Cloud CLI or the Google Cloud console.

gcloud

Describe the node pools in the cluster:

gcloud container node-pools describe NODE_POOL_NAME --cluster=CLUSTER_NAME |grep autoscaling -A 1

Replace the following:

POOL_NAME: the name of the new node pool that you choose.
CLUSTER_NAME: the name of the cluster.

If autoscaling is enabled, the output is similar to the following:

autoscaling:
  enabled: true

Console

Go to the Google Kubernetes Engine page in the Google Cloud console.

Go to Google Kubernetes Engine
In the cluster list, click the name of the cluster you want to verify.
Click the Nodes tab.
Under Node Pools, verify that node pool Autoscalling state.

Creating a node pool that prioritizes optimization of unused reservations

You can use the --location_policy=ANY flag when you create a node pool to instruct the cluster autoscaler to prioritize utilization of unused reservations:

gcloud container node-pools create POOL_NAME \
    --cluster=CLUSTER_NAME \
    --location_policy=ANY

Replace the following:

POOL_NAME: the name of the new node pool that you choose.
CLUSTER_NAME: the name of the cluster.

Disabling autoscaling for an existing node pool

You can disable autoscaling for an existing node pool using the gcloud CLI or the Google Cloud console.

gcloud

To disable autoscaling for a specific node pool, use the --no-enable-autoscaling flag:

gcloud container clusters update CLUSTER_NAME \
    --no-enable-autoscaling \
    --node-pool=POOL_NAME \
    --region=COMPUTE_REGION

Replace the following:

CLUSTER_NAME: the name of the cluster to update.
POOL_NAME: the name of the desired node pool.
COMPUTE_REGION: the Compute Engine region for the new cluster. For zonal clusters, use --zone=COMPUTE_ZONE.

The cluster size is fixed at the cluster's current default node pool size, which can be manually updated.

Console

To disable autoscaling for a specific node pool:

Go to the Google Kubernetes Engine page in the Google Cloud console.

Go to Google Kubernetes Engine
In the cluster list, click the name of the cluster you want to modify.
Click the Nodes tab.
Under Node Pools, click the name of the node pool you want to modify, then click Edit.
Under Size, clear the Enable autoscaling checkbox.
Click Save.

Resizing a node pool

For clusters with autoscaling enabled, the cluster autoscaler automatically resizes node pools within the boundaries specified by either the minimum size (--min-nodes) and maximum size (--max-nodes) values or the minimum total size (--total-min-nodes) and maximum total size (--total-max-nodes). These flags are mutually exclusive. You cannot manually resize a node pool by changing these values.

If you want to manually resize a node pool in your cluster that has autoscaling enabled, perform the following:

Disable autoscaling on the node pool.
Manually resize the cluster.
Re-enable autoscaling and specify the minimum and maximum node pool size.

Preventing Pods scheduling on selected nodes

You can use startup or status taints to prevent Pods scheduling on selected nodes, depending on the use case.

This feature is available in GKE in version 1.28 and later.

Startup taints

Use startup taints when there is an operation that has to complete before any Pods can run on the node. For example, Pods shouldn't run until the drivers installation on node finishes.

Cluster autoscaler treats nodes tainted with startup taints as unready, but taken into account during scale up logic, assuming they will become ready shortly.

We recommend that you don't apply the startup taints to the nodes for an extended period of time. The cluster autoscaler might stop working if a substantial number of nodes are tainted with startup taints, which means that they are unready. In this case, GKE doesn't scale up the cluster because even new nodes don't become ready. Therefore, GKE treats this cluster as broken.

Startup taints are defined as all taints with the prefix startup-taint.cluster-autoscaler.kubernetes.io/

Status taints

Use status taints when GKE shouldn't use a given node to run Pods.

Cluster autoscaler treats nodes tainted with status taints as ready, but ignores them during scale up logic. Even though the tainted node is ready, no Pods should run. If more resources are needed by the Pods, GKE scales up the cluster and ignores the tainted nodes.

Status taints are defined as all taints with the prefix status-taint.cluster-autoscaler.kubernetes.io/

Ignore taints

Ignore taints are defined as all taints with the prefix ignore-taint.cluster-autoscaler.kubernetes.io/

Troubleshooting

Check if the issue you are running into is caused by one of the limitations for the cluster autoscaler. Otherwise, see the following troubleshooting information for the cluster autoscaler:

Cluster is not downscaling

After the cluster properly scales up and then attempts to scale down, underutilized nodes remain enabled and prevent the cluster from scaling down. This error occurs for one of the following reasons:

Restrictions can prevent a node from being deleted by the autoscaler. GKE might prevent a node's deletion if the node contains a Pod with any of these conditions:
- The Pod's affinity or anti-affinity rules prevent rescheduling.
- In GKE version 1.21 and earlier, the Pod has local storage.
- The Pod is not managed by a Controller such as a Deployment, StatefulSet, Job or ReplicaSet.
To resolve this issue, set up the cluster autoscaler scheduling and eviction rules on your Pods. For more information, see Pod scheduling and disruption.
System Pods are running on a node. To verify that your nodes are running kube-system pods, perform the following steps:
1. Go to the Logs Explorer page in the Google Cloud console.
  
  Go to Logs Explorer
2. Click Query builder.
3. Use the following query to find all network policy log records:
```
  - resource.labels.location="CLUSTER_LOCATION"
  resource.labels.cluster_name="CLUSTER_NAME"
  logName="projects/PROJECT_ID/logs/container.googleapis.com%2Fcluster-autoscaler-visibility"
  jsonPayload.noDecisionStatus.noScaleDown.nodes.node.mig.nodepool="NODE_POOL_NAME"
```
  Replace the following:
  - CLUSTER_LOCATION: The region your cluster is in.
  - CLUSTER_NAME: The name of your cluster.
  - PROJECT_ID: the ID of the project in which the cluster is created.
  - NODE_POOL_NAME: The name of your node pool.
    
    If there are kube-system pods running on your node pool, the output includes the following:
```
"no.scale.down.node.pod.kube.system.unmovable"
```
To resolve this issue, you have to either:
- Add a PodDisruptionBudget for the kube-system Pods. For more information about manually adding a PodDisruptionBudget for the kube-system Pods, see the Kubernetes cluster autoscaler FAQ.
- Use a combination of node pools taints and tolerations to separate kube-system pods from your application pods. For more information, see node auto-provisioning in GKE.

Node pool size mismatching

The following issue results when you configure node pool size:

An existing node pool size is smaller than the minimum number of nodes you specified for the cluster.

The following list describes the possible common causes of this behavior:

You specified a new minimum number of nodes when the existing number of nodes is higher.
You manually scaled down the node pool or the underlying Managed Instance Group. This manual operation specified the number of nodes lesser than the minimum number of nodes.
You deployed preempted Spot VMs within the node pool.
The Pod has local storage and the GKE control plane version is lower than 1.22. In GKE clusters with control plane version 1.22 or later, Pods with local storage no longer block scaling down.
The Pod has the annotation "cluster-autoscaler.kubernetes.io/safe-to-evict": "false".

For more troubleshooting steps during scale down events, refer to Cluster not scaling down.
When scaling down, cluster autoscaler respects the Pod termination grace period, up to a maximum of 10 minutes. After 10 minutes, Pods are forcefully terminated.
You may observe a node pool size being smaller than the minimum number of nodes you specified for the cluster. This behavior happens because the autoscaler uses the minimum number of nodes parameter only when it need to determine a scaling down. These are the list of the possible common causes of this behavior.

To resolve this issue, manually increase the node pool size to at least the minimum number of nodes. For more information, see how to manually resize a cluster.

For more information about the cluster autoscaler and preventing disruptions, see the following questions in the Kubernetes cluster autoscaler FAQ:

Autoscaling a cluster

Using the cluster autoscaler

Creating a cluster with autoscaling

gcloud

Console

Adding a node pool with autoscaling

gcloud

Console

Enabling autoscaling for an existing node pool

gcloud

Console

Verifying that autoscaling for the existing node pool is enabled

gcloud

Console

Creating a node pool that prioritizes optimization of unused reservations

Disabling autoscaling for an existing node pool

gcloud

Console

Resizing a node pool

Preventing Pods scheduling on selected nodes

Startup taints

Status taints

Ignore taints

Troubleshooting

Cluster is not downscaling

Node pool size mismatching

What's next