Autoscaling a cluster


This page shows you how to autoscale your Standard Google Kubernetes Engine (GKE) clusters. To learn about how the cluster autoscaler works, refer to Cluster autoscaler.

With Autopilot clusters, you don't need to worry about provisioning nodes or managing node pools because node pools are automatically provisioned through node auto-provisioning, and are automatically scaled to meet the requirements of your workloads.

Using cluster autoscaler

The following sections explain how to use cluster autoscaler.

Creating a cluster with autoscaling

You can create a cluster with autoscaling enabled using the Google Cloud CLI or the Google Cloud console.

gcloud

To create a cluster with autoscaling enabled, use the --enable-autoscaling flag and specify --min-nodes and --max-nodes:

gcloud container clusters create CLUSTER_NAME \
    --enable-autoscaling \
    --num-nodes NUM_NODES \
    --min-nodes MIN_NODES \
    --max-nodes MAX_NODES \
    --region=COMPUTE_REGION

Replace the following:

  • CLUSTER_NAME: the name of the cluster to create.
  • NUM_NODES: the number of nodes to create in each location.
  • MIN_NODES: the minimum number of nodes to automatically scale for the specified node pool.
  • MAX_NODES: the maximum number of nodes to automatically scale for the specified node pool.
  • COMPUTE_REGION: the Compute Engine region for the new cluster. For zonal clusters, use --zone=COMPUTE_ZONE.

Example: Creating a cluster with node autoscaling enabled

The following command creates a cluster with 30 nodes. Node autoscaling is enabled and resizes the number of nodes based on cluster load. The cluster autoscaler can reduce the size of the default node pool to 15 nodes or increase the node pool to a maximum of 50 nodes.

gcloud container clusters create my-cluster --enable-autoscaling \
    --num-nodes 30 \
    --min-nodes 15 --max-nodes 50 \
    --zone us-central1-c

Console

To create a new cluster in which the default node pool has autoscaling enabled:

  1. Go to the Google Kubernetes Engine page in the Google Cloud console.

    Go to Google Kubernetes Engine

  2. Click Create.

  3. Configure your cluster as desired.

  4. From the navigation pane, under Node Pools, click default-pool.

  5. Select the Enable autoscaling checkbox.

  6. Change the values of the Minimum number of nodes and Maximum number of nodes fields as desired.

  7. Click Create.

Adding a node pool with autoscaling

You can create a node pool with autoscaling enabled using the gcloud CLI or the Google Cloud console.

gcloud

To add a node pool with autoscaling to an existing cluster, use the following command:

gcloud container node-pools create POOL_NAME \
    --cluster=CLUSTER_NAME \
    --enable-autoscaling \
    --min-nodes=MIN_NODES \
    --max-nodes=MAX_NODES \
    --region=COMPUTE_REGION

Replace the following:

  • POOL_NAME: the name of the desired node pool.
  • CLUSTER_NAME: the name of the cluster in which the node pool is created.
  • MIN_NODES: the minimum number of nodes to automatically scale for the specified node pool.
  • MAX_NODES: the maximum number of nodes to automatically scale for the specified node pool.
  • COMPUTE_REGION: the Compute Engine region for the new cluster. For zonal clusters, use --zone=COMPUTE_ZONE.

Example: Adding a node pool with node autoscaling enabled

The following command creates a node pool named of size 3 (default), with node autoscaling based on cluster load that scales the node pool to a maximum of 5 nodes and a minimum of 1 node:

gcloud container node-pools create my-node-pool \
    --cluster my-cluster \
    --enable-autoscaling \
    --min-nodes 1 --max-nodes 5 \
    --zone us-central1-c

Console

To add a node pool with autoscaling to an existing cluster:

  1. Go to the Google Kubernetes Engine page in Google Cloud console.

    Go to Google Kubernetes Engine

  2. In the cluster list, click the name of the cluster you want to modify.

  3. Click Add Node Pool.

  4. Configure the node pool as desired.

  5. Under Size, select the Enable autoscaling checkbox.

  6. Change the values of the Minimum number of nodes and Maximum number of nodes fields as desired.

  7. Click Create.

Enabling autoscaling for an existing node pool

You can enable autoscaling for an existing node pool using the gcloud CLI or the Google Cloud console.

gcloud

To enable autoscaling for an existing node pool use the --enable-autoscaling flag:

gcloud container clusters update CLUSTER_NAME \
    --enable-autoscaling \
    --node-pool=POOL_NAME \
    --min-nodes=MIN_NODES \
    --max-nodes=MAX_NODES \
    --region=COMPUTE_REGION

Replace the following:

  • CLUSTER_NAME: the name of the cluster to update.
  • POOL_NAME: the name of the desired node pool. If you have only one node pool, supply default-pool as the value.
  • MIN_NODES: the minimum number of nodes to automatically scale for the specified node pool.
  • MAX_NODES: the maximum number of nodes to automatically scale for the specified node pool
  • COMPUTE_REGION: the Compute Engine region for the new cluster. For zonal clusters, use --zone=COMPUTE_ZONE.

Console

To enable autoscaling for an existing node pool:

  1. Go to the Google Kubernetes Engine page in Google Cloud console.

    Go to Google Kubernetes Engine

  2. In the cluster list, click the name of the cluster you want to modify.

  3. Click the Nodes tab.

  4. Under Node Pools, click the name of the node pool you want to modify, then click Edit.

  5. Under Size, select the Enable autoscaling checkbox.

  6. Change the values of the Minimum number of nodes and Maximum number of nodes fields as desired.

  7. Click Save.

Disabling autoscaling for an existing node pool

You can disable autoscaling for an existing node pool using the gcloud CLI or the Google Cloud console.

gcloud

To disable autoscaling for a specific node pool, use the --no-enable-autoscaling flag:

gcloud container clusters update CLUSTER_NAME \
    --no-enable-autoscaling \
    --node-pool=POOL_NAME \
    --region=COMPUTE_REGION

Replace the following:

  • CLUSTER_NAME: the name of the cluster to update.
  • POOL_NAME: the name of the desired node pool.
  • COMPUTE_REGION: the Compute Engine region for the new cluster. For zonal clusters, use --zone=COMPUTE_ZONE.

The cluster size is fixed at the cluster's current default node pool size, which can be manually updated.

Console

To disable autoscaling for a specific node pool:

  1. Go to the Google Kubernetes Engine page in Google Cloud console.

    Go to Google Kubernetes Engine

  2. In the cluster list, click the name of the cluster you want to modify.

  3. Click the Nodes tab.

  4. Under Node Pools, click the name of the node pool you want to modify, then click Edit.

  5. Under Size, clear the Enable autoscaling checkbox.

  6. Click Save.

Resizing a node pool

For clusters with autoscaling enabled, the cluster autoscaler automatically resizes node pools within the boundaries specified by the minimum size (--min-nodes) and maximum size (--max-nodes) values. You cannot manually resize a node pool by changing these values.

If you want to manually resize a node pool in your cluster that has autoscaling enabled, perform the following:

  1. Disable autoscaling on the node pool.
  2. Manually resize the cluster.
  3. Re-enable autoscaling and specify the minimum and maximum node pool size.

Troubleshooting

See the following troubleshooting information for cluster autoscaler:

  • You might be experiencing one of the limitations for cluster autoscaler.
  • If you are having problems with downscaling your cluster, see Pod scheduling and disruption. You might have to add a PodDisruptionBudget for the kube-system Pods. For more information about manually adding a PodDisruptionBudget for the kube-system Pods, see the Kubernetes cluster autoscaler FAQ.
  • When scaling down, cluster autoscaler respects scheduling and eviction rules set on Pods. These restrictions can prevent a node from being deleted by the autoscaler. A node's deletion could be prevented if it contains a Pod with any of these conditions:

    • The Pod's affinity or anti-affinity rules prevent rescheduling.
    • The Pod is not managed by a Controller such as a Deployment, StatefulSet, Job or ReplicaSet.
    • The Pod has local storage and the GKE control plane version is lower than 1.22. In GKE clusters with control plane version 1.22 or later, Pods with local storage no longer block scaling down.
    • The Pod has the annotation "cluster-autoscaler.kubernetes.io/safe-to-evict": "false".

    For more troubleshooting steps during scale down events, refer to Cluster not scaling down.

  • When scaling down, cluster autoscaler respects the Pod termination grace period, up to a maximum of 10 minutes. After 10 minutes, Pods are forcefully terminated.

  • You may observe a node pool size being smaller than the minimum number of nodes you specified for the cluster. This behavior happens because the autoscaler uses the minimum number of nodes parameter only when it need to determine a scaling down. These are the list of the possible common causes of this behavior.

    • When the newly specified minimum number of nodes is higher than the current number of nodes.
    • When the nodepool or underlying Managed Instance Group was scaled down manually to the number of nodes less than the minimum number of nodes.
    • When spot instances within the node pool are preempted.

Increase the node pool size at least to the minimum number of nodes when you intend to keep your node counts larger than this number. For more information about cluster autoscaler and preventing disruptions, see the following questions in the Kubernetes cluster autoscaler FAQ:

What's next