Cluster Autoscaler

This page explains Kubernetes Engine's cluster autoscaler feature.

Overview

Kubernetes Engine's cluster autoscaler automatically resizes clusters based on the demands of the workloads you want to run. With autoscaling enabled, Kubernetes Engine automatically adds a new node to your cluster if you've created new Pods that don't have enough capacity to run; conversely, if a node in your cluster is underutilized and its Pods can be run on other nodes, Kubernetes Engine can delete the node.

Cluster autoscaling allows you to pay only for resources that are needed at any given moment, and to automatically get additional resources when demand increases.

Keep in mind that when resources are deleted or moved in the course of autoscaling your cluster, your services can experience some disruption. For example, if your service consists of a controller with a single replica, that replica's Pod might be restarted on a different node if its current node is deleted. Before enabling autoscaling, ensure that your services can tolerate potential disruption.

How cluster autoscaler works

Cluster autoscaler works on a per-node pool basis. For each node pool, the autoscaler periodically checks whether there are any Pods that are not being scheduled and are waiting for a node with available resources. If such Pods exist, and the autoscaler determines that resizing a node pool would allow the waiting Pods to be scheduled, then the autoscaler expands that node pool.

Cluster autoscaler also measures the usage of each node against the node pool's total demand for capacity. If a node has had no new Pods scheduled on it for a set period of time, and all Pods running on that node can be scheduled onto other nodes in the pool, the autoscaler moves the Pods and deletes the node.

Note that cluster autoscaler works based on Pod resource requests, that is, how many resources your Pods have requested. Cluster autoscaler does not take into account the resources your Pods are actively using. Essentially, cluster autoscaler trusts that the Pod resource requests you've provided are accurate and schedules Pods on nodes based on that assumption.

If your Pods have requested too few resources (or haven't changed the defaults, which might be insufficient) and your nodes are experiencing shortages, cluster autoscaler does not correct the situation. You can help ensure cluster autoscaler works as accurately as possible by making explicit resource requests for all of your workloads.

Operating criteria

Cluster autoscaler makes the following assumptions when resizing a node pool:

  • Cluster autoscaler assumes that all replicated Pods can be restarted on some other node, possibly causing a brief disruption. If your services are not disruption-tolerant, using autoscaling is not recommended.
  • Cluster autoscaler assumes that users or administrators are not manually managing nodes; it may override any manual node management operations you perform.
  • Cluster autoscaler assumes that all nodes in a single node pool have the same set of labels.
  • Cluster autoscaler considers the relative cost of each instance type in the node pool and attempts to expand the least expensive possible node pool.
  • Cluster autoscaler does not track labels manually added after initial cluster or node pool creation. Nodes created by cluster autoscaler are assigned labels specified with --node-labels at the time of node pool creation.

Balancing across zones

If your cluster contains multiple node pools with the same instance type, cluster autoscaler will attempt to keep those node pools' sizes balanced when scaling up. This can help prevent an uneven distribution of node pool sizes when you have node pools in multiple zones.

If you have a node pool that you want to exempt from pool size balancing, you can do so by giving that node pool a custom label.

For more information on how cluster autoscaler makes balancing decisions, see the Kubernetes documentation's FAQ for autoscaling

Minimum and maximum node pool size

You can specify the minimum and maximum size for each node pool in your cluster, and cluster autoscaler makes rescaling decisions within these boundaries. If the current cluster size is lower than the specified minimum or greater than the specified maximum when you enable autoscaling, the autoscaler waits to take effect until a new node is needed or a node can be safely deleted.

Limitations

Cluster autoscaler has following limitations:

  • Cluster autoscaler supports up to 1000 nodes running 30 pods each. See Scalability report for more details on scalability guarantees.
  • When scaling down, cluster autoscaler supports a graceful termination period for a pod of up to 10 minutes. A pod is always killed after a maximum of 10 minutes, even if the pod is configured with a higher grace period.

Additional Information

You can find more information about cluster autoscaler in the Autoscaling FAQ in the open-source Kubernetes project.

Using cluster autoscaler

The following sections explain how to use cluster autoscaler.

Creating a cluster with autoscaling

Console

To create a new cluster in which the default node pool has autoscaling enabled, peform the following steps:

  1. Visit the Kubernetes Engine menu in GCP Console
  2. Click Create cluster.
  3. Configure the cluster as desired, then click More at the bottom of the menu.
  4. From the Autoscaling drop-down menu, click On.
  5. Change the values of the Minimal size and Maximal size fields as desired.
  6. Click Create.

gcloud

The following command creates a cluster of size 30, with node autoscaling based on cluster load that scales the default node pool to a maximum of 50 nodes and a minimum of 15 nodes:

gcloud container clusters create [CLUSTER-NAME] --num-nodes=30 \
--enable-autoscaling --min-nodes=15 --max-nodes=50 [--zone=[ZONE] --project=[PROJECT-ID]]

In this command:

  • --enable-autoscaling indicates that autoscaling is enabled.
  • --min-nodes specifies the minimum number of nodes for the default node pool.
  • --max-nodes specifies the maximum number of nodes for the default node pool.
  • --zone specifies the compute zone in which the autoscaler should create new nodes.

Adding a node pool with autoscaling

Console

To add a node pool with autoscaling to an existing cluster, perform the following steps:

  1. Visit the Kubernetes Engine menu in GCP Console
  2. Click the desired cluster, then click Edit.
  3. From the Node Pools menu at the bottom of the page, click Add node pool.
  4. Configure the node pool as desired, Then, from the Autoscaling drop-down menu, select On.
  5. Click Save.

gcloud

The following command creates a node pool of size 3 (default), with node autoscaling based on cluster load that scales the node pool to a maximum of 5 nodes and a minimum of 1 node:

gcloud container node-pools create [POOL-NAME] --cluster=[CLUSTER-NAME] \
--enable-autoscaling --min-nodes=1 --max-nodes=5 [--zone=[ZONE] --project=[PROJECT-ID]]

In this command:

  • --cluster indicates the cluster in which the node is created.
  • --enable-autoscaling indicates that autoscaling is enabled.
  • --min-nodes specifies the minimum number of nodes for the node pool.
  • --max-nodes specifies the maximum number of nodes for the node pool.
  • --zone specifies the compute zone in which the autoscaler should create new nodes.

Enabling autoscaling for an existing node pool

Console

To enable autoscaling for a specific node pool, perform the following steps:

  1. Visit the Kubernetes Engine menu in GCP Console
  2. Click the desired cluster, then click Edit.
  3. From the Node Pools menu at the bottom of the page, select the desired node pool by clicking the adjacent edit button.
  4. From the Autoscaling drop-down menu, click On.
  5. Change the values of the Minimal size and Maximal size fields as desired.
  6. Click Save.

gcloud

To enable autoscaling for an existing node pool, run the following command:

gcloud container clusters update [CLUSTER-NAME] --enable-autoscaling \
--min-nodes=1 --max-nodes=10 --zone=[ZONE] --node-pool=default-pool

In this command:

  • --enable-autoscaling indicates that autoscaling is enabled.
  • --node-pool specifies the node pool for which autoscaling is enabled.
  • --min-nodes specifies the minimum number of nodes for the node pool.
  • --max-nodes specifies the maximum number of nodes for the node pool.
  • --zone specifies the compute zone in which the autoscaler should create new nodes.

Disabling autoscaling for an existing node pool

Console

To disable autoscaling for a specific node pool, perform the following steps:

  1. Visit the Kubernetes Engine menu in GCP Console
  2. Click the name of the desired cluster.
  3. Click Edit.
  4. From the Node Pools menu at the bottom of the page, select the desired node pool by clicking the adjacent edit button.
  5. From the Autoscaling drop-down menu, click Off.
  6. Click Save.

gcloud

To disable autoscaling for a specific node pool, run the following command:

gcloud container clusters update [CLUSTER-NAME] --no-enable-autoscaling --node-pool=[POOL-NAME] [--zone=[ZONE] --project=[PROJECT-ID]]

In this command:

  • --no-enable-autoscaling instructs the cluster to disable autoscaling.
  • --zone specifies the compute zone in which the autoscaler should create new nodes.

The cluster size is fixed at the cluster's current default node pool size, which can be manually updated.

What's next

Send feedback about...

Kubernetes Engine