This documentation is for the most recent version of Anthos clusters on Azure, released on August 29. See the Release notes for more information.
Stay organized with collections Save and categorize content based on your preferences.

Update a node pool

This topic explains how you can update your node pools. You can update your node pools for the following reasons:

  • To upgrade your node pool version
  • To change the number of nodes in your node pool
  • To change your node pool annotations (Only updatable through API)

You can also change additional parameters on your node pools not listed above. For a complete list of parameters you can update, see the gcloud container azure node-pools update and the projects.locations.azureNodePools.patch documentation.

Update process

This section describes the processes Anthos clusters on Azure takes to update a node pool. The process is different depending on the extent of changes necessary to the node pool.

Configuration-only update

If Anthos clusters on Azure can update a node pool without restarting or recreating any resources, it will make those changes. For example, updating your node pool's annotations will not restart any instances.

Rolling update

When a change to a node pool requires restarting existing virtual machines— for example, when updating the Kubernetes version— Anthos clusters on Azure performs the following steps:

  1. Modify the node pool's virtual machine scale set with new configuration.
  2. Choose one node's underlying instance to update.
  3. Anthos clusters on Azure cordons and drains the node. At this point, new Pods can not be scheduled on the target node. Existing Pod objects on the target node are rescheduled onto other nodes. Pods that can't be rescheduled onto any other existing node stay in the Pending phase until they can be scheduled.
  4. Update the instance to take the latest configuration from its virtual machine scale set.
  5. Reimage and reboot the instance.
  6. Wait until all the nodes in this node pool become healthy.
  7. If all nodes in this node pool are healthy, select another node until all nodes are updated. If any node is unhealthy, Anthos clusters on Azure places the node pool into a DEGRADED state. For more information, see Failed updates.

Protect workloads during a node pool rolling update

During node pool rolling update, Anthos clusters on Azure honors PodDisruptionBudget
configuration for up to one hour after a node starts to drain. After one hour, Anthos clusters on Azure deletes any remaining Pods on the node.

During rolling update, Anthos clusters on Azure performs a graceful shutdown of any nodes to be restarted or removed with the best effort for up to two hours. After two hours, if there are any remaining Pod objects on the node, Anthos clusters on Azure deletes the node and reimages the underlying virtual machine instance.

Resizing node pools

Anthos clusters on Azure node pools have cluster autoscaler enabled by default. The cluster autoscaler automatically resizes the node pool based on the demands of your workloads. For more information on the cluster autoscaler, see Cluster autoscaler.

When you change the maximum and minimum number of nodes in the node pool, Anthos clusters on Azure takes different actions depending on the new configuration and node pool's current number of nodes. These actions include the following:

  • If the node pool's current node count is already within the new range, Anthos clusters on Azure doesn't change the number of nodes in the pool.

  • If the new minimum number of nodes is higher than the node pool's current node count, Anthos clusters on Azure adds more nodes until the node pool reaches the new minimum size.

  • If the new maximum number of nodes is less than the node pool's current node count, Anthos clusters on Azure reduces the size of the node pool by performing the following actions:

    1. Update the autoscaling configuration onto the virtual machine scale set in the node pool
    2. Select a node to remove
    3. Cordon and drain the node
    4. Delete the underlying virtual machine instance
    5. Wait until the deleted virtual machine is fully gone
    6. Perform health check on the whole node pool
    7. Repeat until the number of nodes reaches the desired number

How Anthos clusters on Azure protects workloads during node pool resizing

During node pool resizing, Anthos clusters on Azure honors PodDisruptionBudget configuration for up to one hour after a node starts to drain. After one hour, Anthos clusters on Azure deletes any remaining Pod objects on the node.

During node pool resizing, Anthos clusters on Azure performs a graceful shutdown of any nodes to be restarted or removed and waits for up to two hours. After two hours, if there are any remaining Pod objects on the node, Anthos clusters on Azure deletes the underlying virtual machine instance.

Check for a failed update status

If Anthos clusters on Azure performs a health check after an update and the health check fails, the node pool is marked as DEGRADED. You can find status information on your cluster with the following Google Cloud CLI command:

gcloud container azure node-pools describe NODE_POOL_NAME \
    --cluster CLUSTER_NAME \
    --location GOOGLE_CLOUD_LOCATION

Replace the following:

  • NODE_POOL_NAME: the name of your node pool
  • CLUSTER_NAME: the name of your cluster
  • GOOGLE_CLOUD_LOCATION: the Google Cloud region that manages your cluster

The output includes information about the status and configuration of your node pool.

Prerequisites

To update a node pool, you must have the gkemulticloud.googleapis.com/azureNodePools.update Identity and Access Management permission.

Update a node pool

You can update a node pool with the Google Cloud CLI. To update a node pool, run:

gcloud container azure node-pools update NODE_POOL_NAME \
    --cluster CLUSTER_NAME \
    --location GOOGLE_CLOUD_LOCATION \
    --node-version NODE_POOL_VERSION \
    --min-nodes MIN_NODES \
    --max-nodes MAX_NODES     

Replace the following:

  • NODE_POOL_NAME: the name of the node pool to update
  • CLUSTER_NAME: the name of the cluster to attach the node pool to
  • GOOGLE_CLOUD_LOCATION: the supported Google Cloud region that manages your cluster—for example, us-west1
  • NODE_POOL_VERSION: the new supported node pool version
  • MIN_NODES: the new minimum number of nodes the node pool can contain. Must be 0 or greater.
  • MAX_NODES: the new maximum number of nodes the node pool can contain. Must at least 1 and the value of MIN_NODES.

What's next