This documentation is for the current version of Anthos clusters on AWS, released in November 2021. See the Release notes for more information. For documentation on the previous generation of Anthos clusters on AWS, see Previous generation.

Update a node pool

This topic explains how you can update your node pools. You can update your node pools for the following reasons:

  • To upgrade your node pool version
  • To change the number of nodes in your node pool
  • To change your node pool annotations (Only updatable through API)
  • To update your node pool's security groups
  • To update your node pool's encryption keys

You can also change additional parameters on your node pools not listed above. For a complete list of parameters you can update, see the gcloud container aws node-pools update and the projects.locations.awsNodePools.patch documentation.

Update process

This section describes the processes Anthos clusters on AWS takes to update a node pool. The process is different depending on the extent of changes necessary to the node pool.

Configuration-only update

If Anthos clusters on AWS can update a node pool without restarting or recreating any resources, it will make those changes. For example, updating your node pool's annotations will not restart any instances.

Rolling update

When a change to a node pool requires restarting existing virtual machines— for example, when updating the Kubernetes version— Anthos clusters on AWS performs the following steps:

  1. Create a new launch template for the node pool with the latest configuration.
  2. Update the node pool's Auto Scaling group to have the above created launch template.
  3. Choose one node pool instance to update.
  4. Anthos clusters on AWS cordons and drains the instance. At this point, Anthos clusters on AWS can't schedule new Pod objects on the target node. The control plane reschedules Pod objects onto other nodes. Objects that can't be rescheduled stay in the Pending phase until they can be scheduled.
  5. Delete the instance. AWS recreates the instance and the instance boots with the new configuration.
  6. Perform health checks on the instance.
  7. If the health checks succeed, select another instance until all instances are restarted or recreated. If the health check fails, Anthos clusters on AWS places the node pool into a DEGRADED state. For more information, see the following section.
  8. Delete the original launch template.

Protect workloads during a node pool rolling update

During node pool rolling update, Anthos clusters on AWS honors PodDisruptionBudget
configuration for up to one hour after a node starts to drain. After one hour, Anthos clusters on AWS deletes any remaining Pods on the node.

During rolling update, Anthos clusters on AWS performs a graceful shutdown of any nodes to be restarted or removed with the best effort for up to two hours. After two hours, if there are any remaining Pod objects on the node, Anthos clusters on AWS deletes the underlying virtual machine instance.

Resizing node pools

Anthos clusters on AWS node pools have cluster autoscaler enabled by default. The cluster autoscaler automatically resizes the node pool based on the demands of your workloads. For more information on the cluster autoscaler, see Cluster autoscaler.

Anthos clusters on AWS might disable the cluster autoscaler if the resize request reduces the size of the node pool. When the operation completes, Anthos clusters on AWS enables the cluster autoscaler.

When you change the maximum and minimum number of nodes in the node pool, Anthos clusters on AWS takes different actions depending on the new configuration and node pool's current number of nodes. These actions include the following:

  • If the node pool's current node count is already within the new range, Anthos clusters on AWS doesn't change the number of nodes in the pool.

  • If the new minimum number of nodes is higher than the node pool's current node count, Anthos clusters on AWS adds more nodes until the node pool reaches the new minimum size.

  • If the new maximum number of nodes is less than the node pool's current node count, Anthos clusters on AWS reduces the size of the node pool by performing the following actions:

    1. Disable the cluster autoscaler
    2. Update the minimum size of the node pool's Auto Scaling group to the desired minimum size
    3. Select a node for scale-in (removal from the node pool)
    4. Cordon and drain the selected node
    5. Terminate the node
    6. Repeat until the number of nodes reaches the desired maximum size
    7. Update the maximum size of the node pool's Auto Scaling group to the desired maximum size

How Anthos clusters on AWS protects workloads during node pool resizing

During node pool resizing, Anthos clusters on AWS honors PodDisruptionBudget configuration for up to one hour after a node starts to drain. After one hour, Anthos clusters on AWS deletes any remaining Pod objects on the node.

During node pool resizing, Anthos clusters on AWS performs a graceful shutdown of any nodes to be restarted or removed and waits for up to two hours. After two hours, if there are any remaining Pod objects on the node, Anthos clusters on AWS deletes the underlying virtual machine instance.

Check for a failed update status

If Anthos clusters on AWS performs a health check after an update and the health check fails, the node pool is marked as DEGRADED. You can find status information on your cluster with the following Google Cloud CLI command:

gcloud container aws node-pools describe NODE_POOL_NAME \
    --cluster CLUSTER_NAME \
    --location GOOGLE_CLOUD_LOCATION

Replace the following:

  • NODE_POOL_NAME: the name of your node pool
  • CLUSTER_NAME: the name of your cluster
  • GOOGLE_CLOUD_LOCATION: the Google Cloud region that manages your cluster

The output includes information about the status and configuration of your node pool.

Prerequisites

To update a node pool, you must have the gkemulticloud.googleapis.com/awsNodePools.update Identity and Access Management permission. To update node pool tags, your API service agent role must have the following AWS IAM permissions:

  • autoscaling:CreateOrUpdateTags
  • autoscaling:DeleteTags
  • ec2:CreateTags
  • ec2:DeleteTags
  • ec2:DescribeLaunchTemplates

Dynamic updates

For clusters running Kubernetes version 1.25 or later, Anthos clusters on AWS can update node pool security groups without restarting the node. This technique is known as dynamic updating and is the recommended approach for clusters at Kubernetes v1.25 or later. For clusters running earlier versions of Kubernetes, security groups can be updated only through rolling updates, which restart each node as it is updated.

To perform dynamic updates to your node pool's security groups, your API service agent role must have the following AWS IAM permissions:

  • ec2:ModifyInstanceAttribute
  • ec2:DescribeInstances

These permissions are automatically assigned to the API service agent role if you choose the default API service agent role when you create your cluster.

Update a node pool

You can update a node pool with the Google Cloud CLI. To update a node pool, run:

gcloud container aws node-pools update NODE_POOL_NAME \
    --cluster CLUSTER_NAME \
    --location GOOGLE_CLOUD_LOCATION \
    --node-version NODE_POOL_VERSION \
    --min-nodes MIN_NODES \
    --max-nodes MAX_NODES     

Replace the following:

  • NODE_POOL_NAME: the name of the node pool to update
  • CLUSTER_NAME: the name of the cluster to attach the node pool to
  • GOOGLE_CLOUD_LOCATION: the supported Google Cloud region that manages your cluster—for example, us-west1
  • NODE_POOL_VERSION: the new supported node pool version
  • MIN_NODES: the new minimum number of nodes the node pool can contain. Must be 0 or greater.
  • MAX_NODES: the new maximum number of nodes the node pool can contain. Must at least 1 and the value of MIN_NODES.

Cancelling an update operation

To cancel an ongoing node pool update operation, run the following command:

gcloud container aws operations cancel OPERATION_NAME

Replace OPERATION_NAME with the name of the update operation.

Note that cancelling an update node pool operation which is in progress won't revert node updates that have already completed. This can result in a partially updated node pool.

Update node pool security groups

You can replace any additional security groups to your node pool with the Google Cloud CLI. To update a node pool, run:

gcloud container aws node-pools update NODE_POOL_NAME \
    --security-group-ids=SECURITY_GROUP_IDS

Replace the following:

  • NODE_POOL_NAME: the name of the node pool to update
  • SECURITY_GROUP_IDS: a comma-separated list of security groups to add to the node pool security group

Removing node pool security groups

You can remove all the non-default security groups attached to your node pool with the Google Cloud CLI. To update a node pool, run:

gcloud container aws node-pools update NODE_POOL_NAME \
    --clear-security-group-ids

Replace the following:

  • NODE_POOL_NAME: the name of the node pool to update

Update encryption keys

For information on how to update your node pool's KMS encryption keys, see Key rotation.

What's next