Node pool upgrade strategies

Stay organized with collections Save and categorize content based on your preferences.

This page discusses the node pool upgrade strategies you can use with your Google Kubernetes Engine (GKE) clusters:

  • Surge upgrades: Nodes are upgraded in a rolling window. You can control how many nodes can be upgraded at once and how disruptive upgrades are to the workloads.
  • Blue-green upgrades: Existing nodes are kept available for rolling back while the workloads are validated on the new node configuration.

By choosing an upgrade strategy, you can pick the process with the right balance of speed, workload disruption, risk mitigation, and cost optimization. To learn more about which node pool upgrade strategy is right for your environment, see Choose surge upgrades and Choose blue-green upgrades.

With both strategies, you can configure upgrade settings to optimize the process based on your environment's needs. To learn more, see Configure your chosen upgrade strategy.

Surge upgrades

Surge upgrades are the default upgrade strategy. Surge upgrades use a rolling method to upgrade nodes. Nodes are upgraded one at a time, in an undefined order. In a node pool spread across multiple zones, upgrades take place one zone at a time. Within a zone, the nodes will be upgraded in an undefined order.

Surge upgrades are best for applications that can handle incremental, non-disruptive changes. With this strategy, nodes are upgraded in a rolling window, and with the settings you can change how many nodes can be upgraded at once and how disruptive the upgrades can be, finding the optimal balance of speed and disruption for your environment's needs.

With surge upgrades, you can use the max-surge-upgrade and max-unavailable-upgrade settings to control the number of nodes GKE can upgrade at a time and control how disruptive upgrades are to your workloads.

Choose surge upgrades for your environment

If cost optimization is important for you and your workload can tolerate being shut down in less than 60 minutes, we recommend choosing surge upgrades for your node pools.

Surge upgrades are optimal for:

  • if you want to optimize for the speed of upgrades.
  • if workloads are more tolerant of disruptions, where graceful termination up to 60 minutes is acceptable.
  • if you want to control costs by minimizing the creation of new nodes.

When GKE uses surge upgrades

If enabled, GKE uses surge upgrades when the following types of changes occur:

Other changes, including applying updates to node labels and taints of existing node pools, do not use surge upgrades as they do not require recreating the nodes.

Understand surge upgrade settings

You can change how many nodes GKE attempts to upgrade at once by changing the surge upgrade parameters on a node pool. Surge upgrades reduce disruption to your workloads during cluster maintenance and also allow you to control the number of nodes upgraded in parallel. Surge upgrades also work with the Cluster Autoscaler to prevent changes to nodes that are being upgraded.

Surge upgrade behavior is determined by two settings:

max-surge-upgrade

The number of additional nodes that can be added to the node pool during an upgrade. Increasing max-surge-upgrade raises the number of nodes that can be upgraded simultaneously. Default is 1. Can be set to 0 or greater.

max-unavailable-upgrade

The number of nodes that can be simultaneously unavailable during an upgrade. Default is 0. Increasing max-unavailable-upgrade raises the number of nodes that can be upgraded in parallel.

The number of nodes upgraded simultaneously is the sum of max-surge-upgrade and max-unavailable-upgrade. The maximum number of nodes upgraded simultaneously is limited to 20.

For example, a 5-node pool is created with max-surge-upgrade set to 2 and max-unavailable-upgrade set to 1. During a node pool upgrade, GKE creates two upgraded nodes. GKE brings down at most three (the sum of max-surge-upgrade and max-unavailable-upgrade) existing nodes after the upgraded nodes are ready. GKE will only make a maximum of one node unavailable (max-unavailable-upgrade) at a time. During the upgrade process, the node pool will include between four and seven nodes.

You can configure surge upgrade parameters for node pools. You can learn more and try out Surge Upgrade by completing the tutorial Use surge upgrades to decrease disruptions from GKE node upgrades.

Tune surge upgrade settings for speed and reliability

The following table describes three different upgrade settings as examples to help you understand different configurations:

Description Configuration
Balanced (Default), slower but least disruptive maxSurge=1 maxUnavailable=0
Fast, no surge resources, most disruptive maxSurge=0 maxUnavailable=20
Fast, most surge resources and less disruptive maxSurge=20 maxUnavailable=0

Balanced (Default)

The simplest way to take advantage of surge upgrade is to configure maxSurge=1 maxUnavailable=0. This means that only 1 surge node can be added to the node pool during an upgrade so only 1 node will be upgraded at a time. This setting is superior to the existing upgrade configuration (maxSurge=0 maxUnavailable=1) because it speeds up Pod restarts during upgrades while progressing conservatively.

Fast and no surge resources

If your workload isn't sensitive to disruption, like most batch jobs, you can emphasize speed by using maxSurge=0 maxUnavailable=20. This configuration does not bring up additional surge nodes and allows 20 nodes to be upgraded at the same time.

Fast and less disruptive

If your workload is sensitive to disruption and you have already set up PodDisruptionBudgets (PDB) and you are not using externalTrafficPolicy: Local, which does not work with parallel node drains, you can increase the speed of the upgrade by using maxSurge=20 maxUnavailable=0. This configuration upgrades 20 nodes in parallel while the PDB limits the number of Pods that can be drained at a given time. Although the configurations of PDBs may vary, if you create a PDB with maxUnavailable: 1 for one or more workloads running on the node pool, then only one Pod of those workloads can be evicted at a time, limiting the parallelism of the entire upgrade.

Control an in-progress surge upgrade

With surge upgrades, while an upgrade is in progress you can use commands to exercise some control over it. For more control over the upgrade process, we recommend using blue-green upgrades.

Cancel a surge upgrade

You can cancel an in-progress surge upgrade at any time during the upgrade process. For example, you might want to cancel an upgrade to temporarily pause an upgrade, if needed.

When you cancel an upgrade:

  • Nodes that have started the upgrade complete it.
  • Nodes that have not started the upgrade do not upgrade.
  • Nodes that have already successfully completed the upgrade are unaffected and are not rolled back.

This means that the node pool might end up in a state where nodes are running two different versions. If automatic upgrades are enabled for the node pool, the node pool can be scheduled for auto-upgrade again, which would upgrade the remaining nodes in the node pool running the older version.

Learn how to cancel a node pool upgrade.

If you cancel a surge upgrade, this does not automatically roll back the upgrade. Once you cancel an upgrade, you can either resume or roll back.

Resume a surge upgrade

If a node pool upgrade was canceled and left partially upgraded, you can resume the upgrade to complete the upgrade process for the node pool. This will upgrade any remaining nodes that had not been upgraded in the original operation. Learn how to resume a node pool upgrade.

Roll back a surge upgrade

If a node pool is left partially upgraded, you can roll back the node pool to revert it to its previous state. You cannot roll back node pools once they have been successfully upgraded. Nodes that have not started an upgrade are unaffected. Learn how to roll back a node pool upgrade.

If you want to downgrade a node pool back to its previous version after the upgrade is already complete, see Downgrading node pools.

Blue-green upgrades

Blue-green upgrades are an alternative upgrade strategy to the default surge upgrade strategy. With blue-green, a new set of desired node resources ("green" nodes) will be created first before any workloads on the original resources ("blue" nodes) are evicted. The "blue" resources will remain available if needed for rolling back until their soaking time has met. You can adjust the pace of upgrades and soaking time based on your environment's needs.

With blue-green upgrades, you have more control over the upgrade process. You can roll back an in-progress upgrade, if necessary, as the original environment is maintained during the upgrade. This upgrade strategy, however, is also more resource intensive. As the original environment is replicated, the node pool uses double the number of resources during the upgrade.

Choose blue-green upgrades for your environment

If you have highly-available production workloads that you need to be able to roll back quickly in case the workload does not tolerate the upgrade, and a temporary cost increase is acceptable, we recommend choosing blue-green upgrades for your node pools.

Blue-green upgrades are optimal for:

  • if you want a gradual rollout where risk mitigation is most important, where graceful termination greater than 60 minutes is needed.
  • if your workloads are less tolerant of disruptions.
  • if a temporary cost increase due to higher resource usage is acceptable.

When GKE uses blue-green upgrades

For GKE nodes, there are different types of configuration changes that require the nodes to be recreated. If you enable the blue-green strategy for a node pool, it will be used for version changes (generally referred to here as "upgrades") and image type changes.

Surge upgrades will be used for any other features requiring the nodes to be recreated. To learn more, see When surge upgrades are used.

The phases of blue-green upgrades

With blue-green upgrades, you can customize and control the process by:

This section explains the phases of the upgrade process. You can use upgrade settings to tune how the phases work, and commands to control the upgrade process.

How cluster autoscaler works with blue-green upgrades

During the phases of a blue-green upgrade, the original "blue" pool does not scale up or down. When the new "green" pool is created, it can only be scaled up until the Soak node pool phase, where it can scale or up or down. If an upgrade is rolled back, the original "blue" pool might scale up during this process if additional capacity is needed.

Phase 1: Create green pool

In this phase, a new set of managed instance groups (MIGs)—known as the "green" pool—are created for each zone under the target pool with the new node configuration (new version or image type).

Quota will be checked before starting provisioning new green resources.

In this phase, the original MIGs—known as the blue pool—cluster autoscaler will stop scaling up or down. The green pool can only scale up in this phase.

In this phase, you can cancel the upgrade if necessary. Once you've canceled it, it, you can either resume it or roll back. At this phase, rolling back will delete the green pool.

Phase 2: Cordon blue pool

In this phase, all the original nodes in the blue pool (existing MIGs) will be cordoned (marked as unschedulable). Existing workloads will keep running, but new workloads will not be scheduled on the existing nodes.

In this phase, you can cancel the upgrade if necessary. Once you've canceled it, it, you can either resume it or roll back. At this phase, rolling back will un-cordon the blue pool and delete the green pool.

Phase 3: Drain blue pool

In this phase, the original nodes in the blue pool (existing MIGs) will be drained in batches. When Kubernetes drains a nodes, eviction requests are sent to all the Pods running on the node. The Pods will be rescheduled. For Pods that have PodDisruptionBudget violations or long terminationGracePeriodSeconds during the draining, they will be deleted in the Delete blue pool phase when the node is deleted. You can use BATCH_SOAK_DURATION and NODE_POOL_SOAK_DURATION, which are described here and in the next section, to extend the period before Pods are deleted.

You can control the size of the batches with either of the following settings:

  • BATCH_NODE_COUNT: the absolute number of nodes to drain in a batch. If it is set to zero, this phase will be skipped.
  • BATCH_PERCENT: the percentage of nodes to drain in a batch. Must be in the range of [0.0, 1.0]. If it is set to zero, this phase will be skipped.

Additionally, you control the duration in seconds between each batch drain with BATCH_SOAK_DURATION. This duration is defined in seconds, with the default being zero seconds.

In this phase, you can still cancel the upgrade if necessary. Once you've canceled it, it, you can either resume it or roll back. At this phase, rolling back will stop the draining of the blue pool, and un-cordon the blue pool. Workloads can then be rescheduled on the blue pool (not guaranteed), and the green pool will be deleted.

Phase 4: Soak node pool

This phase is used for you to verify the workload's health after the blue pool nodes have been drained.

The soak time is set with NODE_POOL_SOAK_DURATION, in seconds. By default, it is set to one hour (3600 seconds). The maximum length of the soak time is 7 days (604,800 seconds).

In this phase, you can finish the upgrade and skip any remaining soak time by completing the upgrade. This will immediately begin the process of removing the blue pool nodes. You can still cancel the upgrade if necessary. Once you've canceled it, it, you can either resume it or roll back.

In this phase, cluster autoscaler can now scale up or down the green pool as normal.

Phase 5: Delete blue pool

After the expiration of the soaking time, the blue pool nodes will be removed from the target pool. This phase cannot be paused. At the completion of this phase, your node pool will have only new nodes with the updated configuration (version or image type).

Control an in-progress blue-green upgrade

With blue-green upgrades, while an upgrade is in progress you can use commands to exercise control over it. This gives you a high level of control over the process in case you determine, for instance, that your workloads need to be rolled back to the old node configuration.

Cancel a blue-green upgrade

When you cancel a blue-green upgrade, you pause the upgrade in its current phase. This command can be used at all phases except the Delete blue pool phase. When you pause an update, the node pool will be left in an intermediate status based on the phase where the pause request was issued.

Learn how to cancel a node pool upgrade.

Once an upgrade is canceled, you can choose one of two paths forward: resume or roll back.

Resume a blue-green upgrade

If you have determined the upgrade is ok to move forward, you can resume it.

If you resume, the upgrade process will continue at the intermediate phase it was paused. To learn how to resume a node pool upgrade, see Resume a node pool upgrade.

Roll back a blue-green upgrade

If you have determined that the upgrade should not move forward and you want to bring the node pool back to its original state, you can roll back. To learn how to roll back a node pool upgrade, see roll back a node pool upgrade.

With the roll back workflow, the process reverses itself to bring the node pool back to its original state. The blue pool will be un-cordoned so that workloads may be rescheduled on it. During this process, cluster autoscaler may scale up the blue pool as needed. The green pool will be drained and deleted.

If you want to downgrade a node pool back to its previous version after the upgrade is already complete, see Downgrading node pools.

Complete a blue-green upgrade

During the Soak phase, you can complete an upgrade if you have determined that the workload does not need further validation on the new node configuration and the old nodes can be removed. Completing an upgrade skips the rest of the Soak phase and proceeds to the Delete blue pool phase.

To learn more about how to use the complete command, see Complete a blue-green node pool upgrade.

What's next