Auto-upgrading nodes


This page shows you how to configure node auto-upgrades in Google Kubernetes Engine (GKE).

Overview

Node auto-upgrades help you keep the nodes in your cluster up-to-date with the cluster control plane version when your control plane is updated on your behalf. When you create a new cluster or node pool with the Google Cloud console or the gcloud command, node auto-upgrade is enabled by default.

You can learn more about cluster and node upgrades.

Node auto-upgrades provide several benefits:

  • Lower management overhead: You don't have to manually track and update your nodes when the control plane is upgraded on your behalf.
  • Better security: Sometimes new binaries are released to fix a security issue. With auto-upgrades, GKE automatically ensures that security updates are applied and kept up to date.
  • Ease of use: Provides a simple way to keep your nodes up to date with the latest Kubernetes features.

Node pools with auto-upgrades enabled are scheduled for upgrades when they meet the selection criteria (announced in the release notes). Rollouts are phased across multiple weeks to ensure cluster and fleet stability. When the upgrade is performed, nodes are drained and re-created to match the current control plane version. Modifications on the boot disk of a node VM do not persist across node re-creations. To preserve modifications across node re-creation, use a DaemonSet.

Node auto-upgrade is not available for Alpha clusters. If you are using a cluster with Windows Server node pools, review Upgrading Windows Server node pools before enabling node auto-upgrade.

Check the state of auto-upgrade for an existing node pool

You can check whether auto-upgrade is enabled or disabled for a node pool using the Google Cloud console or the gcloud command.

gcloud

To check the state of auto-upgrade for a node pool, run the following command:

gcloud container node-pools describe NODE_POOL_NAME \
  --cluster CLUSTER_NAME \
  --zone COMPUTE_ZONE

Replace the following:

  • NODE_POOL_NAME: the name of the node pool.
  • CLUSTER_NAME: the name of the cluster that contains the node pool.
  • COMPUTE_ZONE: the compute zone for the cluster.

Console

To check the state of auto-upgrade for a node pool, perform the following:

  1. Go to the Google Kubernetes Engine page in the Google Cloud console.

    Go to Google Kubernetes Engine

  2. In the cluster list, click the name of the cluster you want to view.

  3. Click the Nodes tab.

  4. Under Node Pools, click the name of the node pool you want to view.

  5. On the Node pool details page, under Management, view the value of the Auto-upgrade field.

Enable node auto-upgrades for an existing node pool

When you create a new cluster with the Google Cloud console or the gcloud command, node auto-upgrade is enabled by default.

You can enable node auto-upgrade if it is currently disabled.

gcloud

To enable auto-upgrades for an existing node pool, run the following command:

gcloud container node-pools update NODE_POOL_NAME \
    --cluster CLUSTER_NAME \
    --zone COMPUTE_ZONE \
    --enable-autoupgrade

Replace the following:

  • NODE_POOL_NAME: the name of the node pool.
  • CLUSTER_NAME: the name of the cluster that contains the node pool.
  • COMPUTE_ZONE: the compute zone for the cluster.

Console

To enable auto-upgrades for an existing node pool, perform the following steps:

  1. Go to the Google Kubernetes Engine page in the Google Cloud console.

    Go to Google Kubernetes Engine

  2. In the cluster list, click the name of the cluster you want to modify.

  3. Click the Nodes tab.

  4. Under Node Pools, click the name of the node pool you want to modify.

  5. On the Node pool details page, click Edit.

  6. Under Management, select the Enable auto-upgrade checkbox.

  7. Click Save.

For more control over when nodes can be auto-upgraded, consider configuring maintenance windows and exclusions.

Check the status of node upgrades

You can check the status of an upgrade using gcloud container operations.

View a list of every running and completed operation in the cluster:

gcloud container operations list

Each operation is assigned an operation ID and an operation type as well as start and end times, target cluster, and status. The list appears similar to the following example:

NAME                              TYPE                ZONE           TARGET              STATUS_MESSAGE  STATUS  START_TIME                      END_TIME
operation-1505407677851-8039e369  CREATE_CLUSTER      us-west1-a     my-cluster                          DONE    20xx-xx-xxT16:47:57.851933021Z  20xx-xx-xxT16:50:52.898305883Z
operation-1505500805136-e7c64af4  UPGRADE_CLUSTER     us-west1-a     my-cluster                          DONE    20xx-xx-xxT18:40:05.136739989Z  20xx-xx-xxT18:41:09.321483832Z
operation-1505500913918-5802c989  DELETE_CLUSTER      us-west1-a     my-cluster                          DONE    20xx-xx-xxT18:41:53.918825764Z  20xx-xx-xxT18:43:48.639506814Z

To get more information about a specific operation, specify the operation ID as shown in the following command:

gcloud container operations describe OPERATION_ID

For example:

gcloud container operations describe operation-1507325726639-981f0ed6
endTime: '20xx-xx-xxT21:40:05.324124385Z'
name: operation-1507325726639-981f0ed6
operationType: UPGRADE_CLUSTER
selfLink: https://container.googleapis.com/v1/projects/.../kubernetes-engine/docs/zones/us-central1-a/operations/operation-1507325726639-981f0ed6
startTime: '20xx-xx-xxT21:35:26.639453776Z'
status: DONE
targetLink: https://container.googleapis.com/v1/projects/.../kubernetes-engine/docs/zones/us-central1-a/clusters/...
zone: us-central1-a

Check node pool upgrade settings

You can see details on the node upgrade strategy being used for your node pools using the gcloud container node-pools describe command. For blue-green upgrades, the command also returns the current phase of the upgrade.

Run the following command:

gcloud container node-pools describe NODE_POOL_NAME \
--cluster=CLUSTER_NAME

Replace the following:

  • NODE_POOL_NAME: the name of the node pool to describe.
  • CLUSTER_NAME: the name of the cluster of the node pool to describe.

This command will output the current upgrade settings. The following example shows the output if you are using the blue-green upgrade strategy.

upgradeSettings:
  blueGreenSettings:
    nodePoolSoakDuration: 1800s
    standardRolloutPolicy:
      batchNodeCount: 1
      batchSoakDuration: 10s
  strategy: BLUE_GREEN

If you are using the blue-green upgrade strategy, the output also includes details about the blue-green upgrade settings and its current intermediate phase. The following example shows what this might look like:

updateInfo:
  blueGreenInfo:
    blueInstanceGroupUrls:
    - https://www.googleapis.com/compute/v1/projects/{PROJECT_ID}/zones/{LOCATION}/instanceGroupManagers/{BLUE_INSTANCE_GROUP_NAME}
    bluePoolDeletionStartTime: {BLUE_POOL_DELETION_TIME}
    greenInstanceGroupUrls:
    - https://www.googleapis.com/compute/v1/projects/{PROJECT_ID}/zones/{LOCATION}/instanceGroupManagers/{GREEN_INSTANCE_GROUP_NAME} 
    greenPoolVersion: {GREEN_POOL_VERSION}
    phase: DRAINING_BLUE_POOL

Disable node auto-upgrades

Although not recommended, you can disable node auto-upgrade for an existing node pool if the underlying cluster isn't enrolled in a release channel.

Considerations before disabling node auto-upgrades

If you disable node auto-upgrades for a node pool, GKE does not update the version of the nodes. Opting out of node auto-upgrades does not block GKE from upgrading your cluster's control plane.

Disabling prevents version updates, but not all maintenance tasks

Disabling node auto-upgrades only prevents GKE from updating the version of the nodes, but does not prevent GKE from initiating other maintenance tasks. For example, even with node auto-upgrades disabled, triggering IP address rotation, enabling network policy, or PSC migration on a cluster recreates all nodes at the same version as the control plane, regardless of the version selected for the node pool. To control the timing of maintenance, use Maintenance windows and exclusions.

Disabling means responsibility of control plane-nodes compatibility

If you disable node auto-upgrade, you are responsible for ensuring that the cluster's nodes run a version compatible with the cluster's version, and that the version adheres to the Kubernetes version and version skew support policy. Starting with GKE version 1.19, GKE upgrades nodes that are running an unsupported version after the version has reached end of life to ensure cluster health and alignment with the open source version skew policy. Nodes running unsupported versions might not be upgraded immediately upon version end of life, and actual timing can vary at Google's discretion.

Disabling does not stop ongoing operations

Disabling node auto-upgrades does not stop or cancel any ongoing upgrades to nodes in node pools. To cancel or stop ongoing upgrades, follow Cancelling a node upgrade. Canceling a node upgrade can be helpful in situations where you find workloads are failing on upgraded nodes and you want to prevent further disruption.

If the upgrade is fully completed for the entire node pool, then the upgrade cannot be rolled back or cancelled. To downgrade the node pool, see Downgrading node pools.

Disable node auto-upgrades for an existing node pool

gcloud

To disable auto-upgrades for an existing node pool, run the following command:

gcloud container node-pools update NODE_POOL_NAME \
    --cluster CLUSTER_NAME \
    --zone COMPUTE_ZONE \
    --no-enable-autoupgrade

Console

To disable auto-upgrades for an existing node pool, perform the following steps:

  1. Go to the Google Kubernetes Engine page in the Google Cloud console.

    Go to Google Kubernetes Engine

  2. In the cluster list, click the name of the cluster you want to modify.

  3. Click the Nodes tab.

  4. Under Node Pools, click the name of the node pool you want to modify.

  5. On the Node pool details page, click Edit.

  6. Under Management, clear the Enable auto-upgrade checkbox.

  7. Click Save to modify the cluster.

Create a cluster or node pool with node auto-upgrades enabled

gcloud

To create a cluster with auto-upgrades enabled for the default node pool, specify the --enable-autoupgrade flag in the gcloud container clusters create command:

gcloud container clusters create CLUSTER_NAME \
    --zone COMPUTE_ZONE \
    --enable-autoupgrade

To create a node pool with auto-upgrade enabled specify the --enable-autoupgrade flag in the gcloud container node-pools create command:

gcloud container node-pools create NODE_POOL_NAME \
    --cluster CLUSTER_NAME \
    --zone COMPUTE_ZONE \
    --enable-autoupgrade

Console

Clusters and node pools created with the Google Cloud console have auto-upgrades enabled by default. Visit Creating a cluster or Adding and managing node pools for instructions to create clusters and node pools.

You can disable auto-upgrades for new node pools. From the cluster creation page, click the name of the node pool you want to modify, then clear Enable auto-upgrade.

Receive upgrade notifications

GKE publishes upgrade notifications to Pub/Sub, providing you with a channel to receive information from GKE about your clusters.

For more information, see Receiving cluster upgrade notifications.

Change surge upgrade parameters

To learn more about changing surge upgrade parameters, see Configure surge upgrades.

Exercise control during a node pool upgrade

During automatic upgrades and manually-initiated node pool upgrades, you can take the following actions.

Cancel a node pool upgrade

You can cancel an upgrade at any time. To learn more about what happens when you cancel a surge upgrade, see Cancel a surge upgrade. To learn more about what happens when you cancel a blue-green upgrade, see Cancel a blue-green upgrade.

  1. Get the upgrade's operation ID:

    gcloud container operations list
    
  2. Cancel the upgrade:

    gcloud container operations cancel OPERATION_ID
    

Refer to the gcloud container operations cancel documentation.

Resume a node pool upgrade

You can resume an upgrade by manually initiating the upgrade again, specifying the target version from the original upgrade.

For example, if you paused an ongoing upgrade to version 1.23.1-gke.100, you could resume the canceled upgrade by starting the same upgrade again on the node pool, targeting version 1.23.1-gke.100.

To learn more about what happens when you resume an upgrade, see Resume a surge upgrade and blue-green upgrade.

To resume an upgrade, use the following command:

    gcloud container clusters upgrade CLUSTER_NAME \
      --node-pool=NODE_POOL_NAME \
      --cluster-version VERSION

Replace the following:

  • NODE_POOL_NAME: the name of the node pool for which you want to resume the node pool upgrade.
  • CLUSTER_NAME: the name of the cluster of the node pool for which you want to resume the upgrade.
  • VERSION: the target version of the canceled node pool upgrade.

For more information, refer to the gcloud container clusters upgrade documentation.

Roll back a node pool upgrade

You can roll back a node pool to downgrade the upgraded nodes to their original state from before the node pool upgrade started.

Use the rollback command if an in-progress upgrade was cancelled, the upgrade failed, or the upgrade is incomplete due to a maintenance window timing out. Alternatively, if you want to specify the version, follow the instructions to downgrade the node pool.

To learn more about what happens when you roll back a node pool upgrade, see Roll back a surge upgrade or Roll back a blue-green upgrade.

To roll back an upgrade, run the following command:

gcloud container node-pools rollback NODE_POOL_NAME \
  --cluster CLUSTER_NAME

Replace the following:

  • NODE_POOL_NAME: the name of the node pool for which to to roll back the node pool upgrade.
  • CLUSTER_NAME: the name of the cluster of the node pool for which to roll back the upgrade.

Refer to the gcloud container node-pools rollback documentation.

Complete a node pool upgrade

If you are using the blue-green upgrade strategy, you can complete a node pool upgrade during the Soak phase, skipping the rest of the soak time.

To learn how completing a node pool upgrade works, see Complete a node pool upgrade.

To complete an upgrade when using the blue-green upgrade strategy, run the following command:

gcloud container node-pools complete-upgrade NODE_POOL_NAME \
  --cluster CLUSTER_NAME

Replace the following:

  • NODE_POOL_NAME: the name of the node pool for which you want to complete the upgrade.
  • CLUSTER_NAME: the name of the cluster of the node pool for which you want to complete the upgrade.

Refer to the gcloud container node-pools complete-upgrade documentation.

What's next