Best practices for GKE on VMware cluster upgrades

This document describes best practices and considerations to upgrade GKE on VMware. You learn how to prepare for cluster upgrades, and the best practices to follow before the upgrade. These best practices help to reduce the risks associated with cluster upgrades.

If you have multiple environments such as test, development, and production, we recommend that you start with the least critical environment, such as test, and verify the upgrade functionality. After you verify that the upgrade was successful, move on to the next environment. Repeat this process until you upgrade your production environments. This approach lets you move from one critical point to the next, and verify that the upgrade and your workloads all run correctly.

Upgrade checklist

To make the upgrade process as smooth as possible, review and complete the following checks before you start to upgrade your clusters:

Plan the upgrade

Updates can be disruptive. Before you start the upgrade, plan carefully to make sure that your environment and applications are ready and prepared.

Back up the user and admin cluster

Before you start an upgrade, back up your user and admin clusters.

A user cluster backup is a snapshot of the user cluster's etcd store. The etcd store contains all of the Kubernetes objects and custom objects required to manage cluster state. The snapshot contains the data required to recreate the cluster's components and workloads. For more information, see how to back up a user cluster.

With GKE on VMware version 1.8 and later, you can set up automatic backup with clusterBackup.datastore in the admin cluster configuration file. To enable this feature in an existing cluster, edit the admin cluster configuration file and add the clusterBackup.datastore field, then run gkectl update admin.

After clusterBackup.datastore is enabled, your admin cluster is automatically backed up in etcd on the configured vSphere datastore. This backup process repeats every time there's a change to the admin cluster. When you start a cluster upgrade, a backup task runs before upgrading the cluster.

To restore an admin cluster from its backup if you have problems, see Back up and restore an admin cluster with gkectl .

Review the use of PodDisruptionBudgets

In Kubernetes, PodDisruptionBudgets (PDBs) can help prevent unwanted application downtime or outages. PDBs instruct the scheduler to always keep a number of Pods running while other Pods might be failing. This behavior is a useful way to provide for application availability.

  1. To check what PDBs are configured in your cluster, use the kubectl get pdb command:

    kubectl get pdb -A --kubeconfig KUBECONFIG
    

    Replace KUBECONFIG with the name of your kubeconfig file.

    The following example output shows PDBs named istio-ingress, istiod, and kube-dns:

    NAMESPACE     NAME            MIN AVAILABLE   MAX UNAVAILABLE   ALLOWED DISRUPTIONS   AGE
    gke-system    istio-ingress   1               N/A               1                     16d
    gke-system    istiod          1               N/A               1                     16d
    kube-system   kube-dns        1               N/A               1                     16d
    

In the preceding table, each PDB specifies that at least one Pod must always be available. This availability becomes critical during upgrades when nodes are drained.

Check for PDBs that can't be fulfilled. For example, you might set a minimum availability of 1, when the Deployment only features 1 replica. In this example, the draining operation is disrupted because the PDB can't be satisfied by the resource controller.

To make sure that the PDBs don't interfere with the upgrade procedure, check all PDBs on a given cluster before you start the upgrade. You might need to coordinate with the development teams and application owners to temporarily change or disable PDBs during a cluster upgrade.

GKE on VMware runs a preflight check during the upgrade process to warn about PDBs. However, you should also manually verify the PDBs to ensure a smooth upgrade experience. To learn more about PDBs, see Specifying a Disruption Budget for your Application.

Review the available IP addresses

The following IP address considerations apply during cluster upgrades:

  • The cluster upgrade process creates a new node and drains the resources before it deletes the old node. We recommend that you always have N+1 IP addresses for the admin or user cluster, where N is the number of nodes in the cluster.
  • When using static IP addresses, the required IP addresses must be listed in the IP block files.
  • If you use DHCP, make sure that new VMs can get additional IP leases in the desired subnet during an upgrade.
    • If you need to add IP addresses, update the IP block file, then run the gkectl update command. For more information, see Plan your IP addresses.
  • If you use static IP addresses and want to speed up the user cluster upgrade process, list enough IP addresses in your IP block file so that each node pool can have an extra IP address available. This approach lets the process speed up the VM addition and removal procedure as it's performed on a per node pool basis.
    • Although this approach is a good option to speed up user cluster upgrades, consider the resource and performance availability of your vSphere environment before you proceed.
  • If there is only one spare IP for the entire user cluster, this limitation slows the upgrade process to only one VM at a time, even when multiple node pools are used.

Check cluster utilization

Make sure that Pods can be evacuated when the node drains and that there are enough resources in the cluster being upgraded to manage the upgrade. To check the current resource usage of the cluster, you can use custom dashboards in Cloud Operations Suite, or directly on the cluster using commands such as kubectl top nodes.

Commands you run against the cluster show you a snapshot of the current cluster resource usage. Dashboards can provide a more detailed view of resources being consumed over time. This resource usage data can help indicate when an upgrade would cause the least disruption, such as during weekends or evenings, depending on the running workload and use cases.

The timing for the admin cluster upgrade might be less critical than for the user clusters, because an admin cluster upgrade usually does not introduce application downtime. However, it's still important to check for free resources in vSphere before you begin an admin cluster upgrade. Also, upgrading the admin cluster might imply some risk, and therefore might be recommended during less active usage periods when management access to the cluster is less critical.

For more information, see what services are impacted during a cluster upgrade.

Check vSphere utilization

Check that there are enough resources on the underlying vSphere infrastructure. To check this resource usage, select a cluster in vCenter and review the Summary tab.

The summary tab shows the overall memory, CPU, and storage consumption of the entire cluster. Because GKE on VMware upgrades demand additional resources, you should also check if the cluster can handle these additional resource requests.

As a general rule, your vSphere cluster must be able to support the following additional resources:

  • +1 VM per admin cluster upgrade
  • +1 VM per node pool per user cluster upgrade

For example, if you upgrade a user cluster with 3 nodes where each node has 8 vCPUs and 32GB or more of RAM, the upgrade procedure consumes the following additional resources:

  • 24 vCPUs
  • 256GB of RAM
  • VM disk space + 256GB of vSwap

The upgrade process creates VMs using the vSphere clone operation. Cloning multiple VMs from a template can introduce stress to the underlying storage system in the form of rising I/O operations. The upgrade can be severely slowed down if the underlying storage subsystem is incapable of providing sufficient performance during an upgrade.

While vSphere is designed for simultaneous resource usage and has mechanisms to provide resources, even when overcommitted, we strongly recommend not overcommitting the VM memory. Memory overcommitment can lead to serious performance impacts that affect the entire cluster as vSphere provides the "missing RAM" from swapping pages out to the datastore. This behavior can lead to problems during an upgrade of a cluster, and cause performance impacts on other running VMs on the vSphere cluster.

If the available resources are already scarce, power down unneeded VMs to help satisfy these additional requirements and prevent a potential performance hit.

Diagnose cluster issues

To check the health of a cluster before an upgrade, run gkectl diagnose on the cluster. The command runs advanced checks, such as to identify nodes that aren't configured properly, or that have Pods that are in a stuck state.

The gkectl upgrade command runs preflight checks and stops the upgrade process if these checks aren't successful. It's best to proactively identify and fix these problems, rather than relying on the preflight checks which are there to protect clusters from any possible damage. As the gkectl diagnose command does more checks than the regular preflight checks, we recommend that you manually run this command before an upgrade.

If the gkectl diagnose command shows a Cluster unhealthy warning, fix the issues before you attempt an upgrade.

For more information, see Diagnosing cluster issues.

Run the pre-upgrade tool

Run the standalone pre-upgrade tool to run pre-flight checks before upgrading the cluster.

Use Deployments to minimize application disruption

As nodes need to be drained during updates, cluster upgrades can lead to application disruptions. Draining the nodes means that all running Pods must be shut down and restarted on the remaining nodes in the cluster.

If possible, your applications should use Deployments. With this approach, applications are designed to handle interruptions. Any impact should be minimal to Deployments that have multiple replicas. You can still upgrade your cluster if applications don't use Deployments.

There are also rules for Deployments to make sure that a set number of replicas always keep running. These rules are known as PodDisruptionBudgets (PDBs). PDBs allow you to limit the disruption to a workload when its Pods must be rescheduled for some reason, such as upgrades or maintenance on the cluster nodes, and are important to check before an upgrade.

Use a high availability load balancer pair

If you use Seesaw as a load balancer on a cluster, the load balancers are upgraded automatically when you upgrade the cluster. This upgrade can cause a service disruption. To reduce the impact of an upgrade and an eventual load balancer failure, you can use a high-availability pair (HA pair). In this configuration, the system creates and configures two load balancer VMs so that a failover to the other peer can happen.

To increase service availability (that is, to the Kubernetes API server), we recommend that you always use an HA pair in front of the admin cluster. To learn more about Seesaw and its HA configuration, see Bundled load balancing with Seesaw.

To prevent a service disruption during an upgrade with an HA pair, the cluster initiates a failover before it creates the new load balancer VM. If a user cluster only uses a single load balancer instance, a service disruption occurs until the upgrade for the load balancer is complete.

We recommend that you have an HA load balancer pair if the user cluster itself is also configured to be highly available. This best practices series assumes that an HA user cluster uses an HA load balancer pair.

Where GKE on VMware version 1.11 or 1.12 uses MetalLB as a bundled load balancer, no pre-upgrade setup is required. The load balancer is upgraded during the cluster upgrade process.

Upgrade sequence

In-place upgrades since version 1.7 must always follow a specific upgrade sequence:

  1. Upgrade the admin workstation.
  2. Upgrade the user clusters, one at a time.

    If you decide not to upgrade all of your user clusters, you cannot upgrade the admin cluster. If you upgrade all of your user clusters, then you have the option of upgrading the admin cluster.

  3. Upgrade the admin cluster as the last and optional step.

Differences between cluster types

There are two different types of cluster:

  • User cluster
  • Admin cluster

When a user cluster is created, it contains only worker nodes, and no control plane nodes. Control plane nodes are added or created in the admin cluster, for all attached user clusters. This approach lets GKE on VMware handle upgrades differently and in a more flexible way as user workloads and control plane nodes are separated.

Different effects of user cluster versus admin cluster upgrades

The GKE on VMware upgrade procedure involves a node drain process that removes all Pods from a node. The process creates a new VM for each drained worker and adds it to the cluster. The drained workers are then removed from VMware's inventory. During this process, any workload that runs on these nodes is stopped and re-started on other available nodes in the cluster.

Depending on the chosen architecture of the workload, this procedure might have an impact on an application's availability. To avoid too much strain on the cluster's resource abilities, GKE on VMware upgrades one node at a time.

User cluster disruption

The following table describes the impact of an in-place user cluster upgrade:

Function Admin cluster Non-HA user cluster HA user cluster
Kubernetes API access Not affected Not affected Not affected
User workloads N/A Affected Affected
PodDisruptionBudgets* Not affected Not affected Not affected
Control-plane node Not affected Affected Not affected
Pod autoscaler (VMware) Not affected Not affected Not affected
Auto repair Not affected Not affected Not affected
Node autoscaling (VMware) Not affected Not affected Not affected
Horizontal Pod autoscaling Affected Affected Not affected
  • * : PDBs might cause the upgrade to fail or stop.
  • Affected: a service disruption during the upgrade is noticeable until the upgrade is finished.
  • Not affected: a service disruption might occur during a very short amount of time, but is almost unnoticeable.

Upgrading the admin cluster doesn't disrupt the user cluster workloads. The admin cluster only contains user control plane nodes, which don't run any user workloads. During the upgrade, these control plane nodes are drained and then updated accordingly.

To improve availability and reduce disruption of production user clusters during upgrades, we recommend that you use three control plane nodes (high availability mode).

Admin cluster disruption

The following table describes the impact of an in-place admin cluster upgrade:

Function Admin cluster Non-HA user cluster HA user cluster
Kubernetes API access Affected Affected Not affected
User workloads N/A Not affected Not affected
PodDisruptionBudgets Affected Affected Not affected
Control-plane node Affected Affected Not affected
Pod Autoscaler Affected Affected Not affected
Auto Repair Affected Affected Not affected
Node autoscaling Affected Affected Not affected
Horizontal Pod autoscaling Affected Affected Not affected
  • Affected: a service disruption during the upgrade is noticeable until the upgrade is finished.
  • Not affected: a service disruption might occur during a very short amount of time, but is almost unnoticeable.

What's next