Creating and managing node pools

When you create a user cluster, you have to configure at least one node pool, which is a group of nodes that all have the same configuration. After the cluster is created, you can add new node pools, update node pool settings, and delete node pools.

How you create, update, and delete node pools depends on whether the cluster is managed by the Anthos On-Prem API. A user cluster is managed by the Anthos On-Prem API if one of the following is true:

If the Anthos On-Prem API is managing a user cluster, you can use the Google Cloud console to manage node pools. If the user cluster isn't managed by the Anthos On-Prem API, use gkectl on the command line of your admin workstation to manage node pools.

Add a node pool

If you created the cluster in the Google Cloud console, you can use the Google Cloud console to add a node pool; however, you need to use the command line to configure the following node pool settings:

Before you add another node pool, verify that enough IP addresses are available on the cluster.

Console

  1. In the Google Cloud console, go to the GKE Enterprise clusters page.

    Go to the GKE Enterprise clusters page

  2. Select the Google Cloud project that the user cluster is in.

  3. In the cluster list, click the name of the cluster, and then click View details in the Details panel.

  4. Click Add node pool.

  5. Configure the node pool:

    1. Enter the Node pool name.
    2. Enter the number of vCPUs for each node in the pool (minimum 4 per user cluster worker).
    3. Enter the memory size in mebibytes (MiB) for each node in the pool (minimum 8192 MiB per user cluster worker node and must be a multiple of 4).
    4. In the Nodes field, enter the number of nodes in the pool (minimum of 3).
    5. Select the OS image type: Ubuntu Containerd, Ubuntu, or COS.

    6. Enter the Boot disk size in gibibytes (GiB) (default is 40 GiB).

  6. In the Node pool metadata (optional) section, if you want to add Kubernetes labels and taints, do the following:

    1. Click + Add Kubernetes Labels. Enter the Key and Value for the label. Repeat as needed.
    2. Click + Add Taint. Enter the Key, Value, and Effect for the taint. Repeat as needed.
  7. Click Create.

  8. The Google Cloud console displays Cluster status: changes in progress. Click Show Details to view the Resource status condition and Status messages.

Command line

  1. In your user cluster configuration file, fill in the nodePools section.

    You must specify the following fields:

    • nodePools.[i].name
    • nodePools[i].cpus
    • nodePools.[i].memoryMB
    • nodePools.[i].replicas

    The following fields are optional. If you don't include nodePools[i].bootDiskSizeGB or nodePools[i].osImageType, the default values are used.

    • nodePools[i].labels
    • nodePools[i].taints
    • nodePools[i].bootDiskSizeGB
    • nodePools[i].osImageType
    • nodePools[i].vsphere.datastore
    • nodePools[i].vsphere.tags
  2. Run the following command:

    gkectl update cluster --kubeconfig ADMIN_CLUSTER_KUBECONFIG --config USER_CLUSTER_CONFIG
    

    Replace the following:

    • [ADMIN_CLUSTER_KUBECONFIG] with the path of the kubeconfig file for your admin cluster.

    • [USER_CLUSTER_CONFIG] with the path of your user cluster configuration file.

Example configuration

In the following example configuration, there are four node pools, each with different attributes:

  • pool-1: only the minimum required attributes are specified
  • pool-2: includes vsphere.datastore and vsphere.tags
  • pool-3: includes taints and labels
  • pool-4: includes osImageTypeand bootDiskSizeGB
nodePools:
- name: pool-1
  cpus: 4
  memoryMB: 8192
  replicas: 5
- name: pool-2
  cpus: 8
  memoryMB: 16384
  replicas: 3
  vsphere:
    datastore: my_datastore
    tags:
    - category: "purpose"
      name: "testing"
- name: pool-3
  cpus: 4
  memoryMB: 8192
  replicas: 5
  taints:
    - key: "example-key"
      effect: NoSchedule
  labels:
    environment: production
    app: nginx
- name: pool-4
  cpus: 4
  memoryMB: 8192
  replicas: 5
  osImageType: cos
  bootDiskSizeGB: 40

Update a node pool

You can use the command line to update all fields in the in the nodePools section of your user cluster configuration file. Currently, the only node pool fields that you can update using the Google Cloud console are:

  • Number of replicas
  • Memory
  • Number of vCPUs

When you increase the number of replicas, GKE on VMware adds the required number of nodes to the user cluster, and when you decrease the number of replicas, nodes are removed. Changing the number of replicas for a node pool doesn't interrupt workloads. Make sure that you have IP addresses available if you increase the number of replicas.

If you update any other node pool field, this triggers a rolling update on the cluster. In a rolling update, GKE on VMware creates a new node and then deletes an old node. This process is repeated until all the old nodes have been replaced with new nodes. This process doesn't cause downtime, but the cluster must have an extra IP address available to use during the update.

Suppose a node pool will have N nodes at the end of an update. Then you must have at least N + 1 IP addresses available for nodes in that pool. This means that if you are resizing a cluster by adding nodes to one or more pools, you must have at least one more IP address than the total number of nodes that will be in all of the cluster's node pools at the end of the resizing. For more information, see Verify that enough IP addresses are available.

To update a node pool on a user cluster:

Console

  1. In the Google Cloud console, go to the GKE Enterprise clusters page.

    Go to the GKE Enterprise clusters page

  2. Select the Google Cloud project that the user cluster is in.

  3. In the cluster list, click the name of the cluster, and then click View details in the Details panel.

  4. Click the Nodes tab.

  5. Click the name of the node pool that you want to modify.

  6. Click Edit next to the field that you want to modify, and click Done.

  7. Click to go back to the previous page.

  8. The Google Cloud console displays Cluster status: changes in progress. Click Show Details to view the Resource status condition and Status messages.

Command line

  1. Modify the values for the fields that you want to change in the nodePools section of the user cluster configuration file.

  2. Update the cluster:

    gkectl update cluster --kubeconfig ADMIN_CLUSTER_KUBECONFIG --config USER_CLUSTER_CONFIG
    

    Replace the following:

    • [ADMIN_CLUSTER_KUBECONFIG] with the path of the kubeconfig file for your admin cluster.

    • [USER_CLUSTER_CONFIG] with the path of your user cluster configuration file.

Update the osImageType used by a node pool

To update a node pool to use a different osImageType, you have to use the command line. To change the osImageType used by a node pool, update the configuration file for the node pool, as shown in the following example, and run gkectl update cluster.

nodePools:
- name: np-1
  cpus: 4
  memoryMB: 8192
  replicas: 3
  osImageType: ubuntu_containerd

Verify your changes

To verify that your node pools have been created or updated as intended, inspect the cluster nodes:

Console

  1. In the Google Cloud console, go to the GKE Enterprise clusters page.

    Go to the GKE Enterprise clusters page

  2. Select the Google Cloud project that the user cluster is in.

  3. In the cluster list, click the name of the cluster, and then click View details in the Details panel.

  4. Click the Nodes tab.

  5. Click the name of the node pool you want to view.

Command line

Run the following command:

kubectl --kubeconfig USER_CLUSTER_KUBECONFIG get nodes -o wide

If you need to revert your changes, edit the cluster configuration file and run gkectl update cluster.

Delete a node pool

Although you can delete node pools, your user cluster must have at least one node pool. Deleting a node pool causes immediate removal of the pool's nodes regardless of whether those nodes are running workloads.

To delete a node pool from a user cluster:

Console

  1. Ensure that there are no workloads running on the affected nodes.

  2. In the Google Cloud console, go to the GKE Enterprise clusters page.

    Go to the GKE Enterprise clusters page

  3. Select the Google Cloud project that the user cluster is in.

  4. In the cluster list, click the name of the cluster, and then click View details in the Details panel.

  5. Click the Nodes tab.

  6. Click the name of the node pool that you want to delete.

  7. Click Delete.

  8. Click to go back to the previous page.

  9. The Google Cloud console displays Cluster status: changes in progress. Click Show Details to view the Resource status condition and Status messages.

Command line

  1. Ensure that there are no workloads running on the affected nodes.

  2. Remove its definition from the nodePools section of the user cluster configuration file.

  3. Update the cluster:

    gkectl update cluster --kubeconfig ADMIN_CLUSTER_KUBECONFIG --config USER_CLUSTER_CONFIG
    

    Replace the following:

    • [ADMIN_CLUSTER_KUBECONFIG] with the path of the kubeconfig file for your admin cluster.

    • [USER_CLUSTER_CONFIG] with the path of your user cluster configuration file.

Troubleshooting

  • In general, the gkectl update cluster command provides specifics when it fails. If the command succeeded and you don't see the nodes, you can troubleshoot with the Diagnosing cluster issues guide.

  • It is possible that there are insufficient cluster resources like a lack of available IP addresses during node pool creation or update. See the Resizing a user cluster topic for details about verifying that IP addresses are available.

  • You can also review the general Troubleshooting guide.

  • Won't proceed past Creating node MachineDeployment(s) in user cluster….

    It can take a while to create or update the node pools in your user cluster. However, if the wait time is extremely long and you suspect that something might have failed, you can run the following commands:

    1. Run kubectl get nodes to obtain the state of your nodes.
    2. For any nodes that are not ready, run kubectl describe node NODE_NAME to obtain details.