Update clusters

After you create a cluster with bmctl, you can change some aspects of the cluster's configuration by performing the following sequence of actions:

  1. Change the values of certain fields in the cluster's configuration file, which by default is located here: bmctl-workspace/CLUSTER-NAME/CLUSTER-NAME.yaml.

  2. Update the cluster by running the bmctl update command.

In this way, you can, for example, add or remove nodes in a cluster or replace nodes in a cluster. This document describes how to perform these and other updates to a cluster.

It's important to note, however, that many aspects of your cluster configuration are immutable and can't be updated after you create your cluster. For a comprehensive list of mutable and immutable fields, see the Cluster configuration field reference. The field reference is a sortable table. Click the column headings to change the sort order. Click a field name to view its description.

Add or remove nodes in a cluster

A node pool is a group of nodes within a cluster that have the same configuration. Keep in mind that a node always belongs to a node pool. To add a new node to a cluster, you need to add it to a particular node pool. Removing a node from a node pool amounts to removing the node from the cluster altogether.

There are three kinds of node pools in GDCV for Bare Metal: control plane, load balancer, and worker node pools.

You add or remove a node from a node pool by adding or removing the IP address of the node in a specific section of the cluster configuration file. The following list shows which section to edit for a given node pool:

  • Worker node pool: add or remove the IP address of the node in the spec.nodes section of the NodePool spec.
  • Control plane node pool: add or remove the IP address of the node in the spec.controlPlane.nodePoolSpec.nodes section of the Cluster spec.
  • Load balancer node pool: add or remove the IP address of the node in the spec.loadBalancer.nodePoolSpec.nodes section of the Cluster spec.

Example: how to remove a worker node

Here's a sample cluster configuration file that shows the specifications of two worker nodes:

---
apiVersion: baremetal.cluster.gke.io/v1
kind: NodePool
metadata:
  name: nodepool1
  namespace: cluster-cluster1
spec:
  clusterName: cluster1
  nodes:
  - address: 172.18.0.5
  - address: 172.18.0.6

To remove a node:

  1. (Optional) If the node that you're removing is running critical pods, first put the node into maintenance mode.

    You can monitor the node draining process for worker nodes by viewing the status.nodesDrained and status.nodesDraining fields on the NodePool resource.

  2. Edit the cluster configuration file to delete the IP address entry for the node.

  3. Update the cluster:

    bmctl update cluster -c CLUSTER_NAME \
        --kubeconfig=ADMIN_KUBECONFIG
    

    Replace the following:

    • CLUSTER_NAME: the name of the cluster you want to update.
    • ADMIN_KUBECONFIG: the path to the admin cluster kubeconfig file.

    After the bmctl update command has executed successfully, it takes some time to complete the machine-preflight and machine-init jobs. You can view the status of nodes and their respective node pools by running the commands described in the Verify your updates section of this document.

Forcing the removal of a node

If the bmctl update command is unable to remove a node, you may have to force its removal from the cluster. For details, see Force-removing broken nodes.

Replace HA control plane nodes

You can replace high availability (HA) control plane nodes in admin, user, standalone, and hybrid clusters.

You replace a node in a cluster by performing the following steps:

  1. Remove the node's IP address from the cluster configuration file.
  2. Update the cluster.
  3. Check the status of nodes in the cluster.
  4. Add a new node's IP address to the same cluster configuration file.
  5. Update the cluster.

The rest of this section goes through an example.

Here's a sample cluster configuration file that shows three control plane nodes in a user cluster:

---
apiVersion: baremetal.cluster.gke.io/v1
kind: Cluster
metadata:
 name: user-cluster
 namespace: cluster-user-cluster
spec:
  controlPlane:
   nodePoolSpec:
     nodes:
     - address: 10.200.0.11
     - address: 10.200.0.12
     - address: 10.200.0.13

To replace the last node listed in the spec.controlPlane.nodePoolSpec.nodes section, perform the following steps:

  1. Remove the node by deleting its IP address entry in the cluster configuration file. After making this change, the cluster configuration file should look something like this:

    ---
    apiVersion: baremetal.cluster.gke.io/v1
    kind: Cluster
    metadata:
     name: user-cluster
     namespace: cluster-user-cluster
    spec:
      controlPlane:
       nodePoolSpec:
         nodes:
         - address: 10.200.0.11
         - address: 10.200.0.12
    
  2. Update the cluster by running the following command:

    bmctl update cluster -c CLUSTER_NAME \
        --kubeconfig=KUBECONFIG
    

    Make the following changes:

    • Replace CLUSTER_NAME with the name of the cluster you want to update.
    • If the cluster is a self-managing cluster (such as admin or standalone cluster), replace KUBECONFIG with the path to the cluster's kubeconfig file. If the cluster is a user cluster, as it is in this example, replace KUBECONFIG with the path to the admin cluster's kubeconfig file.
  3. After the bmctl update command has executed successfully, it takes some time to complete the machine-preflight and machine-init jobs. You can view the status of nodes and their respective node pools by running the commands described in the Verify your updates section of this document. Once the node pool and nodes are in a ready state, you can proceed to the next step.

  4. Add a new control plane node to the node pool by adding the IP address of the new control plane node to the spec.controlPlane.nodePoolSpec.nodes section of the cluster configuration file. After making this change, the cluster configuration file should look something like this:

    ---
    apiVersion: baremetal.cluster.gke.io/v1
    kind: Cluster
    metadata:
     name: user-cluster
     namespace: cluster-user-cluster
    spec:
      controlPlane:
       nodePoolSpec:
         nodes:
         - address: 10.200.0.11
         - address: 10.200.0.12
         - address: 10.200.0.14
    
  5. Update the cluster by running the following command:

    bmctl update cluster -c CLUSTER_NAME \
        --kubeconfig=KUBECONFIG
    

    Make the following changes:

    • Replace CLUSTER_NAME with the name of the cluster you want to update.
    • If the cluster is a self-managing cluster (such as admin or standalone cluster), replace KUBECONFIG with the path to the cluster's kubeconfig file. If the cluster is a user cluster, as it is in this example, replace KUBECONFIG with the path to the admin cluster's kubeconfig file.

Verify your updates

You can view the status of nodes and their respective node pools with the kubectl get command.

For example, the following command shows the status of the node pools in the cluster namespace cluster-my-cluster:

kubectl -n cluster-my-cluster get nodepools.baremetal.cluster.gke.io

The system returns results similar to the following:

NAME                    READY   RECONCILING   STALLED   UNDERMAINTENANCE   UNKNOWN
cluster-my-cluster      3       0             0         0                  0
cluster-my-cluster-lb   2       0             0         0                  0
np1                     3       0             0         0                  0

Reconciling=1 means that the reconciliation step is still in progress. You should wait until the status changes to Reconciling=0.

You can also check status of nodes in the a cluster by running following command:

kubectl get nodes --kubeconfig=KUBECONFIG

If you need more information about how to diagnose your clusters, see Create snapshots for diagnosing clusters.

Features you can change with an update

Besides adding, removing or replacing nodes, you can use the bmctl update command to modify certain mutable field values, Custom Resources (CRs), and annotations in the cluster configuration file.

To update a cluster resource, edit the cluster configuration file and use bmctl update to apply your changes.

The following sections outline some common examples for updating an existing cluster by changing either a field value, CR or annotation.

loadBalancer.addressPools

The addressPools section contains fields for specifying load-balancing pools for bundled load balancers. You can add more load-balancing address pools at any time, but you can't remove or modify any existing address pools.

addressPools:
- name: pool1
  addresses:
  - 192.168.1.0-192.168.1.4
  - 192.168.1.240/28
- name: pool2
  addresses:
  - 192.168.1.224/28

Prevent inadvertent cluster deletion

If you add the baremetal.cluster.gke.io/prevent-deletion: "true" annotation to your cluster configuration file, you are prevented from deleting the cluster. For example, running kubectl delete cluster or bmctl reset cluster produce an error.

apiVersion: baremetal.cluster.gke.io/v1
kind: Cluster
metadata:
  name: ci-10c3c6f4d9c698e
  namespace: cluster-ci-10c3c6f4d9c698e
  annotations:
    baremetal.cluster.gke.io/prevent-deletion: "true"
spec:
  clusterNetwork:

bypassPreflightCheck

The default value of the bypassPreflightCheck field is false. If you set this field to true in the cluster configuration file, the internal preflight checks are ignored you apply resources to existing clusters.

apiVersion: baremetal.cluster.gke.io/v1
kind: Cluster
metadata:
  name: cluster1
  namespace: cluster-cluster1
  annotations:
    baremetal.cluster.gke.io/private-mode: "true"
spec:
  bypassPreflightCheck: true

loginUser

You can set the loginUser field under the node access configuration. This field supports passwordless sudo capability for machine login.

apiVersion: baremetal.cluster.gke.io/v1
kind: Cluster
metadata:
  name: cluster1
  namespace: cluster-cluster1
  annotations:
    baremetal.cluster.gke.io/private-mode: "true"
spec:
  nodeAccess:
    loginUser: abm

NetworkGatewayGroup

The NetworkGatewayGroup custom resource is used to provide floating IP addresses for advanced networking features, such as the egress NAT gateway or the bundled load-balancing feature with BGP. To use the NetworkGatewayGroup custom resource and related networking features, you must set clusterNetwork.advancedNetworking to true when you create your clusters.

apiVersion: networking.gke.io/v1
kind: NetworkGatewayGroup
  name: default
  namespace: cluster-bm
spec:
  floatingIPs:
  - 10.0.1.100
  - 10.0.2.100

BGPLoadBalancer

When you configure bundled load balancers with BGP, the data plane load balancing uses, by default, the same external peers that were specified for control plane peering. Alternatively, you can configure the data plane load balancing separately, using the BGPLoadBalancer custom resource (and the BGPPeer custom resource). For more information, see Configure bundled load balancers with BGP.

apiVersion: networking.gke.io/v1
kind: BGPLoadBalancer
metadata:
  name: default
  namespace: cluster-bm
spec:
  peerSelector:
    cluster.baremetal.gke.io/default-peer: "true"

BGPPeer

When you configure bundled load balancers with BGP, the data plane load balancing uses, by default, the same external peers that were specified for control plane peering. Alternatively, you can configure the data plane load balancing separately, using the BGPPeer custom resource (and the BGPLoadBalancer custom resource). For more information, see Configure bundled load balancers with BGP.

apiVersion: networking.gke.io/v1
kind: BGPPeer
metadata:
  name: bgppeer1
  namespace: cluster-bm
  labels:
    cluster.baremetal.gke.io/default-peer: "true"
spec:
  localASN: 65001
  peerASN: 65002
  peerIP: 10.0.3.254
  sessions: 2

NetworkAttachmentDefinition

You can use the bmctl update command to modify NetworkAttachmentDefinition custom resources that correspond to the network.

apiVersion: "k8s.cni.cncf.io/v1"
kind: NetworkAttachmentDefinition
metadata:
  name: gke-network-1
  namespace: cluster-my-cluster
spec:
  config: '{
  "type": "ipvlan",
  "master": "enp2342",
  "mode": "l2",
  "ipam": {
    "type": "whereabouts",
    "range": "172.120.0.0/24"

Increase service network range

To create more services than the initial limit, you can reduce the IPv4 service CIDR mask to increase the service network of your cluster. Reducing the mask (the value after "/") results in a bigger network range.

apiVersion: baremetal.cluster.gke.io/v1
kind: Cluster
metadata:
  name: cluster1
  namespace: cluster-cluster1
spec:
  ...
  clusterNetwork:
    services:
      cidrBlocks:
        - 172.26.0.0/14
  ...

You can only increase the range of the IPv4 service CIDR. The network range can't be reduced, which means the mask (the value after "/") can't be increased.

Configure kubelet image pull settings

The kubelet runs on each node of your cluster. The kubelet is responsible for monitoring containers on a node and making sure they are healthy. When needed, the kubelet queries and pulls images from the Container Registry.

Updating your kubelet configurations manually and keeping them synchronized across all of your cluster nodes can be challenging. To make matters worse, manual kubelet configuration changes on your nodes are lost when you upgrade your cluster.

To help make synchronized updates easier and persistent, GKE on Bare Metal lets you specify some kubelet settings for each of your cluster node pools: control plane nodes, load balancer nodes, and worker nodes. The settings apply for all nodes in a given pool and persist through cluster upgrades. The fields for these settings are mutable, so you can update them at any time, not just during cluster creation.

The following supported fields control Container Registry pull operations for kubelet:

  • registryBurst (default: 10)
  • registryPullQPS (default: 5)
  • serializeImagePulls (default: true)

For more information about each of the kubelet configuration fields, see the Cluster configuration field reference.

You can specify these fields in kubeletConfig sections of the Cluster spec and the NodePool spec for the following pools of nodes:

The following example shows the added fields with their default values in the cluster configuration file. Note that the annotation preview.baremetal.cluster.gke.io/custom-kubelet: "enable" is required.

apiVersion: baremetal.cluster.gke.io/v1
kind: Cluster
metadata:
  name: cluster1
  namespace: cluster-cluster1
  annotations:
    preview.baremetal.cluster.gke.io/custom-kubelet: "enable"
spec:
  ...
  controlPlane:
    nodePoolSpec:
      kubeletConfig:
        registryBurst: 10
        registryPullQPS: 5
        serializeImagePulls: true
  ...
  loadBalancer:
    nodePoolSpec:
      kubeletConfig:
        registryBurst: 10
        registryPullQPS: 5
        serializeImagePulls: true
  ...
apiVersion: baremetal.cluster.gke.io/v1
kind: NodePool
metadata:
  name: node-pool-new
  namespace: cluster-cluster1
spec:
  clusterName: cluster1
  ...
  kubeletConfig:
    registryBurst: 10
    registryPullQPS: 5
    serializeImagePulls: true

In each case, the setting applies to all nodes in the pool.

How to use

Here are some considerations for tuning image pulls:

  • Since images are pulled in series by default, an image pull that takes a long time can delay all other image pulls scheduled on a node. Delayed image pulls can block the upgrade process (especially when new GKE on Bare Metal images need to be deployed on a node). If you're affected by image pull delays, you can set serializeImagePulls to false to allow parallel image pulls.

  • If you're encountering image pull throttling errors, such as pull QPS exceeded, you may want to increase registryPullQPS and registryBurst to increase image pull throughput. These two fields adjust the pull rate and queue size and may help to address other throttling related issues. Negative values aren't allowed.

Use bmctl update to apply your changes

After you modify the config file, update the cluster by running the following bmctl update command:

bmctl update cluster -c CLUSTER_NAME --kubeconfig=KUBECONFIG

Make the following changes:

  • Replace CLUSTER_NAME with the name of the cluster you want to update.
  • If the cluster is a self-managing cluster (such as admin or standalone cluster), replace KUBECONFIG with the path to the cluster's kubeconfig file. If the cluster is a user cluster, replace KUBECONFIG with the path to the admin cluster's kubeconfig file.