When nodes in Google Distributed Cloud fail, which can happen because of issues with storage, network, or OS misconfiguration, you want to efficiently restore cluster health. After you restore the cluster health, you can troubleshoot the node failure. This document shows you how to recover from node failure scenarios by resetting a node, and forcefully removing the node if needed.
If you want to add or remove nodes from a cluster when a node hasn't failed, see Update clusters.
If you need additional assistance, reach out to Cloud Customer Care.Reset nodes
When there's a node failure, sometimes you can't run reset commands on the nodes as the node might be unreachable. You might need to forcefully remove the node from the cluster.
When you cleanly reset a node and update the cluster, the following actions happen:
- The node resets, similar to
kubeadm reset
, and the machine reverts to the pre-installed state. - The related references to the node are removed from the nodepool and cluster custom resources.
In some of the following bmctl
commands to reset nodes, the --force
parameter indicates whether the reset commands (step 1) should be skipped. If
the --force
parameter is used, bmctl
only performs the removal step (step
2), and doesn't run the reset commands.
Remove worker node
To remove a worker node from a cluster, complete the following steps:
Try to cleanly reset the node. After the node is reset, the node is removed from the cluster:
bmctl reset nodes \ --addresses COMMA_SEPARATED_IPS \ --cluster CLUSTER_NAME \ --kubeconfig ADMIN_KUBECONFIG
Replace the following:
COMMA_SEPARATED_IP
: the IP addresses of the nodes to reset, such as10.200.0.8,10.200.0.9
.CLUSTER_NAME
: the name of the target cluster that contains the failed nodes.ADMIN_KUBECONFIG
: the path to the admin clusterkubeconfig
file.
If this command succeeds, you can now diagnose the node and fix any misconfigurations that caused the initial failure. Skip the remaining steps in this section.
If the previous step to reset the node fails, forcefully remove the node from the cluster. This forceful removal skips the previous step that runs that reset commands and only performs the step to remove the related references to the node from the nodepool and cluster custom resources:
bmctl reset nodes \ --addresses COMMA_SEPARATED_IPS \ --cluster CLUSTER_NAME \ --kubeconfig ADMIN_KUBECONFIG \ --force
You can now diagnose the node and fix any misconfigurations that caused the initial failure.
If you forcefully removed the node from the node cluster in the previous step, run the
bmctl reset
command again to reset the nodes:bmctl reset nodes \ --addresses COMMA_SEPARATED_IPS \ --cluster CLUSTER_NAME \ --kubeconfig ADMIN_KUBECONFIG
Remove single control plane node
The process is the same as for worker nodes. For control plane nodes, bmctl
also cleans the etcd
membership.
The cluster stops being in a highly available (HA) state after you remove the failed node. To return to a HA state, add a healthy node to the cluster.
To remove a node from a cluster, complete the following steps:
Try to cleanly reset the node. After the node is reset, the node is removed from the cluster:
bmctl reset nodes \ --addresses COMMA_SEPARATED_IPS \ --cluster CLUSTER_NAME \ --kubeconfig ADMIN_KUBECONFIG
Replace the following values:
COMMA_SEPARATED_IP
: the IP addresses of the nodes to reset, such as10.200.0.8,10.200.0.9
.CLUSTER_NAME
: the name of the target cluster that contains the failed nodes.ADMIN_KUBECONFIG
: the path to the admin clusterkubeconfig
file.
If this command succeeds, you can now diagnose the node and fix any misconfigurations that caused the initial failure. Skip the remaining steps in this section.
If the previous step to reset the node fails, you can forcefully remove the node from the cluster. This forceful removal skips the previous step that runs that reset commands, and only performs the step to remove the related references to the node from the nodepool and cluster custom resources:
bmctl reset nodes \ --addresses COMMA_SEPARATED_IPS \ --cluster CLUSTER_NAME \ --kubeconfig ADMIN_KUBECONFIG \ --force
You can now diagnose the node and fix any misconfigurations that caused the initial failure.
If you forcefully removed the node from the node cluster in the previous step, run the
bmctl reset
command again to reset the nodes:bmctl reset nodes \ --addresses COMMA_SEPARATED_IPS \ --cluster CLUSTER_NAME \ --kubeconfig ADMIN_KUBECONFIG
Reset a node when control plane is inaccessible
You can run the following command to revert a machine to pre-installed states when the cluster control plane is inaccessible:
bmctl reset nodes \
--addresses NODE_IP_ADDRESSES \
--ssh-private-key-path SSH_PRIVATE_KEY_PATH \
--login-user LOGIN_USER \
--gcr-service-account-key GCR_SERVICE_ACCOUNT_KEY
Replace the following:
NODE_IP_ADDRESSES
: a comma-separated list of node IP addresses, one for each node that you're resetting.SSH_PRIVATE_KEY_PATH
: the path of the SSH private key file.LOGIN_USER
: the username used for passwordless SUDO access to the node machines. Unless you explicitly specify a non-root username for node access in the cluster configuration (nodeAccess.loginUser
),root
is used.GCR_SERVICE_ACCOUNT_KEY
: the path of the Container Registry service account JSON key file.
This command doesn't remove references to the node from the nodepool and cluster custom resources. After restoring access to the cluster control plane, you should forcibly remove the node from the cluster if you want to retain the cluster.
Quorum lost in HA control plane
If too many control planes nodes in an HA cluster enter a failed state, the cluster loses quorum and becomes unavailable.
When you need to restore management clusters, don't provide the kubeconfig
file in the reset commands. If you provide the kubeconfig
file for a management
cluster, it forces a new cluster to perform the reset operation. When you
restore a user cluster, provide the path to the kubeconfig
file.
To recover a cluster that has lost quorum, run the following command on a remaining healthy node:
bmctl restore --control-plane-node CONTROL_PLANE_NODE \ --cluster CLUSTER_NAME \ [--kubeconfig KUBECONFIG_FILE]
Replace the following:
CONTROL_PLANE_NODE
: the IP addresses of a healthy node that remains as part of the cluster.CLUSTER_NAME
: the name of the target cluster that contains the failed nodes.KUBECONFIG_FILE
: if recovering a user cluster, the path to the user clusterkubeconfig
file.
After you recover the failed nodes, run the
bmctl reset
command to reset the nodes:bmctl reset nodes \ --addresses COMMA_SEPARATED_IPS \ --cluster CLUSTER_NAME \ [--kubeconfig KUBECONFIG_FILE]
Replace the following:
COMMA_SEPARATED_IP
: the IP addresses of the nodes to reset, such as10.200.0.8,10.200.0.9
.CLUSTER_NAME
: the name of the target cluster that contains the failed nodes.KUBECONFIG_FILE
: the path to the admin clusterkubeconfig
file.
If the failed nodes were part of the load balancer node pools, after the nodes recover they might contend for the control plane virtual IP address and make the new cluster unstable. Run the reset commands against the failed nodes as soon as possible after you recover the nodes.
This process only handles the disaster recovery for a 3-node control plane HA deployment. This process doesn't support the recovery for HA setups with 5 nodes or more.
What's next
For more information on how to add or remove nodes from a cluster when there isn't a failure and check the node status, see Update clusters.
- If you need additional assistance, reach out to Cloud Customer Care.