Reset a failed node in Google Distributed Cloud

When nodes in Google Distributed Cloud fail, which can happen because of issues with storage, network, or OS misconfiguration, you want to efficiently restore cluster health. After you restore the cluster health, you can troubleshoot the node failure. This document shows you how to recover from node failure scenarios by resetting a node, and forcefully removing the node if needed.

If you want to add or remove nodes from a cluster when a node hasn't failed, see Update clusters.

Reset nodes

When there's a node failure, sometimes you can't run reset commands on the nodes as the node might be unreachable. You might need to forcefully remove the node from the cluster.

When you cleanly reset a node and update the cluster, the following actions happen:

The node resets, similar to kubeadm reset, and the machine reverts to the pre-installed state.
The related references to the node are removed from the nodepool and cluster custom resources.

In some of the following bmctl commands to reset nodes, the --force parameter indicates whether the reset commands (step 1) should be skipped. If the --force parameter is used, bmctl only performs the removal step (step 2), and doesn't run the reset commands.

Remove worker node

To remove a worker node from a cluster, complete the following steps:

Try to cleanly reset the node. After the node is reset, the node is removed from the cluster:
```
bmctl reset nodes \
    --addresses COMMA_SEPARATED_IPS \
    --cluster CLUSTER_NAME \
    --kubeconfig ADMIN_KUBECONFIG
```
Replace the following:
- COMMA_SEPARATED_IP: the IP addresses of the nodes to reset, such as 10.200.0.8,10.200.0.9.
- CLUSTER_NAME: the name of the target cluster that contains the failed nodes.
- ADMIN_KUBECONFIG: the path to the admin cluster kubeconfig file.
If this command succeeds, you can now diagnose the node and fix any misconfigurations that caused the initial failure. Skip the remaining steps in this section.
If the previous step to reset the node fails, forcefully remove the node from the cluster. This forceful removal skips the previous step that runs that reset commands and only performs the step to remove the related references to the node from the nodepool and cluster custom resources:
```
bmctl reset nodes \
    --addresses COMMA_SEPARATED_IPS \
    --cluster CLUSTER_NAME \
    --kubeconfig ADMIN_KUBECONFIG \
    --force
```
You can now diagnose the node and fix any misconfigurations that caused the initial failure.

If you forcefully removed the node from the node cluster in the previous step, run the bmctl reset command again to reset the nodes:

bmctl reset nodes \
    --addresses COMMA_SEPARATED_IPS \
    --cluster CLUSTER_NAME \
    --kubeconfig ADMIN_KUBECONFIG

Remove single control plane node

The process is the same as for worker nodes. For control plane nodes, bmctl also cleans the etcd membership.

The cluster stops being in a highly available (HA) state after you remove the failed node. To return to a HA state, add a healthy node to the cluster.

To remove a node from a cluster, complete the following steps:

Try to cleanly reset the node. After the node is reset, the node is removed from the cluster:
```
bmctl reset nodes \
    --addresses COMMA_SEPARATED_IPS \
    --cluster CLUSTER_NAME \
    --kubeconfig ADMIN_KUBECONFIG
```
Replace the following values:
- COMMA_SEPARATED_IP: the IP addresses of the nodes to reset, such as 10.200.0.8,10.200.0.9.
- CLUSTER_NAME: the name of the target cluster that contains the failed nodes.
- ADMIN_KUBECONFIG: the path to the admin cluster kubeconfig file.
If this command succeeds, you can now diagnose the node and fix any misconfigurations that caused the initial failure. Skip the remaining steps in this section.
If the previous step to reset the node fails, you can forcefully remove the node from the cluster. This forceful removal skips the previous step that runs that reset commands, and only performs the step to remove the related references to the node from the nodepool and cluster custom resources:
```
bmctl reset nodes \
  --addresses COMMA_SEPARATED_IPS \
  --cluster CLUSTER_NAME \
  --kubeconfig ADMIN_KUBECONFIG \
  --force
```
You can now diagnose the node and fix any misconfigurations that caused the initial failure.
If you forcefully removed the node from the node cluster in the previous step, run the bmctl reset command again to reset the nodes:
```
bmctl reset nodes \
  --addresses COMMA_SEPARATED_IPS \
  --cluster CLUSTER_NAME \
  --kubeconfig ADMIN_KUBECONFIG
```
Reset a node when control plane is inaccessible

You can run the following command to revert a machine to pre-installed states when the cluster control plane is inaccessible:

bmctl reset nodes \
    --addresses NODE_IP_ADDRESSES \
    --ssh-private-key-path SSH_PRIVATE_KEY_PATH \
    --login-user LOGIN_USER \
    --gcr-service-account-key AR_SERVICE_ACCOUNT_KEY

Replace the following:

NODE_IP_ADDRESSES: a comma-separated list of node IP addresses, one for each node that you're resetting.
SSH_PRIVATE_KEY_PATH: the path of the SSH private key file.
LOGIN_USER: the username used for passwordless SUDO access to the node machines. Unless you explicitly specify a non-root username for node access in the cluster configuration (nodeAccess.loginUser), root is used.
AR_SERVICE_ACCOUNT_KEY: the path of the Artifact Registry service account JSON key file.

This command doesn't remove references to the node from the nodepool and cluster custom resources. After restoring access to the cluster control plane, you should forcibly remove the node from the cluster if you want to retain the cluster.

Quorum lost in HA control plane

If too many control planes nodes in an HA cluster enter a failed state, the cluster loses quorum and becomes unavailable.

When you need to restore management clusters, don't provide the kubeconfig file in the reset commands. If you provide the kubeconfig file for a management cluster, it forces a new cluster to perform the reset operation. When you restore a user cluster, provide the path to the kubeconfig file.

To recover a cluster that has lost quorum, run the following command on a remaining healthy node:
```
bmctl restore --control-plane-node CONTROL_PLANE_NODE \
    --cluster CLUSTER_NAME \
    [--kubeconfig KUBECONFIG_FILE]
```
Replace the following:
- CONTROL_PLANE_NODE: the IP addresses of a healthy node that remains as part of the cluster.
- CLUSTER_NAME: the name of the target cluster that contains the failed nodes.
- KUBECONFIG_FILE: if recovering a user cluster, the path to the user cluster kubeconfig file.
After you recover the failed nodes, run the bmctl reset command to reset the nodes:
```
bmctl reset nodes \
   --addresses COMMA_SEPARATED_IPS \
   --cluster CLUSTER_NAME \
   [--kubeconfig KUBECONFIG_FILE]
```
Replace the following:
- COMMA_SEPARATED_IP: the IP addresses of the nodes to reset, such as 10.200.0.8,10.200.0.9.
- CLUSTER_NAME: the name of the target cluster that contains the failed nodes.
- KUBECONFIG_FILE: the path to the admin cluster kubeconfig file.
If the failed nodes were part of the load balancer node pools, after the nodes recover they might contend for the control plane virtual IP address and make the new cluster unstable. Run the reset commands against the failed nodes as soon as possible after you recover the nodes.

This process only handles the disaster recovery for a 3-node control plane HA deployment. This process doesn't support the recovery for HA setups with 5 nodes or more.

What's next

For more information about how to add or remove nodes from a cluster when there isn't a failure and to check node status, see Update clusters.

If you need additional assistance, reach out to Cloud Customer Care. You can also see Getting support for more information about support resources, including the following:

Requirements for opening a support case.
Tools to help you troubleshoot, such as your environment configuration, logs, and metrics.
Supported components.