In a Anthos clusters on VMware (GKE on-prem) implementation, the control-plane VM for an admin cluster has two attached disks:
The boot disk has the operating system for the VM.
The data disk has credentials and the etcd database, which stores the state of the admin cluster. That is, the data disk stores all of the Kubernetes objects for the admin cluster.
This page shows how to recover when the control-plane VM is lost or the boot disk is compromised. For example:
- The boot disk becomes read-only due to spam journal logs.
- The Docker overlay filesystem gets corrupted.
This page does not cover recovery of the data disk. For instructions on how to recover the data disk, see Restoring an admin cluster.
Repairing the control-plane VM
To repair the admin cluster's control-plane VM:
gkectl repair admin-master --config ADMIN_CLUSTER_CONFIG --kubeconfig ADMIN_CLUSTER_KUBECONFIG
ADMIN_CLUSTER_CONFIG with the path of your admin cluster configuration file.
ADMIN_CLUSTER_KUBECONFIG with the path of your admin cluster's kubeconfig file.
The admin cluster's control-plane VM is cloned into a VM template, which has
all the information needed to re-create the VM. The
gkectl repair admin-master
command uses the VM template to create a new VM. Then it attaches a new
boot disk and the existing data disk.
If your cluster nodes get their addresses from a DHCP server, the new VM might have a different IP address from the original VM.