This page describes how to use bmctl
to back up and restore clusters created
with Google Distributed Cloud. These instructions apply to all cluster types supported
by Google Distributed Cloud.
The bmctl
backup and restore process does not include persistent
volumes. Any volumes created by the local volume provisioner (LVP) are left
unaltered.
Back up a cluster
The bmctl backup cluster
command adds the cluster information from the etcd
store and the PKI certificates for the specified cluster the cluster to a tar
file. The etcd store is the Kubernetes backing store for all cluster data and
contains all the Kubernetes objects and custom objects required to manage
cluster state. The PKI certificates are used for authentication over TLS. This
data is backed up from the cluster's control plane or from one of the control
planes for a
high-availability (HA)
deployment.
The backup tar file contains sensitive credentials, including your service account keys and the SSH key. Store backup files in a secure location. To prevent unintended file exposure, the Google Distributed Cloud backup process uses in-memory files only.
Back up your clusters regularly to ensure your snapshot data is relatively current. Adjust the rate of backups to reflect the frequency of significant changes to your clusters.
The bmctl
version you use to back up a cluster must match the version of
the managing cluster.
To back up a cluster:
Ensure your cluster is operating properly, with working credentials and SSH connectivity to all nodes.
The intent of the backup process is to capture your cluster in a known good state, so that you can restore operation if a catastrophic failure occurs.
Use the following command to check your cluster:
bmctl check cluster -c CLUSTER_NAME --kubeconfig ADMIN_KUBECONFIG
Replace the following:
CLUSTER_NAME
: the name of the cluster you plan to back up.ADMIN_KUBECONFIG
: the path of the kubeconfig file for the admin cluster.
Run the following command to ensure the target cluster is not in a reconciliation state:
kubectl describe cluster CLUSTER_NAME -n CLUSTER_NAMESPACE --kubeconfig ADMIN_KUBECONFIG
Replace the following:
CLUSTER_NAME
: the name of the cluster to back up.CLUSTER_NAMESPACE
: the namespace for the cluster. By default, the cluster namespaces for Google Distributed Cloud are the name of the cluster prefaced withcluster-
. For example, if you name your clustertest
, the namespace has a name likecluster-test
.ADMIN_KUBECONFIG
: the path of the kubeconfig file for the admin cluster.
Check the
Status
section in the command output forConditions
of typeReconciling
.As shown in the following example, a status of
False
for theseConditions
means the cluster is stable and ready to be backed up.... Status: ... Cluster State: Running ... Control Plane Node Pool Status: ... Conditions: Last Transition Time: 2023-11-03T16:37:15Z Observed Generation: 1 Reason: ReconciliationCompleted Status: False Type: Reconciling ...
Run the following command to back up the cluster:
bmctl backup cluster -c CLUSTER_NAME --kubeconfig ADMIN_KUBECONFIG
Replace the following:
CLUSTER_NAME
: the name of the cluster to back up.ADMIN_KUBECONFIG
: the path to the admin cluster kubeconfig file.
By default, the backup tar file saved to the workspace directory (
bmctl-workspace
, by default) on your admin workstation. The tar file is namedCLUSTER_NAME_backup_TIMESTAMP.tar.gz
, whereCLUSTER_NAME
is the name of the cluster being backed up andTIMESTAMP
is the date and time the backup was made. For example, if the cluster name istestuser
, the backup file has a name liketestuser_backup_2006-01-02T150405Z0700.tar.gz
.To specify a different name and location for your backup file, use the
--backup-file
flag.
The backup file expires after a year and the cluster restore process doesn't work with expired backup files.
Restore a cluster
Restoring a cluster from a backup is a last resort and should be used when a
cluster has failed catastrophically and cannot be returned to service any other
way. For example, the etcd data is corrupted or the etcd
Pod is in a crash
loop.
The backup tar file contains sensitive credentials, including your service account keys and the SSH key. To prevent unintended file exposure, the Google Distributed Cloud restore process uses in-memory files only.
The bmctl
version you use to restore a cluster must match the version of
the managing cluster.
To restore a cluster:
Ensure all node machines that were available for the cluster at the time of the backup are operating properly and reachable.
Ensure that SSH connectivity between nodes works with the SSH keys that were used at the time of the backup.
These SSH keys are reinstated as part of the restore process.
Ensure that the service account keys that were used at the time of the backup are still active.
These service account keys are reinstated for the restored cluster.
To restore an admin, hybrid, or standalone cluster, run the following command:
bmctl restore cluster -c CLUSTER_NAME --backup-file BACKUP_FILE
Replace the following:
CLUSTER_NAME
: the name of the cluster you are restoring.BACKUP_FILE
: the path and name of the backup file you are using.
To restore a user cluster, run the following command:
bmctl restore cluster -c CLUSTER_NAME --backup-file BACKUP_FILE \ --kubeconfig ADMIN_KUBECONFIG
Replace the following:
CLUSTER_NAME
: the name of the cluster you are restoring.BACKUP_FILE
: the path and name of the backup file you are using.ADMIN_KUBECONFIG
: the path to the admin cluster kubeconfig file.
At the end of the restore process, a new kubeconfig file is generated for the restored cluster.