This page shows how to back up the etcd data store for your Anthos clusters on AWS (GKE on AWS) installation for recovery from events that may damage your cluster's etcd data.
Using a backup file to restore your etcd data is a last resort. We do not recommend restoring from a backup file unless the cluster is completely broken. Contact Google support for help in deciding the best course of action.
This procedure does not back up data from your workloads, including PersistentVolumes.
This backup cannot be used to restore a cluster from a different version of Anthos clusters on AWS.
Backing up a user cluster
A user cluster backup is a snapshot of the user cluster's etcd store. The etcd store contains all of the Kubernetes objects and custom objects that represent the cluster's state. The snapshot contains the data required to recreate the cluster's stateless workloads.
To create a snapshot of the etcd data store, perform the following steps:
Open a shell on the management service instance running etcd for your cluster.
Find the IP address of your cluster's management service instance.
export CLUSTER_ID=$(terraform output cluster_id) export MANAGEMENT_IP=$(aws ec2 describe-instances \ --filters "Name=tag:Name,Values=$CLUSTER_ID-management-0" \ --query "Reservations[*].Instances[*].PrivateIpAddress" \ --output text)
sshtool to open a connection to the management service instance.
ssh -i ~/.ssh/anthos-gke ubuntu@$MANAGEMENT_IP
export BASTION_DNS=$(terraform output bastion_dns_name) ssh -i ~/.ssh/anthos-gke -J ubuntu@$BASTION_DNS ubuntu@$MANAGEMENT_IP
Create a directory to store the etcd backup data.
pscommand-line tool to find the process ID of the etcd process on that instance.
ps -e | grep etcd
The output shows details of your etcd process. The first element is etcd's process ID. In the following steps, replace ETCD_PID with this process ID.
Create a script within the etcd container's filesystem to take a snapshot. This script runs etcdctl to connect to the etcd daemon and perform a snapshot to back up the etcd database.
cat << EOT > /tmp/etcdbackup.sh # Extract a snapshot of the anthos-gke etcd state database export ETCDCTL_API=3 etcdctl \ --endpoints=https://127.0.0.1:2379 \ --cacert=/secrets/server-ca.crt \ --cert=/secrets/server.crt \ --key=/secrets/server.key \ snapshot save /tmp/snapshot.db EOT chmod a+x /tmp/etcdbackup.sh sudo mv /tmp/etcdbackup.sh /proc/ETCD_PID/root/tmp/etcdbackup.sh
nsentercommand to run the script within the etcd container to create the snapshot.
sudo nsenter --all --target ETCD_PID /tmp/etcdbackup.sh
Copy the snapshot file out of the etcd container.
sudo cp /proc/ETCD_PID/root/tmp/snapshot.db ./etcd-backups
Copy all files in the /secrets directory of the etcd container to your backup directory. These files contain the certificates that encrypt and validate communication between etcd and other processes in the cluster. Together, the snapshot file and the certificates files are a full backup of your etcd cluster status.
sudo cp -r /proc/ETCD_PID/root/secrets ./etcd-backups
tartool to bundle the etc-backup files into a convenient tar file.
tar -cvf etcd-backup.tar etcd-backup
Exit to your local machine and use the
scptool to copy the etcd-backup.tar file from the management service instance. This example uses the BASTION_DNS and MANAGEMENT_IP environment variables defined earlier.
scp -i ~/.ssh/anthos-gke -J ubuntu@$BASTION_DNS \ ubuntu@$MANAGEMENT_IP:~/etcd-backup/backup.tar