This page describes how to schedule backups for Cassandra without the Cloud Storage. In this method, backups are stored on a remote server specified by you instead of a Cloud Storage bucket. Apigee uses SSH to communicate with the remote server.
You must schedule the backups as cron
jobs. Once a backup schedule
has been applied to your hybrid cluster, a Kubernetes backup job is
periodically executed according to the schedule in the runtime plane. The job triggers a
backup script on each Cassandra node in your hybrid cluster that collects all the
data on the node, creates an archive (compressed) file of the data, and sends the archive
to the server specified in your overrides.yaml
file.
The following steps include common examples for completing specific tasks, like creating an SSH key pair. Use the methods that are appropriate to your installation.
The procedure has the following parts:
Set up the server and SSH
- Designate a Linux or Unix server for your backups. This server must be reachable using SSH from your Apigee hybrid runtime plane. It must have enough storage for your backups.
- Set up an SSH server on the server, or ensure that it has a secure SSH server configured.
- Create an SSH key pair and store the private key file in a path that is accessible from your hybrid
runtime plane. You must use a blank password for your key pair or the backup will fail. For example:
ssh-keygen -t rsa -b 4096 -C exampleuser@example.com
Enter file in which to save the key (/Users/exampleuser/.ssh/id_rsa): $APIGEE_HOME/hybrid-files/certs/ssh_key Enter passphrase (empty for no passphrase): Enter same passphrase again: Your identification has been saved in ssh_key Your public key has been saved in ssh_key.pub The key fingerprint is: SHA256:DWKo334XMZcZYLOLrd/8HNpjTERPJJ0mc11UYmrPvSA exampleuser@example.com The key's randomart image is: +---[RSA 4096]----+ | +. ++X| | . . o.=.*+| | . o . . o==o | | . . . =oo+o...| | . S +E oo .| | . . .. . o .| | . . . . o.. | | . ...o ++. | | .. .. +o+. | +----[SHA256]-----+Where: exampleuser@example.com is a string. Any string that follows
-C
in thessh-keygen
command becomes a comment included in the newly createdssh
key. The input string can be any string. When you use an account name in the form of exampleuser@example.com, you can quickly identify which account goes with the key. - Create a user account on the backup server with the name
apigee
. Make sure the newapigee
user has a home directory under/home
. - On the backup server, create an
.ssh
directory in the new/home/apigee
directory. - Copy the public key (
ssh_key.pub
in the previous example) into a file namedauthorized_keys
in the new/home/apigee/.ssh
directory. For example:cd /home/apigee
mkdir .ssh
cd .ssh
vi authorized_keys
- On your backup server, create a backup directory within the
/home/apigee/
directory. The backup directory can be any directory as long as theapigee
user has access to it. For example:cd /home/apigee
mkdir cassandra-backup
- Test the connection. You need to make sure that your Cassandra pods can connect to your
backup server using SSH:
- Log into the shell of your Cassandra pod. For example:
kubectl exec -it -n APIGEE_NAMESPACE APIGEE_CASSANDRA_DEFAULT_0 -- /bin/bash
Where APIGEE_CASSANDRA_DEFAULT_0 is the name of a Cassandra pod. Change this to the name of the pod you want to connect from.
- Connect by SSH to your backup server, using the private SSH key mounted the Cassandra pod and server IP address:
ssh -i /var/secrets/keys/key apigee@BACKUP_SERVER_IP
- Log into the shell of your Cassandra pod. For example:
Set the schedule and destination for backup
You set the schedule and destination for backups in your overrides.yaml
file.
- Add the following parameters to your
overrides.yaml
file:Parameters
cassandra: backup: enabled: true keyFile: "PATH_TO_PRIVATE_KEY_FILE" server: "BACKUP_SERVER_IP" storageDirectory: "/home/apigee/BACKUP_DIRECTORY" cloudProvider: "HYBRID" # required verbatim "HYBRID" (all caps) schedule: "SCHEDULE"
Example
cassandra: backup: enabled: true keyFile: "private.key"# path relative to apigee-datastore path server: "34.56.78.90" storageDirectory: "/home/apigee/cassbackup" cloudProvider: "HYBRID" schedule: "0 2 * * *"
Where:
Property Description backup:enabled
Backup is disabled by default. You must set this property to true
.backup:keyFile
PATH_TO_PRIVATE_KEY_FILE
The path on your local file system to the SSH private key file (named
ssh_key
in the step where you created the SSH key pair). This path must be relative to theapigee-datastore
chart directory.backup:server
BACKUP_SERVER_IP
The IP address of your backup server.
backup:storageDirectory
BACKUP_DIRECTORY
The name of the backup directory on your backup server. This must be a directory within
home/apigee
(the backup directory is namedcassandra_backup
in the step where you created the backup directory).backup:cloudProvider
GCP/HYBRID
For a Cloud Storage backup, set the property to
GCP
. For example,cloudProvider: "GCP"
.For a remote server backup, set the property to
HYBRID
. For example,cloudProvider: "HYBRID"
.backup:schedule
SCHEDULE
The time when the backup starts, specified in standard crontab syntax. Times are in the local time zone of the Kubernetes cluster. Default:
0 2 * * *
- Apply the backup configuration to the storage scope of your
cluster:
helm upgrade datastore apigee-datastore/ \ --namespace APIGEE_NAMESPACE \ --atomic \ -f OVERRIDES_FILE.yaml
Where OVERRIDES_FILE is the path to the overrides file you just edited.
- Verify the backup job. For example:
kubectl get cronjob -n APIGEE_NAMESPACE
NAME SCHEDULE SUSPEND ACTIVE LAST SCHEDULE AGE apigee-cassandra-backup 33 * * * * False 0 <none> 94s