Cassandra CSI backup and restore

You can back up and restore your hybrid data using CSI (Container Storage Interface) snapshots. CSI backup triggers disk snapshots taken by the underlying storage system using the provided CSI driver. CSI backup does not need a Google Cloud Storage bucket or a remote server to store backup data.

CSI backup is recommended for hybrid instances hosted in Google Cloud, AWS, or Azure.

This page describes the steps to use hybrid CSI backup and restore. For an overview of hybrid back up and restore in general, see the Cassandra backup and restore overview.

Backup and restore limitations

Be aware of these limitations when using CSI backup and restore:

  • The CSI driver used by the configured storage class must support CSI snapshots. See this Kubernetes CSI driver list for driver information.
  • Not all platforms are supported. Only Google Cloud, AWS, and Azure platforms are supported.
  • OpenShift Container Platform is not supported due to volume snapshot limitations.
  • Only cloud platforms are supported. On-prem platforms are not supported.
  • CSI backup data and non-CSI hybrid backup data are incompatible. Non-CSI backups cannot be used with CSI restore and CSI backups cannot be used with non-CSI restore.
  • CSI driver installation and functionality is the responsibility of the CSI driver vendor.
  • Users are responsible for ensuring adequate cluster resources are available for provisioning CSI snapshots.
  • Users are responsible for removing old snapshot data.

Set up CSI backups

To schedule hybrid backups using CSI, perform the following steps:

  1. If you have not previously set up hybrid backup:
    1. Run the following create-service-account command to create a Google Cloud service account (SA) with the standard roles/storage.objectAdmin role. This SA role allows you to write backup data to Cloud Storage. Execute the following command in the directory appropriate for your management tool:
      • Helm charts: $APIGEE_HELM_CHARTS_HOME/apigee-operator/etc/
      • apigeectl: HYBRID_BASE_DIRECTORY/hybrid-files/
      ./tools/create-service-account --env non-prod --dir ./service-accounts

      This command creates a single service account named apigee-non-prod for use in non-production environments and places the downloaded key file in the ./service-accounts directory.

      For more information about Google Cloud service accounts, see Creating and managing service accounts.

    2. The create-service-account command saves a JSON file containing the service account private key. The file is saved in the same directory where the command executes. You will need the path to this file in the following steps.
  2. Open your overrides.yaml file. Set the follow parameters, as shown in Example overrides files.

    1. Set the general parameters shown below in the backup block. If you've already set these parameters for the non-CSI hybrid back up solution, you can use the same parameters for your CSI snapshots. See backup properties reference table for more information about each value.

      For backup:

      • enabled: Set to true to enable scheduled backups.
      • pullPolicy in image: Set to Always.
      • schedule: Provide a cron expression schedule.
    2. Set these parameters for CSI-specific backup:
      • Cassandra storage group values: The configured Cassandra storage class must support CSI snapshots for CSI backup and restore to work. To check if a storage class supports CSI snapshots, run the following command to get the available storage classes:
        kubectl get sc
        Look at the "Provisioner" output for each storage class. Provisioners using CSI usually have a ".csi." part to their name like "pd.csi.storage.gke.io". Look for the provisioner name in this Kubernetes CSI driver list. If the "Other Features" column for the provisioner contains the word "SNAPSHOT", then the storage class using the provisioner supports CSI snapshots.

        Add these parameters in the storage group. Both values are required.

        • storageclass: A CSI snapshot-enabled storage class name.
        • capacity: The capacity of the disk.
      • Cloud provider type:

        Once the CSI snapshot capability has been verified, modify the overrides file to use CSI backup and restore:

        • cloudProvider: Set cloudProvider in backup and restore to CSI.

Example backup config

This section shows the backup-related portions of an example overrides.yaml file.
cassandra:
  hostNetwork: false
  replicaCount: 3
  storage:
    storageclass: standard-rwo
    capacity: 100Gi
  image:
    pullPolicy: Always

  backup:
    enabled: true
    image:
      pullPolicy: Always
    cloudProvider: "CSI"
    schedule: "0 * * 11 *"

Launch a manual backup

CSI backups generate automatically according to the cron schedule set in the overrides.yaml file.

To initiate a manual CSI backup, use this command:

kubectl create job -n apigee --from=cronjob/apigee-cassandra-backup backup-pod-name
where backup-pod-name is the name of the backup pod that will be created.

Verify backups

One way to verify a backup was successfully created is to check the volume snapshots on the Kubernetes cluster, using this command:

kubectl get volumesnapshot -n apigee

The output shows the current list of snapshots on the cluster. The CSI backup process creates a snapshot of each Cassandra disk. The number of generated snapshots should match the total number of cassandra pods in the cluster.

Restore a backup

Use this process to restore a previously generated CSI backup. For general information on restoring backups and an overview of the process, see the restore overview page.

To initiate a restore of a CSI backup, follow the instructions for the hybrid non-CSI single region restore, but use these values in the restore block in your overrides.yaml. See the backup properties reference table for more information about each value and the example restore configuration for an example.

  • enabled: Set to true to enable restore for the backup referenced with the snapshotTimestamp timestamp.
  • snapshotTimestamp: Provide the timestamp of a previous CSI backup.
  • pullPolicy in image: Set to Always.

To find the snapshotTimestamp value to restore, run this command to get the list of available snapshots:

kubectl get volumesnapshot -n apigee
In the returned list, the names of the snapshots contain the timestamp:
pvc-us-west2-b-20220803004907-47beff0e306d8861
In this example the timestamp is 20220803004907.

Example restore config

This section shows the restore-related portions of an example overrides.yaml file.
cassandra:
  hostNetwork: false
  replicaCount: 3
  storage:
    storageclass: standard-rwo
    capacity: 100Gi
  image:
    pullPolicy: Always

  restore:
    enabled: true
    snapshotTimestamp: "20220908222130"
    cloudProvider: "CSI"
    image:
      pullPolicy: Always

Migrate to CSI backup and restore

If you have not previously used hybrid backup and restore, you can follow the instructions in Set up CSI backups to create a new CSI backups without the steps in this section. These steps guide you through migrating from the non-CSI backup and restore solution to CSI backups.

  1. Generate a new backup using the currently configured non-CSI backup method.
  2. Change the backup configuration in the hybrid overrides.yaml file to use the CSI backup overrides as shown in the example backup config.
  3. Apply the changes in the overrides.yaml file:

    Helm

    helm upgrade datastore apigee-datastore/ \
      --namespace apigee \
      --atomic \
      -f OVERRIDES_FILE.yaml
    

    apigeectl

    $APIGEECTL_HOME/apigeectl apply -f OVERRIDES_FILE.yaml
  4. Verify the backup job:
    kubectl get cronjob -n apigee
  5. After a backup job completes, verify snapshots have been created. The number of generated snapshots should be equivalent to the number of Cassandra nodes in the Hybrid instance.
    kubectl get volumesnapshot -n apigee