Restoring in a single region

This page describes how to restore Cassandra in a single region.

In a single region deployment, Apigee hybrid is deployed in a single data center or a region. If you have multiple Apigee organizations in your deployment, the restore process restores data for all the organizations. In a multi-organization setup, you cannot restore a specific organization.

Restoring a region from a backup

Choose the instructions below for the management tool you are using for Apigee hybrid:

Helm

  1. Update the Cassandra restore details in the overrides.yaml file:

    namespace: YOUR_RESTORE_NAMESPACE # Use the same namespace as in your original cluster.
    cassandra:
      hostNetwork: false
    ...
    restore:
      enabled: true
      serviceAccountPath: "SA_JSON_FILE_PATH"
      dbStorageBucket: "CLOUD_STORAGE_BUCKET_PATH"
      cloudProvider: "GCP"  # required verbatim "GCP" (all caps)
      snapshotTimestamp: "TIMESTAMP"
    ...
    backup:
      enabled: false
    ...
    

    Where:

    Property Description
    namespace

    YOUR_RESTORE_NAMESPACE

    Namespace for restore. Use the same namespace as in your original cluster.

    cassandra:hostNetwork

    hostNetwork is required and should always be set to false.

    restore:enabled Restore is disabled by default. You must set this property to true.
    restore:serviceAccountPath

    SA_JSON_FILE_PATH

    The path on your filesystem to the service account you created for the backup.

    restore:dbStorageBucket

    CLOUD_STORAGE_BUCKET_PATH

    The Cloud Storage bucket path where your backup data is stored in the following format: gs://BUCKET_NAME. The gs:// is required.

    restore:cloudProvider

    GCP

    The cloudProvider: "GCP" property is required.

    restore:snapshotTimestamp

    TIMESTAMP

    The timestamp of the backup snapshot to restore. To check what timestamps can be used, go to the dbStorageBucket and look at the files that are present in the bucket. Each file name contains a timestamp value. For example, backup_20210203213003_apigee-cassandra-default-0.tgz

    Where 20210203213003 is the snapshotTimestamp value you would use if you wanted to restore the backups created at that point in time.

    backup:enabled You should set this property to false in case it had been previously set to true.
  2. In case you do not have a clean cluster to start out with, follow the Decommission a hybrid region for helm documentation to bring your existing Hybrid installation into a clean state (you can leave the Cert Manager installed). This would bring you to an equal state as if you would have followed Helm runtime setup manual until the beginning of Step 11.

  3. Verify there are no pods remaining in the Apigee namespaces:

    kubectl get pods -n apigee
            kubectl get pods -n apigee-system
  4. If you are using CSI backup, make sure that you can see the volumesnapshots you want to use for the restoration process by running:

    kubectl get volumesnapshot -n apigee
              
  5. Install all Hybrid components one by one as described in Step 11 on the installation manual. Note that the apigee-cassandra-restore pod will get created once you run the command to install the datastore, but it will only go into running state after you install the apigee-org component.

See Cassandra backup overview for more details on Cassandra backup and restore.

apigeectl

In your configuration, the Cassandra backup can reside either on Cloud Storage or on a remote server. In either case, perform the following steps to restore:

  1. Verify the hybrid version.
    apigeectl version
    Ensure the version is the same version that created the backup files in storage.
  2. Confirm that the Kubernetes cluster you are restoring to does not have a prior Apigee hybrid installation. If you are restoring to the existing cluster, use the following commands to delete the existing Apigee hybrid installation:
    apigeectl delete -f overrides.yaml
    kubectl -n apigee get apigeedatastore,apigeeredis,apigeetelemetry,org,env,arc # The output should be empty.
    apigeectl delete --all -f overrides.yaml
  3. Open your overrides.yaml file and set the restore properties to the desired values:

    Cloud Storage

    Parameters

    namespace: YOUR_RESTORE_NAMESPACE # Use the same namespace as in your original cluster.
    cassandra:
      hostNetwork: false
      ...
      restore:
        enabled: true
        serviceAccountPath: "SA_JSON_FILE_PATH"
        dbStorageBucket: "CLOUD_STORAGE_BUCKET_PATH"
        cloudProvider: "GCP"  # required verbatim "GCP" (all caps)
        snapshotTimestamp: "TIMESTAMP"
      ...
      backup:
        enabled: false
        serviceAccountPath: "SA_JSON_FILE_PATH"
        dbStorageBucket: "CLOUD_STORAGE_BUCKET_PATH"
        cloudProvider: "GCP"  # required verbatim "GCP" (all caps)
        schedule: "SCHEDULE"
    

    Example

    namespace: apigee
    cassandra:
      hostNetwork: false
      ...
      restore:
        enabled: true
        serviceAccountPath: "/Users/myhome/.ssh/my_cassandra_backup.json"
        dbStorageBucket: "gs://myname-cassandra-backup"
        cloudProvider: "GCP"
        snapshotTimestamp: "20201001183903"
    
      ...
      backup:
        enabled: false
        serviceAccountPath: "/Users/myhome/.ssh/my_cassandra_backup.json"
        dbStorageBucket: "gs://myname-cassandra-backup"
        cloudProvider: "GCP"
        schedule: "0 2 * * *"
      ...

    Where:

    Property Description
    namespace

    YOUR_RESTORE_NAMESPACE

    Namespace for restore. Use the same namespace as in your original cluster.

    cassandra:hostNetwork

    hostNetwork is required and should always be set to false.

    restore:enabled Restore is disabled by default. You must set this property to true.
    restore:serviceAccountPath

    SA_JSON_FILE_PATH

    The path on your filesystem to the service account you created for the backup.

    restore:dbStorageBucket

    CLOUD_STORAGE_BUCKET_PATH

    The Cloud Storage bucket path where your backup data is stored in the following format: gs://BUCKET_NAME. The gs:// is required.

    restore:cloudProvider

    GCP

    The cloudProvider: "GCP" property is required.

    restore:snapshotTimestamp

    TIMESTAMP

    The timestamp of the backup snapshot to restore. To check what timestamps can be used, go to the dbStorageBucket and look at the files that are present in the bucket. Each file name contains a timestamp value. For example, backup_20210203213003_apigee-cassandra-default-0.tgz

    Where 20210203213003 is the snapshotTimestamp value you would use if you wanted to restore the backups created at that point in time.

    backup:enabled You should set this property to false in case it had been previously set to true.
    backup:serviceAccountPath

    SA_JSON_FILE_PATH

    The path on your filesystem to the service account JSON file that was downloaded when you ran ./tools/create-service-account

    backup:dbStorageBucket

    CLOUD_STORAGE_BUCKET_PATH

    The Cloud Storage bucket path in this format: gs://BUCKET_NAME. The gs:// is required.

    backup:cloudProvider

    GCP

    The cloudProvider: "GCP" property is required.

    backup:schedule

    SCHEDULE

    The time when the backup starts, specified in standard crontab syntax. Default: 0 2 * * *

    Non-Cloud Storage

    Parameters

      namespace: YOUR_RESTORE_NAMESPACE # Use the same namespace as in your original cluster.
      cassandra:
        hostNetwork: false
        ...
        restore:
          enabled: true
          keyFile: "PATH_TO_PRIVATE_KEY_FILE"
          server: "BACKUP_SERVER_IP"
          storageDirectory: "/home/apigee/BACKUP_DIRECTORY"
          cloudProvider: "HYBRID"  # required verbatim "HYBRID" (all caps)
          snapshotTimestamp: "TIMESTAMP"
        ...
        backup:
          enabled: false
          keyFile: "PATH_TO_PRIVATE_KEY_FILE"
          server: "BACKUP_SERVER_IP"
          storageDirectory: "/home/apigee/BACKUP_DIRECTORY"
          cloudProvider: "HYBRID" # required verbatim "HYBRID" (all caps)
          schedule: "SCHEDULE"
      

    Example

      namespace: apigee
      cassandra:
        hostNetwork: false
        ...
        restore:
          enabled: true
          keyFile: "/Users/exampleuser/apigee-hybrid/hybrid-files/service-accounts/private.key"
          server: "34.56.78.90"
          storageDirectory: "/home/apigee/cassbackup"
          cloudProvider: "HYBRID"
          snapshotTimestamp: "20201001183903"
        ...
        backup:
          enabled: false
          keyFile: "/Users/exampleuser/apigee-hybrid/hybrid-files/service-accounts/private.key"
          server: "34.56.78.90"
          storageDirectory: "/home/apigee/cassbackup"
          cloudProvider: "HYBRID"
          schedule: "0 2 * * *"
        ...

    Where:

    Property Description
    namespace

    YOUR_RESTORE_NAMESPACE

    Namespace for restore. Use the same name namespace as in your original cluster.

    cassandra:hostNetwork

    hostNetwork is required and should always be set to false.

    restore:enabled Restore is disabled by default. You must set this property to true.
    restore:keyFile

    PATH_TO_PRIVATE_KEY_FILE

    The path on your local file system to the SSH private key file (named ssh_key in the step where you created the SSH key pair).

    restore:server

    BACKUP_SERVER_IP

    The IP address of your backup server.

    restore:storageDirectory

    BACKUP_DIRECTORY

    The name of the backup directory on your backup server. This must be a directory within home/apigee (the backup directory is named cassandra_backup in the step where you created the backup directory).

    restore:cloudProvider

    HYBRID

    The cloudProvider: "HYBRID" property is required.

    restore:snapshotTimestamp

    TIMESTAMP

    The timestamp of the backup snapshot to restore. To check what timestamps can be used, go to the dbStorageBucket and look at the files that are present in the bucket. Each file name contains a timestamp value. For example, backup_20210203213003_apigee-cassandra-default-0.tgz

    Where 20210203213003 is the snapshotTimestamp value you would use if you wanted to restore the backups created at that point in time.

    backup:enabled You should set this property to false in case it has been previously set to true.
    backup:keyFile

    PATH_TO_PRIVATE_KEY_FILE

    The path on your local file system to the SSH private key file (named ssh_key in the step where you created the SSH key pair).

    backup:server

    BACKUP_SERVER_IP

    The IP address of your backup server.

    backup:storageDirectory

    BACKUP_DIRECTORY

    The name of the backup directory on your backup server. This must be a directory within home/apigee (the backup directory is named cassandra_backup in the step where you created the backup directory).

    backup:cloudProvider

    HYBRID

    The cloudProvider: "HYBRID" property is required.

    backup:schedule

    SCHEDULE

    The time when the backup starts, specified in standard crontab syntax. Default: 0 2 * * *

  4. Create a new hybrid runtime deployment. This will create a new Cassandra cluster and begin restoring the backup data into the cluster:
    ${APIGEECTL_HOME}/apigeectl init  -f overrides/overrides.yaml
    ${APIGEECTL_HOME}/apigeectl check-ready -f overrides/overrides.yaml
    ${APIGEECTL_HOME}/apigeectl apply -f overrides/overrides.yaml --restore
    ${APIGEECTL_HOME}/apigeectl check-ready -f overrides/overrides.yaml

Verify the restoration job progress and confirm that apigeeds and all the other pods are up:

  1. Check apigeeds:
    kubectl get apigeeds -n apigee
  2. Check all other pods:
    kubectl get pods -n apigee

Upon successful completion of the restore and confirmation that the runtime components are healthy, we recommend configuring a backup on the cluster:

  1. Remove the restore configuration from the overrides-restore.yaml file.
  2. Add the backup configuration to the overrides-restore.yaml file.
  3. Apply the backup configuration with the following command:
    ./apigeectl apply -f ../overrides-restore.yaml