This page describes how to recover or restore Cassandra in multiple regions.
In a multi-region deployment, Apigee hybrid is deployed in multiple geographic locations across different datacenters. If one or more regions fail, but healthy regions remain, you can use a healthy region to recover failed Cassandra regions with the latest data.
In the event of a catastrophic failure of all hybrid regions, Cassandra can be restored. It is important to note that, if you have multiple Apigee organizations in your deployment, the restore process restores data for all the organizations. In a multi-organization setup, restoring only a specific organization is not supported.
This topic describes both approaches to salvaging failed region(s):
- Recover failed region(s) - Describes the steps to recover failed region(s) based on a healthy region.
- Restore failed region(s) - Describes the steps to restore failed region(s) from a backup. This approach is only required if all hybrid regions are impacted.
Recover failed region(s)
To recover failed region(s) from a healthy region, perform the following steps:
- Redirect the API traffic from the impacted region(s) to the good working region. Plan the capacity accordingly to support the diverted traffic from failed region(s).
- Decommission the impacted region. For each impacted region, follow the steps outlined in Decommission a hybrid region. Wait for decommissioning to complete before moving on to the next step.
- Restore the impacted region. To restore, create a new region, as described in Multi-region deployment on GKE, GKE on-prem, and AKS.
Restoring from a backup
The Cassandra backup can either reside on Cloud Storage or on a remote server based on your configuration. To restore Cassandra from a backup, perform the following steps:
- Delete apigee hybrid deployment from all the regions:
apigeectl delete -f overrides.yaml
-
Restore the desired region from a backup. For more information, see Restoring a region from a backup.
- Remove the deleted region(s) references and add the restored region(s) references in the
KeySpaces
metadata. - Get the region name by using the
nodetool status
option.kubectl exec -n apigee -it apigee-cassandra-default-0 -- bash nodetool -u APIGEE_JMX_USER -pw APIGEE_JMX_PASSWORD status |grep -i Datacenter
where:
- APIGEE_JMX_USER is the username for the Cassandra JMX operations user. Used
to authenticate and communicate with the Cassandra JMX interface. See
cassandra:auth:jmx:username
. - APIGEE_JMX_PASSWORD is the password for the Cassandra JMX operations user.
See
cassandra:auth:jmx:password
.
- APIGEE_JMX_USER is the username for the Cassandra JMX operations user. Used
to authenticate and communicate with the Cassandra JMX interface. See
- Update the
KeySpaces
replication.- Create a client container and connect to the Cassandra cluster through the CQL interface.
- Get the list of user keyspaces from CQL interface:
cqlsh CASSANDRA_SEED_HOST -u APIGEE_DDL_USER -p APIGEE_DDL_PASSWORD --ssl -e "select keyspace_name from system_schema.keyspaces;"|grep -v system
where:
- CASSANDRA_SEED_HOST is the Cassandra multi-region seed host. For most
multi-region installations, use the IP address of a host in your first region. See
Configure Apigee
hybrid for multi-region and
cassandra:externalSeedHost
. - APIGEE_DDL_USER and APIGEE_DDL_PASSWORD are the admin
username and password for the Cassandra Data Definition Language (DDL) user. The
default values are "
ddl_user
" and "iloveapis123
".See
cassandra.auth.ddl.password
in the Configuration properties reference and Command Line Options in the Apache Cassandra cqlsh documentation.
- CASSANDRA_SEED_HOST is the Cassandra multi-region seed host. For most
multi-region installations, use the IP address of a host in your first region. See
Configure Apigee
hybrid for multi-region and
- For each keyspace, run the following command from the CQL interface to update the replication settings:
ALTER KEYSPACE KEYSPACE_NAME WITH replication = {'class': 'NetworkTopologyStrategy', 'REGION_NAME':3};
where:
- KEYSPACE_NAME is the name of the keyspace listed in the previous step's output.
- REGION_NAME is the region name obtained in Step 4.