This topic describes steps you must take to configure the Cassandra database component for an Apigee hybrid production installation.
Ensure high availability
Cassandra clusters need three availability zones to maintain availability in a production environment. If one zone goes down, the remaining zones will continue responding to requests while the remaining zone comes back online. If two or more zones go down, Cassandra will be unable to respond to requests until at least two zones are online. Apigee recommends bringing zones back online within three hours to minimize the risk of missing data updates.
Configure Cassandra storage settings
For a production installation of Apigee hybrid, Google recommends that you add the following storage and heap settings to your overrides file and apply them to the cluster:
cassandra: ... replicaCount: 3 storage: storageclass: your-preferred-ssd-storage #If not using default storage for your cluster capacity: 500Gi resources: requests: cpu: 7 memory: 15Gi maxHeapSize: 8192M heapNewSize: 1200M
Apply changes to cassandra with the following command:
Helm
helm upgrade datastore apigee-datastore/ \ --namespace apigee \ --atomic \ -f OVERRIDES_FILE.yaml
apigeectl
$APIGEECTL_HOME/apigeectl init -f OVERRIDES_FILE.yaml
$APIGEECTL_HOME/apigeectl apply -f OVERRIDES_FILE.yaml --datastore
replicaCount
The value of replicaCount
must be a multiple of 3
. To determine your
desired replicaCount
value, consider the following:
- Estimate the traffic demands for your proxies.
- Load test and make reasonable predictions of your CPU utilization.
- You can specify different
replicaCount
values in different regions. - You can expand the
replicaCount
in the future in your overrides file.
storageclass
For production, Cassandra storage must be an SSD StorageClass. Set the value of
storageclass
if you are not using the default Kubernetes StorageClass for your
cluster. You can check the default StorageClass with the following command.
kubectl get storageclass
Your output should look something like:
NAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE ALLOWVOLUMEEXPANSION AGE premium-rwo pd.csi.storage.gke.io Delete WaitForFirstConsumer true 6d23h standard kubernetes.io/gce-pd Delete Immediate true 6d23h standard-rwo (default) pd.csi.storage.gke.io Delete WaitForFirstConsumer true 6d23h
Follow the instructions in StorageClass configuration if you want to change the default Kubernetes StorageClass.
To check the current storageclass
setting, execute the following command on your cluster:
kubectl get pvc -n NAMESPACE cassandra-data-apigee-cassandra-default-0 -o=jsonpath="{['.spec.storageClassName', '.metadata.annotations.volume\.beta\.kubernetes\.io/storage-class']}"
capacity
For production installations, Google recommends a storage capacity of at least 500Gi (gibibytes). You can change the storage capacity in response to your cluster's storage needs. See the instructions in Expand Cassandra persistent volumes to change the storage capacity.
To check the current capacity setting, execute the following command on your cluster:
kubectl get pvc -n NAMESPACE cassandra-data-apigee-cassandra-default-0 -o=jsonpath='{.spec.resources.requests.storage}'
cpu
and memory
For production installations, Google recommends at least 7 CPUs and a minimum 15Gi (gibibytes) per
pod. When specifying cassandra.resources.requests.cpu
and
cassandra.resources.requests.memory
, consider the traffic volume and the CPU and
Memory demands of your proxies.
To check the current cpu setting, execute the following command on your cluster:
kubectl get pods -n NAMESPACE apigee-cassandra-default-0 -o=jsonpath='{.spec.containers[].resources.requests.cpu}'
To check the current memory setting, execute the following command on your cluster:
kubectl get pods -n NAMESPACE apigee-cassandra-default-0 -o=jsonpath='{.spec.containers[].resources.requests.memory}'
maxHeapSize
and heapNewSize
These properties determine the maximum memory heap allocated to cassandra processes and the amount by which memory is increased, respectively, in megabytes (heap sizes are specified in megabytes, not mebibytes). For production environments, Google recommends the following values:
maxHeapSize: 8192M
heapNewSize: 1200M
Consult your Kubernetes platform provider's documentation for optimal heap size values.
To check the current maxHeapSize
setting, execute the following command on your cluster:
kubectl get sts -n NAMESPACE apigee-cassandra-default -o=jsonpath='{.spec.template.spec.containers[].env[?(@.name=="MAX_HEAP_SIZE")]}'
To check the current heapNewSize
setting, execute the following command on your cluster:
kubectl get sts -n NAMESPACE apigee-cassandra-default -o=jsonpath='{.spec.template.spec.containers[].env[?(@.name=="HEAP_NEWSIZE")]}'
For more information on these property settings, see the Configuration property reference.
Use SSD storage for production deployments
For the Cassandra database, the hybrid runtime only supports using dynamically created persistent volumes to store data. Local solid state disk (SSD) drives are not supported.
If you do not currently have SSD configured for Cassandra, you must configure a StorageClass definition that is backed by a solid-state drive (SSD) and make it the default class. See StorageClass configuration for detailed steps.
Follow the instructions in StorageClass configuration if you want to change the default Kubernetes StorageClass.