Scale down Cassandra

Apigee hybrid employs a ring of Cassandra nodes as a StatefulSet. Cassandra provides persistent storage for certain Apigee entities on the runtime plane. For more information about Cassandra, see About the runtime plane.

Cassandra is a resource intensive service and should not be deployed on a pod with any other hybrid services. Depending on the load, you might want to scale the number of Cassandra nodes in the ring down in your cluster.

The general process for scaling down a Cassandra ring is:

  1. Decommission one Cassandra node.
  2. Update the cassandra.replicaCount property in overrides.yaml.
  3. Apply the configuration update.
  4. Repeat these steps for each node you want remove.
  5. Delete the persistent volume claim or volume, depending on your cluster configuration.

What you need to know

  • Perform this task on one node at a time before proceeding to the next node.
  • If any node other than the node to be decommissioned is unhealthy, do not proceed. Kubernetes will not be able to downscale the pods from the cluster.
  • Always scale down or up by a factor of three nodes.

Prerequisites

Before you scale down the number of Cassandra nodes in the ring, validate if the cluster is healthy and all the nodes are up and running, as the following example shows:

 kubectl get pods -n yourNamespace -l app=apigee-cassandra
NAME                 READY   STATUS    RESTARTS   AGE
apigee-cassandra-0   1/1     Running   0          2h
apigee-cassandra-1   1/1     Running   0          2h
apigee-cassandra-2   1/1     Running   0          2h
apigee-cassandra-3   1/1     Running   0          16m
apigee-cassandra-4   1/1     Running   0          14m
apigee-cassandra-5   1/1     Running   0          13m
kubectl -n yourNamespace exec -it apigee-cassandra-0 nodetool status
Datacenter: us-east1
====================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address      Load       Tokens       Owns (effective)  Host ID                               Rack
UN  10.16.2.6    690.17 KiB  256          48.8%             b02089d1-0521-42e1-bbed-900656a58b68  ra-1
UN  10.16.4.6    700.55 KiB  256          51.6%             dc6b7faf-6866-4044-9ac9-1269ebd85dab  ra-1 to
UN  10.16.11.11  144.36 KiB  256          48.3%             c7906366-6c98-4ff6-a4fd-17c596c33cf7  ra-1
UN  10.16.1.11   767.03 KiB  256          49.8%             ddf221aa-80aa-497d-b73f-67e576ff1a23  ra-1
UN  10.16.5.13   193.64 KiB  256          50.9%             2f01ac42-4b6a-4f9e-a4eb-4734c24def95  ra-1
UN  10.16.8.15   132.42 KiB  256          50.6%             a27f93af-f8a0-4c88-839f-2d653596efc2  ra-1

Decommission the Cassandra nodes

  1. Decommission the Cassandra nodes from the cluster using the nodetool command.
    kubectl -n yourNamespace exec -it nodeName nodetool decommission

    For example, this command decommissions apigee-cassandra-5, the node with the highest number value in the name:

    kubectl -n apigee exec -it apigee-cassandra-5 nodetool decommission
  2. Wait for the decommission to complete, and verify that the cluster now has one less node. For example:
    kubectl -n yourNamespace exec -it nodeName nodetool status
    Datacenter: us-east1
    ====================
    Status=Up/Down
    |/ State=Normal/Leaving/Joining/Moving
    --  Address     Load       Tokens       Owns (effective)  Host ID                               Rack
    UN  10.16.2.6   710.37 KiB  256          59.0%             b02089d1-0521-42e1-bbed-900656a58b68  ra-1
    UN  10.16.4.6   720.97 KiB  256          61.3%             dc6b7faf-6866-4044-9ac9-1269ebd85dab  ra-1
    UN  10.16.1.11  777.11 KiB  256          58.9%             ddf221aa-80aa-497d-b73f-67e576ff1a23  ra-1
    UN  10.16.5.13  209.23 KiB  256          62.2%             2f01ac42-4b6a-4f9e-a4eb-4734c24def95  ra-1
    UN  10.16.8.15  143.23 KiB  256          58.6%             a27f93af-f8a0-4c88-839f-2d653596efc2  ra-1
    
  3. Update or add the cassandra.replicaCount property in your overrides.yaml file. For example, if the current node count is 6, change it to 5:
    cassandra:
      replicaCount: 5 # (n-1)
  4. Apply the configuration change to your cluster. For example:
    ./apigeectl apply -v beta2 -c cassandra
    namespace/apigee unchanged
    secret/ssl-cassandra unchanged
    storageclass.storage.k8s.io/apigee-gcepd unchanged
    service/apigee-cassandra unchanged
    statefulset.apps/apigee-cassandra configured
  5. Verify that all of the remaining Cassandra nodes are running:
    kubectl get pods -n yourNamespace -l app=apigee-cassandra
    NAME                 READY   STATUS    RESTARTS   AGE
    apigee-cassandra-0   1/1     Running   0          3h
    apigee-cassandra-1   1/1     Running   0          3h
    apigee-cassandra-2   1/1     Running   0          2h
    apigee-cassandra-3   1/1     Running   0          25m
    apigee-cassandra-4   1/1     Running   0          24m
    
    
  6. Repeat Steps 1-5 for each node that you wish to decommission.
  7. When you are finished decommissioning nodes, verify that the cassandra.replicaCount value equals the number of nodes returned by the nodetool status command.
    kubectl -n yourNamespace exec -it apigee-cassandra-0 nodetool status
    Datacenter: us-east1
    ====================
    Status=Up/Down
    |/ State=Normal/Leaving/Joining/Moving
    --  Address     Load       Tokens       Owns (effective)  Host ID                               Rack
    UN  10.16.2.6   710.37 KiB  256          59.0%             b02089d1-0521-42e1-bbed-900656a58b68  ra-1
    UN  10.16.4.6   720.97 KiB  256          61.3%             dc6b7faf-6866-4044-9ac9-1269ebd85dab  ra-1
    UN  10.16.1.11  777.11 KiB  256          58.9%             ddf221aa-80aa-497d-b73f-67e576ff1a23  ra-1
    UN  10.16.5.13  209.23 KiB  256          62.2%             2f01ac42-4b6a-4f9e-a4eb-4734c24def95  ra-1
    UN  10.16.8.15  143.23 KiB  256          58.6%             a27f93af-f8a0-4c88-839f-2d653596efc2  ra-1
    
    
  8. After the Cassandra cluster is downsized, make sure to delete the pvc (PersistentVolumeClaim) to make sure next scale up event does not use the same Persistent volume and the data created earlier.
    kubectl get pvc -n yourNamespace
    NAME                                STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS   AGE
    cassandra-data-apigee-cassandra-0   Bound    pvc-f9c2a5b9-818c-11e9-8862-42010a8e014a   100Gi      RWO            apigee-gcepd   7h
    cassandra-data-apigee-cassandra-1   Bound    pvc-2956cb78-818d-11e9-8862-42010a8e014a   100Gi      RWO            apigee-gcepd   7h
    cassandra-data-apigee-cassandra-2   Bound    pvc-79de5407-8190-11e9-8862-42010a8e014a   100Gi      RWO            apigee-gcepd   7h
    cassandra-data-apigee-cassandra-3   Bound    pvc-d29ba265-81a2-11e9-8862-42010a8e014a   100Gi      RWO            apigee-gcepd   5h
    cassandra-data-apigee-cassandra-4   Bound    pvc-0675a0ff-81a3-11e9-8862-42010a8e014a   100Gi      RWO            apigee-gcepd   5h
    cassandra-data-apigee-cassandra-5   Bound    pvc-354afa95-81a3-11e9-8862-42010a8e014a   100Gi      RWO            apigee-gcepd   5h
    kubectl -n yourNamespace delete pvc cassandra-data-apigee-cassandra-5
    persistentvolumeclaim "cassandra-data-apigee-cassandra-5" deleted
    kubectl get pvc -n yourNamespace
    NAME                                STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS   AGE
    cassandra-data-apigee-cassandra-0   Bound    pvc-f9c2a5b9-818c-11e9-8862-42010a8e014a   100Gi      RWO            apigee-gcepd   7h
    cassandra-data-apigee-cassandra-1   Bound    pvc-2956cb78-818d-11e9-8862-42010a8e014a   100Gi      RWO            apigee-gcepd   7h
    cassandra-data-apigee-cassandra-2   Bound    pvc-79de5407-8190-11e9-8862-42010a8e014a   100Gi      RWO            apigee-gcepd   7h
    cassandra-data-apigee-cassandra-3   Bound    pvc-d29ba265-81a2-11e9-8862-42010a8e014a   100Gi      RWO            apigee-gcepd   5h
    cassandra-data-apigee-cassandra-4   Bound    pvc-0675a0ff-81a3-11e9-8862-42010a8e014a   100Gi      RWO            apigee-gcepd   5h
  9. If you are using Anthos installation, delete the Persistent volume from Anthos Kubernetes cluster also.
    kubectl get pv -n youNamespace
    NAME                                       CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS   CLAIM                                      STORAGECLASS   REASON   AGE
    pvc-0675a0ff-81a3-11e9-8862-42010a8e014a   100Gi      RWO            Delete           Bound    apigee/cassandra-data-apigee-cassandra-4   apigee-gcepd            5h
    pvc-2956cb78-818d-11e9-8862-42010a8e014a   100Gi      RWO            Delete           Bound    apigee/cassandra-data-apigee-cassandra-1   apigee-gcepd            7h
    pvc-354afa95-81a3-11e9-8862-42010a8e014a   100Gi      RWO            Delete           Bound    apigee/cassandra-data-apigee-cassandra-5   apigee-gcepd            5h
    pvc-79de5407-8190-11e9-8862-42010a8e014a   100Gi      RWO            Delete           Bound    apigee/cassandra-data-apigee-cassandra-2   apigee-gcepd            7h
    pvc-d29ba265-81a2-11e9-8862-42010a8e014a   100Gi      RWO            Delete           Bound    apigee/cassandra-data-apigee-cassandra-3   apigee-gcepd            5h
    pvc-f9c2a5b9-818c-11e9-8862-42010a8e014a   100Gi      RWO            Delete           Bound    apigee/cassandra-data-apigee-cassandra-0   apigee-gcepd            7h
    kubectl -n yourNamespace delete pv pvc-354afa95-81a3-11e9-8862-42010a8e014a