This topic describes minimum cluster configurations for Apigee hybrid. These minimum configurations apply to all of the supported Kubernetes platforms. The recommendations in this topic apply for non-production installations, such as trial or testing scenarios. Keep these recommendations in mind when performing the Apigee hybrid installation steps.
About node pools
A node pool is a group of nodes within a cluster that all have the same configuration. By default, hybrid assigns all pods to the default node pool; however, you can create dedicated node pools and assign hybrid components to them as a way of distributing resources.
Typically, you define dedicated node pools when you have pods with differing resource
requirements. For example, the apigee-cassandra
pods require persistent storage, while
the other Apigee hybrid pods do not. For this reason, we recommend that you create
a stateful node pool for Cassandra and a stateless node pool for the rest of the hybrid
runtime services. See Configure dedicated node pools for
details.
The following section lists configurations for both stateful and stateless node pools.
Minimum configurations
Use these minimum configurations when setting up your cluster:
Configuration | Stateful node pool | Stateless node pool |
---|---|---|
Purpose | A stateful node pool used for the Cassandra database. | A stateless node pool used by the runtime message processor. |
Label name | apigee-data | apigee-runtime |
Number of nodes | 1 per zone (3 per region) | 1 per zone (3 per region) |
CPU | 4 | 4 |
RAM | 15 | 15 |
Storage | dynamic | Managed with the ApigeeDeployment CRD |
Minimum disk IOPS | 2000 IOPS with SAN or directly attached storage. NFS is not recommended even if it can support the required IOPS. | 2000 IOPS with SAN or directly attached storage. NFS is not recommended even if it can support the required IOPS. |
Network bandwidth for each machine instance type | 1 Gbps | 1 Gbps |
Cassandra network requirements
This section discusses network requirements and recommendations to follow when setting up Apigee hybrid.
Network bandwidth
Cassandra uses the Gossip protocol to exchange information with other nodes about network topology. The use of Gossip plus the distributed nature of Cassandra—which involves talking to multiple nodes for read and write operations—results in a lot of data transfer through the network.
Cassandra requires a minimum of 1 Gbps of network bandwidth for each machine instance. For example,
on GKE, the minimum recommended machine type, e2-standard-4
, has a minimum bandwidth
of 1 Gbps. For production installations, a higher Gbps is recommended.
The maximum or 99th percentile latency for Cassandra should be below 100 milliseconds.
Secure network connectivity between regions
When installing hybrid in multiple regions, ensure that the connections between regions is secure:
- Use a virtual private network solution, such as Google Virtual Private Cloud (VPC), to secure connectivity between regions.
- Open a firewall to ensure that Cassandra nodes can connect between regions in non-overlapping subnets and can resolve those network IPs.
- Always use port 7001 for Cassandra. All other ports are local to the region. See also Secure ports usage.
Cassandra NTP requirements
Cassandra data synchronizes based on the timestamp of the system. Ensure that the time is synchronized across all pods and all regions within the Cassandra cluster. Time delays between the nodes and regions causes data inconsistencies.
Scaling the configuration
If you need to scale your initial configuration based on additional capacity or throughput needs, see the following topics:
- Configuring Cassandra for production
- Scaling Cassandra pods
- Configuring dedicated node pools
- Scale and autoscale runtime services
- Multi-region deployments