However, dedicated test infrastructure can be expensive and difficult to maintain because it is not needed on a continuous basis. Moreover, dedicated test infrastructure is often a one-time capital expense with a fixed capacity, which makes it difficult to scale load testing beyond the initial investment and can limit experimentation. This can lead to slowdowns in productivity for development teams and lead to applications that are not properly tested before production deployments.
Distributed load testing using cloud computing is an attractive option for a variety of test scenarios. Cloud platforms provide a high degree of infrastructure elasticity, making it easy to test applications and services with large numbers of simulated clients, each generating traffic patterned after users or devices. Additionally, the pricing model of cloud computing fits very well with the very elastic nature of load testing.
Containers, which offer a lightweight alternative to running full virtual machine instances for applications, are well-suited for rapid scaling of simulated clients. Containers are an excellent abstraction for running test clients because they are lightweight, simple to deploy, immediately available, and well-suited to singular tasks.
Google Cloud Platform is an excellent environment for distributed load testing using containers. The platform supports containers as first-class citizens via Google Kubernetes Engine, which is powered by the open source container-cluster manager, Kubernetes. Kubernetes Engine provides the ability to quickly provision container infrastructure and tools to manage deployed applications and resources.
This solution demonstrates how to use Kubernetes Engine to deploy a distributed load testing framework. The framework uses multiple containers to create load testing traffic for a simple REST-based API. Although this solution tests a simple web application, the same pattern can be used to create more complex load testing scenarios such as gaming or Internet-of-Things (IoT) applications. This solution discusses the general architecture of a container-based load testing framework. For a tutorial with step-by-step instructions on setting up a sample framework, see the Tutorial section at the end of this document.
System under test
In software testing terminology, the system under test is the system that your tests are designed to evaluate. In this solution, the system under test is a small web application deployed to Google App Engine. The application exposes basic REST-style endpoints to capture incoming HTTP POST requests (incoming data is not persisted). In a real-world scenario, web applications can be complex and include a multitude of additional components and services such as caching, messaging, and persistence. These complexities are outside the scope of this solution. For more information on building scalable web applications on Google Cloud Platform, please refer to the Building Scalable and Resilient Web Applications solution.
The source code for the sample application is available as part of the tutorial at the end of this document.
The example application is modeled after the backend service component found in many Internet-of-Things (IoT) deployments—devices first register with the service and then begin reporting metrics or sensor readings, while also periodically re-registering with the service.
The following diagram shows a common backend service component interaction.
To model this interaction, you can use Locust,
a distributed, Python-based load testing tool that is capable of distributing
requests across multiple target paths. For example, Locust can distribute
requests to the
/metrics target paths. There are many load
generation software packages available, including
one of which might better suit your project's needs.
The workload is based on the interaction described above and is modeled as a set of Tasks in Locust. To approximate real-world clients, each Locust task is weighted. For example, registration happens once per thousand total client requests.
From an architectural perspective, there are two main components involved in deploying this distributed load testing solution: the Locust container image and the container orchestration and management mechanism.
The Locust container image is a Docker image that contains the Locust software. The Dockerfile can be found in the associated GitHub repository (refer to the tutorial below). The Dockerfile uses a base Python image and includes scripts to start the Locust service and execute the tasks.
This solution uses Google Kubernetes Engine as the container orchestration and management mechanism. Kubernetes Engine, which is based on the open source framework Kubernetes, is the product of years of experience running, orchestrating, and managing container deployments all across Google. Container-based computing allows developers to focus on their applications, instead of on deployments and integrations into hosting environments. Containers also facilitate load testing portability so that containerized applications can run across multiple cloud environments. Kubernetes Engine and Kubernetes introduce several concepts specific to container orchestration and management.
A container cluster is a group of Compute Engine instances that provides the foundation for your entire application. The Kubernetes Engine and Kubernetes documentation refer to these instances as nodes. A cluster comprises a single master node and one or more worker nodes. The master and workers all run on Kubernetes, which is why container clusters are sometimes called Kubernetes clusters. For more information about clusters, see the Kubernetes Engine documentation.
A pod is a tightly-coupled group of containers that should be deployed together. Some pods contain only a single container. For example, in this solution, each of the Locust containers runs in its own pod. Often, however, pods contain multiple containers that work together in some way. For example, in this solution, Kubernetes uses a pod with three containers to provide DNS services. In one container, SkyDNS provides DNS server functionality. SkyDNS relies on a key-value store, named etcd, that resides in another container. In the pod's third container, kube2sky acts as a bridge between Kubernetes and SkyDNS.
A replication controller ensures that a specified number of pod "replicas" are running at any one time. If there are too many, the replication controller kills some pods. If there are too few, it starts more. This solution has three replication controllers: one ensures the existence of a single DNS server pod; another maintains a single Locust master pod; and a third keeps exactly 10 Locust worker pods running.
A particular pod can disappear for a variety of reasons, including node failure or intentional node disruption for updates or maintenance. This means that the IP address of a pod does not provide a reliable interface for that pod. A more reliable approach would use an abstract representation of that interface that never changes, even if the underlying pod disappears and is replaced by a new pod with a different IP address. A Kubernetes Engine service provides this type of abstract interface by defining a logical set of pods and a policy for accessing them. In this solution, there are several services that represent pods or sets of pods. For example, there is a service for the DNS server pod, another service for the Locust master pod, and a service that represents all 10 Locust worker pods.
The following diagram shows the contents of the master and worker nodes.
Deploying system under test
The solution uses Google App Engine to run the system under test. To deploy the system under test, you need an active Google Cloud Platform account so that you can install and run the Google Cloud Platform SDK. With the SDK in place, you can deploy the sample web application with a single command. The source code for the web application is available as part of the tutorial at the end of this document.
Deploying load testing tasks
To deploy the load testing tasks, you first deploy a load testing master and then deploy a group of ten load testing workers. With this many load testing workers, you can create a substantial amount of traffic for testing purposes. Keep in mind, however, that generating excessive amounts of traffic to external systems can resemble a denial-of-service attack. Be sure to review the Google Cloud Platform Terms of Service and the Google Cloud Platform Acceptable Use Policy.
The load testing master
The first component of the deployment is the Locust master, which is the entry point for executing the load testing tasks described above. The Locust master is deployed as a replication controller with a single replica because we need only one master. A replication controller is useful even when deploying a single pod because it ensures high availability.
The configuration for the replication controller specifies
several elements, including the name of the controller (
for organization (
name: locust, role: master), and the ports that need to be
exposed by the container (
8089 for web interface,
communicating with workers). This information is later used to configure the
Locust workers controller. The following snippet contains the configuration for
... ports: - name: locust-master-web containerPort: 8089 protocol: TCP - name: locust-master-port-1 containerPort: 5557 protocol: TCP - name: locust-master-port-2 containerPort: 5558 protocol: TCP
Next, we deploy a Service to ensure that the exposed ports are accessible to
other pods via
hostname:port within the cluster, and referenceable via a
descriptive port name. The use of a service allows the Locust workers to easily
discover and reliably communicate with the master, even if the master fails and
is replaced with a new pod by the replication controller. The Locust master
service also includes a directive to create an external forwarding rule at the
cluster level, which provides the ability for external traffic to access the
cluster resources. Note that you must still create firewall rules to provide
complete access to target instances.
After you deploy the Locust master, you can access the web interface using the public IP address of the external forwarding rule. After you deploy the Locust workers, you can start the simulation and look at aggregate statistics through the Locust web interface.
The load testing workers
The next component of the deployment includes the Locust workers, which execute
the load testing tasks described above. The Locust workers are deployed
by a single replication controller that creates ten pods. The pods are spread
out across the Kubernetes cluster. Each pod uses environment variables to control
important configuration information such as the hostname of the system under
test and the hostname of the Locust master. The configuration of the worker’s
replication controller can be found in the tutorial below. The configuration
contains the name of the controller,
locust-worker, labels for organization,
name: locust, role: worker, and the previously described environment
variables. The following snippet contains the configuration for the name,
labels, and number of replicas:
kind: ReplicationController apiVersion: v1 metadata: name: locust-worker labels: name: locust role: worker spec: replicas: 10 selector: name: locust role: worker ...
For the Locust workers, no additional service needs to be deployed because the worker pods themselves do not need to support any inbound communication—they connect directly to the Locust master pod.
The following diagram shows the relationship between the Locust master and the Locust workers.
After the replication controller deploys the Locust workers, you can return to the Locust master web interface and see that the number of slaves corresponds to the number of deployed workers.
Executing load testing tasks
Starting the load testing
The Locust master web interface enables you to execute the load testing tasks against the system under test, as shown in the following image:
To begin, specify the total number of users to simulate and a rate at which each user should be spawned. Next, click Start swarming to begin the simulation. As time progress and users are spawned, you will see statistics begin to aggregate for simulation metrics, such as the number of requests and requests per second, as shown in the following image:
To stop the simulation, click Stop and the test will terminate. The complete results can be downloaded into a spreadsheet.
Scaling up the number of simulated users will require an increase in the number of Locust worker pods. As specified in the Locust worker controller, the replication controller deploys 10 Locust worker pods. To increase the number of pods deployed by the replication controller, Kubernetes offers the ability to resize controllers without redeploying them. For example, you can change the number of worker pods by using the kubectl command line tool. The following command scales the pool of Locust worker pods to 20:
$ kubectl scale --replicas=20 replicationcontrollers locust-worker
After you issue the scale command, wait a few minutes for all of the pods to be deployed and started. After all the pods have started, return to the Locust master web interface and restart the load testing.
Resources and cost
This solution uses four Kubernetes Engine nodes, each one backed by a Compute
Engine VM standard instance of type
n1-standard-1. You can use the Google Cloud
Platform Pricing Calculator to get an estimate of the monthly cost of running
this container cluster. As discussed previously, you can customize the size of
the container cluster to scale it to your needs. The pricing calculator lets
you customize the cluster characteristics to get an estimate of what the
cost would be to scale up or down.
You've now seen how to use Kubernetes Engine to create a load testing framework for a simple web application. Kubernetes Engine allows you to specify the number of container nodes that provide the foundation for your load testing framework. Kubernetes Engine also allows you to organize your load testing workers into pods, and to declare how many pods that you want Kubernetes Engine to keep running.
You can use this same pattern to create load testing frameworks for a variety of different scenarios and applications. For example, you can use this pattern to create load testing frameworks for messaging systems, data stream management systems, and database systems. You can create new Locust tasks or even switch to a different load testing framework.
Another way to extend the framework presented in this solution is to customize the metrics that are collected. For example, you might want to measure the requests per second, or monitor the response latency as load increases, or check the response failure rates and types of errors. There are several monitoring options available, including Google Cloud Monitoring.
The complete contents of the tutorial, including instructions and source code, are available on GitHub at https://github.com/GoogleCloudPlatform/distributed-load-testing-using-kubernetes.