Scaling read-only workloads on Cloud SQL for PostgreSQL with HAProxy
Staff Database Engineer, Cloud SQL
Cloud SQL is Google Cloud’s fully managed relational database service for MySQL, PostgreSQL and SQL Server workloads. Cloud SQL enables customers to implement horizontal scaling of read-only workloads using read replicas. In this blog post, we will demonstrate a single-endpoint read scaling and load balancing solution using Cloud SQL for PostgreSQL and HAProxy, with automatic replica detection.
Key building blocks:
Cloud SQL for PostgreSQL database with one or more read replicas.
Google Compute Engine (GCE) instance running HAProxy and client programs.
Shell scripts to facilitate automatic HAProxy configuration updates when read replicas are added or removed from the Cloud SQL instance.
Architecture diagram of the test setup:
Figure 1: Cloud SQL with HAProxy for read-only load balancing (test setup)
This section discusses the key characteristics of the solution.
Load balancer mode of operation
We will configure the HAProxy load balancer to operate in TCP mode. It can be thought of as a “pass through” mode from a database connection perspective. The load balancer doesn’t store database credentials (except for the health check user), and it doesn’t pool or decrypt/re-encrypt database connections. A single client connection in HAProxy translates to a single client connection in Postgres.
This approach is suitable when the workload is constrained by the database’s processing capacity, and not by the number of client connections. You may require an additional connection pooling component (e.g. PgBouncer) if the number of clients becomes an issue, for example, when the database instances exhibit performance or stability issues due to the sheer number of simultaneous database connections.
Query vs. connection load balancing
Our HAProxy setup will use connection-level load balancing, that is:
All queries executed through a particular database connection will reach the same read replica.
A client must disconnect and reconnect in order to potentially reach a different replica.
HAProxy will be configured to use the “leastconn” connection routing algorithm instead of the default “roundrobin” algorithm. The default “roundrobin” algorithm allocates new connections quasi-randomly and without considering the number of existing connections on each replica. This can lead to workload imbalance due to some replicas receiving more connections than others. By contrast, the “leastconn” algorithm works by routing each new connection to the replica with the least number of existing connections, effectively trying to keep the number of connections equal on all replicas. This approach helps spread the query load more evenly among the replicas.
Note that in this context, “existing connections” are the client sessions handled by the load balancer itself. HAProxy has no knowledge of other connections that may have been opened directly against the replicas (bypassing the load balancer).
The solution provides load balancing for read-only connections, but not read/write connections. The read/write traffic separation must be handled by the application at a connection level. Read-only connections can connect to the HAProxy load balancer instance, whereas the connections that intend to execute read/write queries should connect directly to the Cloud SQL primary instance.
HAProxy will execute SQL-level health checks against each read replica in the Cloud SQL instance, and new client connections will be distributed among the healthy replicas. When a replica becomes unhealthy/unresponsive, connections that exist on that replica will not be automatically migrated or re-balanced onto the remaining replicas. Upon detecting a connection failure, the client should be able to reach one of the remaining healthy replicas simply by reconnecting to HAProxy.
Clients executing write queries should connect directly to the Cloud SQL primary instance. If those clients require high availability, using the Cloud SQL high availability (HA) configuration as described in Cloud SQL documentation.
In this demonstration, the HAProxy instance itself is not configured for high availability. Production applications should consider deploying the load balancer instances in a highly available configuration such as an instance group behind a Google Cloud load balancer. See Google Compute Engine documentation for details.
Example architecture diagram of a real-world deployment:
Figure 2: Cloud SQL with HAProxy for read-only load balancing (example production deployment)
Instance topology changes
Cloud SQL allows customers to add and remove read replicas quickly and easily. Correspondingly, this demonstration assumes that the load balancer configuration should not require manual intervention when replicas are added or removed. We will use a custom configuration script to describe instance topology, enumerate the available replicas, and reload HAProxy configuration automatically.
Notes and prerequisites
Instructions involving compute instances were tested on Google Compute Engine using Debian GNU/Linux 10 buster operating system image. Software installation steps and config/log file locations may vary on different systems.
Instructions involving Cloud SQL instances were tested using PostgreSQL 13, but should work with any modern PostgreSQL release.
We recommend that you launch GCE and Cloud SQL instances in the same region, VPC, and network to simplify the setup, and to allow instances to communicate with each other using private IP addresses. If you choose to use a more complex network layout (e.g. using multiple VPCs), make sure that the network and firewall configuration allows connections from the HAProxy instance to the Cloud SQL replicas using private IPs.
Setting up the Cloud SQL database
Launch a Cloud SQL for PostgreSQL instance with two read replicas. A single-zone setup with 1 vCPU, 3.75GB of memory, and 100GB of SSD storage will be sufficient for this demonstration. Make sure that the “Private IP” feature is enabled for all instances. You can disable the “Public IP” feature, we will not use it here.
Once launched, your initial setup will look similar to this:
Figure 3: Google Cloud Console showing a Cloud SQL instance with two read replicas
Use any of the supported methods to create Cloud SQL database users:
haproxy_healthcheckuser to handle the health check queries from HAProxy.
test_clientuser for client connections passing through HAProxy.
Make note of the user passwords, they will be required later.
You can confirm the successful creation of user accounts in the Google Cloud Console, under the “Users” section for the Cloud SQL primary instance. It should look like this:
Figure 4: Google Cloud Console showing the database roles used for HAProxy setup and testing
Note: The Cloud SQL database comes with a default user named
postgres. This is a privileged account with sufficient permissions to handle both the health checks and the client connections. You may use that account to simplify the setup. Production deployments should follow the least privilege rule and use a separate unprivileged user account for health checks.
Setting up the HAProxy instance
Launch a GCE instance using Debian GNU/Linux 10 operating system image. An e2-medium machine type (2 vCPUs, 4GB of memory) will be sufficient for this demonstration. Connect to the instance via SSH to install the required software and verify basic database connectivity as described below.
Install the Postgres client/server programs and other required utilities:
Don’t worry about the installation of the PostgreSQL server package on the load balancer instance. We only need the
pgbench tool included in the package, and we won’t use the server itself. The server might start automatically during installation, but you can safely stop it afterwards:
pg_ctlcluster 13 main stop
apt-get install haproxy
Note: This example uses HAProxy version 1.8, which is the default version provided by the OS package manager. This simplifies setup and is sufficient to demonstrate the solution. Production deployments should consider using the latest stable HAProxy version instead. If using Debian, the haproxy.debian.net website provides instructions for installing the desired HAProxy version.
Choose one of the Cloud SQL read replicas created earlier, and use the
psql tool to confirm that you can connect to the replica and run a
SELECT 1 query as the health check user:
The Google Cloud SDK comes pre-installed with the GCE’s default Debian OS image. You confirm that by running the
The command should return a few lines describing the installed software versions. An error message such as
gcloud: command not found indicates that the SDK is not installed. Follow the documentation to install it: Installing Cloud SDK.
Ensure that the SDK is initialized and authorized to call the
gcloud sql instances describe API against the Cloud SQL instance created earlier. Consult the Cloud SDK documentation if needed: Initializing Cloud SDK.
With the SDK installed and authorized, you should be able to describe your Cloud SQL instance like so:
gcloud sql instances describe
For example, this command describes the primary Cloud SQL instance created earlier, and uses the
jq JSON parsing tool to extract the list of read replicas from the
Our HAProxy configuration will consist of two components:
Static settings, which are HAProxy configuration directives as well as frontend/backend directives that are not subject to frequent changes.
Dynamic settings, which are the backend configuration directives that must be updated whenever there’s a change in read replica topology e.g. when replicas are added or removed from the Cloud SQL instance.
For example, the snippet below contains both static and dynamic settings. The directives highlighted in green are set by hand and we don’t expect to modify them often, if ever. Directives highlighted in blue require modification each time a read replica is added, removed, or modified in a way that changes the replica’s private IP address. Those settings are considered dynamic, and we don’t want to change them manually.
The HAProxy configuration will be managed as follows:
Static settings will be stored in regular text files.
Dynamic settings will be generated by a script, using replica information obtained via the
gcloud sql instances describeAPI.
Static and dynamic settings will be merged to produce a final configuration file. The HAProxy configuration will be reloaded in-place, without the need for a full shutdown and restart of the load balancer.
We will implement steps 2 and 3 through a shell script executed on a schedule using the
cron system tool.
Create the following files in HAProxy’s main directory (in our case, /var/lib/haproxy):
pgcheck.sh: shell script used by HAProxy to run a health check query against read replicas.
cfgupdate.sh: shell script responsible for describing instance topology, generating HAProxy backend settings, and reloading HAProxy with those settings.
config_global.cfg, config_backend.cfg: files containing static HAProxy configuration.
The initial contents for each file are included below.
After creating the files, make configuration adjustments:
In pgcheck.sh: provide the connection configuration for the health check user i.e. username, database, and password.
In cfgupdate.sh: provide the name of the primary Cloud SQL instance.
You shouldn’t need to modify static HAProxy settings to complete the demonstration, but production deployments may need to adjust health check timeout and retry settings to achieve the desired sensitivity. You can modify static settings in config_global.cfg and config_backend.cfg.
After providing the required configuration values in pgcheck.sh and cfgupdate.sh, execute the cfgupdate.sh script manually with a
dryrun argument. In a dry run mode, the script will attempt to generate HAProxy configuration into a temporary file, but it will not restart/reload HAProxy or replace the main HAProxy configuration file. This lets you test the script and review the configuration before it’s actually applied:
Any errors will be either printed out or written into the log file. For example, the error below indicates that the configured Cloud SQL instance name is incorrect e.g. it’s a replica instead of a primary, or a primary without replicas:
A successful script execution will look similar to this, with a few notices but no errors in the output or the log file:
You can now find the auto-generated
server directives at the bottom of the configuration file:
Finally, we’re ready to run the script in regular mode to replace and reload HAProxy configuration. Depending on whether HAProxy is already running, the output will state “Starting HAProxy” or “Reloading HAProxy”:
HAProxy should now be running and listening on the Postgres port (5432). HAProxy stats should also show a backend with replicas marked “UP”, indicating a successful SQL health check:
If HAProxy is not running or not listing any servers with status “UP”, review the log file (typically /var/log/haproxy.log) for error messages and troubleshooting information.
With HAProxy running and replicas up, proceed to the next section.
Verifying basic connectivity
Connect to the HAProxy instance via SSH.
psql tool to connect to the private IP of the load balancer instance. You can connect as the
test_client user created earlier. A successful connection will greet you with a Postgres command line prompt:
Now obtain HAProxy statistics. Notice how the
scur field (current sessions) now reports a value of “1” for one of the replicas. This is the replica your
psql client program is connected to via HAProxy:
Opening one connection is not particularly exciting, so let’s try a slightly different approach by opening 20 simultaneous connections in the background. Each connection executes a
pg_sleep function to keep the session active for 30 seconds:
Within 30 seconds of running the command, check HAProxy statistics again. This time, the session counters should show a total of 20 connections distributed between the two replicas:
The current sessions counter (
scur) should drop back to zero after the sleep functions expire and the client sessions are closed.
Testing load balancing with pgbench
Connect to the HAProxy instance via SSH.
Connect to the Cloud SQL primary instance and create the database objects needed for testing:
Populate the test table with a random data sample:
We can now use a query such as this to simulate read-only workload. The query is relatively CPU-intensive, which helps demonstrate the effectiveness of read scaling through HAProxy:
Put the test query in a SQL script that
pgbench will use:
pgbench for 5 minutes (300 seconds) using 100 concurrent sessions. Each session will perform three steps in a loop: connect to the database via HAProxy, execute the query, disconnect,
If you wish to run the benchmark with different settings, the
pgbench documentation contains detailed descriptions of the available parameters.
While the test is running, you can observe HAProxy statistics as described in previous sections. The number of current HAProxy sessions will fluctuate as the clients connect and disconnect, but the traffic should be divided more or less equally between the two replicas:
At the end of the run,
pgbench will produce a summary similar to the one below. In this example, we’ve achieved ~52 transactions (queries) per second using the two replicas:
To confirm that both replicas were utilized, open the Google Cloud Console and select the replicas:
Figure 5: Selecting Cloud SQL read replicas in the Google Cloud Console
Once you select the replicas, several monitoring charts will appear in the information panel on the right hand side. Consult the “CPU utilization” and “Active connections” charts to confirm that both replicas were equally utilized during the test. Example:
Figure 6: Information panel showing CPU utilization for the selected read replicas
Now let’s spin up a third replica and wait for it to become available:
Figure 7: Cloud SQL instance after adding the third read replica
Use the cfgupdate.sh script to update HAProxy configuration:
pgbench again and review the output. In this example, we’ve managed to reach ~74 queries per second with three replicas, compared to the earlier result of ~52 queries per second with two replicas:
The load distribution can be confirmed using Cloud SQL metrics and HAProxy statistics as described previously.
Enabling dynamic scaling with automatic HAProxy configuration updates
At this point, we have all the building blocks necessary to enable dynamic workload scaling using the HAProxy configuration script. We’ve already demonstrated that the script itself is capable of detecting changes in replica topology and reloading HAProxy configuration accordingly. All that's left is to execute the script automatically on a schedule, so that replica changes are reflected in HAProxy configuration without having to run the script manually.
To begin, insert the following line into the HAProxy instance’s crontab (/etc/crontab) to run the script automatically every minute:
You can check the timestamp in the script’s log file to confirm that the code is being executed at the top of each minute:
With the script running on schedule, you can continue to run tests using
pgbench. During your tests, remove one of the replicas e.g. haproxy-test-replica-3. As the replica is shut down, the HAProxy health check will start failing almost immediately, and the replica will be marked as “down”:
The replica will eventually disappear from the configuration script log as well:
You can add the third replica back to the setup, and HAProxy will pick it up automatically within a few minutes:
You can continue the experiment and run additional tests while adding and removing replicas. You should be able to confirm that the changes are reflected in HAProxy configuration within a few minutes.As an example, this is a record of the TPS (transactions per second) metric reported by
pgbenchduring another test. The test was run with three replicas, then two replicas, then three replicas again. Workload scaling happened automatically thanks to the configuration script.
Figure 8: Workload scaling using HAProxy with automatic configuration updates
Production deployments may consider implementing the following advanced features:
Template-based deployment of load balancer instances. When managing a larger fleet of load balancers, you can create a common machine image used to provision HAProxy instances instead of configuring each instance manually. Instance-specific settings (e.g. the name of the primary instance to use in HAProxy configuration script) can be provided using Google Compute Engine features such as startup scripts and custom VM metadata.
Advanced credential storage. The credentials of the health check user must be available on each HAProxy instance. You can use services such as the Secret Manager or Cloud SQL features such as IAM database authentication to avoid hard-coding user credentials inside the health check file.
Advanced replica detection logic. The configuration script provided with this demonstration covers basic detection of read replicas under a single Cloud SQL primary instance. You can introduce advanced detection features based on replica characteristics such as instance names or labels. For example, you can modify the script to use instance labels to include or exclude certain replicas from HAProxy configuration.
Support for multiple replica pools. The configuration script provided with this demonstration generates settings for a single HAProxy backend. You can extend the script to expose several HAProxy backends within a single HAProxy installation. By doing so, you can load balance traffic for multiple Cloud SQL databases with a single HAProxy instance.
Connection pooling. The solution demonstrated here doesn’t provide connection pooling, and thus doesn’t reduce the total number of connections handled by read replica instances. This is acceptable when the workload is constrained by the database’s processing capacity, and not by the number of client connections. If the number of clients becomes so large that the replicas begin experiencing issues due to process contention, you might require an additional connection pooling component (such as PgBouncer) working in concert with HAProxy. Connection pooling software and HAProxy can be run on the same instance for efficiency reasons.
Alternative solutions. HAProxy (optionally with PgBouncer) is one of the easiest ways to implement PostgreSQL load balancing (and connection pooling), but it’s not the only way to do so. Here are some of the examples of alternative solutions to consider:
Pgpool-II for both pooling and load balancing. Pgpool-II can be a bit more complex to set up and manage compared with HAProxy and PgBouncer, but it can be a powerful tool in the hands of an advanced user.
Pgpool-II (for load balancing) with PgBouncer (for pooling).
HAProxy (for load balancing) with connection pooling provided by client-side drivers.