Using load balancing for highly available applications

This tutorial explains how to use load balancing with a regional managed instance group to redirect traffic away from busy or unavailable VM instances, allowing you to provide high availability even during a zonal outage.

A regional managed instance group distributes an application on multiple instances across multiple zones. A global load balancer directs traffic across multiple zones via a single IP address, preventing users from being directed to a zone where your application is unavailable. By using both of these services to distribute your application across multiple zones, you can help ensure that your application is available even in extreme cases, like a zonal disruption.

Load balancers can be used to direct a variety of traffic types. This tutorial shows you how to create a global load balancer that directs external HTTP traffic, but much of the content of this tutorial is still relevant to other types of load balancers. To learn about other types of traffic that can be directed with a load balancer, see Types of Cloud Load Balancing.

This tutorial includes detailed steps for launching a web application on a regional managed instance group, configuring network access, creating a load balancer for directing traffic to the web application, and observing the load balancer by simulating a zonal outage. Depending on your experience with these features, this tutorial takes about 45 minutes to complete.

Objectives

  • Launch a demo web application on a regional managed instance group.
  • Configure a global load balancer that directs HTTP traffic across multiple zones.
  • Observe the effects of the load balancer by simulating a zonal outage.

Costs

This tutorial uses billable components of GCP including:

  • Compute Engine

Use the pricing calculator to generate a cost estimate based on your projected usage. New GCP users might be eligible for a free trial.

Before you begin

  1. Sign in to your Google Account.

    If you don't already have one, sign up for a new account.

  2. Select or create a GCP project.

    Go to the project selector page

  3. Make sure that billing is enabled for your Google Cloud Platform project. Learn how to enable billing.

Application architecture

The application includes the following Compute Engine components:

  • VPC network: a virtual network within GCP that can provide global connectivity via its own routes and firewall rules.
  • Firewall rule: a GCP firewall lets you allow or deny traffic to your instances.
  • Instance template: a template used to create each VM instance in the managed instance group.
  • Regional managed instance group: a group of VM instances running the same application across multiple zones.
  • Global static external IP address: a static IP address that is accessible on external networks and can be attached to a global resource.
  • Global load balancer: an application that directs traffic across multiple zones via a single (global) IP address; it is designed to prevent users from being directed to a zone where your application is unavailable.
  • Health check: a policy used by the load balancer to evaluate the responsiveness of the application on each VM instance.

Launching the web application

This tutorial uses a web application that is stored on GitHub. If you would like learn more about how the application was implemented, see the Google Cloud Platform GitHub repository.

Launch the web application on every VM in an instance group by including a startup script in an instance template. Additionally, run the instance group in a dedicated VPC network to keep this tutorial's firewall rules from interfering with any existing resources running in your project.

Create a VPC network

Using a VPC network protects existing resources in your project from being affected by the resources that you will create for this tutorial. A VPC network is also required to restrict incoming traffic so that it must go through the load balancer.

Create a VPC network to encapsulate the firewall rules for the demo web application:

  1. Go to the VPC networks page in the GCP Console.
    Go to the VPC networks page
  2. Click Create VPC Network.
  3. Under Name, enter web-app-vpc.
  4. Set Subnet creation mode to Automatic.

  5. At the bottom of the page, click Create.

Wait until the VPC network has finished being created before continuing.

Create a firewall rule

After the VPC network is created, set up a firewall rule to allow HTTP traffic to the VPC network:

  1. Go to the Firewalls page in the GCP Console.
    Go to the Firewalls page
  2. Click Create firewall rule.
  3. Under Name, enter allow-web-app-http.
  4. Set Network to web-app-vpc.
  5. Under Targets, select All instances in the network.
  6. Set Source filter to IP ranges.
  7. Under Source IP ranges, enter 0.0.0.0/0 to allow access for all IP addresses.
  8. Under Ports and protocols, select Specified protocols and ports.
    Check tcp and enter 80 to allow access for HTTP traffic.
  9. Click Create.

Create an instance template

Create a template that you will use to create a group of VM instances. Each instance created from the template launches a demo web application via a startup script.

  1. Go to the Instance templates page in the GCP Console.
    Go to the Instance templates page
  2. Click Create instance template.
  3. Under Name, enter load-balancing-web-app-template.
  4. Under Machine configuration, set the Machine type to f1-micro.
  5. Under Boot disk, set the Image to Debian GNU/Linux 9 (stretch).
  6. Under Firewall, select Allow HTTP traffic.
  7. Click Management, security, disks, networking, sole tenancy to reveal the advanced settings.
  8. Click the Management tab. Under Automation, enter the following Startup script.

    sudo apt-get update && sudo apt-get install git gunicorn3 python3-pip -y
    git clone https://github.com/GoogleCloudPlatform/python-docs-samples.git
    cd python-docs-samples/compute/managed-instances/demo
    sudo pip3 install -r requirements.txt
    sudo gunicorn3 --bind 0.0.0.0:80 app:app --daemon
    

    The script gets, installs, and launches the web application at instance startup.

  9. Click the Networking tab. Under Network, select web-app-vpc. This forces each instance created with this template to run on the previously created network.

  10. Click Create.

Wait until the template has finished being created before continuing.

Create a regional managed instance group

To run the web application, use the instance template to create a regional managed instance group:

  1. Go to the Instance groups page in the GCP Console.
    Go to the Instance groups page
  2. Click Create instance group.
  3. Under Name, enter load-balancing-web-app-group.
  4. Under Location, select Multiple zones.

  5. Under Region, select us-central1.

  6. Click the Configure zones drop-down menu to reveal Zones. Select the following zones:

    • us-central1-b
    • us-central1-c
    • us-central1-f
  7. Under Instance template, select load-balancing-web-app-template.

  8. Under Autoscaling, select Off.

  9. Set Number of instances to 6.

  10. Under Instance redistribution, select On.

  11. Under Autohealing and Health check, select No health check.

  12. Click Create. This redirects you back to the Instance groups page.

  13. To verify that your instances are running the demo web application correctly:

    1. From the Instance groups page, click load-balancing-web-app-group to see the instances in that group.
    2. Under External IP, click an IP address to connect that instance. A new browser tab opens displaying the demo web application:

      Screenshot of the demo web application, which lists information about the instance and has action buttons.

      When you are done, close the browser tab for the demo web application.

Configuring the load balancer

To use a load balancer to direct traffic to your web application, you must reserve an external IP address to receive all incoming traffic. Then, create a load balancer that accepts traffic from that IP address and redirects that traffic to the instance group.

Reserve a static IP address

Use a global static external IP address to provide the load balancer with a single point of entry for receiving all user traffic. Compute Engine preserves static IP addresses even if you change or delete any affiliated GCP resources. This allows the web application to always have the same entry point, even if other parts of the web application might change.

  1. Go to the External IP addresses page in the GCP Console.
    Go to the External IP addresses page
  2. Click Reserve static address.
  3. Under Name, enter web-app-ipv4.
  4. Set IP version to IPv4.
  5. Set Type to Global.
  6. Click Reserve.

Create a load balancer

This section explains the steps required to create a global load balancer that directs HTTP traffic.

This load balancer uses a frontend to receive incoming traffic and a backend to distribute this traffic to healthy instances. Because the load balancer is made of multiple components, this task is divided into several parts:

  • Backend configuration
  • Frontend configuration
  • Review and finalize

Complete all the steps to create the load balancer.

  1. Go to the Create a load balancer page in the GCP Console.
    Go to the Create a load balancer page
  2. Under HTTP(S) Load Balancing, click Start configuration.
  3. Under Internet facing or internal only, select From Internet to my VMs. Then, click Continue.
  4. For the Name of the load balancer, enter web-app-load-balancer.

Backend configuration

  1. In the left panel of the New HTTP(S) load balancer page, click Backend configuration.
  2. Click Create or select backend services & backend buckets to open a drop-down menu. Click Backend services, then click Create a backend service.
  3. In the new window, for the Name of the backend application, enter web-app-backend.
  4. Set Instance group to load-balancing-web-app-group.
  5. Set Port numbers to 80. This allows HTTP traffic between the load balancer and the instance group.
  6. Under Balancing mode, select Utilization.
  7. Click Done to create the backend.
  8. Create the health check for the backend of the load balancer:

    1. Under Health check, select Create a health check (or Create another health check) from the drop-down menu. A new window opens.
    2. In the new window under Name, enter web-app-load-balancer-check.
    3. Set the Protocol to HTTP.
    4. Under Port, enter 80.
    5. For this tutorial, set the Request path to /health, which is a path that the demo web application is set up to respond to.
    6. Set the following Health criteria:

      1. Set Check interval to 3 seconds. This defines the amount of time from the start of one probe to the start of the next one.
      2. Set Timeout to 3 seconds. This defines the amount of time that GCP waits for a response to a probe. Its value must be less than or equal to the check interval.
      3. Set Healthy Threshold to 2 consecutive successes. This defines the number of sequential probes that must succeed in order for the instance to be considered healthy.
      4. Set Unhealthy Threshold to 2consecutive failures. This defines the number of sequential probes that must fail in order for the instance to be considered unhealthy.
    7. Click Save and continue to create the health check.

  9. Click Create to create the backend service.

Frontend configuration

  1. In the left panel of the New HTTP(S) load balancer page, click Frontend configuration.
  2. On the Frontend configuration page, under Name, enter web-app-ipv4-frontend.
  3. Set the Protocol to HTTP.
  4. Set the IP version to IPv4.
  5. Set the IP address to web-app-ipv4.
  6. Set the Port to 80.
  7. Click Done to create the frontend.

Review and finalize

  1. Verify your load balancing settings before creating the load balancer:

    1. In the left panel of the New HTTP(S) load balancer page, click Review and finalize.
    2. On the Review and finalize page, verify the following Backend settings:

      • The Backend service is web-app-backend.
      • The Endpoint protocol is HTTP.
      • The Health check is web-app-load-balancer-check.
      • The Instance group is load-balancing-web-app-group.
    3. On the same page, verify that Frontend uses an IP address with a Protocol of HTTP.

  2. In the left panel of the New HTTP(S) load balancer page, click Create to finish creating the load balancer.

You might need to wait a few minutes for the load balancer to finish being created.

Simulating a zonal outage

You can observe the functionality of the load balancer by simulating the widespread unavailability of a zonal outage. This simulation works by forcing all of the instances located in a specified zone to report an unhealthy status on the /health request path. When these instances report an unhealthy status, they fail the load balancing health check, prompting the load balancer to stop directing traffic to these instances.

  1. Monitor which zones the load balancer is directing traffic to.

    1. Open a terminal by using Cloud Shell from the GCP Console.

      Open Cloud Shell

      Cloud Shell opens on the bottom of the GCP Console. It can take a few seconds for the session to initialize.

    2. Save the static external IP address of your load balancer:

      1. Get the external IP address from the frontend forwarding rule of the load balancer by entering the following command in your terminal:

        gcloud compute forwarding-rules list | grep web-app-ipv4-frontend
        

        Copy the EXTERNAl_IP_ADDRESS from the output:

        web-server-ipv4-frontend    EXTERNAl_IP_ADDRESS    TCP    web-app-load-balancer-target-proxy
        
      2. Create a local bash variable:

        export LOAD_BALANCER_IP=EXTERNAl_IP_ADDRESS
        

        where EXTERNAl_IP_ADDRESS is the external IP address that you copied.

    3. To monitor which zones the load balancer is directing traffic to, run the following bash script:

      while true
      do
              BODY=$(curl -s "$LOAD_BALANCER_IP")
              NAME=$(echo -n "$BODY" | grep "load-balancing-web-app-group" | perl -pe 's/.+?load-balancing-web-app-group-(.+?)<.+/\1/')
              ZONE=$(echo -n "$BODY" | grep "us-" | perl -pe 's/.+?(us-.+?)<.+/\1/')
      
              echo $ZONE
      done
      

      This script continously attempts to connect to the web application via the IP address for the frontend of the load balancer, and outputs which zone the web application is running from for each connection.

      The resulting output should include zones us-central1-b, us-central1-c, and us-central1-f:

      us-central1-f
      us-central1-b
      us-central1-c
      us-central1-f
      us-central1-f
      us-central1-c
      us-central1-f
      us-central1-c
      us-central1-c
      

      Keep this terminal open.

  2. While your monitor is running, begin simulating the zonal outage.

    1. In Cloud Shell, open a second terminal session by clicking the Add button.
    2. Create a local bash variable for the project ID:

      export PROJECT_ID=PROJECT_ID
      

      where PROJECT_ID is the project ID for your current project, which is displayed on each new line in the Cloud Shell:

      user@cloudshell:~ (PROJECT_ID)$
      
    3. Create a local bash variable for the zone that you want to disable. To simulate a failure of zone us-central1-f, use the following command:

      export DISABLE_ZONE=us-central1-f
      

      Then, run the following bash script. This script causes the demo web application instances in the disabled zone to output unhealthy responses to the load balancer health check. Unhealthy responses prompt the load balancer to direct traffic away from these instances.

      export MACHINES=$(gcloud --project=$PROJECT_ID compute instances list --filter="zone:($DISABLE_ZONE)" --format="csv(name,networkInterfaces[0].accessConfigs[0].natIP)" | grep "load-balancing-web-app-group")
      for i in $MACHINES;
      do
        NAME=$(echo "$i" | cut -f1 -d,)
        IP=$(echo "$i" | cut -f2 -d,)
        echo "Simulating zonal failure for zone $DISABLE_ZONE, instance $NAME"
        curl -q -s "http://$IP/makeUnhealthy" >/dev/null --retry 2
      done
      

      After a short delay, the load balancer stops directing traffic to the unhealthy zones, so the output from the first terminal window stops listing zone us-central1-f:

      us-central1-c
      us-central1-c
      us-central1-c
      us-central1-b
      us-central1-b
      us-central1-c
      us-central1-b
      us-central1-c
      us-central1-c
      

      This indicates that the load balancer is directing traffic only to the healthy, responsive instances.

      Keep both terminals open.

    4. In the second terminal, create a local bash variable for the zone that you want to restore. To restore traffic to zone us-central1-f, use the following command:

      export ENABLE_ZONE=us-central1-f
      

      Then, run the following bash script. This script causes the demo web application instances in the enabled zone to output healthy responses to the load balancer health check. Healthy responses prompt the load balancer to begin distributing traffic back toward these instances.

      export MACHINES=$(gcloud --project=$PROJECT_ID compute instances list --filter="zone:($ENABLE_ZONE)" --format="csv(name,networkInterfaces[0].accessConfigs[0].natIP)" | grep "load-balancing-web-app-group")
      for i in $MACHINES;
      do
        NAME=$(echo "$i" | cut -f1 -d,)
        IP=$(echo "$i" | cut -f2 -d,)
        echo "Simulating zonal restoration for zone $ENABLE_ZONE, instance $NAME"
        curl -q -s "http://$IP/makeHealthy" >/dev/null --retry 2
      done
      

      After a few minutes, the output from the first terminal window gradually lists zone us-central1-fagain:

      us-central1-b
      us-central1-b
      us-central1-c
      us-central1-f
      us-central1-c
      us-central1-c
      us-central1-b
      us-central1-c
      us-central1-f
      

      This indicates that the load balancer is directing incoming traffic to all zones again.

      Close both terminals when you have finished.

(Optional) Restricting incoming traffic

When you created the regional managed instance group, you could access each instance directly through its external ephemeral IP address. However, now that you have prepared a load balancer and static external IP address, you might want to modify the network firewall so that incoming traffic must go through the load balancer.

If you want to restrict incoming traffic to the load balancer, modify the network firewall to disable the ephemeral external IP address for each instance.

  1. Edit the firewall rule to restrict HTTP traffic so that the web application can only be accessed via the load balancer:

    1. Go to the Firewalls page in the GCP Console.
      Go to the Firewalls page
    2. Under Name, click allow-web-app-http.
    3. Click Edit.
    4. Modify the Source IP ranges to only allow health check probes:

      1. Delete 0.0.0.0/0.
      2. On the same line, enter 130.211.0.0/22 and press Tab.
      3. On the same line, enter 35.191.0.0/16and press Tab.
    5. Click Save.

  2. Verify that you cannot connect to the web application via the ephemeral external IP address for a specific instance:

    1. Go to the Instance groups page in the GCP Console.
      Go to the Instance groups page
    2. Click load-balancing-web-app-group to see the instances in that group.
    3. Under External IP, click an IP address to connect that instance. A new browser tab opens, but the web application does not open. (Eventually, the page will show a timeout error).

      When you are done, close the browser tab for the instance.

  3. Verify that you can connect to the web application via the load balancer:

    1. Go to the Load balancing page in the GCP Console.
      Go to the Load balancing page
    2. Under Name, click web-app-load-balancer to expand the load balancer you just created.
    3. To connect to the web-app via the external static IP addresses, look under Frontend and IP:Port, and copy the IP address. Then, open a new browser tab and paste the IP address into the address bar. This should display the demo web application:

      Screenshot of the demo web application.

      Notice that, whenever you refresh the page, the load balancer connects to different instances in different zones. This happens because you are not connecting to an instance directly; you are connecting to the load balancer, which selects the instance you are redirected to.

      When you are done, close the browser tab for the demo web application.

Cleaning up

After you've finished the load balancing tutorial, you can clean up the resources that you created on GCP so they won't take up quota and you won't be billed for them in the future. The following sections describe how to delete or turn off these resources.

If you created a separate project for this tutorial, delete the entire project. Otherwise, if the project has resources that you want to keep, only delete the resources created in this tutorial.

Deleting the project

  1. In the GCP Console, go to the Manage resources page.

    Go to the Manage resources page

  2. In the project list, select the project you want to delete and click Delete .
  3. In the dialog, type the project ID, and then click Shut down to delete the project.

Deleting specific resources

Deleting the load balancer

  1. Go to the Load balancing page in the GCP Console.
    Go to the Load balancing page
  2. Click the checkbox next to web-app-load-balancer.
  3. Click Delete at the top of the page.
  4. In the new window, select all checkboxes. Then, click Delete load balancer and selected resources to confirm the deletion.

Deleting the static external IP address

  1. Go to the External IP addresses page in the GCP Console.
    Go to the External IP addresses page
  2. Click the checkbox next to web-app-ipv4.
  3. Click Release static address at the top of the page. In the new window, click Delete to confirm the deletion.

Deleting the instance group

  1. In the GCP Console, go to the Instance groups page.

    Go to the Instance groups page

  2. Click the checkbox for your load-balancing-web-app-group instance group.
  3. Click Delete to delete the instance group.

Deleting the instance template

  1. Go to the Instance Templates page in the GCP Console.

    Go to the Instance templates page

  2. Click the checkbox next to load-balancing-web-app-template.

  3. Click Delete at the top of the page. In the new window, click Delete to confirm the deletion.

Deleting the VPC network

  1. Go to the VPC networks page in the GCP Console.

    Go to the VPC networks page

  2. Click web-app-vpc.

  3. Click Delete at the top of the page. In the new window, click Delete to confirm the deletion.

What's next

क्या यह पेज उपयोगी था? हमारे काम के बारे में अपनी राय दें:

निम्न के बारे में फ़ीडबैक भेजें...