Globally Autoscaling a Web Service on Compute Engine

This tutorial shows how to set up a globally available web service with regional Compute Engine managed instance groups that automatically scale to meet capacity needs. You can use the techniques shown in this tutorial for implementing your own globally distributed and scalable project on Compute Engine.

Objectives

  • Deploy multiple regional Compute Engine managed instance groups with autoscaling enabled.
  • Create a cross-region load balancer.
  • Generate test traffic from different regions across the globe.
  • Use the Google Cloud Platform Console to visualize how the load balancer routes requests and how the instance groups autoscale to meet demand.

Costs

This tutorial uses billable components of GCP including:

  • Compute Engine

Before you begin

  1. Select or create a GCP project.

    Go to the Manage resources page

  2. Make sure that billing is enabled for your project.

    Learn how to enable billing

  3. Enable the Compute Engine API.

    Enable the API

Application architecture

The application includes the following Compute Engine components:

  1. Instance template: A template used to create each instance in the instance groups.
  2. Instance groups: Multiple instance groups that autoscale based on incoming traffic.
  3. Load balancer: An HTTP load balancer that distributes traffic among the instance groups.
  4. Instances: Multiple testing instances to generate test traffic from different parts of the globe.

System architecture diagram showing a load balancer with multiple regional instance groups

Set up the web service

Create the instance groups

Console

  1. Create a network for the instance groups.

    1. Go to the VPC networks page in the GCP Console.
      Go to the VPC networks page
    2. Click Create VPC Network.
    3. Set the Name to fortressnet.
    4. Set Subnet creation mode to Automatic.
    5. Click Create at the bottom of the page.
  2. Create a firewall rule for the network. This rule will allow all HTTP requests sent to your instances.

    1. Go to the Firewall rules page in the GCP Console.
      Go to the Firewall rules page
    2. Click Create Firewall Rule.
    3. Set the Name to fortressnet-allow-http.
    4. For Network select fortressnet.
    5. For Targets select All instances in the network.
    6. Set Source IP ranges to 0.0.0.0/0.
    7. Under Protocols and ports select tcp and enter 80.
    8. Click Create.
  3. Create an instance template. Include a startup script that starts up a simple Apache web server on each instance.

    1. Go to the Instance templates page in the GCP Console.
      Go to the Instance templates page
    2. Click Create instance template.
    3. Set the Name to fort-template.
    4. For Machine type select micro (f1-micro).
    5. Click Management, security, disks, networking, sole tenancy to to reveal advanced settings. You should see a number of tabs.
    6. Click the Networking tab.
    7. For Network select fortressnet.
    8. Click the Management tab.
    9. Under Automation enter the following Startup script:

      apt-get update && apt-get install -y apache2
      

    10. Click Create at the bottom of the page.

  4. Create multiple regional managed instance groups using the instance template. Configure autoscaling for each instance group.

    1. Go to the Instance groups page in the GCP Console.
      Go to the Instance groups page
    2. Click Create instance group template.
    3. Set the Name to us-central1-pool.
    4. For Location select Multi-zone.
    5. For Region select us-central1.
    6. For Instance template select fort-template.
    7. For Autoscaling select On.
    8. For Autoscale based on select HTTP load balancing usage.
    9. Set Target load balancing usage to 80.
    10. Set Minimum number of instances to 1.
    11. Set Maximum number of instances to 5.
    12. Click Create.
    13. Repeat these steps to create two more instance groups with the following changes:
      • Create a group with Name as europe-west1-pool and Region as europe-west1.
      • Create a group with Name as asia-east1-pool and Region as asia-east1.
  5. (Optional) Verify the instances are healthy and serving HTTP traffic. Test the external IP address of one or more instances. You might need to wait a minute for the instances to finish the startup process.

    1. Go to the VM instances page in the GCP Console.
      Go to the VM instances page.
    2. Verify each running instances has a green check mark under the Name column.
    3. Copy an instance's External IP and paste it into a web browser.

    You should see the 'Apache2 Debian Default Page' web page.

    If it doesn't seem to work, try waiting a few moments.

gcloud

  1. Create a network for the instance groups.

    gcloud compute networks create fortressnet --subnet-mode auto
    

  2. Create a firewall rule for the network. This rule will allow all HTTP requests sent to your instances.

    gcloud compute firewall-rules create fortressnet-allow-http \
        --network fortressnet \
        --allow tcp:80
    

  3. Create an instance template. Include a startup script that starts up a simple Apache web server on each instance.

    gcloud compute instance-templates create fort-template \
        --machine-type f1-micro \
        --network fortressnet \
        --metadata startup-script='apt-get update && apt-get install -y apache2'
    

  4. Create multiple regional managed instance groups using the instance template. Configure autoscaling for each instance group.

    gcloud compute instance-groups managed create us-central1-pool \
        --region us-central1 \
        --template fort-template \
        --size 1
    gcloud compute instance-groups managed set-autoscaling us-central1-pool \
        --region us-central1 \
        --min-num-replicas 1 \
        --max-num-replicas 5 \
        --scale-based-on-load-balancing \
        --target-load-balancing-utilization .8
    

    gcloud compute instance-groups managed create europe-west1-pool \
        --region europe-west1 \
        --template fort-template \
        --size 1
    gcloud compute instance-groups managed set-autoscaling europe-west1-pool \
        --region europe-west1 \
        --min-num-replicas 1 \
        --max-num-replicas 5 \
        --scale-based-on-load-balancing \
        --target-load-balancing-utilization .8
    

    gcloud compute instance-groups managed create asia-east1-pool \
        --region asia-east1 \
        --template fort-template \
        --size 1
    gcloud compute instance-groups managed set-autoscaling asia-east1-pool \
        --region asia-east1 \
        --min-num-replicas 1 \
        --max-num-replicas 5 \
        --scale-based-on-load-balancing \
        --target-load-balancing-utilization .8
    

  5. (Optional) Verify the instances are healthy and serving HTTP traffic. Test the external IP address of one or more instances. You might need to wait a minute for the instances to finish the startup process.

    1. List your instances.

      gcloud compute instances list
      

    2. Verify under the STATUS column that the instances are RUNNING.

    3. Check an instance by querying it's IP address under the EXTERNAL_IP column.
      curl http://[EXTERNAL_IP] | head
      

    You should see some HTML text, including the line <title>Apache2 Debian Default Page: It works</title>.

    If it doesn't seem to work, try waiting a few moments.

Configure the load balancer

The load balancer will distribute client requests among your multiple backends.

Console

Starting the load balancer configuration

  1. Go to the Load balancing page in the GCP Console.
    Go to the Load balancing page.
  2. Click Create load balancer.
  3. Under HTTP(S) load balancing click Start configuration.
  4. Set the Name as fortressnet-balancer.

Backend configuration

  1. On the New HTTP(S) load balancer page, click Backend configuration.
  2. In the Create or select backend services & backend buckets pull-down menu, select Backend services, then Create a backend service. You should see the Create Backend Service dialog box.
  3. Set the Name of the backend service to fortressnet-backend-service.
  4. Under the New backend dialog box, set Instance group to asia-east1-pool.
  5. For Balancing mode select Rate.
  6. Set Maximum RPS to 100 RPS per instance.
  7. Click Done.
  8. Click Add backend.
  9. Under the New backend dialog box, set Instance group to europe-west1-pool.
  10. For Balancing mode select Rate.
  11. Set Maximum RPS to 100 RPS per instance.
  12. Click Done.
  13. Click Add backend.
  14. Under the New backend dialog box, set Instance group to us-central1-pool.
  15. For Balancing mode select Rate.
  16. Set Maximum RPS to 100 RPS per instance.
  17. Click Done.
  18. Under Health check, select Create a health check.
  19. Set the Name to http-basic-check.
  20. For Protocol select HTTP.
  21. Set Port to 80.
  22. Click Save and continue.
  23. Click Create.

Host and path rules

  1. On the left panel of the New HTTP(S) load balancer page, click Host and path rules.
    For this example, we don't need to configure any host or path rules since all traffic will go to the default rule. So, we can accept the pre-populated default values.

Frontend configuration

  1. On the left panel of the New HTTP(S) load balancer page, click Frontend configuration.
  2. Set Name to fortressnet-http-rule.
  3. For IP version select IPv4.
  4. For IP address select Create IP address.
  5. In the Reserve a new static IP dialog box, set Name to fortressnet-ip.
  6. Click Reserve and wait a few moments.
  7. Click Done at the bottom of the New Frontend IP and port dialog box.
  8. Click Add frontend IP and port.
  9. Set Name to fortressnet-http-ipv6-rule.
  10. For IP version select IPv6.
  11. For IP address select Create IP address.
  12. In the dialog box, set Name to fortressnet-ipv6.
  13. Click Reserve and wait a few moments.
  14. Click Done at the bottom of the New Frontend IP and port dialog box.

Review and finalize

  1. On the left panel of the New HTTP(S) load balancer page, click Review and finalize.
  2. Compare your settings to what you intended to create.
  3. If the settings are correct, click Create at the bottom of the left panel. You are returned to the Load Balancing screen. After the load balancer is created, a green check mark next to it indicates that it is running.

gcloud

Backend configuration

  1. Create a basic health check. This will check whether a load balancer backend is responding to HTTP requests.

    gcloud compute health-checks create http http-basic-check
    

  2. Create a global backend service. This backend service will receive HTTP traffic from the load balancer.

    gcloud compute backend-services create fortressnet-backend-service \
        --protocol HTTP \
        --health-checks http-basic-check \
        --global
    

  3. Add the instance groups as regional backends of the backend service. This configuration will distribute traffic among the backends based on a maximum number of requests per second (RPS) per instance.

    gcloud compute backend-services add-backend fortressnet-backend-service \
        --balancing-mode RATE \
        --max-rate-per-instance 100 \
        --instance-group us-central1-pool \
        --instance-group-region us-central1 \
        --global
    gcloud compute backend-services add-backend fortressnet-backend-service \
        --balancing-mode RATE \
        --max-rate-per-instance 100 \
        --instance-group europe-west1-pool \
        --instance-group-region europe-west1 \
        --global
    gcloud compute backend-services add-backend fortressnet-backend-service \
        --balancing-mode RATE \
        --max-rate-per-instance 100 \
        --instance-group asia-east1-pool \
        --instance-group-region asia-east1 \
        --global
    

Host and path rules

  1. Define a URL map. URL maps route different URLs to different backend services. Since we only have one backend service, we'll simply set that backend service as the default service for all URLs.

    gcloud compute url-maps create fortressnet-balancer \
        --default-service fortressnet-backend-service
    

  2. Create an HTTP proxy route. HTTP proxy routes accept HTTP requests and route them according to your URL map. In this case, it will send all requests to your single backend service.

    gcloud compute target-http-proxies create fortressnet-http-proxy \
        --url-map fortressnet-balancer
    

Frontend configuration

  1. Create two global static external IP addresses: one for IPV4 and one for IPV6. These will be the global external IP addresses of the load balancer.

    gcloud compute addresses create fortressnet-ip \
        --ip-version IPV4 \
        --global
    gcloud compute addresses create fortressnet-ipv6 \
        --ip-version IPV6 \
        --global
    

  2. Lookup the external IP addresses of the load balancer.

    gcloud compute addresses list
    

  3. Create global forwarding rules for the external IP addresses. This will forward both IPV4 and IVP6 HTTP requests to your HTTP proxy.

    gcloud compute forwarding-rules create fortressnet-http-rule \
        --global \
        --target-http-proxy fortressnet-http-proxy \
        --ports 80 \
        --address [LOAD_BALANCER_IP_ADDRESS]
    
    gcloud compute forwarding-rules create fortressnet-http-ipv6-rule \
        --global \
        --target-http-proxy fortressnet-http-proxy \
        --ports 80 \
        --address [LOAD_BALANCER_IPV6_ADDRESS]
    

(Optional) Verify the load balancer is working. You may need to wait a minute or three.

Console

  1. Go to the Load balancing page in the GCP Console.
    Go to the Load Balancing page.
  2. Wait for fortressnet-balancer to have a green check mark under the Backends column.
  3. Click on fortressnet-balancer.
  4. Under Frontend copy the IPV4 address under the IP:Port column. (IPV4 addresses are of the form www.xxx.yyy.zzz. You don't need the trailing port number :nn.) If the Frontend section is missing, try waiting a few moments and then reloading the web page.
  5. Enter the IP address in a web browser.

You should see the 'Apache2 Debian Default Page' web page.

If you get an 'Error 404 (Not Found)' web page instead, try waiting a few more minutes.

gcloud

  1. Lookup the external IP addresses of the load balancer.

    gcloud compute addresses list
    

  2. Query the IPV4 address. (IPV4 addresses are of the form www.xxx.yyy.zzz.)

    curl http://[LOAD_BALANCER_IP_ADDRESS] | head
    

You should see some HTML text, including the line <title>Apache2 Debian Default Page: It works</title>.

If you see <title>Error 404 (Not Found)!!1</title> instead, try waiting a few more minutes.

Best Practice: Create a secure firewall to allow only internal traffic from the load balancer and the health check. Then delete the original firewall that allowed any HTTP request. This prevents individual instances from being acessible by outside clients.

Console

  1. Create a new firewall only allowing traffic from the load balancer and the health check.

    1. Go to the Firewall rules page in the GCP Console.
      Go to the Firewall rules page
    2. Click Create Firewall Rule.
    3. Set the Name to fortressnet-allow-load-balancer.
    4. For Network select fortressnet.
    5. For Targets select All instances in the network.
    6. For Source IP ranges type 130.211.0.0/22 and press the Enter key, then type 35.191.0.0/16 and press Enter again.
    7. Under Protocols and ports select tcp and enter 80.
    8. Click Create.
  2. Delete the old allow-everything firewall.

    1. Select the checkmark next to fortressnet-allow-http.
    2. Click Delete at the top of the page.
    3. In the dialog box, click Delete.

gcloud

  1. Create a new firewall only allowing traffic from the load balancer and the health check.

    gcloud compute firewall-rules create fortressnet-allow-load-balancer \
        --network fortressnet \
        --source-ranges 130.211.0.0/22,35.191.0.0/16 \
        --allow tcp:80
    

  2. Delete the old allow-everything firewall.

    gcloud compute firewall-rules delete fortressnet-allow-http -q
    

(Optional) Verify that autoscaling and load balancing works

Generate some test traffic

Suppose it is morning in Europe and your web service suddenly goes viral on the internet. Generate a high number of client requests all at once from Europe.

Console

  1. Create an instance installed with the Siege load testing tool.

    1. Go to the VM instances page in the GCP Console.
      Go to the VM instances page.
    2. Click Create instance.
    3. Set the Name to europe-loadtest.
    4. For Region select europe-west1.
    5. Click Management, security, disks, networking, sole tenancy to to reveal advanced settings. You should see a number of tabs.
    6. Click the Management tab.
    7. Under Automation enter the following Startup script:

      apt-get install -y siege
      

    8. Click Create at the bottom of the page.

  2. Get the IPV4 address of the load balancer.

    1. Go to the Load balancing page in the GCP Console.
      Go to the Load balancing page.
    2. Click on fortressnet-balancer.
    3. Under Frontend copy the IPV4 address under the IP:Port column. (IPV4 addresses are of the form www.xxx.yyy.zzz.)
  3. SSH into the load testing instance.

    1. Go to the VM instances page in the GCP Console.
      Go to the VM instances page.
    2. Wait for the europe-loadtest instance to have a green checkmark under the Name column.
    3. Click SSH on europe-loadtest under the Connect column.
  4. Start siege. Target the IPV4 address of the load balancer.

    siege -c150 http://[LOAD_BALANCER_IP_ADDRESS]
    

gcloud

  1. Create an instance installed with the Siege load testing tool.

    gcloud compute instances create europe-loadtest \
        --network default \
        --zone europe-west1-c \
        --metadata startup-script='apt-get -y install siege'
    

  2. Get the IPV4 address of the load balancer.

    gcloud compute addresses list
    

  3. Open a new shell session where the gcloud command is available.

    1. In your new shell session, SSH into the load testing instance.

      gcloud compute ssh --zone europe-west1-c europe-loadtest
      

    2. Start siege. Target the IPV4 address of the load balancer.

      siege -c150 http://[LOAD_BALANCER_IP_ADDRESS]
      

After running the siege command you should see output declaring The server is now under siege...

[alert] Zip encoding disabled; siege requires zlib support to enable it
** SIEGE 4.0.2
** Preparing 150 concurrent users for battle.
The server is now under siege...

Monitor load balancing and autoscaling

  1. Go to the Load balancing page in the GCP Console.
    Go to the Load Balancing Page
  2. Click on the load balancer named fortressnet-balancer.
  3. Click the Monitoring tab.
  4. In the Backend drop-down, select fortressnet-backend-service.

It may take up to ten minutes to display enough data. Soon you should see a display similar to the following:

GCP Console monitoring display showing requests from Europe distributed equally among all three backends

What's happening here:

  1. The load test starts sending a large amount of traffic all at once. At first, the load balancer distributes requests equally among the three backends. The number of requests quickly exceeds your autoscaling limits, and may even cause your servers to return Backend 5xx errors which will show up on the monitoring display. The autoscaler starts to spin up additional instances as needed.

  2. Autoscaling catches up with capacity needs. To minimize request latency, Compute Engine load balancers try to route requests to the backend that is closest to the client. In this case, since the load test traffic originates from Europe, the load balancer prefers to route more requests to the Europe backend. As a result, autoscaling may spin up more instances in the Europe backend to handle a higher fraction of requests.

Generate test traffic somewhere else

Suppose your web service also catches on in Asia with the afternoon internet crowd. Generate a high number of requests from Asia.

Console

  1. Create another instance installed with the Siege load testing tool.

    1. Go to the VM instances page in the GCP Console.
      Go to the VM instances page.
    2. Click Create instance.
    3. Set the Name to asia-loadtest.
    4. For Region select asia-east1.
    5. Click Management, security, disks, networking, sole tenancy to to reveal advanced settings. You should see a number of tabs.
    6. Click the Management tab.
    7. Under Automation enter the following Startup script:

      apt-get install -y siege
      

    8. Click Create at the bottom of the page.

  2. Get the IP address of the load balancer.

    1. Go to the Load balancing page in the GCP Console.
      Go to the Load balancing page.
    2. Click on fortressnet-balancer.
    3. Under Frontend copy the IPV4 address under the IP:Port column. (IPV4 addresses are of the form www.xxx.yyy.zzz.)
  3. SSH into the load testing instance.

    1. Wait for the asia-loadtest instance to have a green checkmark under the Name column.
    2. Click SSH on asia-loadtest under the Connect column.
  4. Start siege. Target the IPV4 address of the load balancer.

    siege -c150 http://[LOAD_BALANCER_IP_ADDRESS]
    

gcloud

  1. In your original shell session, create another instance installed with the Siege load testing tool.

    gcloud compute instances create asia-loadtest \
        --network default \
        --zone asia-east1-c \
        --metadata startup-script='apt-get -y install siege'
    

  2. Get the IPV4 address of the load balancer.

    gcloud compute addresses list
    

  3. Open a new shell session where the gcloud command is available.

    1. In your new shell session, SSH into the load testing instance.

      gcloud compute ssh --zone asia-east1-c asia-loadtest
      

    2. Start siege. Target the IPV4 address of the load balancer.

      siege -c150 http://[LOAD_BALANCER_IP_ADDRESS]
      

Again, you should see output declaring The server is now under siege...

[alert] Zip encoding disabled; siege requires zlib support to enable it
** SIEGE 4.0.2
** Preparing 150 concurrent users for battle.
The server is now under siege...

Monitor load balancing and autoscaling

Go back to the load balancing monitoring display from last time. It may take up to ten minutes to display enough new data. Soon you should see a display similar to the following:

GCP Console monitoring display showing requests from Europe and Asia distributed among all three backends

What's happening here:

  1. Again, the load test sends another large number of requests all at once. At first the load balancer distributes requests equally among the existing three backends. As the number of requests exceeds your autoscaling limits, the autoscaler starts to spin up additional instances as needed.

  2. Autoscaling catches up with the new capacity needs. The load balancer still prefers to route requests to the closest backends possible. As a result, eventually the Asia backend will receive requests mostly from Asia, the Europe backend will receive requests mostly from Europe, and the US backend will receive everything leftover.

Cleaning up

After you've finished the autoscaling tutorial, you can clean up the resources you created on Google Cloud Platform so you won't be billed for them in the future. The following sections describe how to delete or turn off these resources.

Deleting the project

The easiest way to eliminate billing is to delete the project you created for the tutorial.

To delete the project:

  1. In the GCP Console, go to the Projects page.

    Go to the Projects page

  2. In the project list, select the project you want to delete and click Delete project. After selecting the checkbox next to the project name, click
      Delete project
  3. In the dialog, type the project ID, and then click Shut down to delete the project.

Deleting instances

To delete a Compute Engine instance:

  1. In the GCP Console, go to the VM Instances page.

    Go to the VM Instances page

  2. Click the checkbox next to the instance you want to delete.
  3. Click the Delete button at the top of the page to delete the instance.

What's next

Was this page helpful? Let us know how we did:

Send feedback about...

Compute Engine Documentation