Compute Engine

Load Balancing

Google Compute Engine offers server-side load balancing so you can distribute incoming network traffic across multiple virtual machine instances. Load balancing provides the following benefits with either network load balancing or HTTP(S) load balancing:

  • Scale your application
  • Support heavy traffic
  • Detect unhealthy virtual machines instances
  • Balance loads across regions
  • Route traffic to the closest virtual machine
  • Support content-based routing

Google Compute Engine load balancing uses forwarding rule resources to match certain types of traffic and forward it to a load balancer. For example, a forwarding rule can match TCP traffic destined to port 80 IP address Such traffic is matched by this forwarding rule and then directed to healthy virtual machine instances.

Compute Engine load balancing is a managed service, which means its components are redundant and highly available. If a load balancing component fails, it is restarted or replaced automatically and immediately.

Google offers two types of load balancing that differ in capabilities, usage scenarios, and how you configure them. The scenarios below can help you decide whether Network or HTTP(S) best meet your needs.


The gcloud command line interface

For both network and HTTP(S) load balancing, you will use the gcloud command to configure services:

  1. Install the Cloud SDK, which installs the gcloud command and other Cloud tools.

  2. Authenticate with the gcloud command. You can also specify a default project to use for gcloud commands.

    $ gcloud auth login
    $ gcloud config set project PROJECT
  3. Install the gcloud preview component:

    $ gcloud components update preview


The following situations are examples of load balancing and the type of load balancing configuration that you would need in each scenario.

Network load balancing

Network load

Assume that you are running a website on Apache and you are starting to get a high enough level of traffic and load that you need to add additional Apache instances to help respond to this load. You can add additional Google Compute Engine instances and configure load balancing to spread the load between these instances. In this situation, you would serve the same content from each of the instances. As your site becomes more popular, you would continue increasing the number of instances that are available to respond to requests.

In this situation, you would choose network load balancing to map incoming TCP/IP requests on port 80 (forwarding rules) to your target pool of virtual machines in the same region. You would configure a forwarding rule resource, a target pool that lists the instances to receive traffic, and health checks rules.

Get started with network load balancing

HTTP(S) load balancing

Cross-region load balancing

Representation of
  cross-region load balancing

The network load balancing scenario above scales well for a single region, but to extend the service across regions, you would need to employ unwieldy and sometimes problematic solutions. By using HTTP(S) load balancing in this situation, you can use a global IP address that is a special IP that can intelligently route users based on proximity. You can increase performance and system reliability for a global user base by defining a simple topology.

In this situation, you define global forwarding rules that map to a target HTTP(S) proxy, which routes requests to the closest instances within a back-end service. The back-end service resource defines the groups of instances that are able to handle the requests.

Get started with cross-region load balancing

Content-based load balancing

Representation of
  content-based load balancing

Content-based or content-aware load balancing uses HTTP(S) load balancing to distribute traffic to different instances based on the incoming HTTP(S) URI. For example, you have a site that serves static content (CSS, images), dynamic content, and video uploads. You can architect your load balancing configuration to serve the traffic from different sets of instances that are optimized for the type of content that they are serving.

In this situation, your global forwarding rules map to a target HTTP(S) proxy, which checks requests against a URL map to determine which back-end service is appropriate for the request. The back-end service then distributes the request to one of its resource groups that has one or more virtual machine instances.

Get started with content-based load balancing

Content-based and cross-region load-balancing can work together by using multiple backend services and multiple regions. You can build on top of the scenarios above to configure your own load balancing configuration that meets your application's needs.


Load Balancing and Protocol Forwarding

All regions
Hourly service charge $0.025 (5 rules included)
$0.010 per additional rule
Per GB of data processed $0.008