Setting Up Internal Load Balancing

Internal load balancing enables you to run and scale your services behind a private load balancing IP address which is accessible only to instances internal to your Virtual Private Cloud (VPC).

For a quick introduction to Internal load balancing, see Internal Load Balancing in 5 mins.

Overview

Google Cloud Platform (GCP) now offers Internal load balancing for your TCP/UDP-based traffic. Internal load balancing enables you to run and scale your services behind a private load balancing IP address that is accessible only to your internal virtual machine instances.

Before the availability of Internal load balancing, you had two options to achieve this:

  • Option 1: Configure an external IP address and restrict access to the virtual machine instance so that only your instances could reach it. This added to the configuration complexity.
  • Option 2: Configure a proxy instance with internal IP to proxy the requests to backends. This was complex to configure and manage. Furthermore, throughput was limited by the proxy instance’s bandwidth and capacity. This workaround lacked all the advantages of a managed load balancing service.

With Internal load balancing, you can configure an Internal load balancing IP to act as the frontend to your private backend instances. You no longer need a public IP for your load balanced service. Your internal client requests stay internal to your VPC network and region, likely resulting in lowered latency since all your load-balanced traffic will stay within Google’s network. Overall, your configuration becomes simpler.

Internal load balancing works with auto mode VPC networks, custom mode VPC networks, and legacy networks. Internal load balancing can also be implemented with regional managed instance groups. This allows you to autoscale across a region, making your service immune to zonal failures.

This rest of this user guide walks you through the features and configuration for Internal load balancing for TCP/UDP.

About Internal load balancing

Internal load balancing enables you to support use cases such as the traditional 3-tier web services, where your web tier uses external HTTP(S) or external TCP/UDP load balancing and your instances running the application tier or backend databases are deployed behind Internal load balancing.

3-tier web app with HTTP(S) load balancing and Internal load balancing (click to enlarge)
3-tier web app with HTTP(S) load balancing and Internal load balancing (click to enlarge)

With Internal load balancing, you can:

  • Load balance TCP/UDP traffic using a private frontend IP
    • You can configure the Internal load-balancing IP from within your VPC network.
  • Load balance across instances in a region
    • Allows you to instantiate instances in multiple availability zones within the same region.
  • Configure health checking for your backends
    • Backend instances are health checked by GCP health checking systems
    • You can configure a TCP, SSL(TLS), HTTP, or HTTPS health check
  • Get all the benefits of a fully managed load balancing service that scales as needed to handle client traffic.
    • You no longer have to worry about load balancer availability or the load balancer being a choke point

Architecture

Internal load balancing can be implemented in a variety of ways, such as a proxy.

In the traditional proxy model of Internal load balancing, as shown below on the left, you configure an internal IP on a load balancing device or instance(s) and your client instance connects to this IP. Traffic coming to the IP is terminated at the load balancer. The load balancer selects a backend and establishes a new connection to it. In effect, there are two connections: Client<->Load Balancer and Load Balancer<->Backend.

Internal load balancing for TCP/UDP architecture (click to enlarge)
Internal load balancing for TCP/UDP architecture (click to enlarge)

GCP Internal load balancing distributes client instance requests to the backends using a different approach, as shown on the right. It uses lightweight load-balancing built on top of Andromeda network virtualization stack to provide software-defined load balancing that directly delivers the traffic from the client instance to a backend instance.

Internal load balancing is not a device or an instance-based solution, but software-defined, fully distributed load balancing.

Deploying Internal load balancing

With Internal load balancing, you configure a private RFC1918 address as your load balancing IP and configure backend instance groups to handle requests coming to this load balancing IP from client instances.

The backend instance groups are zonal in scope, so you can configure instance groups in multiple zones in line with your availability requirements. All the instances must belong to the same VPC network and region, but can be in different subnets. The client instances originating traffic to the Internal load balancing IP must belong to the same VPC network and region, but can be in different subnets, as the load balancing IP and the backends.

An Internal load balancing deployment example is shown below:

Internal load balancing deployment (click to enlarge)
Internal load balancing deployment (click to enlarge)

In the above example, you have a VPC network called my-internal-app, comprised of a single subnet A (10.10.0.0/16) in the us-central region. You have two backend instance groups to provide availability across two zones. The load balancing IP ( i.e., the forwarding rule IP) 10.10.10.1 is selected from the same VPC network. An instance, 10.10.100.2, in the network my-internal-app sends a request to the load balancing IP 10.10.10.1. This request gets load balanced to an instance in one of the instance groups, IG1 and IG2.

Configuration details for the above deployment are described below.

Load balancing IP address

With Internal load balancing, the load balancing IP is a private RFC1918 address.

You can assign the IP address of an Internal load balancer, such as the forwarding rule IP, in one of the following ways:

  • You select your Internal load balancing IP

    You can specify an unallocated IP address from the region the forwarding rule is associated with as the load balancing IP. This IP address can be from any subnet in that region that is part of the overall VPC network. If you are configuring Internal load balancing in a legacy network, then you can use any unused IP in the network.

    You need to manually determine which IPs are already in use by listing all existing instance IP addresses and other forwarding rule IP addresses for the VPC network/subnet.

    You can select an internal load balancing IP by specifying an ephemeral internal IP or you can reserve a static internal IP address that remains reserved to the project until you remove it.

    Once you select and specify the internal IP address for your forwarding rule, it will remain allocated as long as the forwarding rule exists. When the forwarding rule is deleted, its IP address will either return back to the available pool of IPs for the VPC network and may get allocated to an instance or another forwarding rule, or it returns back to the project if it is a static internal IP address, available for you to assign to another resource.

  • Load balancing auto-allocates your frontend IP

    You can have the IP be allocated automatically by creating a forwarding rule without specifying an IP address. In this case, GCP will assign an unallocated internal IP address to the forwarding rule from the VPC network/subnet that it is associated with. The IP address will remain allocated only as long as the forwarding rule exists.

Internal load balancing selection algorithm

The backend instance for a client is selected using a hashing algorithm that takes instance health into consideration.

By default, the load balancer directs connections from a client to a backend instance using a 5-tuple hash, which uses the following five parameters for hashing:

  • client source IP
  • client port
  • destination IP (the load balancing IP)
  • destination port
  • protocol (either TCP or UDP).

If you wish the load balancer to direct all traffic from a client to a specific backend instance, then use one of the following options for session affinity:

  • Hash based on 3-tuple (Client IP, Dest IP, Protocol)
  • Hash based on 2-tuple (Client IP, Dest IP)

The Session affinity section provides more details on these options.

Session affinity

As described in the previous section on selection algorithm, the default behavior is that connections from a client get load balanced across all backend instance using a 5-tuple hash. The 5-tuple hash uses the client source IP, client port, destination IP (load balancing IP), destination port, and the protocol (either TCP or UDP).

However, in many instances such as the case where web applications store state locally on the instance and require all the traffic from a client to be load balanced to the same backend instance, you want all traffic from the client instance to be load balanced to the same backend instance. In absence of this capability, the traffic might fail or end up being serviced sub-optimally.

With Internal load balancing, you can enable all traffic from a client to stick to a specific backend instance by enabling the affinity feature.

You can enable the following types of affinity:

  • Hash based on 3-tuple (client_ip_proto) (Client IP, Dest IP, Protocol)
    • Use this affinity if you want all traffic from a client to be directed to the same backend instance based on a hash of the above three parameters.
  • Hash based on 2-tuple (client_ip) (Client IP, Dest IP)
    • Use this affinity if you want all traffic from a client irrespective of the protocol to be directed to the same backend instance based on a hash of the above two parameters.

In general, if you enable 3- or 2- tuple affinity, your client traffic will be load balanced to the same backend, but overall the traffic may not be as evenly distributed as the default 5-tuple hash. In general, a given connection will stay on the same backend instance as long as it is healthy.

Health Checking

Health checks determine which instances can receive new connections. The health checker probes instances at specified intervals. Instances that do not respond successfully to the probe a specified number of times in a row are marked as UNHEALTHY. No new connections are sent to the instance, though existing connections are allowed to continue. The health checker continues to poll unhealthy instances. If an instance responds successfully to the probes a specified number of times in a row, it is marked HEALTHY again and can receive new connections.

Internal load balancing supports four types of health checks:

  • TCP health checks
  • SSL (TLS) health checks
  • HTTP health checks
  • HTTPS health checks

If your traffic is HTTP(S), then HTTP(S) health checks provide the highest fidelity check because they verify that the web server is up and serving traffic, not just that the instance is healthy. Configure the SSL health checks if your traffic is not HTTPS but is encrypted via SSL(TLS). For all TCP traffic that is not HTTP(S) or SSL(TLS), you can configure a TCP health check.

For an HTTP(S) health check probe to be deemed successful, the instance must return a valid HTTP response with code 200 and close the connection normally within the configured period. If it does this a specified number of times in a row, the health check returns a status of HEALTHY for that instance. If an instance fails a specified number of health check probes in a row, it is marked UNHEALTHY without any notification being sent. UNHEALTHY instances do not receive new connections, but existing connections are allowed to continue. If an instance later passes a health check (successfully responds to a specified number of health check probes), it again starts receiving new connections, again without any notification.

When you configure the health check to be of type SSL, an SSL connection is opened to each of your instances. When you configure the health check to be of type TCP, a TCP connection is opened. In both cases, if the handshake is successful, the health check probe is considered to be successful. The health check is passed if a specified number of probes are successful, and it is failed if a specified number of handshakes in a row are unsuccessful. Existing connections are allowed to continue on instances that have failed their health check. A TCP or SSL health check probe can use one of the following checks:

  • Simple handshake health check (default): the health checker attempts a simple TCP or SSL handshake. If it is successful, the instance passes that round of the probe.
  • Request/response health check: you provide a request string for the health checker to send after completing the TCP or SSL handshake. If the instance returns the response string you've configured, the instance passes that round of the probe. Both the request and response strings can be up to 1024 bytes.

High availability

Internal load balancing is a fully managed Google Cloud Platform service, so you do not need any configuration to ensure high availability of the load balancer itself. It is a fully distributed service which will scale as needed to handle client traffic. Limits are described in the Limits section.

You can configure multiple instance groups to deliver high availability for your service. Instance groups are zonal in scope, so you can configure instance groups in multiple zones to guard against instance failures in a single zone.

When you configure more than one instance group, then the instances in all these instance groups are treated as one pool of instances and Internal load balancing distributes your user requests across the healthy instances in this group using the hash algorithm described in the Internal load balancing selection algorithm section.

Instance groups in multiple zones for HA (click to enlarge)
Instance groups in multiple zones for HA (click to enlarge)

In effect, you can think of your deployment as logically comprised of one large pool that spans one or more zones in the region where you have deployed Internal load balancing.

In the above diagram, assuming all instances are healthy, the client instances are load balanced to a pool composed of instances 1, 2, 3, and 4.

Setting up Internal load balancing

Setting up Internal load balancing (click to enlarge)
Setting up Internal load balancing (click to enlarge)

An Internal load balancing configuration consists of several components. All these components must belong to the same VPC network and region. The client instances originating traffic must belong to the same VPC network and region as the load balancing forwarding rule IP, backend services, and instance groups. The client instances do not have to be in the same subnet as the load balanced instances.

This example sets up an Internal load balancer for testing purposes.

We will configure the following:

  1. Four instances spread across two zones in the same region
  2. Instance groups for holding the instances
  3. Backend components, which include the following:
    • health check - used to monitor instance health
    • backend service - monitors instance groups and prevents them from exceeding configured usage
    • backends - hold the instance groups
  4. A forwarding rule, with an internal IP address, that sends user traffic to the proxy
  5. A firewall rule that allows traffic from the load balancer IP address range
  6. A standalone client instance that can be used to test the load balancer

After that, we'll test our configuration.

Configure a test environment

The following sections walk you through creating the components of the test environment.

Configure a VPC network and subnet

Internal load balancing works with auto mode VPC networks, custom mode VPC networks, and legacy networks. For this example, we'll use a custom mode VPC network.

Console


  1. Go to the VPC networks page in the Google Cloud Platform Console.
    Go to the VPC network page
  2. Click Create VPC network.
  3. Enter a Name of my-custom-network.
  4. Under subnets, enter a Name of my-custom-subnet.
  5. Set Region to us-central1.
  6. Enter an IP address range of 10.128.0.0/20.
  7. Click Create.

gcloud


Create a custom VPC network

gcloud compute networks create my-custom-network --subnet-mode custom

Created     [https://www.googleapis.com/compute/v1/projects/[PROJECT_ID]/global/networks/my-custom-network].
NAME               MODE    IPV4_RANGE  GATEWAY_IPV4
my-custom-network  custom

Create a new subnet in your custom VPC network

gcloud compute networks subnets create my-custom-subnet \
    --network my-custom-network \
    --range 10.128.0.0/20 \
    --region us-central1

Created [https://www.googleapis.com/compute/v1/projects/[PROJECT_ID]/regions/us-central1/subnetworks/my-custom-subnet].
NAME              REGION       NETWORK            RANGE
my-custom-subnet  us-central1  my-custom-network  10.128.0.0/20

  • my-custom-network - the name of the VPC network you create, in which you then create a custom subnet.
  • my-custom-subnet - the name of the subnet you are creating.
  • 10.128.0.0/20 - For the range, this can be any valid RFC1918 range that does not overlap another range in the same network. If you are creating your subnet in an existing VPC network, you may have to choose a different range.

Configure firewall rules

Instances on this VPC network will not be reachable until firewall rules are created. As an example, you can allow all internal traffic between instances and allow SSH, RDP, and ICMP traffic from all sources.

Console


Create firewall rule that allows all traffic within the subnet

  1. Go to the Firewall rules page in the Google Cloud Platform Console.
    Go to the Firewall rules page
  2. Click Create firewall rule.
  3. Enter a Name of allow-all-10-128-0-0-20.
  4. Set VPC network to my-custom-network.
  5. Set Source filter to IP ranges.
  6. Set Source IP ranges to 10.128.0.0/20.
  7. Set Specified protocols and ports to tcp;udp;icmp.
  8. Click Create.

Create firewall rule that allows SSH, RDP, and ICMP from anywhere

  1. Create a second firewall rule with a Name of allow-tcp22-tcp3389-icmp.
  2. Set VPC network to my-custom-network.
  3. Set Source filter to IP ranges.
  4. Set Source IP ranges to 0.0.0.0/0 (allow from any source).
  5. Set Specified protocols and ports to tcp:22;tcp:3389;icmp.
  6. Click Create.

gcloud


Create firewall rule that allows all traffic within the subnet

gcloud compute firewall-rules create allow-all-10-128-0-0-20 \
    --network my-custom-network \
    --allow tcp,udp,icmp \
    --source-ranges 10.128.0.0/20

Create firewall rule that allows SSH, RDP, and ICMP from anywhere

gcloud compute firewall-rules create allow-tcp22-tcp3389-icmp \
    --network my-custom-network \
    --allow tcp:22,tcp:3389,icmp

Configure instances and instance groups

A production system would normally use managed instance groups based on instance templates, but this setup is quicker for initial testing.

Create two instances in each zone

For testing purposes, we'll install Apache on each instance. These instance are all being created with a tag of int-lb. This tag is used later by the firewall rule.

The four instances are named ig-us-central1-1 through ig-us-central1-4.

Console


Create instances

  1. Go to the VM instances page in the Google Cloud Platform Console.
    Go to the VM instances page
  2. Click Create instance.
  3. Set Name to ig-us-central1-1.
  4. Set the Zone to us-central1-b.
  5. Click Management, disk, networking, SSH keys to to reveal advanced settings.
  6. Under Management, populate the Tags field with int-lb.
  7. Set the Startup script to
    sudo apt-get update
    sudo apt-get install apache2 -y
    sudo a2ensite default-ssl
    sudo a2enmod ssl
    sudo service apache2 restart
    echo '<!doctype html><html><body><h1>ig-us-central1-1</h1></body></html>' | sudo tee /var/www/html/index.html
  8. Under Networking, set VPC network to my-custom-network and subnet to my-custom-subnet.
  9. Leave the default values for rest of the fields.
  10. Click Create.
  11. Create ig-us-central1-2 with the same settings, except with Startup script set to
    sudo apt-get update
    sudo apt-get install apache2 -y
    sudo a2ensite default-ssl
    sudo a2enmod ssl
    sudo service apache2 restart
    echo '<!doctype html><html><body><h1>ig-us-central1-2</h1></body></html>' | sudo tee /var/www/html/index.html
  12. Create ig-us-central1-3 with the same settings, except with Zone set to us-central1-c and Startup script set to
    sudo apt-get update
    sudo apt-get install apache2 -y
    sudo a2ensite default-ssl
    sudo a2enmod ssl
    sudo service apache2 restart
    echo '<!doctype html><html><body><h1>ig-us-central1-3</h1></body></html>' | sudo tee /var/www/html/index.html
  13. Create ig-us-central1-4 with the same settings, except with Zone set to us-central1-c and Startup script set to
    sudo apt-get update
    sudo apt-get install apache2 -y
    sudo a2ensite default-ssl
    sudo a2enmod ssl
    sudo service apache2 restart
    echo '<!doctype html><html><body><h1>ig-us-central1-4</h1></body></html>' | sudo tee /var/www/html/index.html

gcloud


gcloud compute instances create ig-us-central1-1 \
    --image-family debian-9 \
    --image-project debian-cloud \
    --tags int-lb \
    --zone us-central1-b \
    --subnet my-custom-subnet \
    --metadata startup-script="#! /bin/bash
      apt-get update
      apt-get install apache2 -y
      a2ensite default-ssl
      a2enmod ssl
      service apache2 restart
      echo '<!doctype html><html><body><h1>ig-us-central1-1</h1></body></html>' | tee /var/www/html/index.html
      EOF"

Created [https://www.googleapis.com/compute/v1/projects/[PROJECT_ID]/zones/us-central1-b/instances/ig-us-central1-1].
NAME             ZONE          MACHINE_TYPE  PREEMPTIBLE INTERNAL_IP EXTERNAL_IP    STATUS
ig-us-central1-1 us-central1-b n1-standard-1             10.128.0.3  23.251.150.133 RUNNING

gcloud compute instances create ig-us-central1-2 \
    --image-family debian-9 \
    --image-project debian-cloud \
    --tags int-lb \
    --zone us-central1-b \
    --subnet my-custom-subnet \
    --metadata startup-script="#! /bin/bash
      apt-get update
      apt-get install apache2 -y
      a2ensite default-ssl
      a2enmod ssl
      service apache2 restart
      echo '<!doctype html><html><body><h1>ig-us-central1-2</h1></body></html>' | tee /var/www/html/index.html
      EOF"

Created [https://www.googleapis.com/compute/v1/projects/[PROJECT_ID]/zones/us-central1-b/instances/ig-us-central1-2].
NAME             ZONE          MACHINE_TYPE  PREEMPTIBLE INTERNAL_IP EXTERNAL_IP    STATUS
ig-us-central1-2 us-central1-b n1-standard-1             10.128.0.11 23.251.148.160 RUNNING

gcloud compute instances create ig-us-central1-3 \
    --image-family debian-9 \
    --image-project debian-cloud \
    --tags int-lb \
    --zone us-central1-c \
    --subnet my-custom-subnet \
    --metadata startup-script="#! /bin/bash
      apt-get update
      apt-get install apache2 -y
      a2ensite default-ssl
      a2enmod ssl
      service apache2 restart
      echo '<!doctype html><html><body><h1>ig-us-central1-3</h1></body></html>' | tee /var/www/html/index.html
      EOF"

Created [https://www.googleapis.com/compute/v1/projects/[PROJECT_ID]/zones/us-central1-c/instances/ig-us-central1-3].
NAME             ZONE          MACHINE_TYPE  PREEMPTIBLE INTERNAL_IP EXTERNAL_IP    STATUS
ig-us-central1-3 us-central1-c n1-standard-1             10.128.0.12 104.196.31.214 RUNNING

gcloud compute instances create ig-us-central1-4 \
    --image-family debian-9 \
    --image-project debian-cloud \
    --tags int-lb \
    --zone us-central1-c \
    --subnet my-custom-subnet \
    --metadata startup-script="#! /bin/bash
      apt-get update
      apt-get install apache2 -y
      a2ensite default-ssl
      a2enmod ssl
      service apache2 restart
      echo '<!doctype html><html><body><h1>ig-us-central1-4</h1></body></html>' | tee /var/www/html/index.html
      EOF"

Created [https://www.googleapis.com/compute/v1/projects/[PROJECT_ID]/zones/us-central1-c/instances/ig-us-central1-4].
NAME             ZONE          MACHINE_TYPE  PREEMPTIBLE INTERNAL_IP EXTERNAL_IP    STATUS
ig-us-central1-4 us-central1-c n1-standard-1             10.128.0.13 104.196.25.101 RUNNING

Create an instance group for each zone and add instances

Console


  1. Go to the Instance groups page in the Google Cloud Platform Console.
    Go to the Instance groups page
  2. Click Create instance group.
  3. Set the Name to us-ig1.
  4. Set the Zone to us-central1-b.
  5. Under Instance definition, click Select existing instances.
  6. Set VPC network to my-custom-network.
  7. Set subnet to my-custom-subnet.
  8. From VM instances select ig-us-central1-1 and ig-us-central1-2.
  9. Leave other settings as they are.
  10. Click Create.
  11. Repeat steps, but set the following:
    • Name: us-ig2
    • Zone: us-central1-c
    • Instances: ig-us-central1-3 and ig-us-central1-4.
  12. Confirm that you now have two instance groups, each with two instances.

gcloud


  1. Create the us-ig1 instance group.

    gcloud compute instance-groups unmanaged create us-ig1 \
        --zone us-central1-b
    

    Created [https://www.googleapis.com/compute/v1/projects/[PROJECT_ID]/zones/us-central1-b/instanceGroups/us-ig1].
    NAME   ZONE          NETWORK MANAGED INSTANCES
    us-ig1 us-central1-b                 0

  2. Add ig-us-central1-1 and ig-us-central1-2 to us-ig1

    gcloud compute instance-groups unmanaged add-instances us-ig1 \
        --instances ig-us-central1-1,ig-us-central1-2 --zone us-central1-b
    

    Updated [https://www.googleapis.com/compute/v1/projects/[PROJECT_ID]/zones/us-central1-b/instanceGroups/us-ig1].

  3. Create the us-ig2 instance group.

    gcloud compute instance-groups unmanaged create us-ig2 \
        --zone us-central1-c
    

    Created [https://www.googleapis.com/compute/v1/projects/[PROJECT_ID]/zones/us-central1-c/instanceGroups/us-ig2].
    NAME   ZONE          NETWORK MANAGED INSTANCES
    us-ig2 us-central1-c                  0

  4. Add ig-us-central1-3 and ig-us-central1-4 to us-ig2

    gcloud compute instance-groups unmanaged add-instances us-ig2 \
        --instances ig-us-central1-3,ig-us-central1-4 \
        --zone us-central1-c
    

    Updated [https://www.googleapis.com/compute/v1/projects/[PROJECT_ID]/zones/ig-us-central1-c/instanceGroups/us-ig2].

Configure the load balancer

Console


Create the load balancer and configure a backend service

  1. Go to the Load balancing page in the Google Cloud Platform Console.
    Go to the Load balancing page
  2. Click Create load balancer.
  3. Under TCP load balancing, click Start configuration.
  4. Under Internet facing or internal only select Only between my VMs.
  5. Click Continue.
  6. Set the Name to my-int-lb.

Configure backend services

  1. Click Backend configuration.
  2. The Name of the backend service appears as my-int-lb.
  3. Set Region to us-central1.
  4. Set VPC network to my-custom-network.
  5. Under Backends, select instance group us-ig1.
  6. Click Add backend.
  7. Select instance group us-ig2.
  8. Under Health check, select Create health check.
    1. Set the health check Name to my-tcp-health-check.
    2. Set Protocol to TCP.
    3. Leave the other settings the same.
    4. Click Save and continue.
  9. Verify that there is a blue check mark next to Backend configuration in the Google Cloud Platform Console. If not, double-check that you have completed all the steps above.

Configure frontend services

  1. Click Frontend configuration.
  2. Under subnet, select my-custom-subnet.
  3. Under IP address, select Automatic.
  4. Set Ports to 80.
  5. Verify that there is a blue check mark next to Frontend configuration in the Google Cloud Platform Console. If not, double-check that you have completed all the steps above.

Review and finalize

  1. Click Review and finalize.
  2. Double-check your settings.
  3. Click Create.

gcloud


Create a health check

gcloud compute health-checks create tcp my-tcp-health-check \
    --port 80

Created [https://www.googleapis.com/compute/v1/projects/[PROJECT_ID]/global/healthChecks/my-tcp-health-check].
NAME                PROTOCOL
my-tcp-health-check TCP

Create a backend service

gcloud compute backend-services create my-int-lb \
    --load-balancing-scheme internal \
    --region us-central1 \
    --health-checks my-tcp-health-check \
    --protocol tcp

Created [https://www.googleapis.com/compute/v1/projects/[PROJECT_ID]/regions/us-central1/backendServices/my-int-lb].
NAME               BACKENDS PROTOCOL
my-int-lb          TCP

Add instance groups to your backend service

Internal load balancing spreads incoming connections based on the session affinity setting. If session affinity is not set, the load balancer spreads all connections across all available instances as evenly as possible regardless of current load.

gcloud compute backend-services add-backend my-int-lb \
    --instance-group us-ig1 \
    --instance-group-zone us-central1-b \
    --region us-central1

Updated [https://www.googleapis.com/compute/v1/projects/[PROJECT_ID]/regions/us-central1/backendServices/my-int-lb].

gcloud compute backend-services add-backend my-int-lb \
    --instance-group us-ig2 \
    --instance-group-zone us-central1-c \
    --region us-central1

Updated [https://www.googleapis.com/compute/v1/projects/[PROJECT_ID]//regions/us-central1/backendServices/my-int-lb].

Create a forwarding rule

Create a forwarding rule to forward traffic to the Internal load balancer. Since this command does not specify an IP address for the load balancer, one is assigned automatically. If you wish to specify one, select any unused address from the region and VPC network and specify it using the --address flag.

gcloud compute forwarding-rules create my-int-lb-forwarding-rule \
    --load-balancing-scheme internal \
    --ports 80 \
    --network my-custom-network \
    --subnet my-custom-subnet \
    --region us-central1 \
    --backend-service my-int-lb

Created [https://www.googleapis.com/compute/v1/projects/[PROJECT_ID]/regions/us-central1/forwardingRules/my-int-lb-forwarding-rule].

IPAddress: 10.128.0.6 IPProtocol: TCP backendService: https://www.googleapis.com/compute/v1/projects/[PROJECT_ID]/regions/us-central1/backendServices/my-int-lb creationTimestamp: '2016-06-17T12:56:44.843-07:00' description: '' id: '6922319202683054867' kind: compute#forwardingRule loadBalancingScheme: INTERNAL name: my-int-lb-forwarding-rule network: https://www.googleapis.com/compute/v1/projects/[PROJECT_ID]/global/networks/my-custom-network ports: - '80' region: us-central1 selfLink: https://www.googleapis.com/compute/v1/projects/[PROJECT_ID]/regions/us-central1/forwardingRules/my-int-lb-forwarding-rule subnetwork: https://www.googleapis.com/compute/v1/projects/[PROJECT_ID]/regions/us-central1/subnetworks/my-custom-subnet

Configure a firewall rule to allow Internal load balancing

Configure one firewall rule to allow traffic to the load balancer and from the load balancer to the instances. Configure another to allow health check probes from the health checker

Console


  1. Go to the Firewall rules page in the Google Cloud Platform Console.
    Go to the Firewall rules page
  2. Click Create firewall rule.
  3. Set Name to allow-internal-lb.
  4. Set VPC network to my-custom-network.
  5. Set Target tags to int-lb.
  6. Set Source filter to IP ranges.
  7. Set Source IP ranges to 10.128.0.0/20.
  8. Set Specified protocols and ports to tcp:80;tcp:443.
  9. Click Create.
  10. Click Create firewall rule.
  11. Set Name to allow-health-check.
  12. Set VPC network to my-custom-network.
  13. Set Target tags to int-lb.
  14. Set Source filter to IP ranges.
  15. Set Source IP ranges to 130.211.0.0/22 and 35.191.0.0/16.
  16. Set Specified protocols and ports to tcp.
  17. Click Create.

gcloud


gcloud compute firewall-rules create allow-internal-lb \
    --network my-custom-network \
    --source-ranges 10.128.0.0/20 \
    --target-tags int-lb \
    --allow tcp:80,tcp:443

Created [https://www.googleapis.com/compute/v1/projects/[PROJECT_ID]/global/firewalls/allow-internal-lb].
NAME               NETWORK            SRC_RANGES     RULES           SRC_TAGS  TARGET_TAGS
allow-internal-lb  my-custom-network  10.128.0.0/20  tcp:80,tcp:443            int-lb

gcloud compute firewall-rules create allow-health-check \
    --network my-custom-network \
    --source-ranges 130.211.0.0/22,35.191.0.0/16 \
    --target-tags int-lb \
    --allow tcp

NAME                NETWORK            SRC_RANGES                    RULES  SRC_TAGS  TARGET_TAGS
allow-health-check  my-custom-network  130.211.0.0/22,35.191.0.0/16  tcp              int-lb

Create a standalone client instance

For testing purposes, create a standalone client instance in the same region as the load balancer.

Console


  1. Go to the VM instances page in the Google Cloud Platform Console.
    Go to the VM instances page
  2. Click Create instance.
  3. Set Name to standalone-instance-1.
  4. Set the Zone to us-central1-b.
  5. Click Management, disk, networking, SSH keys to to reveal advanced settings.
  6. Under Management, populate the Tags field with standalone.
  7. Under Networking, set VPC network to my-custom-network and subnet to my-custom-subnet.
  8. Click Create.

gcloud


gcloud compute instances create standalone-instance-1 \
    --image-family debian-9 \
    --image-project debian-cloud \
    --zone us-central1-b \
    --tags standalone \
    --subnet my-custom-subnet

Created [https://www.googleapis.com/compute/v1/projects/[PROJECT_ID]/zones/us-central1-b/instances/standalone-instance-1].
NAME                  ZONE          MACHINE_TYPE  PREEMPTIBLE INTERNAL_IP EXTERNAL_IP    STATUS
standalone-instance-1 us-central1-b n1-standard-1             10.128.0.8  23.251.150.133 RUNNING

Create a firewall rule to allow SSH connections to the standalone client instance

Console


  1. Go to the Firewall rules page in the Google Cloud Platform Console.
    Go to the Firewall rules page
  2. Click Create firewall rule.
  3. Enter a Name of allow-ssh-to-standalone.
  4. Set VPC network to my-custom-network.
  5. Set Target tags to standalone.
  6. Set Source filter to IP ranges.
  7. Set Source IP ranges to 0.0.0.0/0 (allow from any source).
  8. Set Specified protocols and ports to tcp:22.
  9. Click Create.

gcloud


gcloud compute firewall-rules create allow-ssh-to-standalone \
    --network my-custom-network \
    --target-tags standalone \
    --allow tcp:22

Created [https://www.googleapis.com/compute/v1/projects/[PROJECT_ID]/global/firewalls/allow-ssh-to-standalone].
NAME                     NETWORK            SRC_RANGES  RULES   SRC_TAGS  TARGET_TAGS
allow-ssh-to-standalone  my-custom-network  0.0.0.0/0   tcp:22            standalone

Delete the external IPs from your instances

When you created the instances for your instance groups, the instances needed external IPs to download and install Apache. Now that Apache is installed and since these instances are going to serve an Internal load balancer, they do not need access to the Internet, so you can remove the external IPs from the instances.

Console


  1. Go to the VM instances page in the Google Cloud Platform Console.
    Go to the VM instances page
  2. Click on ig-us-central1-1 to get to the console for that instance.
  3. Click Edit.
  4. Set External IP to None.
  5. Click Save.
  6. Repeat for ig-us-central1-2.
  7. Repeat for ig-us-central1-3.
  8. Repeat for ig-us-central1-4.

gcloud


gcloud compute instances delete-access-config ig-us-central1-1 \
    --access-config-name external-nat --zone us-central1-b
gcloud compute instances delete-access-config ig-us-central1-2 \
    --access-config-name external-nat --zone us-central1-b
gcloud compute instances delete-access-config ig-us-central1-3 \
    --access-config-name external-nat --zone us-central1-c
gcloud compute instances delete-access-config ig-us-central1-4 \
    --access-config-name external-nat --zone us-central1-c

Test your load balancer

From your local machine, connect to your standalone client instance.

gcloud compute --project [PROJECT_ID] ssh --zone us-central1-b standalone-instance-1

Use curl to contact the load balancer's internal IP address. Repeat a few times to until you see other instances responding.

$ curl 10.128.0.6
<!doctype html><html><body><h1>ig-us-central1-2</h1></body></html>
$ curl 10.128.0.6
<!doctype html><html><body><h1>ig-us-central1-1</h1></body></html>
$ curl 10.128.0.6
<!doctype html><html><body><h1>ig-us-central1-4</h1></body></html>
$ curl 10.128.0.6
<!doctype html><html><body><h1>ig-us-central1-4</h1></body></html>

New gcloud command parameters for Internal load balancing

Internal load balancing uses the same forwarding rule, backend service, and health check resources as HTTP(S) load balancing, but with some new command parameters. This section explains only the new parameters. Existing parameters are explained in the gcloud command-line tool reference.

Forwarding rule

For Internal load balancing, the forwarding rule points directly to a backend service in the same region as the forwarding rule. This section describes the new parameters in the context of the create command. The commands for get, list, describe, and delete remain the same. See the forwarding-rules gcloud command for explanations of other parameters.

New forwarding-rules create parameters

gcloud compute forwarding-rules create NAME \
    --load-balancing-scheme internal \
    [--address] \
    --region \
    [--protocol]
    --ports PORT,[PORT,…] \
    --backend-service NAME \
    [--subnetwork] \
    [--network]
  • --load-balancing-scheme — for Internal load balancing, this parameter must be specified as internal. Initially, this parameter is only supported for Internal load balancing
  • --address — the target IP address of the Internal load balancer. When the load balancing scheme is internal, this can only be an RFC1918 IP address belonging to the subnet configured for the forwarding rule. In this case, an external address cannot be used. If this parameter is unspecified and the load balancing scheme is internal, the IP address for the forwarding rule is automatically allocated from the internal IP range of the subnet or network configured for the forwarding rule.
  • --region — the region where the forwarding rule is created. Global forwarding rules are not supported for Internal load balancing.
  • --protocol (default: tcp) — The protocol, either tcp or udp, that the Internal load balancer will process.
  • --ports — A parameter that lets you specify 1-5 ports in a comma-separated list. Only packets destined for listed ports are forwarded.
  • --backend-service — the backend service that will receive all traffic directed to the forwarding rule.
  • --subnetwork — The subnet that this forwarding rule applies to. When configured for Internal load balancing, the internal IP of the forwarding rule is assigned from this subnet. If the network is in auto subnet mode, the subnet is optional because it can be determined from the region. However, if the network is in custom subnet mode, a subnet must be specified. Instances configured as backends for an Internal load balancer may belong to different subnets in the same region.
  • --network — The network that this forwarding rule applies to. When configured for Internal load balancing, the internal IP of the forwarding rule is assigned from this network. If this field is not specified, the default network will be used. In the absence of the default network, this field must be specified.

Set the target of a forwarding rule

The forwarding-rules set-target command lets you change which backend service the forwarding rule directs to. For Internal load balancing, the target of the forwarding rule must be a backend service and the --backend-service and --region parameters are required.

Console


Use the gcloud command-line tool for this step.

gcloud


gcloud compute forwarding-rules set-target \
    --region REGION \
    --backend-service NAME

Backend service

The forwarding rule directs your client request to the backend service. In turn, the backend service directs this client request to a healthy instance in the backend. To be considered healthy, and instance must have passed its health check.

New backend-services create parameters

See the backend-services gcloud command for explanations of other parameters.

gcloud compute backend-services create NAME \
    --load-balancing-scheme internal \
    --region \
    --health-checks \
    --protocol \
    [--session-affinity] \
    [--timeout]
  • --load-balancing-scheme internal (required) — Specifies whether the load balancer is performing internal or external load balancing. Only Internal load balancing is supported for this release. A value of internal is required.
  • --region (required) — The region where the backend service resides. For Internal load balancing, the instances, instance groups, backend service, load balancer service, and client instances must be in the same region of the same network as each other. They do not have to be in the same zone or subnet as each other.
  • --health-checks (required) — The name of the health check the backend service uses to monitor instances. The --health-checks parameter is required. The --http-health-checks and --https-health-checks parameters are not supported.
  • --protocol (default: http) — The protocol, either tcp or udp, that the load balancer will process. This parameter must match the --protocol parameter specified for the forwarding rule. The protocol is required for backend services as the default value is http and only tcp and udp are supported for Internal load balancing.
  • --session-affinity — If unspecified, new connections to the load balancer from the same instance are distributed evenly across backend instances. To cause connections from the same client instance to end up on the same backend instance, specify a session affinity of client_ip or client_ip_proto. See Session affinity for more details.
  • --timeout (default=30s) — Specifies how many seconds to wait for a response from the backend instance group before considering it a failed request.

New parameters for backend-services add-backend command

See the backend-services add-backend command for explanations of other parameters.

gcloud compute backend-services add-backend [INSTANCE_GROUP] \
    --balancing-mode connection \
    --instance-group-zone \
    --region
  • --balancing-mode — only a balancing mode of CONNECTION is supported for Internal load balancing. Balances based on the total number of outstanding connections. Instance utilization and current ongoing connections to the instance are not taken into account. Incoming connections are spread as evenly as possible across all available instances regardless of current load. Established connections stay on the same instance regardless of new incoming connections. The --max-rate and --max-rate-per-instance parameters are not supported for Internal load balancing.
  • --instance-group-zone — the zone of the instance group you are associating with the backend.
  • --region — you must specify the region of the backend. This is the same region as the rest of the Internal load balancer components.
  • --instance-group-region — the region of the regional instance group you are associating with the backend.
  • The following parameters are not valid for Internal load balancing:
    • --capacity-scaler
    • --max-rate
    • --max-rate-per-instance
    • --max-utilization

NOTE: Commands for get, list, describe, remove-backend, and delete now take an additional --regions flag when used for corresponding regional backend services.

Connection draining for backend services

You can enable connection draining on backend services to ensure minimal interruption to your users when an instance is removed from an instance group, either manually or by an autoscaler. Connection draining is only available for TCP connections. Connection draining is not supported for UDP traffic. To learn more about connection draining, read the Enabling Connection Draining documentation.

Health checks

Health checks determine which instances can receive new connections. The health checker probes instances at specified intervals. Instances that do not respond successfully to the probe a specified number of times in a row are marked as UNHEALTHY. Health checks can be of type tcp, ssl, http, or https.

You can configure a TCP, SSL, HTTP, or HTTPS health check for determining the health of your instances.

  • If the service running on your backend instances is based on HTTP, use an HTTP health check
  • If the service running on your backend instances is based on HTTPS, use an HTTPS health check
  • If the service running on your backend instances uses SSL, use an SSL health check
  • Unless you have an explicit reason to use a different kind of health check, use a TCP health check

Once configured, health check probes are sent on a regular basis on the specified port to all the instances in the configured instance groups.

See Health checking for more detailed information on how health checks work.

Health-checks create parameters

gcloud compute health-checks create [tcp | ssl | http | https] my-health-check \
    ...options

If you are encrypting traffic between the load balancer and your instances, use an SSL or HTTPS health check.

  • --check-interval (default=5s) — How often to send a health check probe to an instance. For example, specifying 10s will send the probe every 10 seconds. Valid units for this flag are s for seconds and m for minutes.
  • --healthy-threshold (default=2) — The number of consecutive successful health check probes before an unhealthy instance is marked as HEALTHY.
  • --port (default=80 for TCP and HTTP, 443 for SSL and HTTPS) — The TCP port number that this health check monitors.
  • --request — Only valid for TCP and SSL health checks. An optional string of up to 1024 characters that the health checker can send to the instance. The health checker then looks for a reply from the instance of the string provided in the --response field. If --response is not configured, the health checker does not wait for a response and regards the probe as successful if the TCP or SSL handshake was successful.
  • --response — Only valid for TCP and SSL health checks. An optional string of up to 1024 characters that the health checker expects to receive from the instance. If the response is not received exactly, the health check probe fails. If --response is configured, but not --request, the health checker will wait for a response anyway. Unless your system automatically sends out a message in response to a successful handshake, always configure --response to match an explicit --request.
  • --timeout (default=5s) — If the health checker doesn't receive valid response from the instance within this interval, the probe is considered a failure. For example, specifying 10s will cause the checker to wait for 10 seconds before considering the probe a failure. Valid units for this flag are s for seconds and m for minutes.
  • --unhealthy-threshold (default=2) — The number of consecutive health check probe failures before a healthy instance is marked as UNHEALTHY.

Health check source IPs and firewall rules

For Internal load balancing, the health check probes to your load balanced instances come from addresses in the ranges 130.211.0.0/22 and 35.191.0.0/16. Your firewall rules must allow these connections.

The section Configure a firewall rule to allow Internal load balancing covers this step.

Internal load balancing with regional instance groups

You can use internal load balancing with regional instance groups. The gcloud commands are as follows.

This command creates a regional instance group in us-central1, which can then be autoscaled.

    gcloud compute instance-groups managed create us-central1-ig1 \
        --region us-central1

This command creates a backend service in us-central1. A backend containing the regional instance group can then be added.

    gcloud compute backend-services create my-int-lb \
        --load-balancing-scheme internal \
        --region us-central1 \
        --health-checks my-tcp-health-check

This command adds the regional instance group to the backend service.

    gcloud compute backend-services add-backend my-int-lb \
        --instance-group us-central1-ig1 \
        --instance-group-region us-central1 \
        --region us-central1

You can then attach this backend service to a forwarding rule with load-balancing-scheme set to internal:

    gcloud compute forwarding-rules create internal-lb-forwarding-rule \
        --load-balancing-scheme internal \
        --ports 80 \
        --network my-custom-network \
        --subnet my-custom-subnet \
        --region us-central1 \
        --backend-service my-int-lb

Restrictions

  • Client instances and backend instances have to be on Google Cloud Platform. Sending traffic from clients on-premises (outside of GCP) to the Internal load balancer in GCP is not supported in this release. This feature will be supported in future releases.
  • The Internal load balancer IP cannot be the next-hop IP of a manually configured route. If a route is configured that has the IP of the Internal load balancer as its target, traffic matching that route will be dropped.
  • You cannot send traffic through a VPN tunnel to your load balancer IP.
  • Clients in a region cannot access an internal load balancer in a different region. All the instances in the Internal load balancing configuration must belong to the same VPC network and region, but can be in different subnets.

Limits

  • A maximum number of 50 Internal load balancer forwarding rules is allowed per network.

  • A maximum of 250 backends is allowed per Internal load balancer forwarding rule.

Pricing

Normal load balancing pricing applies.

FAQ

Q: What protocols are supported for Internal load balancing?

  • Internal load balancing is supported for TCP and UDP traffic

Q: Is service discovery supported for Internal load balancing?

  • Service name registration and discovery is not supported. Your client instances need to be configured to access Internal load balancing directly via the Internal load balancing frontend IP.

Q: Can I use target pools with Internal load balancing?

Supervisa tus recursos estés donde estés

Obtén la app de Google Cloud Console para ayudarte a administrar tus proyectos.

Enviar comentarios sobre…

Compute Engine Documentation