Application Load Balancer overview

The Application Load Balancer is a proxy-based Layer 7 load balancer that enables you to run and scale your services. The Application Load Balancer distributes HTTP and HTTPS traffic to backends hosted on a variety of Google Cloud platforms—such as Compute Engine, Google Kubernetes Engine (GKE), Cloud Storage, and Cloud Run—as well as external backends connected over the internet or by using hybrid connectivity.

Application Load Balancers are available in the following modes of deployment:

Deployment mode Network service tier Load-balancing scheme IP address Frontend ports Links
External Application Load Balancer

Load balances traffic coming from clients on the internet.

Global external Premium EXTERNAL_MANAGED IPv4
IPv6 termination

HTTP on 80 or 8080

HTTPS on 443

Architecture details
Regional external Standard EXTERNAL_MANAGED IPv4
Classic Premium or Standard EXTERNAL IPv4
IPv6 termination (Premium Tier only)
Internal Application Load Balancer

Load balance traffic within your VPC network or networks connected to your VPC network.

Regional internal

Always regional

Premium INTERNAL_MANAGED IPv4

HTTP on 80 or 8080

HTTPS on 443

Architecture details

Cross-region internal*

Premium INTERNAL_MANAGED IPv4

The load-balancing scheme is an attribute on the forwarding rule and the backend service of a load balancer and indicates whether the load balancer can be used for internal or external traffic. The term *_MANAGED in the load-balancing scheme indicates that the load balancer is implemented as a managed service either on Google Front Ends (GFEs) or on the open source Envoy proxy. In a load-balancing scheme that is *_MANAGED, requests are routed either to the GFE or to the Envoy proxy.

* The load balancer uses global resources and can be deployed in one or multiple Google Cloud regions that you choose.

External Application Load Balancers

External Application Load Balancers are implemented using Google Front Ends (GFEs) or managed proxies. Global external Application Load Balancers and classic Application Load Balancers use GFEs that are distributed globally, operating together by using Google's global network and control plane. GFEs offer multi-region load balancing in the Premium tier, directing traffic to the closest healthy backend that has capacity and terminating HTTP(S) traffic as close as possible to your users. Global external Application Load Balancers and regional external Application Load Balancers use the open source Envoy proxy software to enable advanced traffic management capabilities.

External Application Load Balancers support the following capabilities:

The following diagram shows a sample external Application Load Balancer architecture.

External Application Load Balancer architecture.
External Application Load Balancer architecture.

For a complete overview, see Architecture overview for External Application Load Balancers.

Internal Application Load Balancers

The internal Application Load Balancers are Envoy proxy-based regional Layer 7 load balancers that enable you to run and scale your HTTP application traffic behind an internal IP address. Internal Application Load Balancers support backends in one region, but can be configured to be globally accessible by clients from any Google Cloud region.

The load balancer distributes traffic to backends hosted on Google Cloud, on-premises, or in other cloud environments. Internal Application Load Balancers also support the following features:

  • Locality policies. Within a backend instance group or network endpoint group, you can configure how requests are distributed to member instances or endpoints. For details, see Traffic management.
  • Global access. When global access is enabled, clients from any region can access the load balancer. For details, see Enable global access.
  • Access from connected networks. You can make your load balancer accessible to clients from networks beyond its own Google Cloud Virtual Private Cloud (VPC) network. The other networks must be connected to the load balancer's VPC network by using either VPC Network Peering, Cloud VPN, or Cloud Interconnect. For details, see Access connected networks.
  • Compatibility with GKE by using Ingress (fully orchestrated). For details, see Configure Ingress for internal Application Load Balancers.
Internal Application Load Balancer architecture.
Internal Application Load Balancer architecture.

For a complete overview, see Architecture overview for internal Application Load Balancers.

Use cases

The following sections depict some common use cases for Application Load Balancers.

Three-tier web services

You can deploy a combination of Application Load Balancers and Network Load Balancers to support traditional three-tier web services. The following example shows how you can deploy each tier, depending on your traffic type:

  • Web tier. The application's frontend is served by an external Application Load Balancer with instance group backends. Traffic enters from the internet and is proxied from the load balancer to a set of instance group backends in various regions. These backends send HTTP(S) traffic to a set of internal Application Load Balancers.
  • Application tier. The application's middleware is deployed and scaled by using an internal Application Load Balancer and instance group backends. The load balancers distribute the traffic to middleware instance groups. These middleware instance groups then send the traffic to internal passthrough Network Load Balancers.
  • Database tier. The Network Load Balancers serve as frontends for the database tier. They distribute traffic to data storage backends in various regions.
Layer 7-based routing in a three-tier web application.
Layer 7-based routing in a three-tier web application.

Workloads with jurisdictional compliance

Some workloads with regulatory or compliance requirements require that network configurations and traffic termination reside in a specific region. For these workloads, a regional external Application Load Balancer is often the preferred option to provide the jurisdictional controls these workloads require.

Advanced traffic management

The Application Load Balancers support advanced traffic management features that give you fine-grained control over how your traffic is handled. These capabilities include the following:

  • You can update how traffic is managed without needing to modify your application code.
  • You can intelligently route traffic based on HTTP(S) parameters, such as host, path, headers, and other request parameters. For example, you can use Cloud Storage buckets to handle any static video content, and you can use instance groups or NEGs to handle all other requests.
  • You can mitigate risks when deploying a new version of your application by using weight-based traffic splitting. For example, you can send 95% of the traffic to the previous version of your service and 5% to the new version of your service. After you validate that the new version works as expected, you can gradually shift the percentages until 100% of the traffic reaches the new version of your service. Traffic splitting is typically used for deploying new versions, A/B testing, service migration, modernizing legacy services, and similar processes.

Following is an example of path-based routing implemented by using an internal Application Load Balancer. Each path is handled by a different backend.

Path-based routing with internal Application Load Balancers.
Path-based routing with internal Application Load Balancers.

For more details, see the following:

Migrating legacy services to Google Cloud

Migrating an existing service to Google Cloud enables you to free up on-premises capacity and reduce the cost and burden of maintaining an on-premises infrastructure. You can temporarily set up a hybrid deployment that allows you to route traffic to both your current on-premises service and a corresponding Google Cloud service endpoint.

The following diagram demonstrates this setup with an internal Application Load Balancer. If you are using an internal load balancer, you can configure the Google Cloud load balancer to use weight-based traffic splitting to split traffic across the two services. Traffic splitting lets you start by sending 0% of the traffic to the Google Cloud service and 100% to the on-premises service. You can then gradually increase the proportion of traffic sent to the Google Cloud service. Eventually, you send 100% of the traffic to the Google Cloud service, and you can retire the on-premises service.

Migrate legacy services to Google Cloud.
Migrate legacy services to Google Cloud.

Load balancing for GKE applications

There are three ways to deploy Application Load Balancers for GKE clusters:

Load balancing for Cloud Run, Cloud Functions, and App Engine applications

You can use an Application Load Balancer as the frontend for your Google Cloud serverless applications. This lets you configure your serverless applications to serve requests from a dedicated IP address that is not shared with any other services.

To set this up, you use a serverless NEG as the load balancer's backend. The following diagrams show how a serverless application is integrated with an Application Load Balancer.

Global external

This diagram shows how a serverless NEG fits into a global external Application Load Balancer architecture.

Global external Application Load Balancer architecture for serverless apps.
Global external Application Load Balancer architecture for serverless apps.

Regional external

This diagram shows how a serverless NEG fits into a regional external Application Load Balancer architecture. This load balancer only supports Cloud Run backends.

Regional external Application Load Balancer architecture for serverless apps.
Regional external Application Load Balancer architecture for serverless apps.

Regional internal

This diagram shows how a serverless NEG fits into the internal Application Load Balancer model. This load balancer only supports Cloud Run backends.

Internal Application Load Balancer architecture for serverless apps.
Internal Application Load Balancer architecture for serverless apps.

Related documentation:

Load balancing to backends outside Google Cloud

Application Load Balancers support load-balancing traffic to endpoints that extend beyond Google Cloud, such as on-premises data centers and other cloud environments. External backends are typically accessible in one of the following ways:

  • Accessible over the public internet. For these endpoints, you use an internet NEG as the load balancer's backend. The internet NEG is configured to point to a single FQDN:Port or IP:Port endpoint on the external backend. Internet NEGs can be global or regional.

    The following diagram demonstrates how to connect to external backends accessible over the public internet using a global internet NEG.

    Global external Application Load Balancer with an external backend.
    Global external Application Load Balancer with an external backend.

    For more details, see Internet NEGs overview.

  • Accessible by using hybrid connectivity (Cloud Interconnect or Cloud VPN). For these endpoints, you use a hybrid NEG as the load balancer's backend. The hybrid NEG is configured to point to IP:Port endpoints on the external backend.

    The following diagrams demonstrate how to connect to external backends accessible by using Cloud Interconnect or Cloud VPN.

    External

    Hybrid connectivity with global external Application Load Balancers.
    Hybrid connectivity with global external Application Load Balancers.

    Internal

    Hybrid connectivity with internal Application Load Balancers.
    Hybrid connectivity with internal Application Load Balancers.

    For more details, see Hybrid NEGs overview.

Integration with Private Service Connect

Private Service Connect allows private consumption of services across VPC networks that belong to different groups, teams, projects, or organizations. You can use Private Service Connect to access Google APIs and services or managed services in another VPC network.

You can use a global external Application Load Balancer to access services that are published by using Private Service Connect. For more information, see About Private Service Connect backends.

You can use an internal Application Load Balancer to send requests to supported regional Google APIs and services. For more information, see Access Google APIs through backends.