Cloud Load Balancing extensions overview

Service Extensions lets you use extensions to instruct supported Application Load Balancers to send a callout from the load balancing data path to callout backend services. This page provides an overview about Cloud Load Balancing extensions.

You can configure Application Load Balancers to use the following types of extensions:

  • Route extensions, early in the request processing lifecycle

  • Traffic extensions, just before the load balancer sends requests to backends or receives responses from them

Supported Application Load Balancers

Service Extensions supports extensions for the following Application Load Balancers:

Application Load Balancer Route extensions Traffic extensions
Global external Application Load Balancer
Regional external Application Load Balancer
Regional internal Application Load Balancer
Cross-region internal Application Load Balancer (Preview) (Preview)
Classic Application Load Balancer

Extensibility points in the load balancing data path

Service Extensions supports extensions in different stages of the load balancing data path.

Figure 1 shows how Service Extensions supports extensions in the routing and traffic management stages for these types of load balancers: Regional external Application Load Balancer, Regional internal Application Load Balancer, and Cross-region internal Application Load Balancer.

Regional external Application Load Balancers, regional internal Application Load Balancers,
    and cross-region internal Application Load Balancers support extensibility in the routing and
    traffic management stages.
Figure 1. Regional external Application Load Balancers, regional internal Application Load Balancers, and cross-region internal Application Load Balancers support extensions in the routing and traffic management stages (click to enlarge).

Figure 2 shows how Service Extensions supports extensions in the traffic management stage for Global external Application Load Balancers.

Global external Application Load Balancers support
    extensibility in the traffic management stage.
Figure 2. Global external Application Load Balancers support extensions in the traffic management stage (click to enlarge).

How route extensions work

Route extensions run first in the request processing path when the load balancer receives request headers and before it evaluates the URL map.

After a load balancer invokes a route extension for a request, it does the following:

  • Selects the backend service by evaluating the URL map
  • Applies Google Cloud Armor policies for the selected backend service
  • Applies Identity-Aware Proxy (IAP) policies for the selected backend service
  • Performs fault injection
  • Performs request header transformations and resolves custom request header variables
  • Invokes traffic extensions, if they exist in the processing path of the selected backend service
  • Performs URL rewrites
  • Performs redirects or routing to the selected backend service and applies timeouts and retry policies in the URL map and other load balancing settings for the backend service

How traffic extensions work

Load balancers run traffic extensions last in the request processing path and first in the response processing path.

These extensions let you modify the headers and payloads of both requests and responses without impacting the choice of the backend service. You can also use traffic extensions for custom logging by specifying the information that you want to log, the format, and the external provider.

Before a load balancer invokes a traffic extension on the request path for a request, it does the following:

  • Performs fault injection
  • Performs request header transformations and resolves custom request header variables
  • Selects a backend service for the request
  • Applies Google Cloud Armor policies for the selected backend service
  • Applies IAP policies for the selected backend service
  • Applies Cloud CDN caching policies for the selected backend service in the case of global external Application Load Balancers

After a load balancer invokes a traffic extension on the request path for a request, it does the following:

  • Performs URL rewrites
  • Performs header manipulation according to the URL map
  • Performs redirects or routing to the selected backend service while applying timeouts and retry policies in the URL map and the load balancing settings for the backend service
  • Performs request mirroring

After a load balancer invokes a traffic extension on the response path for a request, it does the following:

  • Performs response header transformations and resolves custom response header variables
  • Performs logging by using Cloud Logging
  • Performs Cloud CDN caching in the case of global external Application Load Balancers

Limitations for extensions

  • A forwarding rule can have only one LbTrafficExtension resource and one LbRouteExtension resource.
  • The callout backend service must be in the same project as the forwarding rule.
  • Route extensions cannot override the processing mode of ext_proc stream.

Recommended optimizations for extensions

Integrating an extension into the load balancing processing path incurs additional latency for requests and responses. Each type of data that the extension service processes, including request headers, request body, response headers, and response body, adds latency.

Consider the following optimizations to minimize the latency:

  • Configure the extension to process only the data that you need. For example, to modify only request headers, set the supported_events field in the extension to REQUEST_HEADERS.
  • Deploy extensions in the same zones as the regular destination backend service for the load balancer. When using a cross-region internal Application Load Balancer, place the extension service backends in the same region as the load balancer's proxy-only subnets.
  • When using a global external Application Load Balancer, place the extension service backends in the geographic regions where the regular load balancer's destination VMs, Google Kubernetes Engine (GKE) workloads, and Cloud Run functions are located.

What's next