Service Extensions enables Application Load Balancers to send callouts to extension backend services to inject custom processing in the processing path. This page describes how to configure callout extensions.
For an overview about Application Load Balancer callout extensions, see Cloud Load Balancing extensions overview.
Before you begin
Ensure that you have either a project owner or editor role or the following Compute Engine IAM roles:
- To create instances:
compute.instanceAdmin.v1
- To create Cloud Load Balancing components:
compute.networkAdmin
- To create instances:
Enable these APIs: Compute Engine API and Network Services API.
gcloud services enable compute.googleapis.com networkservices.googleapis.com
Create and configure an Application Load Balancer that supports callouts. For this example, set up a regional internal Application Load Balancer with VM instance group backends.
For route extensions only. Set up an additional backend service and update the URL map to add a host matcher that routes traffic to this backend service for all traffic with the HTTP host matching
service-extensions.com
.Run the following commands:
gcloud compute instances create l7-ilb-backend2-vm \ --zone=us-west1-a \ --network=lb-network \ --subnet=backend-subnet \ --tags=allow-ssh,load-balanced-backend \ --image-family=debian-11 \ --image-project=debian-cloud \ --metadata=startup-script='#! /bin/bash apt-get update apt-get install apache2 -y a2ensite default-ssl a2enmod ssl echo "Page served from second backend service" | tee /var/www/html/index.html systemctl restart apache2' gcloud compute instance-groups unmanaged create l7-ilb-backend-service2-ig \ --zone us-west1-a gcloud compute instance-groups unmanaged add-instances l7-ilb-backend-service2-ig \ --zone=us-west1-a \ --instances=l7-ilb-backend2-vm gcloud compute backend-services create l7-ilb-backend-service2 \ --load-balancing-scheme=INTERNAL_MANAGED \ --protocol=HTTP \ --health-checks=l7-ilb-basic-check \ --health-checks-region=us-west1 \ --region=us-west1 gcloud compute backend-services add-backend l7-ilb-backend-service2 \ --balancing-mode=UTILIZATION \ --instance-group=l7-ilb-backend-service2-ig \ --instance-group-zone=us-west1-a \ --region=us-west1 gcloud compute url-maps add-path-matcher l7-ilb-map \ --path-matcher-name=callouts \ --default-service=l7-ilb-backend-service2 \ --new-hosts=service-extensions.com \ --region=us-west1
Set up an extension backend service
For this example, a basic Python-based extension server implementing
Envoy's Ext Proc gRPC API is available. The source code for this server is
in the Python Samples GitHub repository of Google Cloud. A docker container
with this server is at us-docker.pkg.dev/service-extensions/ext-proc/ext-proc-sample-python
.
To create and set up an extension backend service, follow these steps:
Create a virtual machine (VM) instance for the extension backend service that's running the sample Python extension server. Use the
gcloud compute instances create-with-container
command.gcloud compute instances create-with-container callouts-vm \ --container-image=us-docker.pkg.dev/service-extensions/ext-proc/ext-proc-sample-python:latest \ --network=lb-network \ --subnet=backend-subnet \ --zone=us-west1-a \ --tags=allow-ssh,load-balanced-backend
Create an unmanaged instance group. Use the
gcloud compute instance-groups unmanaged create
command. For example:gcloud compute instance-groups unmanaged create callouts-ig \ --zone=us-west1-a
Set a port for the unmanaged instance group. Use the
gcloud compute instance-groups unmanaged set-named-ports
command.gcloud compute instance-groups unmanaged set-named-ports callouts-ig \ --named-ports=http:80,grpc:443 \ --zone=us-west1-a
Add the VM instance to the unmanaged instance group. Use the
gcloud compute instance-groups unmanaged add-instances
command. For example:gcloud compute instance-groups unmanaged add-instances callouts-ig \ --zone=us-west1-a \ --instances=callouts-vm
Create an HTTP health check for the instance. Use the
gcloud compute health-checks create http
command. For example:gcloud compute health-checks create http callouts-hc \ --region=us-west1 \ --port=80
Create an extension backend service that uses the HTTP/2 protocol. Use the
gcloud compute backend-services create
command.gcloud compute backend-services create l7-ilb-callout-service \ --load-balancing-scheme=INTERNAL_MANAGED \ --protocol=HTTP2 \ --port-name=grpc \ --health-checks=callouts-hc \ --health-checks-region=us-west1 \ --region=us-west1
Add the instance group with the extension server as a backend to the backend service. Use the
gcloud compute backend-services add-backend
command. The following example adds an instance group running theext proc
service:gcloud compute backend-services add-backend l7-ilb-callout-service \ --balancing-mode=UTILIZATION \ --instance-group=callouts-ig \ --instance-group-zone=us-west1-a \ --region=us-west1
Configure a callout extension
You need to create the following resources to configure an extension for an existing Application Load Balancer:
- An extension backend service resource whose backends run the
ext_proc
gRPC API - An
LbTrafficExtension
orLbRouteExtension
resource that points to both of these resources:- A forwarding rule to attach to
- An extension backend service to call
The LbTrafficExtension
and LbRouteExtension
resource groups relate
extension services into one or more chains. Each extension chain selects
the traffic to act on by using
Common Expression Language (CEL)
match conditions. The load balancer evaluates a request against each chain's
match condition in a sequential manner. When a request matches the conditions
defined by a chain, all extensions in the chain act on the request. Only one
chain matches a given request.
For information about the limits related to callout extensions, see the Quotas and limits page.
The extension resource references the load balancer forwarding rule to attach to. After you configure the resource, the load balancer starts sending matching requests to extension services.
Configure a route extension
The following example shows how to configure a route extension to call when the
path matches /extensions
. The route extension server in the callout-vm
changes the Host
header to service-extensions.com
, sets the path
to /
, and then sends an instruction to the load balancer to recompute
the route. The traffic then flows to l7-ilb-backend-service2
instead of
l7-ilb-backend-service
.
Check if there's a match for
/extensions
in the URL map.Establish an SSH connection to the client VM.
gcloud compute ssh CLIENT_VM \ --zone=ZONE
Replace CLIENT_VM with the name of the client VM.
Run the following
curl
command against the forwarding rule in the client VM:curl FORWARDING_RULE_IP/extensions
Replace FORWARDING_RULE_IP with the IP address of the forwarding rule. To find the IP address, use the
gcloud compute forwarding-rules describe
command.The output should be similar to the following, which indicates that there is no match for
/extensions
in the URL map.<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN"> <html><head> <title>404 Not Found</title> </head><body> ...
Close the SSH connection.
To start configuring a route extension, define the extension in a YAML file and associate it with the forwarding rule.
cat >route.yaml <<EOF name: route-ext forwardingRules: - https://www.googleapis.com/compute/v1/projects/PROJECT_ID/regions/us-west1/forwardingRules/l7-ilb-forwarding-rule loadBalancingScheme: INTERNAL_MANAGED extensionChains: - name: "chain1" matchCondition: celExpression: 'request.path.startsWith("/extensions")' extensions: - name: 'ext11' authority: ext11.com service: https://www.googleapis.com/compute/v1/projects/PROJECT_ID/regions/us-west1/backendServices/l7-ilb-callout-service failOpen: false timeout: 0.1s EOF
Replace
PROJECT_ID
with the project ID.Import the route extension. Use the
gcloud beta service-extensions lb-route-extensions import
command.gcloud beta service-extensions lb-route-extensions import route-ext \ --source=route.yaml \ --location=us-west1
Verify that the route extension works as expected. Establish an SSH connection to the client VM and use the same
curl
command:curl FORWARDING_RULE_IP/extensions
The output should be similar to the following indicating that the traffic matched the virtual host
service-extensions.com
and reached thel7-ilb-backend-service2
service even though the original request did not.Page served from second backend service
To validate that the extension is targeting only requests with the
/extension
prefix, repeat thecurl
command without thepath
prefix.curl FORWARDING_RULE_IP
You should see an output like the following:
Page served from: l7-ilb-backend-example-1c7t
Configure a traffic extension
The following example helps you configure a traffic extension to call when the
path matches example.com
. The traffic extension server in the callout-vm
adds a response header, hello: service-extensions
to matching requests.
Check if there's a match for
example.com
in the URL map.Establish an SSH connection to the client VM.
gcloud compute ssh CLIENT_VM \ --zone=ZONE
Replace CLIENT_VM with the name of the client VM.
Run the following
curl
command against the forwarding rule in the client VM:curl -D - -H "host: example.com" FORWARDING_RULE_IP
Replace FORWARDING_RULE_IP with the IP address of the forwarding rule. To find the IP address, use the
gcloud compute forwarding-rules describe
command.The output should be similar to the following.
HTTP/1.1 200 OK ... content-length: 46 content-type: text/html via: 1.1 google Page served from: l7-ilb-backend-example-1c7t
Close the SSH connection.
To start configuring a traffic extension, define the extension in a YAML file and associate it with the forwarding rule.
cat >traffic.yaml <<EOF name: traffic-ext forwardingRules: - https://www.googleapis.com/compute/v1/projects/PROJECT_ID/regions/us-west1/forwardingRules/l7-ilb-forwarding-rule loadBalancingScheme: INTERNAL_MANAGED extensionChains: - name: "chain1" matchCondition: celExpression: 'request.host == "example.com"' extensions: - name: 'ext11' authority: ext11.com service: https://www.googleapis.com/compute/v1/projects/PROJECT_ID/regions/us-west1/backendServices/l7-ilb-callout-service failOpen: false timeout: 0.1s supportedEvents: - RESPONSE_HEADERS EOF
Replace
PROJECT_ID
with the project ID.Import the traffic extension. Use the
gcloud beta service-extensions lb-traffic-extensions import
command.gcloud beta service-extensions lb-traffic-extensions import traffic-ext \ --source=traffic.yaml \ --location=us-west1
Verify that the traffic extension works as expected. Establish an SSH connection to the VM client and use the same
curl
command:curl -D - -H "host: example.com" FORWARDING_RULE_IP
The output should include the
hello: service-extensions
response header.HTTP/1.1 200 OK ... content-length: 46 content-type: text/ hello: service-extensions via: 1.1 google Page served from: l7-ilb-backend-example-1c7t
To validate that the extension targets only
example.com
traffic, repeat thecurl
command without thehost
header.curl -D - FORWARDING_RULE_IP
The output is similar to the following:
HTTP/1.1 200 OK ... content-length: 46 content-type: text/html via: 1.1 google Page served from: l7-ilb-backend-example-1c7t
Limitations for callout extensions
- A forwarding rule can have only one
LbTrafficExtension
resource and oneLbRouteExtension
resource. - Cross-project referencing between callout extensions and the forwarding rule is not supported.
- The backend services used for callout extensions must be in the same project as the forwarding rule.
- The extension backend service cannot use Google Cloud Armor, IAP, or Cloud CDN policies.
- The extension backend service must use HTTP/2 as the protocol.
- Route extensions cannot override the processing mode of
ext_proc stream
. - You cannot change some headers. For a complete list, see Limitations with header manipulation.
Recommended optimizations for callout extensions
Integrating an extension into the load balancing processing path incurs additional latency for requests and responses. Each type of data that the extension service processes, including request headers, request body, response headers, and response body, adds latency.
Consider the following optimizations to minimize the latency:
- Configure the extension to process only the data that you need. For example,
to modify only request headers, set the
supported_events
field in the extension toREQUEST_HEADERS
. - Deploy callouts in the same zones as the regular destination backend service for the load balancer. When using a cross-region internal Application Load Balancer, place the extension service backends in the same region as the load balancer's proxy-only subnets.
- When using a global external Application Load Balancer, place the extension service backends in the geographic regions where the regular load balancer's destination VMs, Google Kubernetes Engine (GKE) workloads, and Cloud Run functions are located.