Service Extensions enables Application Load Balancers to send callouts to extension backend services to inject custom processing in the processing path. This page describes how to configure callout extensions.
For an overview about Application Load Balancer callout extensions, see Cloud Load Balancing extensions overview.
Before you begin
Ensure that you have either a project owner or editor role or the following Compute Engine IAM roles:
- To create instances:
compute.instanceAdmin.v1
- To create Cloud Load Balancing components:
compute.networkAdmin
- To create instances:
Enable these APIs: Compute Engine API and Network Services API.
Console
In the Google Cloud console, go to the Enable access to APIs page.
Follow the instructions.
gcloud
Use the
gcloud services enable
command:gcloud services enable compute.googleapis.com networkservices.googleapis.com
Create and configure an Application Load Balancer that supports callouts. For this example, set up a regional internal Application Load Balancer with VM instance group backends.
For route extensions only. Set up an additional backend service and update the URL map to add a host matcher that routes traffic to this backend service for all traffic with the HTTP host matching the specified condition.
Console
In the Google Cloud console, go to the Create an instance page.
Specify the following sample values:
- Name:
l7-ilb-backend2-vm
- Tags:
allow-ssh
andload-balanced-backend
- Zone:
us-west1-a
- Network:
lb-network
- Subnetwork:
backend-subnet
- Image:
debian-11
- Family:
debian-cloud
Advanced options > Management > Automation:
'#! /bin/bash apt-get update apt-get install apache2 -y a2ensite default-ssl a2enmod ssl echo "Page served from second backend service" | tee /var/www/html/index.html systemctl restart apache2'
- Name:
Create an unmanaged instance group.
Specify the following sample values:
- Name:
l7-ilb-backend-service2-ig
- Zone:
us-west1-a
- Name:
Add the new VM to the instance group.
For VM instances, specify
l7-ilb-backend2-vm
.In the Google Cloud console, go to the Load balancing page.
Update the load balancer by creating a backend service and adding a backend to it.
For the backend service, specify the following sample values:
- Name:
l7-ilb-backend-service2
- Internet facing or internal only:
Only between my VMs
- Protocol:
HTTP
- Region:
us-west1
- Health check > Name:
l7-ilb-basic-check
- Health check > Region:
us-west1
For the backend, specify the following sample values:
- Instance group:
l7-ilb-backend-service2-ig
- Balancing mode:
Utilization
- Name:
Add a host matcher to the URL map of the backend service.
Specify the following sample values:
- Name:
l7-ilb-map
- Host:
service-extensions.com
- Path:
callouts
- Protocol:
HTTP
- Backend:
l7-ilb-backend-service2
- Name:
gcloud
Create a VM instance. Use the
gcloud compute instances create
command with the following sample values:gcloud compute instances create l7-ilb-backend2-vm \ --zone=us-west1-a \ --network=lb-network \ --subnet=backend-subnet \ --tags=allow-ssh,load-balanced-backend \ --image-family=debian-11 \ --image-project=debian-cloud \ --metadata=startup-script='#! /bin/bash apt-get update apt-get install apache2 -y a2ensite default-ssl a2enmod ssl echo "Page served from second backend service" | tee /var/www/html/index.html systemctl restart apache2'
Create an unmanaged instance group. Use the
gcloud compute instance-groups unmanaged create
command with the following sample values:gcloud compute instance-groups unmanaged create l7-ilb-backend-service2-ig \ --zone us-west1-a
Add the new VM to the instance group. Use the
gcloud compute instance-groups unmanaged add-instances
command with the following sample values:gcloud compute instance-groups unmanaged add-instances l7-ilb-backend-service2-ig \ --zone=us-west1-a \ --instances=l7-ilb-backend2-vm
Create a backend service. Use the
gcloud compute backend-services create
command with the following sample values:gcloud compute backend-services create l7-ilb-backend-service2 \ --load-balancing-scheme=INTERNAL_MANAGED \ --protocol=HTTP \ --health-checks=l7-ilb-basic-check \ --health-checks-region=us-west1 \ --region=us-west1
Add a backend to the backend service. Use the
gcloud compute backend-services add-backend
command with the following sample values:gcloud compute backend-services add-backend l7-ilb-backend-service2 \ --balancing-mode=UTILIZATION \ --instance-group=l7-ilb-backend-service2-ig \ --instance-group-zone=us-west1-a \ --region=us-west1
Add a host matcher to the URL map of the backend service. Use the
gcloud compute url-maps add-path-matcher
command with the following sample values:gcloud compute url-maps add-path-matcher l7-ilb-map \ --path-matcher-name=callouts \ --default-service=l7-ilb-backend-service2 \ --new-hosts=service-extensions.com \ --region=us-west1
Set up an extension backend service
For this example, a basic Python-based extension server implementing
Envoy's Ext Proc gRPC API is available. A docker container with this server is
at us-docker.pkg.dev/service-extensions/ext-proc/service-callout-basic-example-python:latest
in the Service Extensions GitHub repository
of Google Cloud. This repository contains several other Python-based examples
of ext-proc
servers to do tasks such as header mutation and body mutation.
To create and set up an extension backend service, follow these steps:
Create a virtual machine (VM) instance for the extension backend service that's running the sample Python extension server.
Console
Create an instance by using a container image.
In the Google Cloud console, go to the Create an instance page.
Specify the following sample values:
- Name:
callouts-vm
- Zone:
us-west1-a
- Network:
lb-network
- Subnetwork:
backend-subnet
- Tags:
allow-ssh
andload-balanced-backend
- Container image:
us-docker.pkg.dev/service-extensions/ext-proc/service-callout-basic-example-python:latest
- Name:
gcloud
Create an instance by using a container image. Use the
gcloud compute instances create-with-container
command with the following sample values:gcloud compute instances create-with-container callouts-vm \ --container-image=us-docker.pkg.dev/service-extensions/ext-proc/service-callout-basic-example-python:latest \ --network=lb-network \ --subnet=backend-subnet \ --zone=us-west1-a \ --tags=allow-ssh,load-balanced-backend
Add the VM to an unmanaged instance group.
Console
Create an unmanaged instance group.
In the Google Cloud console, go to the Instance groups page.
Specify the following sample values:
- Name:
callouts-ig
- Zone:
us-west1-a
- Name:
Set a port for the instance group.
For Port mapping, specify these port names and values:
http:80
andgrpc:443
.Add the new VM to the instance group.
For VM instances, specify
callouts-vm
.
gcloud
Create an unmanaged instance group. Use the
gcloud compute instance-groups unmanaged create
command with the following sample values:gcloud compute instance-groups unmanaged create callouts-ig \ --zone=us-west1-a
Set a port for the instance group. Use the
gcloud compute instance-groups unmanaged set-named-ports
command with the following sample values:gcloud compute instance-groups unmanaged set-named-ports callouts-ig \ --named-ports=http:80,grpc:443 \ --zone=us-west1-a
Add the new VM instance to the unmanaged instance group. Use the
gcloud compute instance-groups unmanaged add-instances
command with the following sample values:gcloud compute instance-groups unmanaged add-instances callouts-ig \ --zone=us-west1-a \ --instances=callouts-vm
Update the load balancer by creating a backend service and adding a backend.
Console
Create an extension backend service that uses the HTTP/2 protocol and has an HTTP health check.
In the Google Cloud console, go to the Load balancing page.
Add a backend service with the following sample values:
- Name:
l7-ilb-callout-service
- Internet facing or internal only:
Only between my VMs
- Protocol:
HTTP2
- Port name:
grpc
- Region:
us-west1
- Health check > Name:
callouts-hc
- Health check > Port number:
80
- Name:
Add the instance group with the extension server as a backend to the backend service. The instance group runs the
ext proc
service.Specify the following sample values:
- Instance group:
callouts-ig
- Balancing mode:
Utilization
- Instance group:
gcloud
Create a basic HTTP health check for the instance. Use the
gcloud compute health-checks create http
command with the following sample values:gcloud compute health-checks create http callouts-hc \ --region=us-west1 \ --port=80
Create an extension backend service that uses the HTTP/2 protocol. Use the
gcloud compute backend-services create
command.gcloud compute backend-services create l7-ilb-callout-service \ --load-balancing-scheme=INTERNAL_MANAGED \ --protocol=HTTP2 \ --port-name=grpc \ --health-checks=callouts-hc \ --health-checks-region=us-west1 \ --region=us-west1
Add the instance group with the extension server as a backend to the backend service. The instance group runs the
ext proc
service. Use thegcloud compute backend-services add-backend
command with the following sample values:gcloud compute backend-services add-backend l7-ilb-callout-service \ --balancing-mode=UTILIZATION \ --instance-group=callouts-ig \ --instance-group-zone=us-west1-a \ --region=us-west1
Configure a callout extension
You need to create the following resources to configure an extension for an existing Application Load Balancer:
- An extension backend service resource whose backends run the
ext_proc
gRPC API - An
LbTrafficExtension
orLbRouteExtension
resource that points to both of these resources:- A forwarding rule to attach to
- An extension backend service to call
The LbTrafficExtension
and LbRouteExtension
resource groups relate
extension services into one or more chains. Each extension chain selects
the traffic to act on by using
Common Expression Language (CEL)
match conditions. The load balancer evaluates a request against each chain's
match condition in a sequential manner. When a request matches the conditions
defined by a chain, all extensions in the chain act on the request. Only one
chain matches a given request.
For information about the limits related to callout extensions, see the Quotas and limits page.
The extension resource references the load balancer forwarding rule to attach to. After you configure the resource, the load balancer starts sending matching requests to extension services.
Configure a route extension
The following example shows how to configure a route extension to call when the
path matches /extensions
. The route extension server in the callout-vm
changes the Host
header to service-extensions.com
, sets the path
to /
, and then sends an instruction to the load balancer to recompute
the route. The traffic then flows to l7-ilb-backend-service2
instead of
l7-ilb-backend-service
.
Check if there's a match for
/extensions
in the URL map.Establish an SSH connection to the client VM.
Console
- In the Google Cloud console, go to the VM instances page.
- In the list of virtual machine instances, click SSH in the row of the instance that you want to connect to.
gcloud
Use the
gcloud compute ssh
command.gcloud compute ssh CLIENT_VM \ --zone=ZONE
Replace CLIENT_VM with the name of the client VM.
Run the following
curl
command against the forwarding rule in the client VM:curl FORWARDING_RULE_IP/extensions
Replace FORWARDING_RULE_IP with the IP address of the forwarding rule. To find the IP address, use the
gcloud compute forwarding-rules describe
command.The output should be similar to the following, which indicates that there is no match for
/extensions
in the URL map.<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN"> <html><head> <title>404 Not Found</title> </head><body> ...
Close the SSH connection.
Configure the route extension.
Console
In the Google Cloud console, go to the Service Extensions page.
Click Create extension.
A wizard opens to guide you through some initial steps.
For the product, select Load Balancing. Then, click Continue.
A list of Application Load Balancers that support callouts appears.
Select a load balancer type. For regional load balancers, also specify the region. Click Continue.
For the service extension type, select Route extensions, click Continue, and then click Done.
The Create service extension form opens. Notice that the preceding selections, which appear at the top of the page, are not editable.
In the Basics section, do the following:
Specify a unique name for the service extension.
The name must start with a lowercase letter followed by up to 62 lowercase letters, numbers, or hyphens and must not end with a hyphen.
Optional: Enter a brief description about the extension by using a maximum of 1,024 characters.
Optional: In the Labels section, click Add label. Then, in the row that appears, do the following:
- For Key, enter a key name.
- For Value, enter a value for the key.
To add more key-value pairs, with the maximum limit being 64, click Add label.
For more information about labels, see Create and update labels for projects.
For Forwarding rules, select one or more forwarding rules to associate with the extension.
Forwarding rules that are already associated with another extension cannot be selected and appear disabled.
For Extension chains, add one or more extension chains to execute for a matching request.
To add an extension chain, click Add an extension chain, do the following, and then click Done:
For Extension chain name, specify a unique name.
The name must conform with RFC-1034, use only lower-case letters, numbers, and hyphens, and have a maximum length of 63 characters. Additionally, the first character must be a letter and the last a letter or a number.
For Match condition, specify a Common Expression Language (CEL) expression to match requests for which the extension chain is executed.
For more information, click Get syntax help or see CEL matcher language reference.
Add an extension to execute for a matching request. For route extensions, you can specify only one extension.
Under Extensions, do the following, and then click Done:
For Extension name, specify a unique name.
The name must conform with RFC-1034, use only lower-case letters, numbers, and hyphens, and have a maximum length of 63 characters. Additionally, the first character must be a letter and the last a letter or a number.
For Authority, enter the
authority
header from the gRPC request sent from Envoy to the extension service.For Backend service, select a backend service created by following the instructions in Set up an extension backend service.
For Timeout, specify a value between 10 and 1000 milliseconds, after which a message on the stream times out while Envoy is still waiting for a response from the
ext_proc
service.For Forward headers, click Add header, and then add HTTP headers to forward to the extension (from the client or the backend). If a header is not specified, all headers are sent.
Optional: For Fail open, if you want the extension to fail open, select Enabled. In this case, if the call to the extension fails or times out, request or response processing continues without error. Any subsequent extensions in the extension chain are also executed.
This field is unselected by default. In this case, if response headers have not been delivered to the downstream client, a generic
500
error is returned to the client. If response headers have been delivered, the HTTP stream to the downstream client is reset.
Click Create extension.
gcloud
Define the extension in a YAML file and associate it with the forwarding rule. Use the sample values provided.
cat >route.yaml <<EOF name: route-ext forwardingRules: - https://www.googleapis.com/compute/v1/projects/PROJECT_ID/regions/us-west1/forwardingRules/l7-ilb-forwarding-rule loadBalancingScheme: INTERNAL_MANAGED extensionChains: - name: "chain1" matchCondition: celExpression: 'request.path.startsWith("/extensions")' extensions: - name: 'ext11' authority: ext11.com service: https://www.googleapis.com/compute/v1/projects/PROJECT_ID/regions/us-west1/backendServices/l7-ilb-callout-service failOpen: false timeout: 0.1s EOF
Replace
PROJECT_ID
with the project ID.Import the route extension. Use the
gcloud service-extensions lb-route-extensions import
command with the following sample values.gcloud service-extensions lb-route-extensions import route-ext \ --source=route.yaml \ --location=us-west1
Verify that the route extension works as expected. Establish an SSH connection to the client VM and use the same
curl
command:curl FORWARDING_RULE_IP/extensions
The output should be similar to the following indicating that the traffic matched the virtual host
service-extensions.com
and reached thel7-ilb-backend-service2
service even though the original request did not.Page served from second backend service
To validate that the extension is targeting only requests with the
/extension
prefix, repeat thecurl
command without thepath
prefix.curl FORWARDING_RULE_IP
You should see an output like the following:
Page served from: l7-ilb-backend-example-1c7t
Configure a traffic extension
The following example helps you configure a traffic extension to call when the
host matches example.com
. The traffic extension server in the callout-vm
adds a response header, hello: service-extensions
to matching requests.
Check if there's a match for
example.com
in the URL map.Establish an SSH connection to the client VM.
Console
- In the Google Cloud console, go to the VM instances page.
- In the list of virtual machine instances, click SSH in the row of the instance that you want to connect to.
gcloud
Use the
gcloud compute ssh
command.gcloud compute ssh CLIENT_VM \ --zone=ZONE
Replace CLIENT_VM with the name of the client VM.
Run the following
curl
command against the forwarding rule in the client VM:curl -D - -H "host: example.com" FORWARDING_RULE_IP
Replace FORWARDING_RULE_IP with the IP address of the forwarding rule. To find the IP address, use the
gcloud compute forwarding-rules describe
command.The output should be similar to the following.
HTTP/1.1 200 OK ... content-length: 46 content-type: text/html via: 1.1 google Page served from: l7-ilb-backend-example-1c7t
Close the SSH connection.
Configure the traffic extension.
Console
In the Google Cloud console, go to the Service Extensions page.
Click Create extension.
A wizard opens to guide you through some initial steps.
For the product, select Load Balancing. Then, click Continue.
A list of Application Load Balancers that support callouts appears.
Select a load balancer type. For regional load balancers, also specify the region. Click Continue.
For the service extension type, select Traffic extensions, click Continue, and then click Done.
The Create service extension form opens. Notice that the preceding selections, which appear at the top of the page, are not editable.
In the Basics section, do the following:
Specify a unique name for the service extension.
The name must start with a lowercase letter followed by up to 62 lowercase letters, numbers, or hyphens and must not end with a hyphen.
Optional: Enter a brief description about the extension by using up to 1,024 characters.
Optional: In the Labels section, click Add label. Then, in the row that appears, do the following:
- For Key, enter a key name.
- For Value, enter a value for the key.
To add more key-value pairs, with the maximum limit being 64, click Add label.
For more information about labels, see Create and update labels for projects.
For Forwarding rules, select one or more forwarding rules to associate with the extension.
Forwarding rules that are already associated with another extension cannot be selected and appear disabled.
For Extension chains, add one or more extension chains to execute for a matching request.
To add an extension chain, click Add an extension chain, do the following, and then click Done:
For Extension chain name, specify a unique name.
The name must conform with RFC-1034, use only lower-case letters, numbers, and hyphens, and have a maximum length of 63 characters. Additionally, the first character must be a letter and the last a letter or a number.
To match requests for which the extension chain is executed, for Match condition, specify a Common Expression Language (CEL) expression.
For more information, click Get syntax help or see CEL matcher language reference.
Add one or more extensions to execute for a matching request.
For each extension, under Extensions, do the following, and then click Done:
For Extension name, specify a unique name.
The name must conform with RFC-1034, use only lower-case letters, numbers, and hyphens, and have a maximum length of 63 characters. Additionally, the first character must be a letter and the last a letter or a number.
For Authority, enter the
authority
header from the gRPC request sent from Envoy to the extension service.For Backend service, select a backend service created by following the instructions in Set up an extension backend service.
For Timeout, specify a value between 10 and 1000 milliseconds after which a message on the stream times out.
For Events, select one or more HTTP event types that call the extension.
For Forward headers, click Add header, and then add HTTP headers to forward to the extension (from the client or the backend). If a header is not specified, all headers are sent.
For Fail open, select Enabled. If the call to the extension fails or times out, request or response processing continues without error. Any subsequent extensions in the extension chain are also executed.
This field is unselected by default. In this case, if response headers have not been delivered to the downstream client, a generic
500
error is returned to the client. If response headers have been delivered, the HTTP stream to the downstream client is reset.
Click Create extension.
gcloud
Define the extension in a YAML file and associate it with the forwarding rule. Use the sample values provided.
cat >traffic.yaml <<EOF name: traffic-ext forwardingRules: - https://www.googleapis.com/compute/v1/projects/PROJECT_ID/regions/us-west1/forwardingRules/l7-ilb-forwarding-rule loadBalancingScheme: INTERNAL_MANAGED extensionChains: - name: "chain1" matchCondition: celExpression: 'request.host == "example.com"' extensions: - name: 'ext11' authority: ext11.com service: https://www.googleapis.com/compute/v1/projects/PROJECT_ID/regions/us-west1/backendServices/l7-ilb-callout-service failOpen: false timeout: 0.1s supportedEvents: - RESPONSE_HEADERS EOF
Replace
PROJECT_ID
with the project ID.Import the traffic extension. Use the
gcloud service-extensions lb-traffic-extensions import
command with the following sample values.gcloud service-extensions lb-traffic-extensions import traffic-ext \ --source=traffic.yaml \ --location=us-west1
Verify that the traffic extension works as expected. Establish an SSH connection to the VM client and use the same
curl
command:curl -D - -H "host: example.com" FORWARDING_RULE_IP
The output should include the
hello: service-extensions
response header.HTTP/1.1 200 OK ... content-length: 46 content-type: text/ hello: service-extensions via: 1.1 google Page served from: l7-ilb-backend-example-1c7t
To validate that the extension targets only
example.com
traffic, repeat thecurl
command without thehost
header.curl -D - FORWARDING_RULE_IP
The output is similar to the following:
HTTP/1.1 200 OK ... content-length: 46 content-type: text/html via: 1.1 google Page served from: l7-ilb-backend-example-1c7t
Limitations for callout extensions
- A forwarding rule can have only one
LbTrafficExtension
resource and oneLbRouteExtension
resource. - Cross-project referencing between callout extensions and the forwarding rule is not supported.
- The backend services used for callout extensions must be in the same project as the forwarding rule.
- The extension backend service cannot use Google Cloud Armor, IAP, or Cloud CDN policies.
- The extension backend service must use HTTP/2 as the protocol.
- Route extensions cannot override the processing mode of
ext_proc stream
. - You cannot change some headers. For a complete list, see Limitations with header manipulation.
- An immediate response is supported only for traffic extensions and not for route extensions. If your route extension server responds to a processing request with a processing response that contains an immediate response, Envoy ignores the processing response.
Recommended optimizations for callout extensions
Integrating an extension into the load balancing processing path incurs additional latency for requests and responses. Each type of data that the extension service processes, including request headers, request body, response headers, and response body, adds latency.
Consider the following optimizations to minimize the latency:
- Configure the extension to process only the data that you need. For example,
to modify only request headers, set the
supported_events
field in the extension toREQUEST_HEADERS
. - Deploy callouts in the same zones as the regular destination backend service for the load balancer. When using a cross-region internal Application Load Balancer, place the extension service backends in the same region as the load balancer's proxy-only subnets.
- When using a global external Application Load Balancer, place the extension service backends in the geographic regions where the regular load balancer's destination VMs, Google Kubernetes Engine (GKE) workloads, and Cloud Run functions are located.
What's next
- View Python-based examples of
ext-proc
servers in the Service Extensions GitHub repository. - Manage callout extensions