Configure a callout extension

Service Extensions enables Application Load Balancers to send callouts to extension backend services to inject custom processing in the processing path. This page describes how to configure callout extensions.

For an overview about Application Load Balancer callout extensions, see Cloud Load Balancing extensions overview.

Before you begin

  1. Ensure that you have either a project owner or editor role or the following Compute Engine IAM roles:

  2. Enable these APIs: Compute Engine API and Network Services API.

    gcloud services enable compute.googleapis.com networkservices.googleapis.com
    
  3. Create and configure an Application Load Balancer that supports callouts. For this example, set up a regional internal Application Load Balancer with VM instance group backends.

  4. Create a client VM for testing.

  5. For route extensions only. Set up an additional backend service and update the URL map to add a host matcher that routes traffic to this backend service for all traffic with the HTTP host matching service-extensions.com.

    Run the following commands:

    gcloud compute instances create l7-ilb-backend2-vm \
        --zone=us-west1-a \
        --network=lb-network \
        --subnet=backend-subnet \
        --tags=allow-ssh,load-balanced-backend \
        --image-family=debian-11 \
        --image-project=debian-cloud \
        --metadata=startup-script='#! /bin/bash
            apt-get update
            apt-get install apache2 -y
            a2ensite default-ssl
            a2enmod ssl
            echo "Page served from second backend service" | tee /var/www/html/index.html
            systemctl restart apache2'
    
    gcloud compute instance-groups unmanaged create l7-ilb-backend-service2-ig \
        --zone us-west1-a
    
    gcloud compute instance-groups unmanaged add-instances l7-ilb-backend-service2-ig \
        --zone=us-west1-a \
        --instances=l7-ilb-backend2-vm
    
    gcloud compute backend-services create l7-ilb-backend-service2 \
        --load-balancing-scheme=INTERNAL_MANAGED \
        --protocol=HTTP \
        --health-checks=l7-ilb-basic-check \
        --health-checks-region=us-west1 \
        --region=us-west1
    
    gcloud compute backend-services add-backend l7-ilb-backend-service2 \
        --balancing-mode=UTILIZATION \
        --instance-group=l7-ilb-backend-service2-ig \
        --instance-group-zone=us-west1-a \
        --region=us-west1
    
    gcloud compute url-maps add-path-matcher l7-ilb-map \
        --path-matcher-name=callouts \
        --default-service=l7-ilb-backend-service2 \
        --new-hosts=service-extensions.com \
        --region=us-west1
    

Set up an extension backend service

For this example, a basic Python-based extension server implementing Envoy's Ext Proc gRPC API is available. The source code for this server is in the Python Samples GitHub repository of Google Cloud. A docker container with this server is at us-docker.pkg.dev/service-extensions/ext-proc/ext-proc-sample-python.

To create and set up an extension backend service, follow these steps:

  1. Create a virtual machine (VM) instance for the extension backend service that's running the sample Python extension server. Use the gcloud compute instances create-with-container command.

    gcloud compute instances create-with-container callouts-vm \
        --container-image=us-docker.pkg.dev/service-extensions/ext-proc/ext-proc-sample-python:latest \
        --network=lb-network \
        --subnet=backend-subnet \
        --zone=us-west1-a \
        --tags=allow-ssh,load-balanced-backend
    
  2. Create an unmanaged instance group. Use the gcloud compute instance-groups unmanaged create command. For example:

    gcloud compute instance-groups unmanaged create callouts-ig \
        --zone=us-west1-a
    
  3. Set a port for the unmanaged instance group. Use the gcloud compute instance-groups unmanaged set-named-ports command.

    gcloud compute instance-groups unmanaged set-named-ports callouts-ig \
        --named-ports=http:80,grpc:443 \
        --zone=us-west1-a
    
  4. Add the VM instance to the unmanaged instance group. Use the gcloud compute instance-groups unmanaged add-instances command. For example:

    gcloud compute instance-groups unmanaged add-instances callouts-ig \
        --zone=us-west1-a \
        --instances=callouts-vm
    
  5. Create an HTTP health check for the instance. Use the gcloud compute health-checks create http command. For example:

    gcloud compute health-checks create http callouts-hc \
        --region=us-west1 \
        --port=80
    
  6. Create an extension backend service that uses the HTTP/2 protocol. Use the gcloud compute backend-services create command.

    gcloud compute backend-services create l7-ilb-callout-service \
        --load-balancing-scheme=INTERNAL_MANAGED \
        --protocol=HTTP2 \
        --port-name=grpc \
        --health-checks=callouts-hc \
        --health-checks-region=us-west1 \
        --region=us-west1
    
  7. Add the instance group with the extension server as a backend to the backend service. Use the gcloud compute backend-services add-backend command. The following example adds an instance group running the ext proc service:

    gcloud compute backend-services add-backend l7-ilb-callout-service \
        --balancing-mode=UTILIZATION \
        --instance-group=callouts-ig \
        --instance-group-zone=us-west1-a \
        --region=us-west1
    

Configure a callout extension

You need to create the following resources to configure an extension for an existing Application Load Balancer:

  • An extension backend service resource whose backends run the ext_proc gRPC API
  • An LbTrafficExtension or LbRouteExtension resource that points to both of these resources:
    • A forwarding rule to attach to
    • An extension backend service to call

The LbTrafficExtension and LbRouteExtension resource groups relate extension services into one or more chains. Each extension chain selects the traffic to act on by using Common Expression Language (CEL) match conditions. The load balancer evaluates a request against each chain's match condition in a sequential manner. When a request matches the conditions defined by a chain, all extensions in the chain act on the request. Only one chain matches a given request.

For information about the limits related to callout extensions, see the Quotas and limits page.

The extension resource references the load balancer forwarding rule to attach to. After you configure the resource, the load balancer starts sending matching requests to extension services.

Configure a route extension

The following example shows how to configure a route extension to call when the path matches /extensions. The route extension server in the callout-vm changes the Host header to service-extensions.com, sets the path to /, and then sends an instruction to the load balancer to recompute the route. The traffic then flows to l7-ilb-backend-service2 instead of l7-ilb-backend-service.

  1. Check if there's a match for /extensions in the URL map.

    1. Establish an SSH connection to the client VM.

      gcloud compute ssh CLIENT_VM \
          --zone=ZONE
      

      Replace CLIENT_VM with the name of the client VM.

    2. Run the following curl command against the forwarding rule in the client VM:

      curl FORWARDING_RULE_IP/extensions
      

      Replace FORWARDING_RULE_IP with the IP address of the forwarding rule. To find the IP address, use the gcloud compute forwarding-rules describe command.

      The output should be similar to the following, which indicates that there is no match for /extensions in the URL map.

      <!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
      <html><head>
      <title>404 Not Found</title>
      </head><body>
      ...
      
    3. Close the SSH connection.

  2. To start configuring a route extension, define the extension in a YAML file and associate it with the forwarding rule.

    cat >route.yaml <<EOF
        name: route-ext
        forwardingRules:
        - https://www.googleapis.com/compute/v1/projects/PROJECT_ID/regions/us-west1/forwardingRules/l7-ilb-forwarding-rule
        loadBalancingScheme: INTERNAL_MANAGED
        extensionChains:
        - name: "chain1"
          matchCondition:
            celExpression: 'request.path.startsWith("/extensions")'
          extensions:
          - name: 'ext11'
            authority: ext11.com
            service: https://www.googleapis.com/compute/v1/projects/PROJECT_ID/regions/us-west1/backendServices/l7-ilb-callout-service
            failOpen: false
            timeout: 0.1s
    EOF
    

    Replace PROJECT_ID with the project ID.

  3. Import the route extension. Use the gcloud beta service-extensions lb-route-extensions import command.

    gcloud beta service-extensions lb-route-extensions import route-ext \
        --source=route.yaml \
        --location=us-west1
      
  4. Verify that the route extension works as expected. Establish an SSH connection to the client VM and use the same curl command:

    curl FORWARDING_RULE_IP/extensions
    

    The output should be similar to the following indicating that the traffic matched the virtual host service-extensions.com and reached the l7-ilb-backend-service2 service even though the original request did not.

    Page served from second backend service
    

    To validate that the extension is targeting only requests with the /extension prefix, repeat the curl command without the path prefix.

    curl FORWARDING_RULE_IP
    

    You should see an output like the following:

    Page served from: l7-ilb-backend-example-1c7t
    

Configure a traffic extension

The following example helps you configure a traffic extension to call when the path matches example.com. The traffic extension server in the callout-vm adds a response header, hello: service-extensions to matching requests.

  1. Check if there's a match for example.com in the URL map.

    1. Establish an SSH connection to the client VM.

      gcloud compute ssh CLIENT_VM \
          --zone=ZONE
      

      Replace CLIENT_VM with the name of the client VM.

    2. Run the following curl command against the forwarding rule in the client VM:

      curl -D - -H "host: example.com" FORWARDING_RULE_IP
      

      Replace FORWARDING_RULE_IP with the IP address of the forwarding rule. To find the IP address, use the gcloud compute forwarding-rules describe command.

      The output should be similar to the following.

      HTTP/1.1 200 OK
      ...
      content-length: 46
      content-type: text/html
      via: 1.1 google
      
      Page served from: l7-ilb-backend-example-1c7t
      
    3. Close the SSH connection.

  2. To start configuring a traffic extension, define the extension in a YAML file and associate it with the forwarding rule.

    cat >traffic.yaml <<EOF
        name: traffic-ext
        forwardingRules:
        - https://www.googleapis.com/compute/v1/projects/PROJECT_ID/regions/us-west1/forwardingRules/l7-ilb-forwarding-rule
        loadBalancingScheme: INTERNAL_MANAGED
        extensionChains:
        - name: "chain1"
          matchCondition:
            celExpression: 'request.host == "example.com"'
          extensions:
          - name: 'ext11'
            authority: ext11.com
            service: https://www.googleapis.com/compute/v1/projects/PROJECT_ID/regions/us-west1/backendServices/l7-ilb-callout-service
            failOpen: false
            timeout: 0.1s
            supportedEvents:
            - RESPONSE_HEADERS
    EOF
    

    Replace PROJECT_ID with the project ID.

  3. Import the traffic extension. Use the gcloud beta service-extensions lb-traffic-extensions import command.

    gcloud beta service-extensions lb-traffic-extensions import traffic-ext \
        --source=traffic.yaml \
        --location=us-west1
      
  4. Verify that the traffic extension works as expected. Establish an SSH connection to the VM client and use the same curl command:

    curl -D - -H "host: example.com" FORWARDING_RULE_IP
    

    The output should include the hello: service-extensions response header.

    HTTP/1.1 200 OK
    ...
    content-length: 46
    content-type: text/
    hello: service-extensions
    via: 1.1 google
    
    Page served from: l7-ilb-backend-example-1c7t
    

    To validate that the extension targets only example.com traffic, repeat the curl command without the host header.

    curl -D - FORWARDING_RULE_IP
    

    The output is similar to the following:

    HTTP/1.1 200 OK
    ...
    content-length: 46
    content-type: text/html
    via: 1.1 google
    
    Page served from: l7-ilb-backend-example-1c7t
    

Limitations for callout extensions

  • A forwarding rule can have only one LbTrafficExtension resource and one LbRouteExtension resource.
  • Cross-project referencing between callout extensions and the forwarding rule is not supported.
  • The backend services used for callout extensions must be in the same project as the forwarding rule.
  • The extension backend service cannot use Google Cloud Armor, IAP, or Cloud CDN policies.
  • The extension backend service must use HTTP/2 as the protocol.
  • Route extensions cannot override the processing mode of ext_proc stream.
  • You cannot change some headers. For a complete list, see Limitations with header manipulation.

Recommended optimizations for callout extensions

Integrating an extension into the load balancing processing path incurs additional latency for requests and responses. Each type of data that the extension service processes, including request headers, request body, response headers, and response body, adds latency.

Consider the following optimizations to minimize the latency:

  • Configure the extension to process only the data that you need. For example, to modify only request headers, set the supported_events field in the extension to REQUEST_HEADERS.
  • Deploy callouts in the same zones as the regular destination backend service for the load balancer. When using a cross-region internal Application Load Balancer, place the extension service backends in the same region as the load balancer's proxy-only subnets.
  • When using a global external Application Load Balancer, place the extension service backends in the geographic regions where the regular load balancer's destination VMs, Google Kubernetes Engine (GKE) workloads, and Cloud Run functions are located.

What's next