Configure a callout extension

Service Extensions enables Application Load Balancers to send callouts to extension backend services to inject custom processing in the processing path. This page describes how to configure callout extensions.

For an overview about Application Load Balancer callout extensions, see Cloud Load Balancing extensions overview.

Before you begin

  1. Ensure that you have either a project owner or editor role or the following Compute Engine IAM roles:

  2. Enable these APIs: Compute Engine API and Network Services API.

    Console

    1. In the Google Cloud console, go to the Enable access to APIs page.

      Go to Enable access to APIs

    2. Follow the instructions.

    gcloud

    Use the gcloud services enable command:

    gcloud services enable compute.googleapis.com networkservices.googleapis.com
    
  3. Create and configure an Application Load Balancer that supports callouts. For this example, set up a regional internal Application Load Balancer with VM instance group backends.

  4. Create a client VM for testing.

  5. For route extensions only. Set up an additional backend service and update the URL map to add a host matcher that routes traffic to this backend service for all traffic with the HTTP host matching the specified condition.

    Console

    1. In the Google Cloud console, go to the Create an instance page.

      Go to Create an instance

      Specify the following sample values:

      • Name: l7-ilb-backend2-vm
      • Tags: allow-ssh and load-balanced-backend
      • Zone: us-west1-a
      • Network: lb-network
      • Subnetwork: backend-subnet
      • Image: debian-11
      • Family: debian-cloud
      • Advanced options > Management > Automation:

        '#! /bin/bash
        apt-get update
        apt-get install apache2 -y
        a2ensite default-ssl
        a2enmod ssl
        echo "Page served from second backend service" | tee /var/www/html/index.html
        systemctl restart apache2'
        
    2. Create an unmanaged instance group.

      Specify the following sample values:

      • Name: l7-ilb-backend-service2-ig
      • Zone: us-west1-a
    3. Add the new VM to the instance group.

      For VM instances, specify l7-ilb-backend2-vm.

    4. In the Google Cloud console, go to the Load balancing page.

      Go to Load balancing

    5. Update the load balancer by creating a backend service and adding a backend to it.

      For the backend service, specify the following sample values:

      • Name: l7-ilb-backend-service2
      • Internet facing or internal only: Only between my VMs
      • Protocol: HTTP
      • Region: us-west1
      • Health check > Name: l7-ilb-basic-check
      • Health check > Region: us-west1

      For the backend, specify the following sample values:

      • Instance group: l7-ilb-backend-service2-ig
      • Balancing mode: Utilization
    6. Add a host matcher to the URL map of the backend service.

      Specify the following sample values:

      • Name: l7-ilb-map
      • Host: service-extensions.com
      • Path: callouts
      • Protocol: HTTP
      • Backend: l7-ilb-backend-service2

    gcloud

    1. Create a VM instance. Use the gcloud compute instances create command with the following sample values:

      gcloud compute instances create l7-ilb-backend2-vm \
        --zone=us-west1-a \
        --network=lb-network \
        --subnet=backend-subnet \
        --tags=allow-ssh,load-balanced-backend \
        --image-family=debian-11 \
        --image-project=debian-cloud \
        --metadata=startup-script='#! /bin/bash
            apt-get update
            apt-get install apache2 -y
            a2ensite default-ssl
            a2enmod ssl
            echo "Page served from second backend service" | tee /var/www/html/index.html
            systemctl restart apache2'
      
    2. Create an unmanaged instance group. Use the gcloud compute instance-groups unmanaged create command with the following sample values:

      gcloud compute instance-groups unmanaged create l7-ilb-backend-service2-ig \
        --zone us-west1-a
      
    3. Add the new VM to the instance group. Use the gcloud compute instance-groups unmanaged add-instances command with the following sample values:

      gcloud compute instance-groups unmanaged add-instances l7-ilb-backend-service2-ig \
        --zone=us-west1-a \
        --instances=l7-ilb-backend2-vm
      
    4. Create a backend service. Use the gcloud compute backend-services create command with the following sample values:

      gcloud compute backend-services create l7-ilb-backend-service2 \
        --load-balancing-scheme=INTERNAL_MANAGED \
        --protocol=HTTP \
        --health-checks=l7-ilb-basic-check \
        --health-checks-region=us-west1 \
        --region=us-west1
      
    5. Add a backend to the backend service. Use the gcloud compute backend-services add-backend command with the following sample values:

      gcloud compute backend-services add-backend l7-ilb-backend-service2 \
        --balancing-mode=UTILIZATION \
        --instance-group=l7-ilb-backend-service2-ig \
        --instance-group-zone=us-west1-a \
        --region=us-west1
      
    6. Add a host matcher to the URL map of the backend service. Use the gcloud compute url-maps add-path-matcher command with the following sample values:

      gcloud compute url-maps add-path-matcher l7-ilb-map \
        --path-matcher-name=callouts \
        --default-service=l7-ilb-backend-service2 \
        --new-hosts=service-extensions.com \
        --region=us-west1
        

Set up an extension backend service

For this example, a basic Python-based extension server implementing Envoy's Ext Proc gRPC API is available. A docker container with this server is at us-docker.pkg.dev/service-extensions/ext-proc/service-callout-basic-example-python:latest in the Service Extensions GitHub repository of Google Cloud. This repository contains several other Python-based examples of ext-proc servers to do tasks such as header mutation and body mutation.

To create and set up an extension backend service, follow these steps:

  1. Create a virtual machine (VM) instance for the extension backend service that's running the sample Python extension server.

    Console

    Create an instance by using a container image.

    1. In the Google Cloud console, go to the Create an instance page.

      Go to Create an instance

    2. Specify the following sample values:

      • Name: callouts-vm
      • Zone: us-west1-a
      • Network: lb-network
      • Subnetwork: backend-subnet
      • Tags: allow-ssh and load-balanced-backend
      • Container image: us-docker.pkg.dev/service-extensions/ext-proc/service-callout-basic-example-python:latest

    gcloud

    Create an instance by using a container image. Use the gcloud compute instances create-with-container command with the following sample values:

    gcloud compute instances create-with-container callouts-vm \
      --container-image=us-docker.pkg.dev/service-extensions/ext-proc/service-callout-basic-example-python:latest \
      --network=lb-network \
      --subnet=backend-subnet \
      --zone=us-west1-a \
      --tags=allow-ssh,load-balanced-backend
    
  2. Add the VM to an unmanaged instance group.

    Console

    Create an unmanaged instance group.

    1. In the Google Cloud console, go to the Instance groups page.

      Go to Instance groups

      Specify the following sample values:

      • Name: callouts-ig
      • Zone: us-west1-a
    2. Set a port for the instance group.

      For Port mapping, specify these port names and values: http:80 and grpc:443.

    3. Add the new VM to the instance group.

      For VM instances, specify callouts-vm.

    gcloud

    1. Create an unmanaged instance group. Use the gcloud compute instance-groups unmanaged create command with the following sample values:

      gcloud compute instance-groups unmanaged create callouts-ig \
        --zone=us-west1-a
      
    2. Set a port for the instance group. Use the gcloud compute instance-groups unmanaged set-named-ports command with the following sample values:

      gcloud compute instance-groups unmanaged set-named-ports callouts-ig \
        --named-ports=http:80,grpc:443 \
        --zone=us-west1-a
      
    3. Add the new VM instance to the unmanaged instance group. Use the gcloud compute instance-groups unmanaged add-instances command with the following sample values:

      gcloud compute instance-groups unmanaged add-instances callouts-ig \
        --zone=us-west1-a \
        --instances=callouts-vm
      
  3. Update the load balancer by creating a backend service and adding a backend.

    Console

    Create an extension backend service that uses the HTTP/2 protocol and has an HTTP health check.

    1. In the Google Cloud console, go to the Load balancing page.

      Go to Load balancing

    2. Add a backend service with the following sample values:

      • Name: l7-ilb-callout-service
      • Internet facing or internal only: Only between my VMs
      • Protocol: HTTP2
      • Port name: grpc
      • Region: us-west1
      • Health check > Name: callouts-hc
      • Health check > Port number: 80
    3. Add the instance group with the extension server as a backend to the backend service. The instance group runs the ext proc service.

      Specify the following sample values:

      • Instance group: callouts-ig
      • Balancing mode: Utilization

    gcloud

    1. Create a basic HTTP health check for the instance. Use the gcloud compute health-checks create http command with the following sample values:

      gcloud compute health-checks create http callouts-hc \
        --region=us-west1 \
        --port=80
      
    2. Create an extension backend service that uses the HTTP/2 protocol. Use the gcloud compute backend-services create command.

      gcloud compute backend-services create l7-ilb-callout-service \
        --load-balancing-scheme=INTERNAL_MANAGED \
        --protocol=HTTP2 \
        --port-name=grpc \
        --health-checks=callouts-hc \
        --health-checks-region=us-west1 \
        --region=us-west1
      
    3. Add the instance group with the extension server as a backend to the backend service. The instance group runs the ext proc service. Use the gcloud compute backend-services add-backend command with the following sample values:

      gcloud compute backend-services add-backend l7-ilb-callout-service \
        --balancing-mode=UTILIZATION \
        --instance-group=callouts-ig \
        --instance-group-zone=us-west1-a \
        --region=us-west1
      

Configure a callout extension

You need to create the following resources to configure an extension for an existing Application Load Balancer:

  • An extension backend service resource whose backends run the ext_proc gRPC API
  • An LbTrafficExtension or LbRouteExtension resource that points to both of these resources:
    • A forwarding rule to attach to
    • An extension backend service to call

The LbTrafficExtension and LbRouteExtension resource groups relate extension services into one or more chains. Each extension chain selects the traffic to act on by using Common Expression Language (CEL) match conditions. The load balancer evaluates a request against each chain's match condition in a sequential manner. When a request matches the conditions defined by a chain, all extensions in the chain act on the request. Only one chain matches a given request.

For information about the limits related to callout extensions, see the Quotas and limits page.

The extension resource references the load balancer forwarding rule to attach to. After you configure the resource, the load balancer starts sending matching requests to extension services.

Configure a route extension

The following example shows how to configure a route extension to call when the path matches /extensions. The route extension server in the callout-vm changes the Host header to service-extensions.com, sets the path to /, and then sends an instruction to the load balancer to recompute the route. The traffic then flows to l7-ilb-backend-service2 instead of l7-ilb-backend-service.

  1. Check if there's a match for /extensions in the URL map.

    1. Establish an SSH connection to the client VM.

      Console

      1. In the Google Cloud console, go to the VM instances page.

        Go to VM instances

      2. In the list of virtual machine instances, click SSH in the row of the instance that you want to connect to.

      gcloud

      Use the gcloud compute ssh command.

      gcloud compute ssh CLIENT_VM \
        --zone=ZONE
      

      Replace CLIENT_VM with the name of the client VM.

    2. Run the following curl command against the forwarding rule in the client VM:

      curl FORWARDING_RULE_IP/extensions
      

      Replace FORWARDING_RULE_IP with the IP address of the forwarding rule. To find the IP address, use the gcloud compute forwarding-rules describe command.

      The output should be similar to the following, which indicates that there is no match for /extensions in the URL map.

      <!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
      <html><head>
      <title>404 Not Found</title>
      </head><body>
      ...
      
    3. Close the SSH connection.

  2. Configure the route extension.

    Console

    1. In the Google Cloud console, go to the Service Extensions page.

      Go to Service Extensions

    2. Click Create extension.

      A wizard opens to guide you through some initial steps.

    3. For the product, select Load Balancing. Then, click Continue.

      A list of Application Load Balancers that support callouts appears.

    4. Select a load balancer type. For regional load balancers, also specify the region. Click Continue.

    5. For the service extension type, select Route extensions, click Continue, and then click Done.

      The Create service extension form opens. Notice that the preceding selections, which appear at the top of the page, are not editable.

    6. In the Basics section, do the following:

      1. Specify a unique name for the service extension.

        The name must start with a lowercase letter followed by up to 62 lowercase letters, numbers, or hyphens and must not end with a hyphen.

      2. Optional: Enter a brief description about the extension by using a maximum of 1,024 characters.

      3. Optional: In the Labels section, click Add label. Then, in the row that appears, do the following:

        • For Key, enter a key name.
        • For Value, enter a value for the key.

        To add more key-value pairs, with the maximum limit being 64, click Add label.

        For more information about labels, see Create and update labels for projects.

    7. For Forwarding rules, select one or more forwarding rules to associate with the extension.

      Forwarding rules that are already associated with another extension cannot be selected and appear disabled.

    8. For Extension chains, add one or more extension chains to execute for a matching request.

      To add an extension chain, click Add an extension chain, do the following, and then click Done:

      • For Extension chain name, specify a unique name.

        The name must conform with RFC-1034, use only lower-case letters, numbers, and hyphens, and have a maximum length of 63 characters. Additionally, the first character must be a letter and the last a letter or a number.

      • For Match condition, specify a Common Expression Language (CEL) expression to match requests for which the extension chain is executed.

        For more information, click Get syntax help or see CEL matcher language reference.

      • Add an extension to execute for a matching request. For route extensions, you can specify only one extension.

        Under Extensions, do the following, and then click Done:

        • For Extension name, specify a unique name.

          The name must conform with RFC-1034, use only lower-case letters, numbers, and hyphens, and have a maximum length of 63 characters. Additionally, the first character must be a letter and the last a letter or a number.

        • For Authority, enter the authority header from the gRPC request sent from Envoy to the extension service.

        • For Backend service, select a backend service created by following the instructions in Set up an extension backend service.

        • For Timeout, specify a value between 10 and 1000 milliseconds, after which a message on the stream times out while Envoy is still waiting for a response from the ext_proc service.

        • For Forward headers, click Add header, and then add HTTP headers to forward to the extension (from the client or the backend). If a header is not specified, all headers are sent.

        • Optional: For Fail open, if you want the extension to fail open, select Enabled. In this case, if the call to the extension fails or times out, request or response processing continues without error. Any subsequent extensions in the extension chain are also executed.

          This field is unselected by default. In this case, if response headers have not been delivered to the downstream client, a generic 500 error is returned to the client. If response headers have been delivered, the HTTP stream to the downstream client is reset.

    9. Click Create extension.

    gcloud

    1. Define the extension in a YAML file and associate it with the forwarding rule. Use the sample values provided.

      cat >route.yaml <<EOF
          name: route-ext
          forwardingRules:
          - https://www.googleapis.com/compute/v1/projects/PROJECT_ID/regions/us-west1/forwardingRules/l7-ilb-forwarding-rule
          loadBalancingScheme: INTERNAL_MANAGED
          extensionChains:
          - name: "chain1"
            matchCondition:
              celExpression: 'request.path.startsWith("/extensions")'
            extensions:
            - name: 'ext11'
              authority: ext11.com
              service: https://www.googleapis.com/compute/v1/projects/PROJECT_ID/regions/us-west1/backendServices/l7-ilb-callout-service
              failOpen: false
              timeout: 0.1s
      EOF
      

      Replace PROJECT_ID with the project ID.

    2. Import the route extension. Use the gcloud service-extensions lb-route-extensions import command with the following sample values.

      gcloud service-extensions lb-route-extensions import route-ext \
          --source=route.yaml \
          --location=us-west1
      
  3. Verify that the route extension works as expected. Establish an SSH connection to the client VM and use the same curl command:

    curl FORWARDING_RULE_IP/extensions
    

    The output should be similar to the following indicating that the traffic matched the virtual host service-extensions.com and reached the l7-ilb-backend-service2 service even though the original request did not.

    Page served from second backend service
    

    To validate that the extension is targeting only requests with the /extension prefix, repeat the curl command without the path prefix.

    curl FORWARDING_RULE_IP
    

    You should see an output like the following:

    Page served from: l7-ilb-backend-example-1c7t
    

Configure a traffic extension

The following example helps you configure a traffic extension to call when the host matches example.com. The traffic extension server in the callout-vm adds a response header, hello: service-extensions to matching requests.

  1. Check if there's a match for example.com in the URL map.

    1. Establish an SSH connection to the client VM.

      Console

      1. In the Google Cloud console, go to the VM instances page.

        Go to VM instances

      2. In the list of virtual machine instances, click SSH in the row of the instance that you want to connect to.

      gcloud

      Use the gcloud compute ssh command.

      gcloud compute ssh CLIENT_VM \
        --zone=ZONE
      

      Replace CLIENT_VM with the name of the client VM.

    2. Run the following curl command against the forwarding rule in the client VM:

      curl -D - -H "host: example.com" FORWARDING_RULE_IP
      

      Replace FORWARDING_RULE_IP with the IP address of the forwarding rule. To find the IP address, use the gcloud compute forwarding-rules describe command.

      The output should be similar to the following.

      HTTP/1.1 200 OK
      ...
      content-length: 46
      content-type: text/html
      via: 1.1 google
      
      Page served from: l7-ilb-backend-example-1c7t
      
    3. Close the SSH connection.

  2. Configure the traffic extension.

    Console

    1. In the Google Cloud console, go to the Service Extensions page.

      Go to Service Extensions

    2. Click Create extension.

      A wizard opens to guide you through some initial steps.

    3. For the product, select Load Balancing. Then, click Continue.

      A list of Application Load Balancers that support callouts appears.

    4. Select a load balancer type. For regional load balancers, also specify the region. Click Continue.

    5. For the service extension type, select Traffic extensions, click Continue, and then click Done.

      The Create service extension form opens. Notice that the preceding selections, which appear at the top of the page, are not editable.

    6. In the Basics section, do the following:

      1. Specify a unique name for the service extension.

        The name must start with a lowercase letter followed by up to 62 lowercase letters, numbers, or hyphens and must not end with a hyphen.

      2. Optional: Enter a brief description about the extension by using up to 1,024 characters.

      3. Optional: In the Labels section, click Add label. Then, in the row that appears, do the following:

        • For Key, enter a key name.
        • For Value, enter a value for the key.

        To add more key-value pairs, with the maximum limit being 64, click Add label.

        For more information about labels, see Create and update labels for projects.

    7. For Forwarding rules, select one or more forwarding rules to associate with the extension.

      Forwarding rules that are already associated with another extension cannot be selected and appear disabled.

    8. For Extension chains, add one or more extension chains to execute for a matching request.

      To add an extension chain, click Add an extension chain, do the following, and then click Done:

      • For Extension chain name, specify a unique name.

        The name must conform with RFC-1034, use only lower-case letters, numbers, and hyphens, and have a maximum length of 63 characters. Additionally, the first character must be a letter and the last a letter or a number.

      • To match requests for which the extension chain is executed, for Match condition, specify a Common Expression Language (CEL) expression.

        For more information, click Get syntax help or see CEL matcher language reference.

      • Add one or more extensions to execute for a matching request.

        For each extension, under Extensions, do the following, and then click Done:

        • For Extension name, specify a unique name.

          The name must conform with RFC-1034, use only lower-case letters, numbers, and hyphens, and have a maximum length of 63 characters. Additionally, the first character must be a letter and the last a letter or a number.

        • For Authority, enter the authority header from the gRPC request sent from Envoy to the extension service.

        • For Backend service, select a backend service created by following the instructions in Set up an extension backend service.

        • For Timeout, specify a value between 10 and 1000 milliseconds after which a message on the stream times out.

        • For Events, select one or more HTTP event types that call the extension.

        • For Forward headers, click Add header, and then add HTTP headers to forward to the extension (from the client or the backend). If a header is not specified, all headers are sent.

        • For Fail open, select Enabled. If the call to the extension fails or times out, request or response processing continues without error. Any subsequent extensions in the extension chain are also executed.

          This field is unselected by default. In this case, if response headers have not been delivered to the downstream client, a generic 500 error is returned to the client. If response headers have been delivered, the HTTP stream to the downstream client is reset.

    9. Click Create extension.

    gcloud

    1. Define the extension in a YAML file and associate it with the forwarding rule. Use the sample values provided.

      cat >traffic.yaml <<EOF
          name: traffic-ext
          forwardingRules:
          - https://www.googleapis.com/compute/v1/projects/PROJECT_ID/regions/us-west1/forwardingRules/l7-ilb-forwarding-rule
          loadBalancingScheme: INTERNAL_MANAGED
          extensionChains:
          - name: "chain1"
            matchCondition:
              celExpression: 'request.host == "example.com"'
            extensions:
            - name: 'ext11'
              authority: ext11.com
              service: https://www.googleapis.com/compute/v1/projects/PROJECT_ID/regions/us-west1/backendServices/l7-ilb-callout-service
              failOpen: false
              timeout: 0.1s
              supportedEvents:
              - RESPONSE_HEADERS
      EOF
      

      Replace PROJECT_ID with the project ID.

    2. Import the traffic extension. Use the gcloud service-extensions lb-traffic-extensions import command with the following sample values.

      gcloud service-extensions lb-traffic-extensions import traffic-ext \
          --source=traffic.yaml \
          --location=us-west1
      
  3. Verify that the traffic extension works as expected. Establish an SSH connection to the VM client and use the same curl command:

    curl -D - -H "host: example.com" FORWARDING_RULE_IP
    

    The output should include the hello: service-extensions response header.

    HTTP/1.1 200 OK
    ...
    content-length: 46
    content-type: text/
    hello: service-extensions
    via: 1.1 google
    
    Page served from: l7-ilb-backend-example-1c7t
    

    To validate that the extension targets only example.com traffic, repeat the curl command without the host header.

    curl -D - FORWARDING_RULE_IP
    

    The output is similar to the following:

    HTTP/1.1 200 OK
    ...
    content-length: 46
    content-type: text/html
    via: 1.1 google
    
    Page served from: l7-ilb-backend-example-1c7t
    

Limitations for callout extensions

  • A forwarding rule can have only one LbTrafficExtension resource and one LbRouteExtension resource.
  • Cross-project referencing between callout extensions and the forwarding rule is not supported.
  • The backend services used for callout extensions must be in the same project as the forwarding rule.
  • The extension backend service cannot use Google Cloud Armor, IAP, or Cloud CDN policies.
  • The extension backend service must use HTTP/2 as the protocol.
  • Route extensions cannot override the processing mode of ext_proc stream.
  • You cannot change some headers. For a complete list, see Limitations with header manipulation.
  • An immediate response is supported only for traffic extensions and not for route extensions. If your route extension server responds to a processing request with a processing response that contains an immediate response, Envoy ignores the processing response.

Recommended optimizations for callout extensions

Integrating an extension into the load balancing processing path incurs additional latency for requests and responses. Each type of data that the extension service processes, including request headers, request body, response headers, and response body, adds latency.

Consider the following optimizations to minimize the latency:

  • Configure the extension to process only the data that you need. For example, to modify only request headers, set the supported_events field in the extension to REQUEST_HEADERS.
  • Deploy callouts in the same zones as the regular destination backend service for the load balancer. When using a cross-region internal Application Load Balancer, place the extension service backends in the same region as the load balancer's proxy-only subnets.
  • When using a global external Application Load Balancer, place the extension service backends in the geographic regions where the regular load balancer's destination VMs, Google Kubernetes Engine (GKE) workloads, and Cloud Run functions are located.

What's next