Configure advanced traffic management with Envoy

This document provides information about how to configure advanced traffic management for your Traffic Director deployment.

Before you begin

Before you configure advanced traffic management, follow the instructions in Prepare to set up Traffic Director with Envoy, including configuring Traffic Director and any virtual machine (VM) hosts or Google Kubernetes Engine (GKE) clusters that you need. Create the required Google Cloud resources.

Advanced traffic management feature availability differs according to the request protocol that you choose. This protocol is configured when you configure routing by using the target HTTP or HTTPS proxy, target gRPC proxy, or target TCP proxy resource:

  • With the target HTTP proxy and target HTTPS proxy, all the features described in this document are available.
  • With the target gRPC proxy, some features are available.
  • With the target TCP proxy, no advanced traffic management features are available.

For more information, see Traffic Director features and Advanced traffic management. For an end-to-end setup guide, see Configure advanced traffic management with proxyless gRPC services.

Set up traffic splitting

These instructions assume the following:

  • Your Traffic Director deployment has a URL map called review-url-map.
  • The URL map sends all traffic to one backend service called review1, which serves as the default backend service.
  • You plan to route 5% of traffic to a new version of a service. That service is running on a backend VM or endpoint in a network endpoint group (NEG) associated with the backend service review2.
  • No host rules or path matchers are used.

If you are splitting traffic to a new service that has not been referenced by the URL map before, first add the new service to weightedBackendServices and give it a weight of 0. Then, gradually increase the weight assigned to that service.

To set up traffic splitting, follow these steps.

Console

  1. In the Google Cloud Console, go to the Traffic Director page.

    Go to Traffic Director

  2. Click Routing rule maps.

  3. Click Create routing rule map.

  4. On the Create a routing rule map page, enter a Name.

  5. In the Protocol menu, select HTTP.

  6. Select an existing forwarding rule.

  7. Under Routing rules, select Advanced host, path and route rule.

  8. Under Hosts and path matchers, click Add hosts and path matcher. This adds a new path matcher that you can configure to split traffic.

  9. Add the following settings to the Path matcher field:

        - defaultService: global/backendServices/review1
          name: matcher1
          routeRules:
          - priority: 2
            matchRules:
            - prefixMatch: ''
            routeAction:
             weightedBackendServices:
             - backendService: global/backendServices/review1
               weight: 95
             - backendService: global/backendServices/review2
               weight: 5
    
  10. Click Done.

  11. Click Save.

After you are satisfied with the new version, you can gradually adjust the weights of the two services and eventually send all traffic to review2.

gcloud

  1. Run the gcloud export command to get the URL map configuration:

    gcloud compute url-maps export review-url-map \
        --destination=review-url-map-config.yaml
    
  2. Add the following section to the review-url-map-config.yaml file:

         hostRules:
         - description: ''
          hosts:
           - '*'
         pathMatcher: matcher1
         pathMatchers:
         - defaultService: global/backendServices/review1
           name: matcher1
           routeRules:
           - priority: 2
             matchRules:
             - prefixMatch: ''
             routeAction:
              weightedBackendServices:
              - backendService: global/backendServices/review1
                weight: 95
              - backendService: global/backendServices/review2
                weight: 5
    
  3. Update the URL map:

    gcloud compute url-maps import review-url-map \
        --source=review-url-map-config.yaml
    

After you are satisfied with the new version, you can gradually adjust the weights of the two services and eventually send all traffic to review2.

Set up circuit breaking

Circuit breaking lets you set failure thresholds to prevent client requests from overloading your backends. After requests reach a limit that you set, the client stops allowing new connections or sending additional requests, giving your backends time to recover.

As a result, circuit breaking prevents cascading failures by returning an error to the client rather than overloading a backend. This lets some traffic be served while providing time for managing the overload situation, such as handling a traffic spike by increasing capacity through autoscaling.

In the following example, you set the circuit breakers as follows:

  • Maximum requests per connection: 100
  • Maximum number of connections: 1000
  • Maximum pending requests: 200
  • Maximum requests: 1000
  • Maximum retries: 3

To set up circuit breaking, follow these steps.

Console

  1. In the Cloud Console, go to the Traffic Director page.

    Go to Traffic Director

  2. Click the name of the backend service that you want to update.

  3. Click Edit.

  4. Click Advanced configurations.

  5. Under Circuit breakers, select the Enable checkbox.

  6. Click Edit.

    1. In Max requests per connection, enter 100.
    2. In Max connections, enter 1000.
    3. In Max pending requests, enter 200.
    4. In Max requests, enter 1000.
    5. In Max retries, enter 3.
  7. Click Save, and then click Save again.

gcloud

  1. Run the gcloud export command to export the backend service configuration. Replace BACKEND_SERVICE_NAME with the name of the backend service.

     gcloud compute backend-services export BACKEND_SERVICE_NAME \
         --destination=BACKEND_SERVICE_NAME-config.yaml --global
    
  2. Update the BACKEND_SERVICE_NAME.yaml file as follows:

     affinityCookieTtlSec: 0
     backends:
     - balancingMode: UTILIZATION
       capacityScaler: 1.0
        group:https://www.googleapis.com/compute/v1/projects/PROJECT_ID/zones/ZONE/instanceGroups/INSTANCE_GROUP_NAME
       maxUtilization: 0.8
     circuitBreakers:
       maxConnections: 1000
       maxPendingRequests: 200
       maxRequests: 1000
       maxRequestsPerConnection: 100
       maxRetries: 3
     connectionDraining:
       drainingTimeoutSec: 300
     healthChecks:
       - https://www.googleapis.com/compute/v1/projects/PROJECT_ID/global/healthChecks/HEALTH_CHECK_NAME
     loadBalancingScheme: INTERNAL_SELF_MANAGED
     localityLbPolicy: ROUND_ROBIN
     name: BACKEND_SERVICE_NAME
     port: 80
     portName: http
     protocol: HTTP
     sessionAffinity: NONE
     timeoutSec: 30
    
  3. Update the backend service config file:

    gcloud compute backend-services import BACKEND_SERVICE_NAME \
        --source=BACKEND_SERVICE_NAME-config.yaml --global
    

Circuit breaking example

The following circuit breaker example is for a use case in which a shopping cart service has one instance group as its backend. The circuit breaker settings indicate the limits on the following:

  • Maximum requests per connection: 100
  • Maximum number of connections: 1000
  • Maximum pending requests: 200
  • Maximum requests: 20000
  • Maximum retries: 3

To set up the circuit breaking example, follow these steps.

Console

  1. In the Cloud Console, go to the Traffic Director page.

    Go to Traffic Director

  2. Click the name of the backend service that you want to update.

  3. Click Edit.

  4. Click Advanced configurations.

  5. Under Circuit breakers, select the Enable checkbox.

  6. Click Edit.

    1. In Max requests per connection, enter 100.
    2. In Max connections, enter 1000.
    3. In Max pending requests, enter 200.
    4. In Max requests, enter 20000.
    5. In Max retries, enter 3.
  7. Click Save, and then click Save again.

gcloud

  1. Run the gcloud export command to export the backend service configuration. Replace BACKEND_SERVICE_NAME with the name of the backend service.

    gcloud compute backend-services export BACKEND_SERVICE_NAME \
        --destination=BACKEND_SERVICE_NAME-config.yaml --global
    
  2. Update the BACKEND_SERVICE_NAME.yaml file as follows:

     affinityCookieTtlSec: 0
     backends:
     - balancingMode: UTILIZATION
       capacityScaler: 1.0
        group:https://www.googleapis.com/compute/v1/projects/PROJECT_ID/zones/ZONE/instanceGroups/INSTANCE_GROUP_NAME
       maxUtilization: 0.8
     circuitBreakers:
       maxConnections: 1000
       maxPendingRequests: 200
       maxRequests: 20000
       maxRequestsPerConnection: 100
       maxRetries: 3
     connectionDraining:
       drainingTimeoutSec: 300
     healthChecks:
       -  https://www.googleapis.com/compute/v1/projects/PROJECT_ID/global/healthChecks/HEALTH_CHECK_NAME
     loadBalancingScheme: INTERNAL_SELF_MANAGED
     localityLbPolicy: ROUND_ROBIN
     name: BACKEND_SERVICE_NAME
     port: 80
     portName: http
     protocol: HTTP
     sessionAffinity: NONE
     timeoutSec: 30
    
  3. Update the backend service config file:

    gcloud compute backend-services import BACKEND_SERVICE_NAME \
        --source=BACKEND_SERVICE_NAME-config.yaml --global
    

Advanced traffic management enables you to configure session affinity based on a provided cookie.

To set up session affinity using HTTP_COOKIE, follow these steps.

Console

  1. In the Cloud Console, go to the Traffic Director page.

    Go to Traffic Director

  2. Click the name of the backend service that you want to update.

  3. Click Edit.

  4. Click Advanced configurations.

  5. Under Session affinity, select HTTP cookie.

  6. Under Locality Load balancing policy, select Ring hash.

    1. In the HTTP Cookie name field, enter http_cookie.
    2. In the HTTP Cookie path field, enter /cookie_path.
    3. In the HTTP Cookie TTL field, enter 100.
    4. In the Minimum ring size field, enter 10000.
  7. Click Save.

gcloud

  1. Run the gcloud export command to export the backend service configuration. Replace BACKEND_SERVICE_NAME with the name of the backend service.

    gcloud compute backend-services export BACKEND_SERVICE_NAME \
        --destination=BACKEND_SERVICE_NAME-config.yaml --global
    
  2. Update the YAML file as follows:

    sessionAffinity: 'HTTP_COOKIE'
    localityLbPolicy: 'RING_HASH'
    consistentHash:
    httpCookie:
      name: 'http_cookie'
      path: '/cookie_path'
      ttl:
        seconds: 100
        nanos: 30
    minimumRingSize: 10000
    
  3. Import the backend service config file:

    gcloud compute backend-services import BACKEND_SERVICE_NAME \
        --source=BACKEND_SERVICE_NAME-config.yaml --global
    

Set up outlier detection

Outlier detection controls the eviction of unhealthy hosts from the load-balancing pool. Traffic Director does this by using a set of policies that specify the criteria for the eviction of unhealthy backend VMs or endpoints in NEGs, along with criteria defining when a backend or endpoint is considered healthy enough to receive traffic again.

In the following example, the backend service has one instance group as its backend. The outlier detection setting specifies that outlier detection analysis is run every second. If an endpoint returns five consecutive 5xx errors, it is ejected from load-balancing consideration for 30 seconds for the first time. The real ejection time for the same endpoint is 30 seconds times the number of times that it is ejected.

To set up outlier detection on the backend service resource, follow these steps.

Console

  1. In the Cloud Console, go to the Traffic Director page.

    Go to Traffic Director

  2. Click the name of a service.

  3. Click Edit.

  4. Click Advanced configurations.

  5. Select the Outlier detection checkbox.

  6. Click Edit.

    1. Set Consecutive errors to 5.
    2. Set Interval to 1000 milliseconds.
    3. Set Base ejection time to 30000 milliseconds.
  7. Click Save, and then click Save again.

gcloud

  1. Run the gcloud export command to export the backend service configuration. Replace BACKEND_SERVICE_NAME with the name of the backend service.

    gcloud compute backend-services export BACKEND_SERVICE_NAME \
        --destination=BACKEND_SERVICE_NAME-config.yaml --global
    
  2. Update the YAML file as follows, substituting the name of the backend service for BACKEND_SERVICE_NAME:

    name: BACKEND_SERVICE_NAME
    loadBalancingScheme: INTERNAL_SELF_MANAGED
    backends:
    - balancingMode: UTILIZATION
     capacityScaler: 1.0
     group: $INSTANCE_GROUP_URL
    healthChecks:
    - $HEALTH_CHECK_URL
    port: 80
    portName: http
    protocol: HTTP
    outlierDetection:
     consecutiveErrors: 5,
     interval:
         seconds: 1,
         nanos: 0
     baseEjectionTime:
         seconds: 30,
         nanos: 0
    
  3. Import the backend service config file:

    gcloud compute backend-services import BACKEND_SERVICE_NAME \
        --source=BACKEND_SERVICE_NAME-config.yaml --global
    

Set the locality load-balancing policy

Use the locality load-balancing policy to choose a load-balancing algorithm based on the locality weight and priority provided by Traffic Director. For example, you can perform weighted round robin among healthy endpoints or do consistent hashing.

In the following example, the backend service has one instance group as its backend. The locality load-balancing policy is set to RING_HASH.

To set the locality load-balancing policy, follow these steps.

Console

  1. In the Cloud Console, go to the Traffic Director page.

    Go to Traffic Director

  2. Click the name of a service.

  3. Click Edit.

  4. Click Advanced configurations.

  5. Under Traffic policy, in the Locality load balancing policy menu, select Ring hash.

  6. Click Save.

gcloud

  1. Run the gcloud export command to export the backend service configuration. Replace BACKEND_SERVICE_NAME with the name of the backend service.

    gcloud compute backend-services export BACKEND_SERVICE_NAME \
        --destination=BACKEND_SERVICE_NAME-config.yaml --global
    
  2. Update the BACKEND_SERVICE_NAME.yaml file as follows:

    name: shopping-cart-service
    loadBalancingScheme: INTERNAL_SELF_MANAGED
    backends:
    - balancingMode: UTILIZATION
     capacityScaler: 1.0
     group: $INSTANCE_GROUP_URL
    healthChecks:
    - $HEALTH_CHECK_URL
    port: 80
    portName: http
    protocol: HTTP
    localityLbPolicy: RING_HASH
    
  3. Import the backend service config file:

    gcloud compute backend-services import BACKEND_SERVICE_NAME \
        --source=BACKEND_SERVICE_NAME-config.yaml --global
    

For more information about how the locality load-balancing policy works, see the documentation for the backendService resource.

Set up config filtering based on MetadataFilters match

MetadataFilters are enabled with forwarding rules and HttpRouteRuleMatch. Use this feature to control a particular forwarding rule or route rule so that the control plane sends the forwarding rule or route rule only to proxies whose node metadata matches the metadata filter setting. If you do not specify any MetadataFilters, the rule is sent to all Envoy proxies.

This feature makes it easy to operate a staged deployment of a configuration. For example, create a forwarding rule named forwarding-rule1, which you want to be pushed only to Envoys whose node metadata contains app: review and version: canary.

To add MetadataFilters to a forwarding rule, follow these steps.

gcloud

  1. Run the gcloud export command to get the forwarding rule config:

    gcloud compute forwarding-rules export forwarding-rule1 \
        --destination=forwarding-rule1-config.yaml \
        --global
    
  2. Delete the forwarding rule:

    gcloud compute forwarding-rules delete forwarding-rule1 \
        --global
    
  3. Update the forwarding-rule1-config.yaml file.

    The following example creates a MATCH_ALL metadata filter:

     metadataFilters:
     - filterMatchCriteria: 'MATCH_ALL'
       filterLabels:
       - name: 'app'
         value: 'review'
       - name: 'version'
         value: 'canary'
    

    The following example creates a MATCH_ANY metadata filter:

     metadataFilters:
     - filterMatchCriteria: 'MATCH_ANY'
       filterLabels:
       - name: 'app'
         value: 'review'
       - name: 'version'
         value: 'production'
    
  4. Remove all output-only fields from the forwarding-rule1-config.yaml file. For more information, see the documentation for gcloud compute forwarding-rules import.

  5. Run the gcloud import command to update the forwarding-rule1-config.yaml file:

    gcloud compute forwarding-rules import forwarding-rule1 \
        --source=forwarding-rule1-config.yaml \
        --global
    
  6. Use these instructions to add node metadata to Envoy before starting Envoy. Only a string value is supported.

    a. For a VM-based deployment, in bootstrap_template.yaml, add the following under the metadata section:

       app: 'review'
       version: 'canary'
    

    b. For a Google Kubernetes Engine-based or Kubernetes-based deployment, in trafficdirector_istio_sidecar.yaml, add the following under the env section:

       - name: ISTIO_META_app
         value: 'review'
       - name: ISTIO_META_version
         value: 'canary'
    

Metadata filtering examples

Use the following instructions for a scenario in which multiple projects are in the same Shared VPC network and you want each service project's Traffic Director resources to be visible to proxies in the same project.

The Shared VPC setup is as follows:

  • Host project name: vpc-host-project
  • Service projects: project1, project2
  • Backend services with backend instances or endpoints running an xDS compliant proxy in project1 and project2

To configure Traffic Director to isolate project1, follow these steps.

gcloud

  1. Create all forwarding rules in project1 with the following metadata filter:

         metadataFilters:
         - filterMatchCriteria: 'MATCH_ALL'
           filterLabels
           - name: 'project_name'
             value: 'project1'
           - name: 'version'
             value: 'production'
    
  2. When you configure the proxies deployed to instances or endpoints in project1, include the following metadata in the node metadata section of the bootstrap file:

       project_name: 'project1'
       version: 'production'
    
  3. Verify that the proxies already deployed in project2 did not receive the forwarding rule created in the first step. To do this, try to access services in project1 from a system running a proxy in project2. For information about verifying that a Traffic Director configuration is functioning correctly, see Verifying the configuration.

To test a new configuration on a subset of proxies before you make it available to all proxies, follow these steps.

gcloud

  1. Start the proxies that you are using for testing with the following node metadata. Do not include this node metadata for proxies that you are not using for testing.

      version: 'test'
    
  2. For each new forwarding rule that you want to test, include the following metadata filter:

      metadataFilters:
      - filterMatchCriteria: 'MATCH_ALL'
        filterLabels:
        - name: 'version'
          value: 'test'
    
  3. Test the new configuration by sending traffic to the test proxies, and make any necessary changes. If the new configuration is working correctly, only the proxies that you test receive the new configuration. The remaining proxies do not receive the new configuration and are not able to use it.

  4. When you confirm that the new configuration works correctly, remove the metadata filter associated with it. This lets all proxies receive the new configuration.

Troubleshooting

Use this information to troubleshoot when traffic is not being routed according to the route rules and traffic policies that you configured.

Symptoms:

  • Increased traffic to services in rules above the rule in question.
  • An unexpected increase in 4xx and 5xx HTTP responses for a given route rule.

Solution: Because route rules are interpreted in priority order, review the priority assigned to each rule.

When you define route rules, check to be sure that rules with higher priority (that is, with lower priority numbers) do not inadvertently route traffic that would otherwise have been routed by a subsequent route rule. Consider the following example:

  • First route rule

    • Route rule match pathPrefix = /shopping/
    • Redirect action: send traffic to backend service service-1
    • Rule priority: 4
  • Second route rule

    • Route rule match regexMatch = /shopping/cart/ordering/.*
    • Redirect action: send traffic to backend service service-2
    • Rule priority: 8

In this case, a request with the path /shopping/cart/ordering/cart.html is routed to service-1. Even though the second rule would have matched, it is ignored because the first rule had priority.

What's next