Traffic management overview for internal HTTP(S) load balancers

Internal HTTP(S) Load Balancing supports advanced traffic management functionality that enables you to use the following features:
  • Traffic steering. Intelligently route traffic based on HTTP(S) parameters (for example, host, path, headers, and other request parameters).
  • Traffic actions. Perform request-based and response-based actions (for example, redirects and header transformations).
  • Traffic policies. Fine-tune load balancing behavior (for example, advanced load balancing algorithms).

You can set up these features by using URL maps and backend services. For more information, see the following topics:

Use case examples

Traffic management addresses many use cases. This section provides a few high-level examples.

Traffic steering: header-based routing

Traffic steering allows you to direct traffic to service instances based on HTTPS parameters such as request headers. For example, if a user's device is a mobile device with user-agent:Mobile in the request header, traffic steering can send that traffic to service instances designated to handle mobile traffic, and send traffic that doesn't have user-agent:Mobile to instances designated to handle traffic from other devices.

Cloud Load Balancing traffic steering (click to enlarge)
Cloud Load Balancing traffic steering

Traffic actions: weight-based traffic splitting

Deploying a new version of an existing production service generally incurs some risk. Even if your tests pass in staging, you probably don't want to subject 100% of your users to the new version immediately. Internal HTTP(S) Load Balancing allows you to define percentage-based traffic splits across multiple backend services.

For example, you can send 95% of the traffic to the previous version of your service and 5% to the new version of your service. After you've validated that the new production version works as expected, you can gradually shift the percentages until 100% of the traffic reaches the new version of your service. Traffic splitting is typically used for deploying new versions, A/B testing, service migration, and similar processes.

Cloud Load Balancing traffic splitting
Cloud Load Balancing traffic splitting

Traffic policies: request mirroring

Your organization might have specific compliance requirements mandating that all traffic be mirrored to an additional service that can, for example, record the request details in a database for later replay.

Traffic management components

At a high level, internal HTTP(S) load balancers provide traffic management by leveraging regional URL maps and regional backend services resources.

You can set up traffic steering and traffic actions by using regional URL maps. Google Cloud resources that are associated with URL maps include the following:

  • Route rule
  • Rule match
  • Rule action

You can set up traffic policies by using regional backend services. Google Cloud resources that are associated with backend services include the following:

  • Locality load balancer policy
  • Consistent hash load balancer settings
  • Circuit breakers
  • Outlier detection
The following diagram shows the resources that are used to implement each feature.

Cloud Load Balancing traffic steering (click to enlarge)
Cloud Load Balancing data model (click to enlarge)

Routing requests to backends

In Internal HTTP(S) Load Balancing, the backend for your traffic is determined by using a two-phased approach:

  • The load balancer selects a backend service with backends. The backends can be Compute Engine virtual machine (VM) instances in an unmanaged instance group, Compute Engine VMs in a managed instance group (MIG), or containers by means of a Google Kubernetes Engine (GKE) node in a network endpoint group (NEG). The load balancer chooses a backend service based on rules defined in a regional URL map.
  • The backend service selects a backend instance based on policies defined in a regional backend service.

When you configure routing, you can choose between the following modes:

  • Simple host and path rule
  • Advanced host, path, and route rule

For each URL map, you can choose to use simple host and path rules or advanced host, path, and route rules. The two modes are mutually exclusively. Each URL map can contain only one mode or the other mode.

Simple host and path rule

In a simple host and path rule, URL maps work as described in the URL map overview.

The following diagram shows the logical flow of a simple host and path rule.

Simple URL map flow
Simple URL map flow

A request is initially evaluated by using host rules. A host is the domain specified by the request. If the request host matches one of the entries in the hosts field, the associated path matcher is used.

Next, the path matcher is evaluated. Path rules are evaluated on the longest-path-matches-first basis, and you can specify path rules in any order. After the most specific match is found, the request is routed to the corresponding backend service. If the request does not match, the default backend service is used.

A typical simple host and path rule might look something like the following, where video traffic goes to video-backend-service, and all other traffic goes to web-backend-service.

$ gcloud compute url-maps describe l7-ilb-map
defaultService: regions/us-west1/backendServices/web-backend-service
- hosts:
  - '*'
  pathMatcher: pathmap
name: l7-ilb-map
- defaultService: regions/us-west1/backendServices/web-backend-service
  name: pathmap
  - paths:
    - /video
    - /video/*
    service: regions/us-west1/backendServices/video-backend-service
region: regions/us-west1

Advanced host, path, and route rule

Advanced host, path, and route rules provide additional configuration options compared to simple host and path rules. These options enable more advanced traffic management patterns and also modify some of the semantics. For example, route rules have an associated priority value and are interpreted in priority order (rather than by using longest-path-matches-first semantics).

As in the earlier simple host and path rule example, you can configure advanced traffic management by using a regional URL map. For example, the following URL map configures routing where 95% of the traffic is routed to one backend service, and 5% of the traffic is routed to another backend service.

$ gcloud compute url-maps describe l7-ilb-map
defaultService: regions/us-west1/backendServices/service-a
- hosts:
  - '*'
  pathMatcher: matcher1
name: l7-ilb-map
- defaultService: regions/us-west1/backendServices/service-a
  name: matcher1
  - matchRules:
    - prefixMatch: ''
      - backendService: regions/us-west1/backendServices/service-a
        weight: 95
      - backendService: regions/us-west1/backendServices/service-b
        weight: 5
region: regions/us-west1

Host rules

When a request reaches your load balancer, the request's host field is evaluated against the hostRules defined in the URL map. Each host rule consists of a list of one or more hosts and a single path matcher (pathMatcher). If no hostRules are defined, the request is routed to the defaultService.

For more information, see hostRules[] and defaultService in the regional URL map API documentation.

Path matchers

After a request matches a host rule, the load balancer evaluates the path matcher corresponding to the host.

A path matcher is made up of the following:

  • One or more path rules (pathRules) or route rules (routeRules).
  • A default service (defaultService), which is the default backend service that is used when no other backend services match.
For more information, see pathMatchers[], pathMatchers[].pathRules[], and pathMatchers[].routeRules[] in the regional URL map API documentation.

Path rules

Path rules (pathRules) specify one or more URL paths, such as / or /video. Path rules are generally intended for the type of simple host and path-based routing described previously.

For more information, see pathRules[] in the regional URL map API documentation.

Route rules

A route rule (routeRules) matches information in an incoming request and makes a routing decision based on the match.

Route rules can contain a variety of different match rules (matchRules) and a variety of different route actions (routeAction).

A match rule evaluates the incoming request based on the HTTP(S) request's path, headers, and query parameters. Match rules support various types of matches (for example, prefix match) as well as modifiers (for example, case insensitivity). This enables you to, for example, send HTTP(S) requests to a set of backends based on the presence of a custom-defined HTTP header.

If you have multiple route rules, the load balancer executes them in priority order (based on the priority field), which allows you to specify custom logic for matching, routing, and other actions.

Within a given route rule, when the first match is made, the load balancer stops evaluating the match rules, and any remaining match rules are ignored.

Google Cloud performs the following actions:

  1. Looks for the first match rule that matches the request.
  2. Stops looking at any other match rules.
  3. Applies the actions in the corresponding route actions.

Route rules have several components, as described in the following table.

Route rule component (API field name) Description
Priority (priority) A number from 0 through 2,147,483,647 (that is, (2^31)-1) assigned to a route rule within a given path matcher.

The priority determines the order of route rule evaluation. The priority of a rule decreases as its number increases so that a rule with priority 4 is evaluated before a rule with priority 25. The first rule that matches the request is applied.

Priority numbers can have gaps. You cannot create more than one rule with the same priority.
Description (description) An optional description of up to 1,024 characters.
Service (service) The full or partial URL of the backend service resource to which traffic is directed if this rule is matched.
Match rules (matchRules) One or more rules that are evaluated against the request. These matchRules can match all or a subset of the request's HTTP attributes, such as the path, HTTP headers, and query (GET) parameters.

Within a matchRule, all matching criteria must be met for the routeRule's routeActions to take effect. If a routeRule has multiple matchRules, the routeActions of the routeRule take effect when a request matches any of the routeRule's matchRules.
Route action (routeAction) Allows you to specify what actions to take when the match rule criteria are met. These actions include traffic splitting, URL rewrites, retry and mirroring, fault injection, and CORS policies.
Redirect action (urlRedirect) You can configure an action to respond with an HTTP redirect when the match rule criteria are met. This field cannot be used in conjunction with a route action.
Header action (headerAction) You can configure request and response header transformation rules when the criteria within matchRules are met.

For more information, see the following fields in the regional URL map API documentation:

  • routeRules[]
  • routeRules[].priority
  • routeRules[].description
  • routeRules[].service
  • routeRules[].matchRules[]
  • routeRules[].routeAction
  • routeRules[].urlRedirect
  • routeRules[].headerAction

Match rules

Match rules (matchRules) match one or more attributes of a request and take actions specified in the route rule. The following list provides some examples of request attributes that can be matched by using match rules:

  • Host: A host name is the domain name portion of a URL; for example, the host name portion of the URL is In the request, the host name comes from the Host header, as shown in this example curl command, where is the load-balanced IP address:

    curl -v --header 'Host:'
  • Paths follow the host name; for example /images. The rule can specify whether the entire path or only the leading portion of the path needs to match.

  • Other HTTP request parameters, such as HTTP headers, which allow cookie matching, as well as matching based on query parameters (GET variables).

For a complete list of supported match rules, see pathMatchers[].routeRules[].matchRules[] in the regional URL map API documentation.

Route actions

Route actions are specific actions to take when a route rule matches the attributes of a request.

Route action (API field name) Description
Redirects (urlRedirect) Returns a configurable 3xx response code. It also sets the Location response header with the appropriate URI, replacing the host and path as specified in the redirect action.
URL rewrites (urlRewrite) Rewrites the host name portion of the URL, the path portion of the URL, or both, before sending a request to the selected backend service.
Header transformations (headerAction) Adds or removes request headers before sending a request to the backend service. Can also add or remove response headers after receiving a response from the backend service.
Traffic mirroring (requestMirrorPolicy) In addition to forwarding the request to the selected backend service, sends an identical request to the configured mirror backend service on a fire and forget basis. The load balancer doesn't wait for a response from the backend to which it sends the mirrored request.

Mirroring is useful for testing a new version of a backend service. You can also use it to debug production errors on a debug version of your backend service, rather than on the production version.
Weighted traffic splitting (weightedBackendServices) Allows traffic for a matched rule to be distributed to multiple backend services, proportional to a user-defined weight assigned to the individual backend service.

This capability is useful for configuring staged deployments or A/B testing. For example, the route action could be configured such that 99% of the traffic is sent to a service that's running a stable version of an application, while 1% of the traffic is sent to a separate service running a newer version of that application.
Retries (retryPolicy) Configures the conditions under which the load balancer retries failed requests, how long the load balancer waits before retrying, and the maximum number of retries permitted.
Timeout (timeout) Specifies the timeout for the selected route. Timeout is computed from the time that the request is fully processed up until the time that the response is fully processed. Timeout includes all retries.
Fault injection (faultInjectionPolicy) Introduces errors when servicing requests to simulate failures, including high latency, service overload, service failures, and network partitioning. This feature is useful for testing the resiliency of a service to simulated faults.
Delay injection (faultInjectionPolicy) Introduces delays for a user-defined portion of requests before sending the request to the selected backend service.
Abort injection (faultInjectionPolicy) Responds directly to a fraction of requests with user-defined HTTP status codes instead of forwarding those requests to the backend service.
Security policies (corsPolicy) Cross-origin resource sharing (CORS) policies handle Internal HTTP(S) Load Balancing settings for enforcing CORS requests.

You can specify one of the following route actions (referred to as Primary actions in the Google Cloud Console):

  • Route traffic to a single service (service).
  • Split traffic between multiple services (weightedBackendServices weight:x, where x < 100).
  • Redirect URLs (urlRedirect).

In addition, you can combine any one of the previously mentioned route actions with one or more of the following route actions (referred to as Add-on actions in the Cloud Console):

  • Mirror traffic (requestMirrorPolicy).
  • Rewrite URL host/path (urlRewrite).
  • Retry failed requests (retryPolicy).
  • Set timeout (timeout).
  • Introduce faults to a percentage of the traffic (faultInjectionPolicy).
  • Add CORS policy (corsPolicy).
  • Manipulate request/response headers (headerAction).

For more information about the configuration and semantics of route actions, see the following in the regional URL map API documentation:

  • urlRedirect
  • urlRewrite
  • headerAction
  • requestMirrorPolicy
  • weightedBackendServices
  • retryPolicy
  • timeout
  • faultInjectionPolicy
  • corsPolicy

Traffic policies

By using backend service resources, you can configure traffic policies, which enable fine-tuned load balancing within an instance group or network endpoint group (NEG). These policies only take effect after a backend service has been selected by using your regional URL map (as described previously).

Traffic policies enable you to:

  • Control the load balancing algorithm among instances within the backend service.
  • Control the volume of connections to an upstream service.
  • Control the eviction of unhealthy hosts from a backend service.

The following traffic policy features are configured in the regional backend service.

Traffic policy (API field name) Description
Load balancing policy (LocalityLbPolicy) For a backend service, traffic distribution is based on a load balancing mode and a load balancing policy.

The backend service first directs traffic to a backend (instance group or NEG) according to the backend's balancing mode. After a backend is selected, traffic is then distributed among instances in that backend service according to the load balancing policy.

The balancing mode allows the load balancer to first select a locality, such as a Google Cloud zone. The load balancing policy then determines a specific backend VM or endpoint in a NEG.

Various load balancing algorithms (such as round robin, least request, and others) are supported. For a complete list of algorithms, see localityLbPolicy in the regional backend service API documentation.
Session affinity (consistentHash) Includes HTTP cookie-based affinity, HTTP header-based affinity, client IP address affinity, and generated cookie affinity. Session affinity provides a best-effort attempt to send requests from a particular client to the same backend for as long as the back is healthy and has capacity.

For more information about session affinity, see consistentHash in the regional backend service API documentation.
Outlier detection (outlierDetection) A set of policies that specify the criteria for eviction of unhealthy backend VMs or endpoints in NEGs, along with criteria defining when a backend or endpoint is considered healthy enough to receive traffic again.

For more information about session affinity, see outlierDetection in the regional backend service API documentation.
Circuit breaking (circuitBreakers) Sets upper limits on the volume of connections and requests per connection to a backend service.

For more information about session affinity, see circuitBreakers in the regional backend service API documentation.

Configuring traffic management

You can use the Cloud Console, gcloud, or the Cloud Load Balancing API to configure traffic management. Within your chosen configuration environment, you set up traffic management by using YAML configurations. A URL map and a backend service each has its own YAML file. Depending on your desired functionality, you need to write either a URL map YAML, a backend service YAML, or both.

For help writing these YAML files, you can use the following resources:

Accessing the YAML examples in the Cloud Console

To access YAML examples in the Cloud Console:

  1. Go to the Load balancing page in the Google Cloud Console.
    Go to the Load balancing page
  2. Under HTTP(S) Load Balancing, click Start configuration.
  3. Select Only between my VMs. This setting means that the load balancer is internal.
  4. Click Continue.
  5. In the Routing rules configuration, select Advanced host, path and route rule.
  6. Click Add hosts and path matcher.
  7. Click the Code guidance link.

The Path matcher YAML examples page appears.


  • RouteRule.service does not currently work. The workaround is to use RouteRule.weightedBackendServices with a single WeightedBackendService.
  • Path regular expressions pathMatchers.routeRules.matchRules.regexMatch aren't supported in Internal HTTP(S) Load Balancing.
  • UrlMap.defaultRouteAction and UrlMap.defaultUrlRedirect don't currently work. You must specify UrlMap.defaultService for handling traffic that does not match any of the hosts in UrlMap.hostRules[] in that UrlMap.
  • UrlMap.pathMatchers[].defaultRouteAction and UrlMap.pathMatchers[].defaultUrlRedirect do not currently work. You must specify UrlMap.pathMatchers[].defaultService for handling traffic that does not match any of the routeRules for that pathMatcher.
  • If the value of BackendService.SessionAffinity is not NONE, and BackendService.localityLbPolicy is set to a load balancing policy other than MAGLEV or RING_HASH, the session affinity settings don't take effect.
  • The gcloud compute backend-services import command doesn't delete top-level fields of the resource, such as the backend service and the URL map. For example, if you create a backend service with settings for circuitBreakers, you can update those settings by using a subsequent gcloud compute backend-services import command. However, you can't delete those settings from the backend service. You can delete and recreate the resource without the circuitBreakers settings.

What's next