gcloud alpha compute backend-services create

NAME
gcloud alpha compute backend-services create - create a backend service
SYNOPSIS
gcloud alpha compute backend-services create BACKEND_SERVICE_NAME [--affinity-cookie-ttl=AFFINITY_COOKIE_TTL] [--bypass-cache-on-request-headers=BYPASS_CACHE_ON_REQUEST_HEADERS] [--no-cache-key-include-host] [--cache-key-include-http-header=[HEADER_FIELD_NAME,…]] [--cache-key-include-named-cookie=[NAMED_COOKIE,…]] [--no-cache-key-include-protocol] [--no-cache-key-include-query-string] [--cache-mode=CACHE_MODE] [--client-ttl=CLIENT_TTL] [--compression-mode=COMPRESSION_MODE] [--connection-drain-on-failover] [--connection-draining-timeout=CONNECTION_DRAINING_TIMEOUT] [--connection-persistence-on-unhealthy-backends=CONNECTION_PERSISTENCE_ON_UNHEALTHY_BACKENDS] [--custom-request-header=CUSTOM_REQUEST_HEADER] [--custom-response-header=CUSTOM_RESPONSE_HEADER] [--default-ttl=DEFAULT_TTL] [--description=DESCRIPTION] [--drop-traffic-if-unhealthy] [--[no-]enable-cdn] [--[no-]enable-logging] [--[no-]enable-strong-affinity] [--failover-ratio=FAILOVER_RATIO] [--health-checks=HEALTH_CHECK,[…]] [--http-health-checks=HTTP_HEALTH_CHECK,[…]] [--https-health-checks=HTTPS_HEALTH_CHECK,[…]] [--iap=disabled|enabled,[oauth2-client-id=OAUTH2-CLIENT-ID,oauth2-client-secret=OAUTH2-CLIENT-SECRET]] [--idle-timeout-sec=IDLE_TIMEOUT_SEC] [--ip-address-selection-policy=IP_ADDRESS_SELECTION_POLICY] [--load-balancing-scheme=LOAD_BALANCING_SCHEME; default="EXTERNAL"] [--locality-lb-policy=LOCALITY_LB_POLICY] [--logging-optional=LOGGING_OPTIONAL] [--logging-optional-fields=[LOGGING_OPTIONAL_FIELDS,…]] [--logging-sample-rate=LOGGING_SAMPLE_RATE] [--max-ttl=MAX_TTL] [--[no-]negative-caching] [--negative-caching-policy=[[CODE=TTL],…]] [--network=NETWORK] [--port-name=PORT_NAME] [--protocol=PROTOCOL] [--[no-]request-coalescing] [--serve-while-stale=SERVE_WHILE_STALE] [--service-bindings=SERVICE_BINDING,[…]] [--service-lb-policy=SERVICE_LOAD_BALANCING_POLICY] [--session-affinity=SESSION_AFFINITY] [--signed-url-cache-max-age=SIGNED_URL_CACHE_MAX_AGE] [--subsetting-policy=SUBSETTING_POLICY; default="NONE"] [--subsetting-subset-size=SUBSETTING_SUBSET_SIZE] [--timeout=TIMEOUT; default="30s"] [--tracking-mode=TRACKING_MODE] [--cache-key-query-string-blacklist=[QUERY_STRING,…]     | --cache-key-query-string-whitelist=QUERY_STRING,[…]] [--global     | --region=REGION] [--global-health-checks     | --health-checks-region=HEALTH_CHECKS_REGION] [GCLOUD_WIDE_FLAG]
DESCRIPTION
(ALPHA) gcloud alpha compute backend-services create creates a backend service. A backend service defines how Cloud Load Balancing distributes traffic. The backend service configuration contains a set of values, such as the protocol used to connect to backends, various distribution and session settings, health checks, and timeouts. These settings provide fine-grained control over how your load balancer behaves. Most of the settings have default values that allow for easy configuration if you need to get started quickly.

After you create a backend service, you add backends by using gcloud compute backend-services add-backend.

For more information about the available settings, see https://cloud.google.com/load-balancing/docs/backend-service.

POSITIONAL ARGUMENTS
BACKEND_SERVICE_NAME
Name of the backend service to create.
FLAGS
If session-affinity is set to "generated_cookie", this flag sets the TTL, in seconds, of the resulting cookie. A setting of 0 indicates that the cookie should be transient. See $ gcloud topic datetimes for information on duration formats.
--bypass-cache-on-request-headers=BYPASS_CACHE_ON_REQUEST_HEADERS
Bypass the cache when the specified request headers are matched - e.g. Pragma or Authorization headers. Up to 5 headers can be specified.

The cache is bypassed for all cdnPolicy.cacheMode settings.

Note that requests that include these headers will always fill from origin, and may result in a large number of cache misses if the specified headers are common to many requests.

Values are case-insensitive.

The header name must be a valid HTTP header field token (per RFC 7230).

For the list of restricted headers, see the list of required header name properties in How custom headers work.

A header name must not appear more than once in the list of added headers.

--cache-key-include-host
Enable including host in cache key. If enabled, requests to different hosts will be cached separately. Can only be applied for global resources. Enabled by default, use --no-cache-key-include-host to disable.
--cache-key-include-http-header=[HEADER_FIELD_NAME,…]
Specifies a comma-separated list of HTTP headers, by field name, to include in cache keys. Only the request URL is included in the cache key by default.
Specifies a comma-separated list of HTTP cookie names to include in cache keys. The name=value pair are used in the cache key Cloud CDN generates. Cookies are not included in cache keys by default.
--cache-key-include-protocol
Enable including protocol in cache key. If enabled, http and https requests will be cached separately. Can only be applied for global resources. Enabled by default, use --no-cache-key-include-protocol to disable.
--cache-key-include-query-string
Enable including query string in cache key. If enabled, the query string parameters will be included according to --cache-key-query-string-whitelist and --cache-key-query-string-blacklist. If neither is set, the entire query string will be included. If disabled, then the entire query string will be excluded. Can only be applied for global resources. Enabled by default, use --no-cache-key-include-query-string to disable.
--cache-mode=CACHE_MODE
Specifies the cache setting for all responses from this backend. CACHE_MODE must be one of:
CACHE_ALL_STATIC
Automatically cache static content, including common image formats, media (video and audio), web assets (JavaScript and CSS). Requests and responses that are marked as uncacheable, as well as dynamic content (including HTML), aren't cached.
FORCE_CACHE_ALL
Cache all content, ignoring any "private", "no-store" or "no-cache" directives in Cache-Control response headers. Warning: this may result in Cloud CDN caching private, per-user (user identifiable) content. You should only enable this on backends that are not serving private or dynamic content, such as storage buckets.
USE_ORIGIN_HEADERS
Require the origin to set valid caching headers to cache content. Responses without these headers aren't cached at Google's edge, and require a full trip to the origin on every request, potentially impacting performance and increasing load on the origin server.
--client-ttl=CLIENT_TTL
Specifies a separate client (for example, browser client) TTL, separate from the TTL for Cloud CDN's edge caches.

This allows you to set a shorter TTL for browsers/clients, and to have those clients revalidate content against Cloud CDN on a more regular basis, without requiring revalidation at the origin.

The value of clientTtl cannot be set to a value greater than that of maxTtl, but can be equal.

Any cacheable response has its max-age/s-maxage directives adjusted down to the client TTL value if necessary; an Expires header will be replaced with a suitable max-age directive.

The maximum allowed value is 31,622,400s (1 year).

When creating a new backend with CACHE_ALL_STATIC and the field is unset, or when switching to that mode and the field is unset, a default value of 3600 is used.

When the cache mode is set to "USE_ORIGIN_HEADERS", you must omit this field.

--compression-mode=COMPRESSION_MODE
Compress text responses using Brotli or gzip compression, based on the client's Accept-Encoding header. Two modes are supported: AUTOMATIC (recommended) - automatically uses the best compression based on the Accept-Encoding header sent by the client. In most cases, this will result in Brotli compression being favored. DISABLED - disables compression. Existing compressed responses cached by Cloud CDN will not be served to clients. COMPRESSION_MODE must be one of: DISABLED, AUTOMATIC.
--connection-drain-on-failover
Applicable only for backend service-based external and internal passthrough Network Load Balancers as part of a connection tracking policy. Only applicable when the backend service protocol is TCP. Not applicable to any other load balancer. Enabled by default, this option instructs the load balancer to allow established TCP connections to persist for up to 300 seconds on instances or endpoints in primary backends during failover, and on instances or endpoints in failover backends during failback. For details, see: Connection draining on failover and failback for internal passthrough Network Load Balancers and Connection draining on failover and failback for external passthrough Network Load Balancers.
--connection-draining-timeout=CONNECTION_DRAINING_TIMEOUT
Connection draining timeout to be used during removal of VMs from instance groups. This guarantees that for the specified time all existing connections to a VM will remain untouched, but no new connections will be accepted. Set timeout to zero to disable connection draining. Enable feature by specifying a timeout of up to one hour. If the flag is omitted API default value (0s) will be used. See $ gcloud topic datetimes for information on duration formats.
--connection-persistence-on-unhealthy-backends=CONNECTION_PERSISTENCE_ON_UNHEALTHY_BACKENDS
Specifies connection persistence when backends are unhealthy. The default value is DEFAULT_FOR_PROTOCOL. CONNECTION_PERSISTENCE_ON_UNHEALTHY_BACKENDS must be one of: DEFAULT_FOR_PROTOCOL, NEVER_PERSIST, ALWAYS_PERSIST.
--custom-request-header=CUSTOM_REQUEST_HEADER
Specifies a HTTP Header to be added by your load balancer. This flag can be repeated to specify multiple headers. For example:
gcloud alpha compute backend-services create NAME             --custom-request-header "header-name: value"             --custom-request-header "another-header:"
--custom-response-header=CUSTOM_RESPONSE_HEADER
Custom headers that the external Application Load Balancer adds to proxied responses. For the list of headers, see Creating custom headers.

Variables are not case-sensitive.

--default-ttl=DEFAULT_TTL
Specifies the default TTL for cached content served by this origin for responses that do not have an existing valid TTL (max-age or s-maxage).

The default value is 3600s for cache modes that allow a default TTL to be defined.

The value of defaultTtl cannot be set to a value greater than that of maxTtl, but can be equal.

When the cacheMode is set to FORCE_CACHE_ALL, the defaultTtl overwrites the TTL set in all responses.

A TTL of "0" means Always revalidate.

The maximum allowed value is 31,622,400s (1 year). Infrequently accessed objects may be evicted from the cache before the defined TTL.

When creating a new backend with CACHE_ALL_STATIC or FORCE_CACHE_ALL and the field is unset, or when updating an existing backend to use these modes and the field is unset, a default value of 3600 is used. When the cache mode is set to "USE_ORIGIN_HEADERS", you must omit this field.

--description=DESCRIPTION
An optional, textual description for the backend service.
--drop-traffic-if-unhealthy
Applicable only for backend service-based external and internal passthrough Network Load Balancers as part of a connection tracking policy. Not applicable to any other load balancer. This option instructs the load balancer to drop packets when all instances or endpoints in primary and failover backends do not pass their load balancer health checks. For details, see: Dropping traffic when all backend VMs are unhealthy for internal passthrough Network Load Balancers and Dropping traffic when all backend VMs are unhealthy for external passthrough Network Load Balancers.
--[no-]enable-cdn
Enable or disable Cloud CDN for the backend service. Only available for backend services with --load-balancing-scheme=EXTERNAL that use a --protocol of HTTP, HTTPS, or HTTP2. Cloud CDN caches HTTP responses at the edge of Google's network. Cloud CDN is disabled by default. Use --enable-cdn to enable and --no-enable-cdn to disable.
--[no-]enable-logging
The logging options for the load balancer traffic served by this backend service. If logging is enabled, logs will be exported to Cloud Logging. Disabled by default. This field cannot be specified for global external proxy Network Load Balancers. Use --enable-logging to enable and --no-enable-logging to disable.
--[no-]enable-strong-affinity
Enable or disable strong session affinity. This is only available for loadbalancingScheme EXTERNAL. Use --enable-strong-affinity to enable and --no-enable-strong-affinity to disable.
--failover-ratio=FAILOVER_RATIO
Applicable only to backend service-based external passthrough Network load balancers and internal passthrough Network load balancers as part of a failover policy. Not applicable to any other load balancer. This option defines the ratio used to control when failover and failback occur. For details, see: Failover ratio for internal passthrough Network Load Balancers and Failover ratio for external passthrough Network Load Balancer overview.
--health-checks=HEALTH_CHECK,[…]
Specifies a list of health check objects for checking the health of the backend service. Currently at most one health check can be specified. Health checks need not be for the same protocol as that of the backend service.
--http-health-checks=HTTP_HEALTH_CHECK,[…]
Specifies a list of legacy HTTP health check objects for checking the health of the backend service.

Legacy health checks are not recommended for backend services. It is possible to use a legacy health check on a backend service for an Application Load Balancer if that backend service uses instance groups. For more information, refer to this guide: https://cloud.google.com/load-balancing/docs/health-check-concepts#lb_guide.

--https-health-checks=HTTPS_HEALTH_CHECK,[…]
Specifies a list of legacy HTTPS health check objects for checking the health of the backend service.

Legacy health checks are not recommended for backend services. It is possible to use a legacy health check on a backend service for an Application Load Balancer if that backend service uses instance groups. For more information, refer to this guide: https://cloud.google.com/load-balancing/docs/health-check-concepts#lb_guide.

--iap=disabled|enabled,[oauth2-client-id=OAUTH2-CLIENT-ID,oauth2-client-secret=OAUTH2-CLIENT-SECRET]
Configure Identity Aware Proxy (IAP) for external HTTP(S) load balancing. You can configure IAP to be enabled or disabled (default). If enabled, you can provide values for oauth2-client-id and oauth2-client-secret. For example, --iap=enabled,oauth2-client-id=foo,oauth2-client-secret=bar turns IAP on, and --iap=disabled turns it off. For more information, see https://cloud.google.com/iap/.
--idle-timeout-sec=IDLE_TIMEOUT_SEC
Specifies how long to keep a connection tracking table entry while there is no matching traffic (in seconds). Applicable only for backend service-based external and internal passthrough Network Load Balancers as part of a connection tracking policy.
--ip-address-selection-policy=IP_ADDRESS_SELECTION_POLICY
Specifies a preference for traffic sent from the proxy to the backend (or from the client to the backend for proxyless gRPC).

Can only be set if load balancing scheme is INTERNAL_SELF_MANAGED, INTERNAL_MANAGED or EXTERNAL_MANAGED.

The possible values are:

IPV4_ONLY
  Only send IPv4 traffic to the backends of the backend service,
  regardless of traffic from the client to the proxy. Only IPv4
  health checks are used to check the health of the backends.
PREFER_IPV6
  Prioritize the connection to the endpoint's IPv6 address over its IPv4
  address (provided there is a healthy IPv6 address).
IPV6_ONLY
  Only send IPv6 traffic to the backends of the backend service,
  regardless of traffic from the client to the proxy. Only IPv6
  health checks are used to check the health of the backends.

IP_ADDRESS_SELECTION_POLICY must be one of: IPV4_ONLY, PREFER_IPV6, IPV6_ONLY.

--load-balancing-scheme=LOAD_BALANCING_SCHEME; default="EXTERNAL"
Specifies the load balancer type. Choose EXTERNAL for the classic Application Load Balancers, the external passthrough Network Load Balancers, and the global external proxy Network Load Balancers. Choose EXTERNAL_MANAGED for the Envoy-based global and regional external Application Load Balancers, and the regional external proxy Network Load Balancers. Choose INTERNAL for the internal passthrough Network Load Balancers. Choose INTERNAL_MANAGED for Envoy-based internal load balancers such as the internal Application Load Balancers and the internal proxy Network Load Balancers. Choose INTERNAL_SELF_MANAGED for Traffic Director. For more information, refer to this guide: https://cloud.google.com/load-balancing/docs/choosing-load-balancer. LOAD_BALANCING_SCHEME must be one of: INTERNAL, EXTERNAL, INTERNAL_SELF_MANAGED, EXTERNAL_MANAGED, INTERNAL_MANAGED.
--locality-lb-policy=LOCALITY_LB_POLICY
The load balancing algorithm used within the scope of the locality. LOCALITY_LB_POLICY must be one of: INVALID_LB_POLICY, ROUND_ROBIN, LEAST_REQUEST, RING_HASH, RANDOM, ORIGINAL_DESTINATION, MAGLEV, WEIGHTED_MAGLEV.
--logging-optional=LOGGING_OPTIONAL
This field can only be specified if logging is enabled for the backend service. Configures whether all, none, or a subset of optional fields should be added to the reported logs. Default is EXCLUDE_ALL_OPTIONAL. This field can only be specified for internal and external passthrough Network Load Balancers. LOGGING_OPTIONAL must be one of: EXCLUDE_ALL_OPTIONAL, INCLUDE_ALL_OPTIONAL, CUSTOM.
--logging-optional-fields=[LOGGING_OPTIONAL_FIELDS,…]
This field can only be specified if logging is enabled for the backend service and "--logging-optional" was set to CUSTOM. Contains a comma-separated list of optional fields you want to include in the logs. For example: serverInstance, serverGkeDetails.cluster, serverGkeDetails.pod.podNamespace. This can only be specified for internal and external passthrough Network Load Balancers.
--logging-sample-rate=LOGGING_SAMPLE_RATE
This field can only be specified if logging is enabled for the backend service. The value of the field must be a float in the range [0, 1]. This configures the sampling rate of requests to the load balancer where 1.0 means all logged requests are reported and 0.0 means no logged requests are reported. The default value is 1.0 when logging is enabled and 0.0 otherwise.
--max-ttl=MAX_TTL
Specifies the maximum allowed TTL for cached content served by this origin.

The default value is 86400 for cache modes that support a max TTL.

Cache directives that attempt to set a max-age or s-maxage higher than this, or an Expires header more than maxTtl seconds in the future, are capped at the value of maxTtl, as if it were the value of an s-maxage Cache-Control directive.

A TTL of "0" means Always revalidate.

The maximum allowed value is 31,622,400s (1 year). Infrequently accessed objects may be evicted from the cache before the defined TTL.

When creating a new backend with CACHE_ALL_STATIC and the field is unset, or when updating an existing backend to use these modes and the field is unset, a default value of 86400 is used. When the cache mode is set to "USE_ORIGIN_HEADERS" or "FORCE_CACHE_ALL", you must omit this field.

--[no-]negative-caching
Negative caching allows per-status code cache TTLs to be set, in order to apply fine-grained caching for common errors or redirects. This can reduce the load on your origin and improve the end-user experience by reducing response latency.

Negative caching applies to a set of 3xx, 4xx, and 5xx status codes that are typically useful to cache.

Status codes not listed here cannot have their TTL explicitly set and aren't cached, in order to avoid cache poisoning attacks.

HTTP success codes (HTTP 2xx) are handled by the values of defaultTtl and maxTtl.

When the cache mode is set to CACHE_ALL_STATIC or USE_ORIGIN_HEADERS, these values apply to responses with the specified response code that lack any cache-control or expires headers.

When the cache mode is set to FORCE_CACHE_ALL, these values apply to all responses with the specified response code, and override any caching headers.

Cloud CDN applies the following default TTLs to these status codes:

  • HTTP 300 (Multiple Choice), 301, 308 (Permanent Redirects): 10m
  • HTTP 404 (Not Found), 410 (Gone), 451 (Unavailable For Legal Reasons): 120s
  • HTTP 405 (Method Not Found), 421 (Misdirected Request), 501 (Not Implemented): 60s

These defaults can be overridden in cdnPolicy.negativeCachingPolicy.

Use --negative-caching to enable and --no-negative-caching to disable.

--negative-caching-policy=[[CODE=TTL],…]
Sets a cache TTL for the specified HTTP status code.

NegativeCaching must be enabled to config the negativeCachingPolicy.

If you omit the policy and leave negativeCaching enabled, Cloud CDN's default cache TTLs are used.

Note that when specifying an explicit negative caching policy, make sure that you specify a cache TTL for all response codes that you want to cache. Cloud CDN doesn't apply any default negative caching when a policy exists.

CODE is the HTTP status code to define a TTL against. Only HTTP status codes 300, 301, 308, 404, 405, 410, 421, 451, and 501 can be specified as values, and you cannot specify a status code more than once.

TTL is the time to live (in seconds) for which to cache responses for the specified CODE. The maximum allowed value is 1800s (30 minutes), noting that infrequently accessed objects may be evicted from the cache before the defined TTL.

--network=NETWORK
Network that this backend service applies to. It can only be set if the load-balancing-scheme is INTERNAL.
--port-name=PORT_NAME
Backend services for Application Load Balancers and proxy Network Load Balancers must reference exactly one named port if using instance group backends.

Each instance group backend exports one or more named ports, which map a user-configurable name to a port number. The backend service's named port subscribes to one named port on each instance group. The resolved port number can differ among instance group backends, based on each instance group's named port list.

When omitted, a backend service subscribes to a named port called http.

The named port for a backend service is either ignored or cannot be set for these load balancing configurations:

  • For any load balancer, if the backends are not instance groups (for example, GCE_VM_IP_PORT NEGs).
  • For any type of backend on a backend service for internal or external passthrough Network Load Balancers.

See also https://cloud.google.com/load-balancing/docs/backend-service#named_ports.

--protocol=PROTOCOL
Protocol for incoming requests.

If the load-balancing-scheme is INTERNAL (Internal passthrough Network Load Balancer), the protocol must be one of: TCP, UDP, UNSPECIFIED.

If the load-balancing-scheme is INTERNAL_SELF_MANAGED (Traffic Director), the protocol must be one of: HTTP, HTTPS, HTTP2, GRPC.

If the load-balancing-scheme is INTERNAL_MANAGED (Internal Application Load Balancer), the protocol must be one of: HTTP, HTTPS, HTTP2.

If the load-balancing-scheme is EXTERNAL and region is not set (Classic Application Load Balancer and global external proxy Network Load Balancer), the protocol must be one of: HTTP, HTTPS, HTTP2, SSL, TCP.

If the load-balancing-scheme is EXTERNAL and region is set (External passthrough Network Load Balancer), the protocol must be one of: TCP, UDP, UNSPECIFIED.

If the load-balancing-scheme is EXTERNAL_MANAGED (Envoy based Global and regional external Application Load Balancers), the protocol must be one of: HTTP, HTTPS, HTTP2.

--[no-]request-coalescing
Enables request coalescing to the backend (recommended).

Request coalescing (or collapsing) combines multiple concurrent cache fill requests into a small number of requests to the origin. This can improve performance by putting less load on the origin and backend infrastructure. However, coalescing adds a small amount of latency when multiple requests to the same URL are processed, so for latency-critical applications it may not be desirable.

Defaults to true.

Use --request-coalescing to enable and --no-request-coalescing to disable.

--serve-while-stale=SERVE_WHILE_STALE
Serve existing content from the cache (if available) when revalidating content with the origin; this allows content to be served more quickly, and also allows content to continue to be served if the backend is down or reporting errors.

This setting defines the default serve-stale duration for any cached responses that do not specify a stale-while-revalidate directive. Stale responses that exceed the TTL configured here will not be served without first being revalidated with the origin. The default limit is 86400s (1 day), which will allow stale content to be served up to this limit beyond the max-age (or s-max-age) of a cached response.

The maximum allowed value is 604800 (1 week).

Set this to zero (0) to disable serve-while-stale.

--service-bindings=SERVICE_BINDING,[…]
List of service bindings to be attached to this backend service. Can only be set if load balancing scheme is INTERNAL_SELF_MANAGED. If set, lists of backends and health checks must be both empty.
--service-lb-policy=SERVICE_LOAD_BALANCING_POLICY
Service load balancing policy to be applied to this backend service. Can only be set if load balancing scheme is EXTERNAL_MANAGED, INTERNAL_MANAGED, or INTERNAL_SELF_MANAGED. Only available for global backend services.
--session-affinity=SESSION_AFFINITY
The type of session affinity to use. Supports both TCP and UDP. SESSION_AFFINITY must be one of:
CLIENT_IP
Route requests to instances based on the hash of the client's IP address.
CLIENT_IP_NO_DESTINATION
Directs a particular client's request to the same backend VM based on a hash created on the client's IP address only. This is used in L4 ILB as Next-Hop scenarios. It differs from the Client-IP option in that Client-IP uses a hash based on both client-IP's address and destination address.
CLIENT_IP_PORT_PROTO
(Applicable if --load-balancing-scheme is INTERNAL) Connections from the same client IP with the same IP protocol and port will go to the same backend VM while that VM remains healthy.
CLIENT_IP_PROTO
(Applicable if --load-balancing-scheme is INTERNAL) Connections from the same client IP with the same IP protocol will go to the same backend VM while that VM remains healthy.
(Applicable if --load-balancing-scheme is INTERNAL_MANAGED, INTERNAL_SELF_MANAGED, EXTERNAL_MANAGED, or EXTERNAL) If the --load-balancing-scheme is EXTERNAL or EXTERNAL_MANAGED, routes requests to backend VMs or endpoints in a NEG, based on the contents of the GCLB cookie set by the load balancer. Only applicable when --protocol is HTTP, HTTPS, or HTTP2. If the --load-balancing-scheme is INTERNAL_MANAGED or INTERNAL_SELF_MANAGED, routes requests to backend VMs or endpoints in a NEG, based on the contents of the GCILB cookie set by the proxy. (If no cookie is present, the proxy chooses a backend VM or endpoint and sends a Set-Cookie response for future requests.) If the --load-balancing-scheme is INTERNAL_SELF_MANAGED, routes requests to backend VMs or endpoints in a NEG, based on the contents of a cookie set by Traffic Director. This session affinity is only valid if the load balancing locality policy is either RING_HASH or MAGLEV.
HEADER_FIELD
(Applicable if --load-balancing-scheme is INTERNAL_MANAGED, EXTERNAL_MANAGED, or INTERNAL_SELF_MANAGED) Route requests to backend VMs or endpoints in a NEG based on the value of the HTTP header named in the --custom-request-header flag. This session affinity is only valid if the load balancing locality policy is either RING_HASH or MAGLEV and the backend service's consistent hash specifies the name of the HTTP header.
(Applicable if --load-balancing-scheme is INTERNAL_MANAGED, EXTERNAL_MANAGED or INTERNAL_SELF_MANAGED) Route requests to backend VMs or endpoints in a NEG, based on an HTTP cookie named in the HTTP_COOKIE flag (with the optional --affinity-cookie-ttl flag). If the client has not provided the cookie, the proxy generates the cookie and returns it to the client in a Set-Cookie header. This session affinity is only valid if the load balancing locality policy is either RING_HASH or MAGLEV and the backend service's consistent hash specifies the HTTP cookie.
NONE
Session affinity is disabled.
--signed-url-cache-max-age=SIGNED_URL_CACHE_MAX_AGE
The amount of time up to which the response to a signed URL request will be cached in the CDN. After this time period, the Signed URL will be revalidated before being served. Cloud CDN will internally act as though all responses from this backend had a Cache-Control: public, max-age=[TTL] header, regardless of any existing Cache-Control header. The actual headers served in responses will not be altered. If unspecified, the default value is 3600s.

For example, specifying 12h will cause the responses to signed URL requests to be cached in the CDN up to 12 hours. See $ gcloud topic datetimes for information on duration formats.

This flag only affects signed URL requests.

--subsetting-policy=SUBSETTING_POLICY; default="NONE"
Specifies the algorithm used for subsetting. Default value is NONE which implies that subsetting is disabled. For Layer 4 Internal Load Balancing, if subsetting is enabled, only the algorithm CONSISTENT_HASH_SUBSETTING can be specified. SUBSETTING_POLICY must be one of: NONE, CONSISTENT_HASH_SUBSETTING.
--subsetting-subset-size=SUBSETTING_SUBSET_SIZE
Number of backends per backend group assigned to each proxy instance or each service mesh client. Can only be set if subsetting policy is CONSISTENT_HASH_SUBSETTING and load balancing scheme is either INTERNAL_MANAGED or INTERNAL_SELF_MANAGED.
--timeout=TIMEOUT; default="30s"
Applicable to all load balancing products except passthrough Network Load Balancers. For internal passthrough Network Load Balancers (load-balancing-scheme set to INTERNAL) and external passthrough Network Load Balancers (global not set and load-balancing-scheme set to EXTERNAL), timeout is ignored.

If the protocol is HTTP, HTTPS, or HTTP2, timeout is a request/response timeout for HTTP(S) traffic, meaning the amount of time that the load balancer waits for a backend to return a full response to a request. If WebSockets traffic is supported, the timeout parameter sets the maximum amount of time that a WebSocket can be open (idle or not).

For example, for HTTP, HTTPS, or HTTP2 traffic, specifying a timeout of 10s means that backends have 10 seconds to respond to the load balancer's requests. The load balancer retries the HTTP GET request one time if the backend closes the connection or times out before sending response headers to the load balancer. If the backend sends response headers or if the request sent to the backend is not an HTTP GET request, the load balancer does not retry. If the backend does not reply at all, the load balancer returns a 502 Bad Gateway error to the client.

If the protocol is SSL or TCP, timeout is an idle timeout.

The full range of timeout values allowed is 1 - 2,147,483,647 seconds.

--tracking-mode=TRACKING_MODE
Specifies the connection key used for connection tracking. The default value is PER_CONNECTION. Applicable only for backend service-based external and internal passthrough Network Load Balancers as part of a connection tracking policy. For details, see: Connection tracking mode for internal passthrough Network Load Balancers balancing and Connection tracking mode for external passthrough Network Load Balancers. TRACKING_MODE must be one of: PER_CONNECTION, PER_SESSION.
At most one of these can be specified:
--cache-key-query-string-blacklist=[QUERY_STRING,…]
Specifies a comma separated list of query string parameters to exclude in cache keys. All other parameters will be included. Either specify --cache-key-query-string-whitelist or --cache-key-query-string-blacklist, not both. '&' and '=' will be percent encoded and not treated as delimiters. Can only be applied for global resources.
--cache-key-query-string-whitelist=QUERY_STRING,[…]
Specifies a comma separated list of query string parameters to include in cache keys. All other parameters will be excluded. Either specify --cache-key-query-string-whitelist or --cache-key-query-string-blacklist, not both. '&' and '=' will be percent encoded and not treated as delimiters. Can only be applied for global resources.
At most one of these can be specified:
--global
If set, the backend service is global.
--region=REGION
Region of the backend service to create. Overrides the default compute/region property value for this command invocation.
At most one of these can be specified:
--global-health-checks
If set, the health checks are global.
--health-checks-region=HEALTH_CHECKS_REGION
Region of the health checks to operate on. If not specified, you might be prompted to select a region (interactive mode only).

To avoid prompting when this flag is omitted, you can set the compute/region property:

gcloud config set compute/region REGION

A list of regions can be fetched by running:

gcloud compute regions list

To unset the property, run:

gcloud config unset compute/region

Alternatively, the region can be stored in the environment variable CLOUDSDK_COMPUTE_REGION.

GCLOUD WIDE FLAGS
These flags are available to all commands: --access-token-file, --account, --billing-project, --configuration, --flags-file, --flatten, --format, --help, --impersonate-service-account, --log-http, --project, --quiet, --trace-token, --user-output-enabled, --verbosity.

Run $ gcloud help for details.

NOTES
This command is currently in alpha and might change without notice. If this command fails with API permission errors despite specifying the correct project, you might be trying to access an API with an invitation-only early access allowlist. These variants are also available:
gcloud compute backend-services create
gcloud beta compute backend-services create