How Requests are Handled

This document describes how your application should serve and respond to requests.

Serving requests

Your application is responsible for starting a webserver and handling requests. You can use any web framework that’s available for your development language.

App Engine runs multiple instances of your application, each instance has its own web server for handling requests. Any request can be routed to any instance, so consecutive requests from the same user are not necessarily sent to the same instance. An instance can handle multiple requests concurrently. The number of instances can be adjusted automatically as traffic changes.

Request headers

An incoming HTTP request includes the HTTP headers sent by the client. For security purposes, some headers are sanitized or amended by intermediate proxies before they reach the application.

Removed headers

Headers that match the following pattern are removed from the request:

  • X-Google-*

In addition, some selected headers that match the following pattern are removed from the request:

  • X-Appengine-*

Added headers

App Engine adds the following headers to all requests:

Via: "1.1 google"


Country from which the request originated, as an ISO 3166-1 alpha-2 country code. App Engine determines this code from the client's IP address. Note that the country information is not derived from the WHOIS database; it's possible that an IP address with country information in the WHOIS database will not have country information in the X-AppEngine-Country header. Your application should handle the special country code ZZ (unknown country).


Name of region from which the request originated. This value only makes sense in the context of the country in X-AppEngine-Country. For example, if the country is US and the region is ca, then ca means California, not Canada. The complete list of valid region values is found in the ISO-3166-2 standard. Might be null if no location is discovered.


Name of the city from which the request originated. For example, a request from the city of Mountain View might have the header value mountain view. There is no canonical list of valid values for this header. Might be null if no location is discovered.


Latitude and longitude of the city from which the request originated. For example, for a request from Mountain View, this string might be "37.386051,-122.083851". Might be null if no location is discovered.


A unique identifier for the request used for Stackdriver Trace and Stackdriver Logging.

X-Forwarded-For: [CLIENT_IP(s)], [global forwarding rule IP]

A comma-delimited list of IP addresses through which the client request has been routed. The first IP in this list is generally the IP of the client that created the request. The subsequent IPs provide information about proxy servers that also handled the request before it reached the application server. For example:

X-Forwarded-For: clientIp, proxy1Ip, proxy2Ip

X-Forwarded-Proto [http | https]

Shows http or https based on the protocol the client used to connect to your application.

The Google Cloud Load Balancer terminates all https connections, and then forwards traffic to App Engine instances over http. For example, if a user requests access to your site via https://[MY-PROJECT-ID], the X- Forwarded-Proto header value is https.


As explained below, there are limits that apply to the response you generate, and the response may be modified before it is returned to the client.


By default, all responses from App Engine are buffered in 64k-blocks. In some cases, it might make sense to disable buffering and directly stream bytes to the client. This is generally preferred when using hanging GETs or Server Sent Events (SSEs). To disable buffering, you can set the X-Accel-Buffering response header to no.

X-Accel-Buffering: no

HTTP Strict Transport Security

For security reasons, all applications should encourage clients to connect over https. You can use the Strict-Transport-Security header to instruct the browser to prefer https over http for a given page or an entire domain, for example:

Strict-Transport-Security: max-age=31536000; includeSubDomains

You can use the helmet library, which handles setting HTTP security headers.

Quotas and limits

Google App Engine automatically allocates resources to your application as traffic increases. However, this is bound by the following restrictions:

  • App Engine reserves automatic scaling capacity for applications with low latency, where the application responds to requests in less than one second. Applications with very high latency, such as over one second per request for many requests, and high throughput require Silver, Gold, or Platinum support. Customers with this level of support can request higher throughput limits by contacting their support representative.

  • Applications that are heavily CPU-bound may also incur some additional latency in order to efficiently share resources with other applications on the same servers. Requests for static files are exempt from these latency limits.

Each incoming request to the application counts toward the Requests limit. Data sent in response to a request counts toward the Outgoing Bandwidth (billable) limit.

Both HTTP and HTTPS (secure) requests count toward the Requests, Incoming Bandwidth (billable), and Outgoing Bandwidth (billable) limits. The GCP Console Quota Details page also reports Secure Requests, Secure Incoming Bandwidth, and Secure Outgoing Bandwidth as separate values for informational purposes. Only HTTPS requests count toward these values. For more information, see the Quotas page.

Request limits

  • A maximum of ~15KB in request headers is allowed.
  • The total size of the request is limited to ~32MB.
  • All HTTP/2 requests will be translated into HTTP/1.1 requests when forwarded to the application server.
  • SSL connections are terminated at the load balancer. Traffic from the load balancer is sent to the instance over an encrypted channel, and then forwarded to the application server over HTTP. The X-Forwarded-Proto header lets you understand if the origin request was HTTP or HTTPs.

Response limits

  • Responses are buffered by 64k blocks.
  • The response size is unlimited.
  • The response time limit is one hour.

Not supported

The following features are not supported by App Engine flexible environment:

  • HTTP/2 traffic to the backend service
  • Websockets
  • HTTP requests that directly access instances

Send feedback about...

App Engine flexible environment for Node.js docs