REGION_ID is an abbreviated code that Google assigns
based on the region you select when you create your app. The code does not
correspond to a country or province, even though some region IDs may appear
similar to commonly used country and province codes. For apps created after
REGION_ID.r is included in
App Engine URLs. For existing apps created before this date, the
region ID is optional in the URL.
Learn more about region IDs.
This document describes how your App Engine application receives requests and sends responses. For more details, see the Request Headers reference.
If your application uses services, you can address requests to a specific service or a specific version of that service. For more information about service addressability, see How Requests are Routed.
Your application is responsible for starting a webserver and handling requests. You can use any web framework that is available for your development language.
App Engine runs multiple instances of your application, and each
instance has its own web server for handling requests. Any request can be routed
to any instance, so consecutive requests from the same user are not necessarily
sent to the same instance. An instance can handle multiple requests
concurrently. The number of instances can be adjusted automatically as traffic
The following example is a very basic one-file Sinatra application that responds
to all GET requests from web clients to the root path "/" by displaying the
Hello, world! message:
require "sinatra" get "/" do "Hello world!" end
Quotas and limits
App Engine automatically allocates resources to your application as traffic increases. However, this is bound by the following restrictions:
App Engine reserves automatic scaling capacity for applications with low latency, where the application responds to requests in less than one second.
Applications that are heavily CPU-bound may also incur some additional latency in order to efficiently share resources with other applications on the same servers. Requests for static files are exempt from these latency limits.
Each incoming request to the application counts toward the Requests limit. Data sent in response to a request counts toward the Outgoing Bandwidth (billable) limit.
Both HTTP and HTTPS (secure) requests count toward the Requests, Incoming Bandwidth (billable), and Outgoing Bandwidth (billable) limits. The Google Cloud console Quota Details page also reports Secure Requests, Secure Incoming Bandwidth, and Secure Outgoing Bandwidth as separate values for informational purposes. Only HTTPS requests count toward these values. For more information, see the Quotas page.
The following limits apply specifically to the use of request handlers:
- A maximum of ~15KB in request headers is allowed.
- The total size of the request is limited to ~32MB.
- All HTTP/2 requests will be translated into HTTP/1.1 requests when forwarded to the application server.
- SSL connections end at the load balancer. Traffic from the load balancer is sent to the instance over an encrypted channel, and then forwarded to the application server over HTTP. The X-Forwarded-Proto header lets you understand if the origin request was HTTP or HTTPS.
- Responses are buffered by 64k blocks.
- The response size is unlimited.
- The response time limit is one hour.
Unsupported HTTP requests
The following features are not supported by App Engine flexible environment:
- HTTP/2 traffic to the backend service.
- HTTP requests that directly access instances.
An incoming HTTP request includes the HTTP headers sent by the client. For security purposes, some headers are sanitized or amended by intermediate proxies before they reach the application.
For more information, see the Request headers reference.
Forcing HTTPS connections
For security reasons, all applications should encourage clients to connect over
https. To instruct the browser to prefer
http for a given page
or entire domain, set the
Strict-Transport-Security header in your responses.
Strict-Transport-Security: max-age=31536000; includeSubDomains
Strict-Transport-Security is enabled by default for responses that are
generated from your code. For more information, see the
config.force_ssl configuration method.
Handling asynchronous background work
Background work is any work that your app performs for a request after you have delivered your HTTP response. Avoid performing background work in your app, and review your code to make sure all asynchronous operations finish before you deliver your response.
For long-running jobs, we recommend using Cloud Tasks. With Cloud Tasks, HTTP requests are long-lived and return a response only after any asynchronous work ends.