How Requests are Handled

This document describes how your App Engine application receives requests and sends responses. For more details, see the Request Headers and Responses reference.

If your application uses services, you can address requests to a specific service or a specific version of that service. For more information about service addressability, see How Requests are Routed.

Handling requests

Your application is responsible for starting a webserver and handling requests. You can use any web framework that’s available for your development language.

App Engine runs multiple instances of your application, and each instance has its own web server for handling requests. Any request can be routed to any instance, so consecutive requests from the same user are not necessarily sent to the same instance. An instance can handle multiple requests concurrently. The number of instances can be adjusted automatically as traffic changes.

The Go runtime for App Engine uses the standard http package as an interface between your Go program and the App Engine servers. When App Engine receives a web request for your application, it invokes the http.Handler associated with the request URL. The URL must also be specified as a Go handler in the application's app.yaml configuration file.

The following example is a complete Go app that outputs a hard-coded HTML string to the user:

package hello

import (

func init() {
	http.HandleFunc("/", hello)

func hello(w http.ResponseWriter, r *http.Request) {
	fmt.Fprintf(w, "<h1>Hello, world</h1>")

Quotas and limits

App Engine automatically allocates resources to your application as traffic increases. However, this is bound by the following restrictions:

  • App Engine reserves automatic scaling capacity for applications with low latency, where the application responds to requests in less than one second. Applications with very high latency, such as over one second per request for many requests, and high throughput require Silver, Gold, or Platinum support. Customers with this level of support can request higher throughput limits by contacting their support representative.

  • Applications that are heavily CPU-bound may also incur some additional latency in order to efficiently share resources with other applications on the same servers. Requests for static files are exempt from these latency limits.

Each incoming request to the application counts toward the Requests limit. Data sent in response to a request counts toward the Outgoing Bandwidth (billable) limit.

Both HTTP and HTTPS (secure) requests count toward the Requests, Incoming Bandwidth (billable), and Outgoing Bandwidth (billable) limits. The GCP Console Quota Details page also reports Secure Requests, Secure Incoming Bandwidth, and Secure Outgoing Bandwidth as separate values for informational purposes. Only HTTPS requests count toward these values. For more information, see the Quotas page.

The following limits apply specifically to the use of request handlers:

Limit Amount
request size 32 megabytes
response size 32 megabytes
request duration 60 seconds
maximum total number of files (app files and static files) 10,000 total
1,000 per directory
maximum size of an application file 32 megabytes
maximum size of a static file 32 megabytes
maximum total size of all application and static files first 1 gigabyte is free
$ 0.026 per gigabyte per month after first 1 gigabyte

Response limits

Dynamic responses are limited to 32MB. If a script handler generates a response larger than this limit, the server sends back an empty response with a 500 Internal Server Error status code. This limitation does not apply to responses that serve data from the Blobstore or Google Cloud Storage .

Request headers

An incoming HTTP request includes the HTTP headers sent by the client. For security purposes, some headers are sanitized or amended by intermediate proxies before they reach the application.

For more information, see the Request headers reference.

Request responses

App Engine calls the handler with a Request and a ResponseWriter, then waits for the handler to write to the ResponseWriter and return. When the handler returns, the data in the ResponseWriter's internal buffer is sent to the user.

This is practically the same as when writing normal Go programs that use the http package.

There are limits that apply to the response you generate, and the response may be modified before it is returned to the client.

For more information, see the Request responses reference.

Streaming Responses

App Engine does not support streaming responses where data is sent in incremental chunks to the client while a request is being processed. All data from your code is collected as described above and sent as a single HTTP response.

Response compression

If the client sends HTTP headers with the original request indicating that the client can accept compressed (gzipped) content, App Engine compresses the handler response data automatically and attaches the appropriate response headers. It uses both the Accept-Encoding and User-Agent request headers to determine if the client can reliably receive compressed responses.

Custom clients can indicate that they are able to receive compressed responses by specifying both Accept-Encoding and User-Agent headers with a value of gzip. The Content-Type of the response is also used to determine whether compression is appropriate; in general, text-based content types are compressed, whereas binary content types are not.

When responses are compressed automatically by App Engine, the Content- Encoding header is added to the response.

Specifying a request deadline

A request handler has a limited amount of time to generate and return a response to a request, typically around 60 seconds. Once the deadline has been reached, the request handler is interrupted. When a Go request handler exceeds the deadline, its process is terminated and the runtime environment returns an HTTP 500 Internal Server Error to the client.

While a request can take as long as 60 seconds to respond, App Engine is optimized for applications with short-lived requests, typically those that take a few hundred milliseconds. An efficient app responds quickly for the majority of requests. An app that doesn't will not scale well with App Engine's infrastructure.

Refer to Dealing with DeadlineExceededErrors for common DeadlineExceededError causes and suggested workarounds.


The log package has Debugf, Infof, Warningf, Errorf, and Criticalf functions that print messages into your application's logs. You can view and analyze logs in the GCP Console Logs page, and download logs using the gcloud app logs read command:

gcloud app logs read --logs=request_log

The following example demonstrates an HTTP handler that constructs an appengine.Context from the *http.Request and logs the requested URL.

package hello

import (


func init() {
	http.HandleFunc("/", Logger)

func Logger(w http.ResponseWriter, r *http.Request) {
	ctx := appengine.NewContext(r)
	log.Infof(ctx, "Requested URL: %v", r.URL)

The environment

The Go runtime provides access to environment information through the appengine.Context interface. See the appengine package reference for details.

Send feedback about...

App Engine standard environment for Go