Notice: Over the next few months, we're reorganizing the App Engine documentation site to make it easier to find content and better align with the rest of Google Cloud products. The same content will be available, but the navigation will now match the rest of the Cloud products. If you have feedback or questions as you navigate the site, click Send Feedback.

How requests are handled

Stay organized with collections Save and categorize content based on your preferences.

Region ID

The REGION_ID is an abbreviated code that Google assigns based on the region you select when you create your app. The code does not correspond to a country or province, even though some region IDs may appear similar to commonly used country and province codes. For apps created after February 2020, REGION_ID.r is included in App Engine URLs. For existing apps created before this date, the region ID is optional in the URL.

Learn more about region IDs.

This document describes how your App Engine application receives requests and sends responses. For more details, see the Request Headers and Responses reference.

If your application uses services, you can address requests to a specific service or a specific version of that service. For more information about service addressability, see How Requests are Routed.

Handling requests

Your application is responsible for starting a webserver and handling requests. You can use any web framework that is available for your development language.

App Engine runs multiple instances of your application, and each instance has its own web server for handling requests. Any request can be routed to any instance, so consecutive requests from the same user are not necessarily sent to the same instance. An instance can handle multiple requests concurrently. The number of instances can be adjusted automatically as traffic changes. You can also change the number of concurrent requests an instance can handle by setting the max_concurrent_requests element in your app.yaml file.

Go

The Go runtime for App Engine uses the standard http package as an interface between your Go program and the App Engine servers. When App Engine receives a web request for your application, it invokes the http.Handler associated with the request URL.

The following example is a complete Go app that outputs a hard-coded HTML string to the user:


// Sample helloworld is an App Engine app.
package main

import (
	"fmt"
	"log"
	"net/http"
	"os"
)


func main() {
	http.HandleFunc("/", indexHandler)

	port := os.Getenv("PORT")
	if port == "" {
		port = "8080"
		log.Printf("Defaulting to port %s", port)
	}

	log.Printf("Listening on port %s", port)
	if err := http.ListenAndServe(":"+port, nil); err != nil {
		log.Fatal(err)
	}
}



// indexHandler responds to requests with our greeting.
func indexHandler(w http.ResponseWriter, r *http.Request) {
	if r.URL.Path != "/" {
		http.NotFound(w, r)
		return
	}
	fmt.Fprint(w, "Hello, World!")
}

Java

If you are using the legacy bundled services, consider the following.

When App Engine receives a web request for your application, it invokes the servlet that corresponds to the URL, as described in the application's web.xml file in the WEB-INF/ directory. It supports the Java Servlet 2.5 or 3.1 API specifications, to provide the request data to the servlet and accept the response data.

App Engine runs multiple instances of your application, each instance has its own web server for handling requests. Any request can be routed to any instance, so consecutive requests from the same user are not necessarily sent to the same instance. The number of instances can be adjusted automatically as traffic changes.

The following example servlet class displays a simple message on the user's browser.

// With @WebServlet annotation the webapp/WEB-INF/web.xml is no longer required.
@WebServlet(name = "requests", description = "Requests: Trivial request", urlPatterns = "/requests")
public class RequestsServlet extends HttpServlet {

  @Override
  public void doGet(HttpServletRequest req, HttpServletResponse resp) throws IOException {
    resp.setContentType("text/plain");
    resp.getWriter().println("Hello, world");
  }
}

Node.js

The following sample contains the JavaScript code to start a server and respond to all GET requests from web clients to the root path ('/') by displaying the "Hello, world!" message, via a server that runs on port 8080:

const express = require('express');

const app = express();

app.get('/', (req, res) => {
  res.status(200).send('Hello, world!').end();
});

// Start the server
const PORT = parseInt(process.env.PORT) || 8080;
app.listen(PORT, () => {
  console.log(`App listening on port ${PORT}`);
  console.log('Press Ctrl+C to quit.');
});

Importantly, on the last few lines, the code has the server listen to the port specified by the process.env.PORT variable. This is an environment variable set by the App Engine runtime - if your server does not listen to this port, it will not be able to receive requests.

PHP

The server determines which PHP handler script to run by comparing the URL of the request to the URL patterns in the app's app.yaml configuration file. It then runs the script populated with the request data. The server puts the request data in environment variables and the standard input stream. The script performs actions appropriate to the request, then prepares a response and puts it on the standard output stream.

The following example is a PHP script that responds to any HTTP request with the message 'Hello World!'

<?php

echo 'hello world!';

The following example uses the Slim framework to respond to an HTTP request.

$app = AppFactory::create();
$app->addRoutingMiddleware();
$app->addErrorMiddleware(true, true, true);

$app->get('/', function (Request $request, Response $response) {
    // Use the Null Coalesce Operator in PHP7
    // http://php.net/manual/en/language.operators.comparison.php#language.operators.comparison.coalesce
    $name = $request->getQueryParams()['name'] ?? 'World';
    $response->getBody()->write("Hello, $name!");
    return $response;
});
$app->run();

Python

The following example is a Python script that responds to any HTTP request with the message 'Hello World!'

# Copyright 2018 Google LLC
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#     http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

from flask import Flask


# If `entrypoint` is not defined in app.yaml, App Engine will look for an app
# called `app` in `main.py`.
app = Flask(__name__)


@app.route('/')
def hello():
    """Return a friendly HTTP greeting."""
    return 'Hello World!'


if __name__ == '__main__':
    # This is used when running locally only. When deploying to Google App
    # Engine, a webserver process such as Gunicorn will serve the app. You
    # can configure startup instructions by adding `entrypoint` to app.yaml.
    app.run(host='127.0.0.1', port=8080, debug=True)

Ruby

The following example is a very basic one-file Sinatra application that responds to all GET requests from web clients to the root path "/" by displaying the Hello, world! message:

require "sinatra"

get "/" do
  "Hello world!"
end

Quotas and limits

App Engine automatically allocates resources to your application as traffic increases. However, this is bound by the following restrictions:

  • App Engine reserves automatic scaling capacity for applications with low latency, where the application responds to requests in less than one second.

  • Applications that are heavily CPU-bound may also incur some additional latency in order to efficiently share resources with other applications on the same servers. Requests for static files are exempt from these latency limits.

Each incoming request to the application counts toward the Requests limit. Data sent in response to a request counts toward the Outgoing Bandwidth (billable) limit.

Both HTTP and HTTPS (secure) requests count toward the Requests, Incoming Bandwidth (billable), and Outgoing Bandwidth (billable) limits. The Google Cloud console Quota Details page also reports Secure Requests, Secure Incoming Bandwidth, and Secure Outgoing Bandwidth as separate values for informational purposes. Only HTTPS requests count toward these values. For more information, see the Quotas page.

The following limits apply specifically to the use of request handlers:

Limit Amount
Request size 32 megabytes
Response size 32 megabytes
Request timeout Depends on the type of scaling your app uses
Maximum total number of files (app files and static files) 10,000 total
1,000 per directory
Maximum size of an application file 32 megabytes
Maximum size of a static file 32 megabytes
Maximum total size of all application and static files First 1 gigabyte is free
$ 0.026 per gigabyte per month after first 1 gigabyte
Pending request timeout 10 seconds
Maximum size of a single request header field 8 kilobytes for second-generation runtimes in the standard environment. Requests to these runtimes with header fields exceeding 8 kilobytes will return HTTP 400 errors.

Request limits

All HTTP/2 requests will be translated into HTTP/1.1 requests when forwarded to the application server.

Response limits

  • Dynamic responses are limited to 32 MB. If a script handler generates a response larger than this limit, the server sends back an empty response with a 500 Internal Server Error status code. This limitation does not apply to responses that serve data from Cloud Storage or the legacy Blobstore API if it is available in your runtime.

  • The response header limit is 8 KB for second-generation runtimes. Response headers that exceed this limit will return HTTP 502 errors, with logs showing upstream sent too big header while reading response header from upstream.

Request headers

An incoming HTTP request includes the HTTP headers sent by the client. For security purposes, some headers are sanitized or amended by intermediate proxies before they reach the application.

For more information, see the Request headers reference.

Handling request timeouts

App Engine is optimized for applications with short-lived requests, typically those that take a few hundred milliseconds. An efficient app responds quickly for the majority of requests. An app that doesn't will not scale well with App Engine's infrastructure. To ensure this level of performance, there is a system-imposed maximum request timeout that every app must respond by.

Go

If your app exceeds this deadline, App Engine interrupts the request handler.

For Go request handlers, the process is stopped, and the runtime environment returns an HTTP 500 Internal Server Error to the client.

Java

If your app exceeds this deadline, App Engine interrupts the request handler.

When using the legacy bundled services, the Java runtime environment interrupts the servlet by throwing a com.google.apphosting.api.DeadlineExceededException. If there is no request handler to catch this exception, the runtime environment will return an HTTP 500 server error to the client.

If there is a request handler and the DeadlineExceededException is caught, then the runtime environment gives the request handler time (less than a second) to prepare a custom response. If the request handler takes more than a second after raising the exception to prepare a custom response, a HardDeadlineExceededError will be raised.

Both DeadlineExceededExceptions and HardDeadlineExceededErrors will force termination of the request and stop the instance.

To find out how much time remains before the deadline, the application can import com.google.apphosting.api.ApiProxy and call ApiProxy.getCurrentEnvironment().getRemainingMillis(). This is useful if the application is planning to start on some work that might take too long; if you know it takes five seconds to process a unit of work but getRemainingMillis() returns less time, there's no point starting that unit of work.

Node.js

If your app exceeds this deadline, App Engine interrupts the request handler.

PHP

If a PHP script exceeds this deadline, the TIMEOUT bit on the connection status bitfield is set. Your script will then have a short second deadline to clean up any long running tasks and return a response to the user.

If your script hasn't returned a response by the second deadline, the handler is stopped and a default error response is returned.

Python

If your app exceeds this deadline, App Engine interrupts the request handler.

Ruby

If your app exceeds this deadline, App Engine interrupts the request handler.

Responses

Go

App Engine calls the handler with a Request and a ResponseWriter, then waits for the handler to write to the ResponseWriter and return. When the handler returns, the data in the ResponseWriter's internal buffer is sent to the user.

This is practically the same as when writing normal Go programs that use the http package.

There are size limits that apply to the response you generate, and the response may be modified before it is returned to the client.

For more information, see the Request responses reference.

Java

App Engine calls the servlet with a request object and a response object, then waits for the servlet to populate the response object and return. When the servlet returns, the data on the response object is sent to the user.

There are size limits that apply to the response you generate, and the response may be modified before it is returned to the client.

For more information, see the Request responses reference.

Node.js

There are size limits that apply to the response you generate, and the response may be modified before it is returned to the client.

For more information, see the Request responses reference.

PHP

App Engine calls the script with the $_REQUEST array populated, buffers any output from the script, and when the script completes execution, sends the buffered output to the end user.

There are size limits that apply to the response you generate, and the response may be modified before it is returned to the client.

For more information, see the Request responses reference.

Python

App Engine calls the handler script with a Request and waits for the script to return; all data written to the standard output stream is sent as the HTTP response.

There are size limits that apply to the response you generate, and the response may be modified before it is returned to the client.

For more information, see the Request responses reference.

Ruby

There are size limits that apply to the response you generate, and the response may be modified before it is returned to the client.

For more information, see the Request responses reference.

Streaming Responses

App Engine does not support streaming responses where data is sent in incremental chunks to the client while a request is being processed. All data from your code is collected as described above and sent as a single HTTP response.

Response compression

For responses that are returned by your code, App Engine compresses data in the response if both of the following conditions are true:

  • The request contains the Accept-Encoding header that includes gzip as a value.
  • The response contains text-based data such as HTML, CSS, or JavaScript.

For responses that are returned by an App Engine static file or directory handler, response data is compressed if all of the following conditions are true:

  • The request includes Accept-Encoding with gzip as one of its values.
  • The client is capable of receiving the response data in a compressed format. The Google Frontend maintains a list of clients that are known to have problems with compressed responses. These clients will not receive compressed data from static handlers in your app, even if the request headers contain Accept-Encoding: gzip.
  • The response contains text-based data such as HTML, CSS, or JavaScript.

Note the following:

  • A client can force text-based content types to be compressed by setting both of the Accept-Encoding and User-Agent request headers to gzip.

  • If a request doesn't specify gzip in the Accept-Encoding header, App Engine will not compress the response data.

  • The Google Frontend caches responses from App Engine static file and directory handlers. Depending on a variety of factors, such as which type of response data is cached first, which Vary headers you have specified in the response, and which headers are included in the request, a client could request compressed data but receive uncompressed data, and the other way around. For more information, see Response caching.

Response caching

The Google Frontend, and potentially the user's browser and other intermediate caching proxy servers, will cache your app's responses as instructed by standard caching headers that you specify in the response. You can specify these response headers either through your framework, directly in your code, or through App Engine static file and directory handlers.

In the Google Frontend, the cache key is the full URL of the request.

Caching static content

To ensure that clients always receive updated static content as soon as it is published, we recommend that you serve static content from versioned directories, such as css/v1/styles.css. The Google Frontend will not validate the cache (check for updated content) until the cache expires. Even after the cache expires, the cache will not be updated until the content at the request URL changes.

The following response headers that you can set in app.yaml influence how and when the Google Frontend caches content:

  • Cache-Control should be set to public for the Google Frontend to cache content; it may also be cached by the Google Frontend unless you specify a Cache-Control private or no-store directive. If you don't set this header in app.yaml, App Engine automatically adds it for all responses handled by a static file or directory handler. For more information, see Headers added or replaced.

  • Vary: To enable the cache to return different responses for a URL based on headers that are sent in the request, set one or more of the following values in the Vary response header: Accept, Accept-Encoding, Origin, or X-Origin

    Due to the potential for high cardinality, data will not be cached for other Vary values.

    For example:

    1. You specify the following response header:

      Vary: Accept-Encoding

    2. You app receives a request that contains the Accept-Encoding: gzip header. App Engine returns a compressed response and the Google Frontend caches the gzipped version of the response data. All subsequent requests for this URL that contain the Accept-Encoding: gzip header will receive the gzipped data from the cache until the cache becomes invalidated (due to the content changing after the cache expires).

    3. Your app receives a request that does not contain the Accept-Encoding header. App Engine returns an uncompressed response and Google Frontend caches the uncompressed version of the response data. All subsequent requests for this URL that do not contain the Accept-Encoding header will receive the compressed data from the cache until the cache becomes invalidated.

    If you do not specify a Vary response header, the Google Frontend creates a single cache entry for the URL and will use it for all requests regardless of the headers in the request. For example:

    1. You do not specify the Vary: Accept-Encoding response header.
    2. A request contains the Accept-Encoding: gzip header, and the gzipped version of the response data will be cached.
    3. A second request does not contain the Accept-Encoding: gzip header. However, because the cache contains a gzipped version of the response data, the response will be gzipped even though the client requested uncompressed data.

The headers in the request also influence caching:

  • If the request contains an Authorization header, the content will not be cached by the Google Frontend.

Cache expiration

By default, the caching headers that App Engine static file and directory handlers add to responses instruct clients and web proxies such as the Google Frontend to expire the cache after 10 minutes.

After a file is transmitted with a given expiration time, there is generally no way to clear it out of web-proxy caches, even if the user clears their own browser cache. Re-deploying a new version of the app will not reset any caches. Therefore, if you ever plan to modify a static file, it should have a short (less than one hour) expiration time. In most cases, the default 10-minute expiration time is appropriate.

You can change the default expiration for all static file and directory handlers by specifying the default_expiration element in your app.yaml file. To set specific expiration times for individiual handlers, specify the expiration element within the handler element in your app.yaml file.

The value you specify in the expiration elements time will be used to set the Cache-Control and Expires HTTP response headers.

If your app uses the Java 11/17 runtime and the legacy bundled services, you can specify the static-files element in your appengine-web.xml file

Forcing HTTPS connections

For security reasons, all applications should encourage clients to connect over https. To instruct the browser to prefer https over http for a given page or entire domain, set the Strict-Transport-Security header in your responses. For example:

Strict-Transport-Security: max-age=31536000; includeSubDomains

To set this header for any static content that is served by your app, add the header to your app's static file and directory handlers.

Go

To set this header for responses that are generated from your code, use the secureheader package.

Java

Most app frameworks and web servers provide support for setting this header for responses that are generated from your code. For information about the Strict-Transport-Security header in Spring Boot, see HTTP Strict Transport Security (HSTS).

Node.js

To set this header for responses that are generated from your code, use the helmet package.

PHP

For information about setting headers in a response that is generated from your script, see the instructions that are provided by the web framework you use for your app. If you aren't using a web framework, use the PHP header function.

Python

To set this header for responses that are generated from your code, use the flask-talisman library.

Ruby

In Rails, Strict-Transport-Security is enabled by default for responses that are generated from your code. For more information, see the config.force_ssl configuration method.

Handling asynchronous background work

Background work is any work that your app performs for a request after you have delivered your HTTP response. Avoid performing background work in your app, and review your code to make sure all asynchronous operations finish before you deliver your response.

For long-running jobs, we recommend using Cloud Tasks. With Cloud Tasks, HTTP requests are long-lived and return a response only after any asynchronous work ends.