Backends Python API Overview

The Backend API is deprecated as of March 13, 2014. Although Google will continue to support the Backend API in accordance with our terms of service, it is strongly recommended that all new applications use the Modules API instead.

For information on converting existing apps using the Backend API to the Modules API, see Converting Apps to Modules

Backends are designed for applications that need faster performance, large amounts of addressable memory, and continuous or long-running background processes. They are exempt from request deadlines and have access to more memory (up to 1GB) and CPU (up to 4.8GHz) than normal instances. Unlike normal instances, backends are billed for uptime rather than CPU usage.

A backend may be configured as either resident or dynamic. Resident backends run continuously, allowing you to rely on the state of their memory over time and perform complex initialization. Dynamic backends come into existence when they receive a request and are turned down when idle. Dynamic backends are ideal for work that is intermittent or driven by user activity. For more information about the differences between resident and dynamic backends, see Types of Backends and also the discussion of Startup and Shutdown.

Backends do not automatically scale in response to request volume. Instead, you must specify the number of backend instances you require and change this number by performing an update or configure command. The number of instances is usually set in proportion to the size of a dataset, the degree of processing power required, and your budget for your application.

Backends are configured using backends.yaml. You can list each backend and specify its properties, such as the number of instances, the memory and CPU class, whether it is public or private, and other options. Backends share the handlers defined in app.yaml with the main application version. You can place your backends in a separate application root directory if you want to avoid sharing code or handlers, or simply mark the relevant handlers with login: admin.

Properties of backends

The following table compares backend instances to default App Engine instances.


Default Instances

Backend Instances

Deadlines 60-second deadline for HTTP requests, 10-minute deadline for tasks Requests to backends can run indefinitely. A backend can choose to handle `/_ah/start` and execute a program or script for many hours without returning an HTTP response code.
CPU Flexible, billed by instance hours Configurable from 600MHz–4.8GHz, included in the hourly price of the instance uptime. See Instance Classes for more information.
Memory Low memory cap (128MB). Configurable memory limit, from 128MB to 1GB of memory per instance. See Instance Classes for more information.
Residence Instances are evicted from memory based on usage patterns. You can configure backends to use resident instances, which remain in memory, so state is preserved across requests. When backends are restarted, you usually have 30 seconds to finish before shutdown (see Shutdown for more information).
Startup and Shutdown Instances are created on demand to handle requests and automatically turned down when idle. Backends are sent a start request automatically by App Engine in the form of an empty request to `/_ah/start`. A backend that is stopped with appcfg backends stop or using Shutdown in the Administration Console has 30 seconds to finish handling requests before it is forcibly terminated.

See Backend States for more information about startup and shutdown.

Instance Addressability Instances are anonymous. Instances are individually addressable at a URL of this form:


If you have set up a wildcard subdomain mapping for a custom domain, you can also address a backend or any of its instances via a URL of the form


where instance is a small integer identifying a particular instance and where backend_name is the name specified in the backend configuration file.

Alternatively, you can use the URL form:


You can reliably cache state in each instance and retrieve it in subsequent requests.

Scaling App Engine scales the number of instances automatically in response to processing volume. You configure the number of instances of each backend in `backends.yaml`. The number of instances usually corresponds to the size of a dataset being held in memory or the desired throughput for offline work. A dynamic backend is configured with a maximum number of instances; the number of live instances scales with the processing volume. You can adjust the number of instances of a backend very quickly, without stopping instances that are currently running, using the `configure` command.
Public vs. Private HTTP Requests Instances can handle private and public requests. Instances handle private requests by default, but you can configure them to handle public requests.


App Engine provides the following command-line tools allowing you to interact with the Backends configured in backends.yaml`.

When running, pass --backends to enable backends support.

appcfg backends <dir> update [backend]
Creates or updates a backend configured in backends.yaml. You can update a single backend by specifying a backend name, or all of the backends by running this command without specifying a backend. If the backend was in the START state before the update, it remains started after the update; if it was in the STOP state, it remains stopped after the update. Updating a backend causes instances that are currently running to shut down before new instances come up. Instances will not be shut down until the final step of the update, so that any incremental errors during an update do not result in downtime.
appcfg backends <dir> rollback <backend>
Rolls back a backend update that was interrupted by the user or stopped due to a configuration error. Updates that have been successfully applied are not eligible for rollback.
appcfg backends <dir> list
Lists all the backends configured for the app specified in dir/app.yaml.
appcfg backends <dir> start <backend>
Sets the backend state to START, allowing it to receive HTTP requests. Resident backends start immediately. Dynamic backends do not start until the first user request arrives. Has no effect if the backend was already started.
appcfg backends <dir> stop <backend>
Sets the backend state to STOP and shuts down any running instances. The stopped backend cannot receive HTTP requests; if it recevies a request, it returns a 404 response. This command has no effect if the backend was already stopped.
appcfg backends <dir> delete <backend>
Deletes the indicated backend. After issuing this command, requests to the deleted backend (for example, are routed to the default app version.
appcfg backends <dir> configure <backend>

Dynamically updates settings in backends.yaml without having to stop the backend. Does not affect any code or handlers. Supports only the following configuration settings:

  • instances
  • options: public
  • options: failfast

Backend states

A backend can be in one of two states: START or STOP. You can view and change the state of a backend using the appcfg or the Backends tab in the Administration Console. The state controls whether a backend instance is considered active and capable of handling requests, or is disabled and shutting down.


Each backend instance is created in response to a start request, which is an empty GET request to /_ah/start. App Engine sends this special request to bring the instance into existence; users cannot send requests to /_ah/start.

Backend instances must respond to the start request before they can handle another request. The start request can be used for two purposes:

  1. To start a program that runs indefinitely, without accepting further requests
  2. To initialize an instance before it receives additional traffic

Resident backends and dynamic backends startup differently. When you start a resident backend, App Engine immediately sends a /_ah/start request to each backend instance. When you start a dynamic backend, App Engine allows the backend to accept traffic, but the /_ah/start request is not sent to an instance until App Engine receives the first user request that invokes the backend. Multiple dynamic instances are only started as necessary, in order to handle increased traffic.

When an instance responds to the /_ah/start request with an HTTP status code of 200–299 or 404, it is considered to have successfully started and can handle additional requests. If it responds otherwise, App Engine terminates the instance and re-issues the start request if necessary. Resident backends are restarted immediately, while dynamic backends are restarted only when needed for serving traffic. You can specify a different start handler for each backend in backends.yaml. This syntax is experimental.


When App Engine needs to turn down a backend instance, existing requests are given 30 seconds to complete, and new requests immediately return 404. The shutdown process may be triggered by a variety of planned and unplanned events, such as:

  • You manually stop the backend using the Administration Console or appcfg backends <dir> stop.
  • You update the backend using appcfg backends <dir> update.
  • Your backend exceeds the maximum memory for its configured class.
  • Your application runs out of Backends Usage quota.
  • The machine running the backend is restarted, forcing your backend to move to a different machine.
  • App Engine needs to move your backend to a different machine to improve load distribution.

If possible, App Engine notifies the backend 30 seconds before terminating it. There are two ways to receive this notification. First, the is_shutting_down() method from google.appengine.api.runtime begins returning true. Second, if you have registered a shutdown hook, it will be called. It's a good idea to register the shutdown hook in your start request.

If you are handling a request, App Engine pauses that request and runs the hook. If you are not handling a request, App Engine sends an /_ah/stop request, which runs the shutdown hook. The /_ah/stop request bypasses normal handling logic and cannot be handled by user code; its sole purpose is to invoke the shutdown hook. If you raise an exception in your shutdown hook while handling another request, it will bubble up into the request, where you can catch it.

If you are using threadsafe in Python 2.7, raising an exception from a shutdown hook copies that exception to all threads.

The following code sample demonstrates a basic shutdown hook:

from google.appengine.api import apiproxy_stub_map
from google.appengine.api import runtime

def my_shutdown_hook():
  # May want to raise an exception


Alternatively, the following sample demonstrates how to use the is_shutting_down() method:

while more_work_to_do and not runtime.is_shutting_down():

Backend uptime

App Engine attempts to keep backends running indefinitely. However, at this time there is no guaranteed uptime for backends. Hardware and software failures that cause early termination or frequent restarts can occur without prior warning and may take considerable time to resolve; thus, you should construct your application in a way that tolerates these failures. The App Engine team will provide more guidance on expected backend uptime as statistics become available.

Good strategies for avoiding downtime due to backend restarts include:

  • Load balancing across multiple backend instances
  • Configuring more backend instances than are normally required to handle your traffic patterns
  • Writing fall-back logic that uses cached results when a backend is unavailable
  • Reducing the amount of time it takes for your backend to start up and shutdown
  • Duplicating the same state in more than one backend instance.

It's also important to recognize that the shutdown hook is not always able to run before a backend terminates. In rare cases, an outage can occur that prevents App Engine from providing 30 seconds of shutdown time. Thus, we recommend periodically checkpointing the state of your backend and using it primarily as an in-memory cache rather than a reliable data store.

Addressing backends

A backend instance can be targeted with HTTP requests to http://[instance]-dot-[backend_name]-dot-[your_app_id], or at your application's custom domain. If you target a backend without targeting an instance using http://[backend_name]-dot-[your_app_id], App Engine selects the first available instance of the backend.

The Backends API provides functions to retrieve the address of a backend or instance. This allows application versions to target backends with requests, for a backend to target another backend, or for one instance of a backend to target another instance. This works in both the development and production environments.

The BACKEND_ID and INSTANCE_ID environment variables contain the backend name and instance index of the instance handling the request.

Note: Google recommends using the HTTPS protocol to send requests to your app. Google does not issue SSL certificates for double-wildcard domains hosted at Therefore with HTTPS you must use the string "-dot-" instead of "." to separate subdomains, as shown in the examples below. You can use a simple "." with your own custom domain or with HTTP addresses.

Public and private backends

Backends can be public, that is, directly exposed in some way to users, or they can be private, not user-facing at all. Backends are private by default, since they typically function as a component inside an application, rather than acting as its public face. Private backends can be accessed by application administrators, instances of the application, and by App Engine APIs and services (such as Task Queue tasks and Cron jobs) without any special configuration. Backends are not primarily intended for user-facing traffic, but you can make a backend public for testing or for interacting with an external system.

A backend that is used for offline processing can typically be taken down, updated, or resized without any user-visible effects. When a backend is incorporated into a user-visible request flow, updating it requires more care. Most updates to a backend require existing instances to shut down and new instances to start up. If startup and shutdown are fast, then updates can be performed rapidly with minimal impact on user traffic. You can update certain backend settings without restarting the backend using the appcfg backends configure command.

Background threads

Code running on a backend can start a background thread, a thread that can "outlive" the request that spawns it. They allow backend instances to perform arbitrary periodic or scheduled tasks or to continue working in the background after a request has returned to the user.

A background thread's os.environ and logging entries are independent of those of the spawning thread.

The background thread API is defined in google.appengine.api.background_thread. Its BackgroundThread class is like the regular Python threading.Thread class, but can "outlive" the request that spawns it:

from google.appengine.api import background_thread

def f(arg1, arg2, **kwargs):
  ...something useful...

t = background_thread.BackgroundThread(target=f, args=["foo", "bar"])

There is a function start_new_background_thread which creates a background thread and starts it:

from google.appengine.api import background_thread

def f(arg1, arg2, **kwargs):
  ...something useful...

tid = background_thread.start_new_background_thread(f, ["foo", "bar"])

The maximum number of concurrent background threads is 10.

Monitoring resource usage

The Instances Console section of the Administration Console provides visibility into how backend instances are performing. By selecting your backend in the version/backend dropdown, you can see the memory and CPU usage of each instance, uptime, number of requests, and other statistics. You can also manually initiate the shutdown process for any instance.

You also can use the Runtime API to access statistics showing the CPU and memory usage of your backend instances. These statistics help you understand how resource usage responds to requests or work performed, and also how to regulate the amount of data stored in memory in order to stay below the memory limit of your backend class.

Periodic logging

Application logs are automatically flushed periodically during backend requests. You can tune the flush settings, or force an immediate flush, using the Logs API . When a flush occurs, a new log entry is created at the time of the flush, containing any log messages that had not been flushed yet. These entries show up in the Logs Console marked with flush, and include the start time of the request that generated the flush:

Fetching Request Logs and Application Logs

For programmatic access to the request and application logs for your application, you can use the Logs API, in particular its fetch() function This feature allows you to retrieve logs using various filters, such as request ID, timestamp, and version ID.

Administering backends

The Backends Console section of the Administration Console provides a list of all backends and gives you the option to start, stop, or delete a backend. Backends are listed alphabetically, along with any options you choose when configuring the backend (such as the option to make the backend dynamic):

Billing, quotas, and limits


Backends are priced based on an hourly rate determined by the backend class. The following table describes the cost for each class:

Class configuration Memory limit CPU limit Cost per hour per instance
B1 128MB 600MHz $0.05
B2 (default) 256MB 1.2GHz $0.10
B4 512MB 2.4GHz $0.20
B4_1G 1024MB 2.4GHz $0.30
B8 1024MB 4.8GHz $0.40

In general, backend usage is billed on an hourly basis based on the backends uptime. Billing begins when the backend starts and ends fifteen minutes after the backend shuts down. Runtime overhead is counted against the instance memory limit. This will be higher for Java than for Python.

Billing is slightly different in resident and dynamic backends:

  • For resident backends, billing ends fifteen minutes after the backend is shut down.
  • For dynamic backends, billing ends fifteen minutes after the last request has finished processing.

Changing the class of a backend also impacts billing:

  • If you switch from a higher class to a lower class, you pay the higher price for fifteen minutes after the change.
  • If you switch from a lower class to a higher class, you begin paying the higher price when the higher-class backend receives its first request.

Quotas and limits

Backends are exempt from the 60-second deadline for user requests and the 10-minute deadline for tasks, and run indefinitely. Backends are subject to the same API quotas, limits, and call deadlines as normal instances, with the following exceptions:

  • Backends are allowed to make up to 100 simultaneous API calls
  • Task Queue tasks have a 24-hour deadline when sent to a backend

The default limits for the number and size of backends depend on the location of the application and whether it has billing enabled.

Resource Type Free App Limit Billed App Limit
Backends per application 5 20 20
Instances per backend 20 200 25
Total configured backend memory, either resident or dynamic, per app 10GB 200GB 50GB

Send feedback about...

App Engine standard environment for Python