Python 2.7 has reached end of support and will be deprecated on January 31, 2026. After deprecation, you won't be able to deploy Python 2.7 applications, even if your organization previously used an organization policy to re-enable deployments of legacy runtimes. Your existing Python 2.7 applications will continue to run and receive traffic after their deprecation date. We recommend that you migrate to the latest supported version of Python.

Managing App Resources

App Engine generates usage reports about your application's performance and resources utilization. Listed below are potential strategies for managing your resources more efficiently. For more information, see the pricing page.

Viewing usage reports

When evaluating application performance, you should check the number of instances the application is running, and how the application consumes resources.

View the dashboard usage reports

View the Instances page

The following sections suggest some strategies for managing resources.

Managing dynamic instance scaling

Decreasing latency

Application latency impacts the number of instances that are required to handle your traffic. By decreasing latency, you can reduce the number of instances used to serve your application. Cloud Trace is a useful tool to view data about latency and understand potential changes to decrease it.

After using Cloud Trace to view your latency, try some of the following strategies to reduce latency:

Increase caching of frequently accessed shared data - That's another way of saying - use App Engine Memcache. Also, setting your application's cache-control headers can have a significant impact on how efficiently your data is cached by servers and browsers. Even caching things for a few seconds can have an impact on how efficiently your application serves traffic. Python applications should also make use of caching in the runtime.
Use App Engine Memcache more efficiently - Use batch calls for get, set, delete, etc instead of a series of individual calls. Consider using the Memcache Async API.
Use tasks for non-request bound functionality- If your application performs work that can be done beyond the scope of a user-facing request, put it in a task! Sending this work to Task Queue instead of waiting for it to complete before returning a response can significantly reduce user-facing latency. Task Queue can then give you much more control over execution rates and help smooth out your load.
Use Firestore in Datastore mode (Datastore) more efficiently - See below for more detail.
Execute multiple URL Fetch calls in parallel:
- Batch together multiple URL Fetch calls instead of handling them individually inside individual user-facing requests, and handle them in an offline task in parallel via async URL Fetch.
- Use the async URL Fetch API.
For HTTP sessions, write asynchronously.

Change auto-scaling performance settings

The app.yaml configuration file contains several settings you can use to adjust the trade-off between performance and resource load for a specific version of your app. For a list of the available auto-scaling settings, see scaling elements. Watch the App Engine New Scheduler Settings video to see the effects of these settings.

Enable concurrent requests in Python

Your application's instances can serve multiple requests concurrently in Python. Enabling this setting will decrease the number of instances needed to serve traffic for your application, but your application must be threadsafe in order for this to work correctly. Read about how to use concurrent requests by enabling threadsafe in your app.yaml file.

Configuring Task Queue settings

The default settings for Task Queue are tuned for performance. With these defaults, when you put several tasks into a queue simultaneously, they will likely cause new Frontend Instances to start. Here are some suggestions for how to tune Task Queue to conserve Instance Hours:

Set the X-AppEngine-FailFast header on tasks that are not latency sensitive. This header instructs the scheduler to immediately fail the request if an existing instance is not available. Task Queue will retry and back-off until an existing instance becomes available to service the request. However, it is important to note that when requests with X-AppEngine-FailFast set occupy existing instances, requests without that header set may still cause new instances to be started.

Configure your Task Queue's settings.

If you set the "rate" parameter to a lower value, Task Queue will execute your tasks at a slower rate.
If you set the "max_concurrent_requests" parameter to a lower value, fewer tasks will be executed simultaneously.

Serve static content where possible

Static content serving in Python is handled by specialized App Engine infrastructure, which does not consume Instance Hours. If you need to set custom headers, use the Blobstore API. The actual serving of the Blob response does not consume Instance Hours.

Managing application storage

App Engine calculates storage costs based on the size of entities in the Datastore, the size of Datastore indexes, the size of tasks in the task queue, and the amount of data stored in Blobstore. Here are some things you can do to make sure you don't store more data than necessary:

Delete any entities or blobs your application no longer needs.
Remove any unnecessary indexes, as discussed in the Managing Datastore Usage section below, to reduce index storage costs.

Managing Datastore usage

App Engine accounts for the number of operations performed in Datastore. Here are a few strategies that can result in reduced Datastore resource consumption, as well as lower latency for requests to Datastore:

The Google Cloud console dataviewer displays the number of write ops that were required to create every entity in your local Datastore. You can use this information to understand the cost of writing each entity. See Understanding Write Costs for information on how to interpret this data.
Remove any unnecessary indexes, which will reduce storage and entity write costs. Use the "Get Indexes" functionality to see what indexes are defined on your application. You can see what indexes are currently serving for your application in the Google Cloud console Search page.
When designing your data model, you might be able to write your queries in such a way so as to avoid custom indexes altogether. Read Queries and Indexes documentation for more information on how App Engine generates indexes.
Whenever possible, replace indexed properties (which are the default) with unindexed properties ( Python), which reduces the number of Datastore write operations when you put an entity. Caution, if you later decide that you do need to be able to query on the unindexed property, you will need to not only modify your code to again use indexed properties, but you will have to run a map reduce over all entities to reput them.
Due to Datastore query planner improvements in App Engine 1.5.2 and 1.5.3 releases, your queries may now require fewer indexes than they did previously. While you may still choose to keep certain custom indexes for performance reasons, you may be able to delete others, reducing storage and entity write costs.
Reconfigure your data model so that you can replace queries with fetch by key which is cheaper and more efficient.
Use keys-only queries instead of entity queries when possible.
To decrease latency, replace multiple entity get()s with a batch get().
Use Datastore cursors for pagination rather than offset.
Parallelize multiple Datastore RPCs via the async datastore API.

Note: Small Datastore operations include calls to allocate Datastore ids or keys-only queries. See the pricing page for more information on costs.

Managing bandwidth

To reduce outgoing bandwidth, you can set the appropriate Cache-Control header on your responses and set reasonable expiration times for static files. Using public Cache-Control headers in this way will allow proxy servers and your clients' browser to cache responses for the designated period of time.

Incoming bandwidth is more difficult to control, since that's the amount of data your users are sending to your app. However, you can use App Engine firewall rules, to allow or restrict ranges of IP addresses and subnets.

Managing other resources

One of the best strategies for auditing your usage of the Email API is to use Appstats to make sure you're not making more calls than are necessary. It's always a good idea to make sure you are checking your error rates and looking out for any invalid calls you might be making. In some cases it might be possible to catch those calls early.