App Engine generates usage reports about your application's performance and resources utilization. Listed below are potential strategies for managing your resources more efficiently. For more information, see the pricing page.
Viewing usage reports
When evaluating application performance, you should check the number of instances the application is running, and how the application consumes resources.
The following sections suggest some strategies for managing resources.
Managing dynamic instance scaling
Application latency impacts the number of instances that are required to handle your traffic. By decreasing latency, you can reduce the number of instances used to serve your application. Stackdriver Trace is a useful tool to view data about latency and understand potential changes to decrease it.
After using Stackdriver Trace to view your latency, try some of the following strategies to reduce latency:
- Increase caching of frequently accessed shared data - That’s another way of saying - use App Engine Memcache. Also, setting your application’s cache-control headers can have a significant impact on how efficiently your data is cached by servers and browsers. Even caching things for a few seconds can have an impact on how efficiently your application serves traffic. Python applications should also make use of caching in the runtime.
- Use App Engine Memcache more efficiently - Use batch calls for get, set, delete, etc instead of a series of individual calls. Consider using the Memcache Async API.
- Use tasks for non-request bound functionality- If your application performs work that can be done beyond the scope of a user-facing request, put it in a task! Sending this work to Task Queue instead of waiting for it to complete before returning a response can significantly reduce user-facing latency. Task Queue can then give you much more control over execution rates and help smooth out your load.
- Use Cloud Datastore more efficiently - See below for more detail.
- Execute multiple URL Fetch calls in parallel:
- Batch together multiple URL Fetch calls instead of handling them individually inside individual user-facing requests, and handle them in an offline task in parallel via async URL Fetch.
- Use the async URL Fetch API.
- For HTTP sessions, write asynchronously.
Change auto-scaling performance settings
configuration file contains several settings you can use to
adjust the trade-off between performance and resource load for a specific version
of your app.
For a list of the available auto-scaling settings, see
Watch the App Engine New Scheduler Settings
video to see the effects of these settings.
Enable concurrent requests in Python
Your application's instances can serve multiple requests
concurrently in Python. Enabling this setting will decrease the number of
instances needed to serve traffic for your application, but your application
must be threadsafe in order for this to work correctly. Read about how to use
concurrent requests by enabling threadsafe in your
Configuring Task Queue settings
The default settings for Task Queue are tuned for performance. With these defaults, when you put several tasks into a queue simultaneously, they will likely cause new Frontend Instances to start. Here are some suggestions for how to tune Task Queue to conserve Instance Hours:
- Set the X-AppEngine-FailFast header on tasks that are not latency sensitive. This header instructs the scheduler to immediately fail the request if an existing instance is not available. Task Queue will retry and back-off until an existing instance becomes available to service the request. However, it is important to note that when requests with X-AppEngine-FailFast set occupy existing instances, requests without that header set may still cause new instances to be started.
- Configure your Task Queue's settings.
- If you set the "rate" parameter to a lower value, Task Queue will execute your tasks at a slower rate.
- If you set the "max_concurrent_requests" parameter to a lower value, fewer tasks will be executed simultaneously.
- Use backends in order to completely control the number of instances used for task execution. You can use push queues with dynamic backends or pull queues with resident backends.
Serve static content where possible
Static content serving in Python is handled by specialized App Engine infrastructure, which does not consume Instance Hours. If you need to set custom headers, use the Blobstore API. The actual serving of the Blob response does not consume Instance Hours.
Managing application storage
App Engine calculates storage costs based on the size of entities in the Cloud Datastore, the size of Cloud Datastore indexes, the size of tasks in the task queue, and the amount of data stored in Blobstore. Here are some things you can do to make sure you don't store more data than necessary:
- Delete any entities or blobs your application no longer needs.
- Remove any unnecessary indexes, as discussed in the Managing Datastore Usage section below, to reduce index storage costs.
Managing Cloud Datastore usage
App Engine accounts for the number of operations performed in Cloud Datastore. Here are a few strategies that can result in reduced Cloud Datastore resource consumption, as well as lower latency for requests to Cloud Datastore:
- The GCP Console dataviewer displays the number of write ops that were required to create every entity in your local Cloud Datastore. You can use this information to understand the cost of writing each entity. See Understanding Write Costs for information on how to interpret this data.
- Remove any unnecessary indexes, which will reduce storage and entity write costs. Use the "Get Indexes" functionality to see what indexes are defined on your application. You can see what indexes are currently serving for your application in the GCP Console Search page.
- When designing your data model, you might be able to write your queries in such a way so as to avoid custom indexes altogether. Read Queries and Indexes documentation for more information on how App Engine generates indexes.
- Whenever possible, replace indexed properties (which are the default) with unindexed properties ( Python), which reduces the number of Cloud Datastore write operations when you put an entity. Caution, if you later decide that you do need to be able to query on the unindexed property, you will need to not only modify your code to again use indexed properties, but you will have to run a map reduce over all entities to reput them.
- Due to Cloud Datastore query planner improvements in App Engine 1.5.2 and 1.5.3 releases, your queries may now require fewer indexes than they did previously. While you may still choose to keep certain custom indexes for performance reasons, you may be able to delete others, reducing storage and entity write costs.
- Reconfigure your data model so that you can replace queries with fetch by key which is cheaper and more efficient.
- Use keys-only queries instead of entity queries when possible.
- To decrease latency, replace multiple entity
get()s with a batch
- Use Cloud Datastore cursors for pagination rather than offset.
- Parallelize multiple Cloud Datastore RPCs via the async datastore API.
Note: Small Cloud Datastore operations include calls to allocate Cloud Datastore ids or keys-only queries. See the pricing page for more information on costs.
For Outgoing Bandwidth, one way to reduce usage is to, whenever possible, set
Cache-Control header on your responses and set reasonable
for static files. Using public
Cache-Control headers in this way will allow
proxy servers and your clients' browser to cache responses for the designated
period of time.
Incoming Bandwidth is more difficult to control, since that's the amount of data your users are sending to your app. However, this is a good opportunity to mention DoS Protection Service, which allows you block traffic from IPs that you consider abusive.