App Engine generates usage reports to help you understand your application's performance and the resources your application is using. Based on these reports, you can employ the strategies listed below to manage your resources. After making any changes, you will see that information reflected in subsequent usage reports. For more information, please see our pricing page.
Viewing usage reports
When you evaluate the performance of your application, you will be interested in how many instances it is running, and how it is consuming resources.
The following sections suggest some strategies you can use to manage resources, and explain what these strategies could mean for your application's performance.
Managing dynamic instance scaling
Application latency has an impact on how many instances are required to handle your traffic. By decreasing latency, you can reduce the number of instances we use to serve your application. Stackdriver Trace is a useful tool that you can use to view data about latency and understand what changes you can make to decrease it.
After you have used Stackdriver Trace to view your latency, you can to try some of the following strategies for reducing it:
- Increase caching of frequently accessed shared data - That’s another way of saying - use Memcache. Also, if you set your application’s cache-control headers, this can have a big impact on how efficiently your data is cached by servers and browsers. Even caching things for a few seconds can have a big impact on how efficiently your application serves traffic. Python applications should also make use of caching in the runtime as well.
- Use Memcache more efficiently - Use batch calls for get, set, delete, etc instead of a series of individual calls. Where appropriate, consider using the Memcache Async API (Java, Python).
- Use Tasks for non-request bound functionality- If your application performs work that can be done outside of the scope of a user-facing request, put it in a task! Sending this work to the Task Queue instead of waiting for it to complete before returning a response can significantly reduce user-facing latency. The Task Queue can then give you much more control over execution rates and help smooth out your load.
- Use the datastore more efficiently - We go in to more detail for this below.
- Parallelize multiple URL Fetch calls
- For Java HTTP sessions, write asynchronously - HTTP sessions
lets you configure your application to asynchronously write HTTP session data
to the datastore by adding
<async-session-persistence enabled="true"/>to your
appengine-web.xml. Session data is always written synchronously to memcache, and if a request tries to read the session data when memcache is not available it will fail over to the datastore, which might not yet have the most recent update. This means there is a small risk your application will see stale session data, but for most applications the latency benefit far outweighs the risk.
Change auto scaling performance settings
The module configuration file contains two settings you can use to adjust the tradeoff between performance and resource load:
- Max Idle Instances - Setting Max Idle Instances allows App Engine to shut down idle instances above the specified limit so they won't consume additional quota or create charges. However, fewer idle instances also means that the App Engine Scheduler may have to spin up new instances if you experience a spike in traffic -- potentially increasing user-visible latency for your app.
- Min Pending Latency - Raising Min Pending Latency instructs App Engine’s scheduler to not start a new instance unless a request has been pending for more than the specified time. If all instances are busy, user-facing requests may have to wait in the pending queue until this threshold is reached. Setting a high value for this setting will require fewer instances to be started, but may result in high user-visible latency during increased load.
Enable concurrent requests in Java
Enabling this setting will decrease the number
of instances needed to serve traffic for your application, but your application
must be threadsafe in order for this to work correctly. Read about how to
use concurrent requests by enabling threadsafe in your
Enable concurrent requests in Python
Your application's instances can serve multiple requests
concurrently in Python. Enabling this setting will decrease the number of
instances needed to serve traffic for your application, but your application
must be threadsafe in order for this to work correctly. Read about how to use
concurrent requests by enabling threadsafe in your
Configuring TaskQueue settings
The default settings for the Task Queue are tuned for performance. With these defaults, when you put several tasks into a queue simultaneously, they will likely cause new Frontend Instances to spin up. Here are some suggestions for how to tune the Task Queue to conserve Instance Hours:
- Set the X-AppEngine-FailFast header on tasks that are not latency sensitive. This header instructs the Scheduler to immediately fail the request if an existing instance is not available. The Task Queue will retry and back-off until an existing instance becomes available to service the request. However, it is important to note that when requests with X-AppEngine-FailFast set occupy existing instances, requests without that header set may still cause new instances to be started.
- Configure your Task Queue's settings(Java, Python).
- If you set the "rate" parameter to a lower value, Task Queue will execute your tasks at a slower rate.
- If you set the "max_concurrent_requests" parameter to a lower value, fewer tasks will be executed simultaneously.
- Use backends(Java, Python) in order to completely control the number of instances used for task execution. You can use push queues with dynamic backends, or pull queues with resident backends.
Serve static content where possible
Static content serving (Java, Python) is handled by specialized App Engine infrastructure, which does not consume Instance Hours. If you need to set custom headers, use the Blobstore API (Java, Python, Go). The actual serving of the Blob response does not consume Instance Hours.
Managing application storage
App Engine calculates storage costs based on the size of entities in the datastore, the size of datastore indexes, the size of tasks in the task queue, and the amount of data stored in Blobstore. Here are some things you can do to make sure you don't store more data than necessary:
- Delete any entities or blobs your application no longer needs.
- Remove any unnecessary indexes, as discussed in the Managing Datastore Usage section below, to reduce index storage costs.
Managing datastore usage
App Engine accounts for the number of operations performed in the Datastore. Here are a few strategies that can result in reduced Datastore resource consumption, as well as lower latency for requests to the datastore:
- The GCP Console dataviewer displays the number of write ops that were required to create every entity in your local datastore. You can use this information to understand the cost of writing each entity. See Understanding Write Costs for information on how to interpret this data.
- Remove any unnecessary indexes, which will reduce storage and entity write costs. Use the "Get Indexes" functionality (Java, Python) to see what indexes are defined on your application. You can see what indexes are currently serving for your application in the GCP Console Search page.
- When designing your data model, you might be able to write your queries in such a way so as to avoid custom indexes altogether. Read our Queries and Indexes documentation for more information on how App Engine generates indexes.
- Whenever possible, replace indexed properties (which are the default) with unindexed properties (Java, Python), which reduces the number of datastore write operations when you put an entity. Caution, if you later decide that you do need to be able to query on the unindexed property, you will need to not only modify your code to again use indexed properties, but you will have to run a map reduce over all entities to reput them.
- Due to the datastore query planner improvements in the App Engine 1.5.2 and 1.5.3 releases, your queries may now require fewer indexes than they did previously. While you may still choose to keep certain custom indexes for performance reasons, you may be able to delete others, reducing storage and entity write costs.
- Reconfigure your data model so that you can replace queries with fetch by key (Java, Python, Go), which is cheaper and more efficient.
- Use keys-only queries instead of entity queries when possible.
- To decrease latency, replace multiple entity
get()s with a batch
- Use datastore cursors for pagination rather than offset.
- Parallelize multiple datastore RPCs via the async datastore API (Java, Python), or goroutines (Go).
Note: Small datastore operations include calls to allocate datastore ids or keys-only queries. See the pricing page for more information on costs.
For Outgoing Bandwidth, one way to reduce usage is to, whenever possible, set
Cache-Control header on your responses and set reasonable
static files. Using public
Cache-Control headers in this way will allow proxy
servers and your clients' browser to cache responses for the designated period
Incoming Bandwidth is more difficult to control, since that's the amount of data your users are sending to your app. However, this is a good opportunity to mention our DoS Protection Service for Python and Java, which allows you block traffic from IPs that you consider abusive.
Managing other resources
The last items on the report are the usages for the Email API. For this API, your best bet is to make sure you are using it effectively. One of the best strategies for auditing your usage of the API is to use Appstats (Python, Java) to make sure you're not making more calls than are necessary. Also, it's always a good idea to make sure you are checking your error rates and looking out for any invalid calls you might be making. In some cases it might be possible to catch those calls early.