Memcache Overview

This page provides an overview of the App Engine memcache service. High performance scalable web applications often use a distributed in-memory data cache in front of or in place of robust persistent storage for some tasks. App Engine includes a memory cache service for this purpose. To learn how to configure, monitor, and use the memcache service, read Using Memcache.

When to use a memory cache

One use of a memory cache is to speed up common datastore queries. If many requests make the same query with the same parameters, and changes to the results do not need to appear on the web site right away, the application can cache the results in the memcache. Subsequent requests can check the memcache, and only perform the datastore query if the results are absent or expired. Session data, user preferences, and other data returned by queries for web pages are good candidates for caching.

Memcache can be useful for other temporary values. However, when considering whether to store a value solely in the memcache and not backed by other persistent storage, be sure that your application behaves acceptably when the value is suddenly not available. Values can expire from the memcache at any time, and can be expired prior to the expiration deadline set for the value. For example, if the sudden absence of a user's session data would cause the session to malfunction, that data should probably be stored in the datastore in addition to the memcache.

Service levels

App Engine supports two levels of the memcache service:

  • Shared memcache is the free default for App Engine applications. It provides cache capacity on a best-effort basis and is subject to the overall demand of all the App Engine applications using the shared memcache service.

  • Dedicated memcache provides a fixed cache capacity assigned exclusively to your application. It's billed by the GB-hour of cache size and requires billing to be enabled. Having control over cache size means your app can perform more predictably and with fewer reads from more costly durable storage.

Both memcache service levels use the same API. To configure the memcache service for your application, see Using Memcache.

The following table summarizes the differences between the two classes of memcache service:

Feature Dedicated Memcache Shared Memcache
Price $0.06 per GB per hour Free
1 to 100GB
other regions
1 to 20GB
No guaranteed capacity
Performance Up to 10k reads or 5k writes (exclusive) per second per GB (items < 1KB). For more details, see Cache statistics. Not guaranteed
Durable store No No
SLA None None

Dedicated memcache billing is charged in 15 minute increments. When charging in local currency, Google will convert the prices listed into applicable local currency pursuant to the conversion rates published by leading financial institutions.

If your app needs more memcache capacity, contact


The following limits apply to the use of the memcache service:

  • The maximum size of a cached data value is 1 MiB (2^20 bytes) minus the size of the key minus an implementation-dependent overhead, which is approximately 73 bytes.
  • A key cannot be larger than 250 bytes. In the Java runtime, keys that are objects or strings longer than 250 bytes will be hashed. (Other runtimes behave differently.)
  • The "multi" batch operations can have any number of elements. The total size of the call and the total size of the data fetched must not exceed 32 megabytes.
  • A memcache key cannot contain a null byte.

Available APIs

App Engine memcache supports two interfaces: a low-level Memcache API and the JCache specification. The following sections provide more information about each interface.

Low-level API

The low-level Memcache API supports more functionality compared with JCache. Examples include:

  • Atomically increment and decrement integer counter values.
  • Expose more cache statistics, such as the amount of time since the least- recently-used entry was accessed, and the total size of all items in the cache.
  • Check and set operations to conditionally store data.
  • Perform memcache operations asynchronously, using the AsyncMemcacheService.

The low-level API provides MemcacheService and AsyncMemcacheService for accessing memcache service. This API is richer than the one provided by JCache.

See Memcache Examples for synchronous and asynchronous usage examples of the low-level Memcache API.


The App Engine Java SDK supports the JCache API. JCache provides a map-like interface to cached data. You store and retrieve values in the cache using keys. Keys and values can be of any Serializable type or class. For more details, see Using Memcache.

JCache features not supported

JCache does not support the following features:

  • The JCache listener API is partially supported for listeners that can execute during the processing of a app's API call, such as for onPut and onRemove listeners. Listeners that require background processing, like onEvict, are not supported.
  • An app can test whether the cache contains a given key, but it cannot test whether the cache contains a given value (containsValue() is not supported).
  • An app cannot dump the contents of the cache's keys or values.
  • An app cannot manually reset cache statistics.
  • The put() method does not return the previous known value for a key. It always returns null.

How cached data expires

Memcache contains key/value pairs. The pairs in memory at any time change as items are written and retrieved from the cache.

By default, values stored in memcache are retained as long as possible. Values can be evicted from the cache when a new value is added to the cache and the cache is low on memory. When values are evicted due to memory pressure, the least recently used values are evicted first.

The app can provide an expiration time when a value is stored, as either a number of seconds relative to when the value is added, or as an absolute Unix epoch time in the future (a number of seconds from midnight January 1, 1970). The value is evicted no later than this time, though it can be evicted earlier for other reasons.

Under rare circumstances, values can also disappear from the cache prior to expiration for reasons other than memory pressure. While memcache is resilient to server failures, memcache values are not saved to disk, so a service failure can cause values to become unavailable.

In general, an application should not expect a cached value to always be available.

You can erase an application's entire cache via the API or in the memcache section of Google Cloud Platform Console.

Cache statistics

Operations per second by item size

Dedicated memcache is rated in operations per second per GB, where an operation is defined as an individual cache item access, such as a get, set, or delete. The operation rate varies by item size approximately according to the following table. Exceeding these ratings might result in increased API latency or errors.

The following tables provide the maximum number of sustained, exclusive get-hit or set operations per GB of cache. Note that a get-hit operation is a get call that finds that there is a value stored with the specified key, and returns that value.

Item size (KB) Maximum get-hit ops/s
Item size (KB) Maximum set ops/s
An app configured for multiple GB of cache can in theory achieve an aggregate operation rate computed as the number of GB multiplied by the per-GB rate. For example, an app configured for 5GB of cache could reach 50,000 memcache operations/sec on 1KB items. Achieving this level requires a good distribution of load across the memcache keyspace, as described in Best Practices for App Engine Memcache.

For each IO pattern, the limits listed above are for reads or writes. For simultaneous reads and writes, the limits are on a sliding scale. The more reads being performed, the fewer writes can be performed, and vice versa. Each of the following are example IOPs limits for simultaneous reads and writes of 1KB values per 1GB of cache:

Read IOPs Write IOPs
10000 0
8000 1000
5000 2500
1000 4500
0 5000

Memcache compute units (MCU)

Memcache throughput can vary depending on the size of the item you are accessing and the operation you want to perform on the item. You can roughly associate a cost with operations and estimate the traffic capacity that you can expect from dedicated memcache using a unit called Memcache Compute Unit (MCU). MCU is defined such that you can expect 10,000 MCU per second per GB of dedicated memcache. The Google Cloud Platform Console shows how much MCU your app is currently using.

Note that MCU is a rough statistical estimation and also it's not a linear unit. Each cache operation that reads or writes a value has a corresponding MCU cost that depends on the size of the value. The MCU for a set depends on the value size: it is 2 times the cost of a successful get-hit operation.

Value item size (KB) MCU cost for get-hit MCU cost for set
≤1 1.0 2.0
2 1.3 2.6
10 1.7 3.4
100 5.0 10.0
512 20.0 40.0
1024 50.0 100.0

Operations that do not read or write a value have a fixed MCU cost:

Operation MCU
get-miss 1.0
delete 2.0
increment 2.0
flush 100.0
stats 100.0

Note that a get-miss operation is a get that finds that there is no value stored with the specified key.

Best practices

Following are some best practices for using memcache:

  • Handle memcache API failures gracefully. Memcache operations can fail for various reasons. Applications should be designed to catch failed operations without exposing these errors to end users. This guidance applies especially to Set operations.

  • Use the batching capability of the API when possible, especially for small items. Doing so increases the performance and efficiency of your app.

  • Distribute load across your memcache keyspace. Having a single or small set of memcache items represent a disproportionate amount of traffic will hinder your app from scaling. This guidance applies to both operations/sec and bandwidth. You can often alleviate this problem by explicitly sharding your data.

    For example, you can split a frequently updated counter among several keys, reading them back and summing only when you need a total. Likewise, you can split a 500K piece of data that must be read on every HTTP request across multiple keys and read them back using a single batch API call. (Even better would be to cache the value in instance memory.) For dedicated memcache, the peak access rate on a single key should be 1-2 orders of magnitude less than the per-GB rating.

For more details and more best practices for concurrency, performance, and migration, including sharing memcache between different programming languages, read the article Best Practices for App Engine Memcache.

What's next

  • Learn how to configure, monitor, and use memcache in Using Memcache.