Cloud Datastore best practices

You can use the best practices listed here as a quick reference of what to keep in mind when building an application that uses Datastore. If you are just starting out with Datastore, this page might not be the best place to start, because it does not teach you the basics of how to use Datastore. If you are a new user, we suggest that you start with Getting Started with Datastore.

  • Always use UTF-8 characters for namespace names, kind names, property names, and custom key names. Non-UTF-8 characters used in these names can interfere with Datastore functionality. For example, a non-UTF-8 character in a property name can prevent creation of an index that uses the property.
  • Do not use a forward slash (/) in kind names or custom key names. Forward slashes in these names could interfere with future functionality.
  • Avoid storing sensitive information in a Cloud Project ID. A Cloud Project ID might be retained beyond the life of your project.
  • As a data compliance best practice, we recommend not storing sensitive information in Datastore entity names or entity property names.

API calls

  • Use batch operations for your reads, writes, and deletes instead of single operations. Batch operations are more efficient because they perform multiple operations with the same overhead as a single operation.
  • If a transaction fails, ensure you try to rollback the transaction. The rollback minimizes retry latency for a different request contending for the same resource(s) in a transaction. Note that a rollback itself might fail, so the rollback should be a best-effort attempt only.
  • Use asynchronous calls where available instead of synchronous calls. Asynchronous calls minimize latency impact. For example, consider an application that needs the result of a synchronous lookup() and the results of a query before it can render a response. If the lookup() and the query do not have a data dependency, there is no need to synchronously wait until the lookup() completes before initiating the query.

Entities

  • Group highly related data in entity groups. Entity groups enable ancestor queries, which return strongly consistent results. Ancestor queries also rapidly scan an entity group with minimal I/O because the entities in an entity group are stored at physically close places on Datastore servers.
  • Avoid writing to an entity group more than once per second. Writing at a sustained rate above that limit makes eventually consistent reads more eventual, leads to time outs for strongly consistent reads, and results in slower overall performance of your application. A batch or transactional write to an entity group counts as only a single write against this limit.
  • Do not include the same entity (by key) multiple times in the same commit. Including the same entity multiple times in the same commit could impact Datastore latency.

Keys

  • Key names are autogenerated if not provided at entity creation. They are allocated so as to be evenly distributed in the keyspace.
  • For a key that uses a custom name, always use UTF-8 characters except a forward slash (/). Non-UTF-8 characters interfere with various processes such as importing a Datastore backup into Google BigQuery. A forward slash could interfere with future functionality.
  • For a key that uses a numeric ID:
    • Do not use a negative number for the ID. A negative ID could interfere with sorting.
    • Do not use the value 0(zero) for the ID. If you do, you will get an automatically allocated ID.
    • If you wish to manually assign your own numeric IDs to the entities you create, have your application obtain a block of IDs with the allocateIds() method. This will prevent Datastore from assigning one of your manual numeric IDs to another entity.
  • If you assign your own manual numeric ID or custom name to the entities you create, do not use monotonically increasing values such as:

    1, 2, 3, ,
    "Customer1", "Customer2", "Customer3", .
    "Product 1", "Product 2", "Product 3", .
    

    If an application generates large traffic, such sequential numbering could lead to hotspots that impact Datastore latency. To avoid the issue of sequential numeric IDs, obtain numeric IDs from the allocateIds() method. The allocateIds() method generates well-distributed sequences of numeric IDs.

  • By specifying a key or storing the generated name, you can later perform a consistent lookup() on that entity without needing issue a query to find the entity.

Indexes

  • If a property will never be needed for a query, exclude the property from indexes. Unnecessarily indexing a property could result in increased latency to achieve consistency, and increased storage costs of index entries.
  • Avoid having too many composite indexes. Excessive use of composite indexes could result in increased latency to achieve consistency, and increased storage costs of index entries. If you need to execute ad hoc queries on large datasets without previously defined indexes, use Google BigQuery.
  • Do not index properties with monotonically increasing values (such as a NOW() timestamp). Maintaining such an index could lead to hotspots that impact Datastore latency for applications with high read and write rates. For further guidance on dealing with monotonic properties, see High read/write rates for a narrow key range below.

Properties

  • Always use UTF-8 characters for properties of type string. A non-UTF-8 character in a property of type string could interfere with queries. If you need to save data with non-UTF-8 characters, use a byte string.
  • Do not use dots in property names. Dots in property names interfere with indexing of embedded entity properties.

Queries

  • If you need to access only the key from query results, use a keys-only query. A keys-only query returns results at lower latency and cost than retrieving entire entities.
  • If you need to access only specific properties from an entity, use a projection query. A projection query returns results at lower latency and cost than retrieving entire entities.
  • Likewise, if you need to access only the properties that are included in the query filter (for example, those listed in an order by clause), use a projection query.
  • Do not use offsets. Instead use cursors. Using an offset only avoids returning the skipped entities to your application, but these entities are still retrieved internally. The skipped entities affect the latency of the query, and your application is billed for the read operations required to retrieve them.
  • If you need strong consistency for your queries, use an ancestor query. (To use ancestor queries, you first need to structure your data for strong consistency.) An ancestor query returns strongly consistent results. Note that a non-ancestor keys-only query followed by a lookup() does not return strong results, because the non-ancestor keys-only query could get results from an index that is not consistent at the time of the query.

Designing for scale

Updates to a single entity group

A single entity group in Datastore should not be updated too rapidly.

If you are using Datastore, Google recommends that you design your application so that it will not need to update an entity group more than once per second. Remember that an entity with no parent and no children is its own entity group. If you update an entity group too rapidly then your Datastore writes will have higher latency, timeouts, and other types of error. This is known as contention.

Datastore write rates to a single entity group can sometimes exceed the one per second limit so load tests might not show this problem.

High read/write rates to a narrow key range

Avoid high read or write rates to Datastore keys that are lexicographically close.

Datastore is built on top of Google's NoSQL database, Bigtable, and is subject to Bigtable's performance characteristics. Bigtable scales by sharding rows onto separate tablets, and these rows are lexicographically ordered by key.

If you are using Datastore, you can get slow writes due to a hot tablet if you have a sudden increase in the write rate to a small range of keys that exceeds the capacity of a single tablet server. Bigtable will eventually split the key space to support high load.

The limit for reads is typically much higher than for writes, unless you are reading from a single key at a high rate. Bigtable cannot split a single key onto more than one tablet.

Hot tablets can apply to key ranges used by both entity keys and indexes.

In some cases, a Datastore hotspot can have wider impact to an application than preventing reads or writes to a small range of keys. For example, the hot keys might be read or written during instance startup, causing loading requests to fail.

By default, Datastore allocates keys using a scattered algorithm. Thus you will not normally encounter hotspotting on Datastore writes if you create new entities at a high write rate using the default ID allocation policy. There are some corner cases where you can hit this problem:

  • If you create new entities at a very high rate using the legacy sequential ID allocation policy.

  • If you create new entities at a very high rate and you are allocating your own IDs which are monotonically increasing.

  • If you create new entities at a very high rate for a kind which previously had very few existing entities. Bigtable will start off with all entities on the same tablet server and will take some time to split the range of keys onto separate tablet servers.

  • You will also see this problem if you create new entities at a high rate with a monotonically increasing indexed property like a timestamp, because these properties are the keys for rows in the index tables in Bigtable.

  • Datastore prepends the namespace and the kind of the root entity group to the Bigtable row key. You can hit a hotspot if you start to write to a new namespace or kind without gradually ramping up traffic.

If you do have a key or indexed property that will be monotonically increasing then you can prepend a random hash to ensure that the keys are sharded onto multiple tablets.

Likewise, if you need to query on a monotonically increasing (or decreasing) property using a sort or filter, you could instead index on a new property, for which you prefix the monotonic value with a value that has high cardinality across the dataset, but is common to all the entities in the scope of the query you want to perform. For instance, if you want to query for entries by timestamp but only need to return results for a single user at a time, you could prefix the timestamp with the user id and index that new property instead. This would still permit queries and ordered results for that user, but the presence of the user id would ensure the index itself is well sharded.

For a more detailed explanation of this issue, see Ikai Lan's blog posting on saving monotonically increasing values in Datastore.

Ramping up traffic

Gradually ramp up traffic to new Datastore kinds or portions of the keyspace.

You should ramp up traffic to new Datastore kinds gradually in order to give Bigtable sufficient time to split tablets as the traffic grows. We recommend a maximum of 500 operations per second to a new Datastore kind, then increasing traffic by 50% every 5 minutes. In theory, you can grow to 740K operations per second after 90 minutes using this ramp up schedule. Be sure that writes are distributed relatively evenly throughout the key range. Our SREs call this the "500/50/5" rule.

This pattern of gradual ramp up is particularly important if you change your code to stop using kind A and instead use kind B. A naive way to handle this migration is to change your code to read kind B, and if it does not exist then read kind A. However, this could cause a sudden increase in traffic to a new kind with a very small portion of the keyspace. Bigtable might be unable to efficiently split tablets if the keyspace is sparse.

The same problem can also occur if you migrate your entities to use a different range of keys within the same kind.

The strategy that you use to migrate entities to a new kind or key will depend on your data model. Below is an example strategy, known as "Parallel Reads". You will need to determine whether or not this strategy is effective for your data. An important consideration will be the cost impact of parallel operations during the migration.

Read from the old entity or key first. If it is missing, then you could read from the new entity or key. A high rate of reads of non-existent entities can lead to hotspotting, so you need to be sure to gradually increase load. A better strategy is to copy the old entity to the new then delete the old. Ramp up parallel reads gradually to ensure that the new key space is well split.

A possible strategy for gradually ramping up reads or writes to a new kind is use a deterministic hash of the user ID to get a random percentage of users that write new entities. Be sure that the result of the user ID hash is not skewed either by your random function or by your user behavior.

Meanwhile, run a Dataflow job to copy all your data from the old entities or keys to the new ones. Your batch job should avoid writes to sequential keys in order to prevent Bigtable hotspots. When the batch job is complete, you can read only from the new location.

A refinement of this strategy is to migrate small batches of users at one time. Add a field to the user entity which tracks the migration status of that user. Select a batch of users to migrate based on a hash of the user ID. A Mapreduce or Dataflow job will migrate the keys for that batch of users. The users that have an in-progress migration will use parallel reads.

Note that you cannot easily roll back unless you do dual writes of both the old and new entities during the migration phase. This would increase Datastore costs incurred.

Deletions

Avoid deleting large numbers of Datastore entities across a small range of keys.

Bigtable periodically rewrites its tables to remove deleted entries, and to reorganize your data so that reads and writes are more efficient. This process is known as a compaction.

If you delete a large number of Datastore entities across a small range of keys then queries across this part of the index will be slower until compaction has completed. In extreme cases, your queries might time out before returning results.

It is an anti-pattern to use a timestamp value for an indexed field to represent an entity's expiration time. In order to retrieve expired entities, you would need to query against this indexed field, which likely lies in an overlapping part of the keyspace with index entries for the most recently deleted entities.

You can improve performance with "sharded queries", that prepend a fixed length string to the expiration timestamp. The index is sorted on the full string, so that entities at the same timestamp will be located throughout the key range of the index. You run multiple queries in parallel to fetch results from each shard.

A fuller solution for the expiration timestamp issue is to use a "generation number" which is a global counter that is periodically updated. The generation number is prepended to the expiration timestamp so that queries are sorted by generation number, then shard, then timestamp. Deletion of old entities occurs at a prior generation. Any entity not deleted should have its generation number incremented. Once the deletion is complete, you move forward to the next generation. Queries against an older generation will perform poorly until compaction is complete. You might need to wait for several generations to complete before querying the index to get the list of entities to delete, in order to reduce the risk of missing results due to eventual consistency.

Sharding and replication

Use sharding or replication for hot Datastore keys.

You can use replication if you need to read a portion of the key range at a higher rate than Bigtable permits. Using this strategy, you would store N copies of the same entity allowing N times higher rate of reads than supported by a single entity.

You can use sharding if you need to write to a portion of the key range at a higher rate than Bigtable permits. Sharding breaks up an entity into smaller pieces.

Some common mistakes when sharding include:

  • Sharding using a time prefix. When the time rolls over to the next prefix then the new unsplit portion becomes a hotspot. Instead, you should gradually roll over a portion of your writes to the new prefix.

  • Sharding just the hottest entities. If you shard a small proportion of the total number of entities then there might not be sufficient rows between the hot entities to ensure that they stay on different splits.

What's next