Note: Developers building new applications are strongly encouraged to use the NDB Client Library, which has several benefits compared to this client library, such as automatic entity caching via the Memcache API. If you are currently using the older DB Client Library, read the DB to NDB Migration Guide
Data objects in Cloud Datastore are known as entities. An entity has one or more named properties, each of which can have one or more values. Entities of the same kind do not need to have the same properties, and an entity's values for a given property do not all need to be of the same data type. (If necessary, an application can establish and enforce such restrictions in its own data model.)
Cloud Datastore supports a variety of data types for property values. These include, among others:
- Floating-point numbers
- Binary data
For a full list of types, see Properties and value types.
Each entity in Cloud Datastore has a key that uniquely identifies it. The key consists of the following components:
- The namespace of the entity, which allows for multitenancy
- The kind of the entity, which categorizes it for the purpose of Cloud Datastore queries
- An identifier for the individual entity, which can be either
- a key name string
- an integer numeric ID
- An optional ancestor path locating the entity within the Cloud Datastore hierarchy
An application can fetch an individual entity from Cloud Datastore using the entity's key, or it can retrieve one or more entities by issuing a query based on the entities' keys or property values.
The Python App Engine SDK includes a data modeling library for representing Cloud Datastore entities as instances of Python classes, and for storing and retrieving those instances in Datastore.
Cloud Datastore itself does not enforce any restrictions on the structure of entities, such as whether a given property has a value of a particular type; this task is left to the application and the data modeling library.
Kinds and identifiers
Each Cloud Datastore entity is of a particular kind, which categorizes the entity for the purpose of queries: for instance, a human resources application might represent each employee at a company with an entity of kind
Employee. In the Python Datastore API, an entity's kind is determined by its model class, which you define in your application as a subclass of the data modeling library class
db.Model. The name of the model class becomes the kind of the entities belonging to it. All kind names that begin with two underscores (
__) are reserved and may not be used.
The following example creates an entity of kind
Employee, populates its property values, and saves it to Datastore:
import datetime from google.appengine.ext import db class Employee(db.Model): first_name = db.StringProperty() last_name = db.StringProperty() hire_date = db.DateProperty() attended_hr_training = db.BooleanProperty() employee = Employee(first_name='Antonio', last_name='Salieri') employee.hire_date = datetime.datetime.now().date() employee.attended_hr_training = True employee.put()
Employee class declares four properties for the data model:
Model superclass ensures that the attributes of
Employee objects conform to this model: for example, an attempt to assign a string value to the
hire_date attribute would result in a runtime error, since the data model for
hire_date was declared as
In addition to a kind, each entity has an identifier, assigned when the entity is created. Because it is part of the entity's key, the identifier is associated permanently with the entity and cannot be changed. It can be assigned in either of two ways:
- Your application can specify its own key name string for the entity.
- You can have Cloud Datastore automatically assign the entity an integer numeric ID.
To assign an entity a key name, provide the named argument
key_name to the model class constructor when you create the entity:
# Create an entity with the key Employee:'asalieri'. employee = Employee(key_name='asalieri')
To have Cloud Datastore assign a numeric ID automatically, omit the
# Create an entity with a key such as Employee:8261. employee = Employee()
Cloud Datastore can be configured to generate auto IDs using two different auto id policies:
defaultpolicy generates a random sequence of unused IDs that are approximately uniformly distributed. Each ID can be up to 16 decimal digits long.
legacypolicy creates a sequence of non-consecutive smaller integer IDs.
If you want to display the entity IDs to the user, and/or depend upon their order, the best thing to do is use manual allocation.
Cloud Datastore generates a random sequence of unused IDs that are approximately uniformly distributed. Each ID can be up to 16 decimal digits long.
System-allocated ID values are guaranteed unique to the entity group. If you copy an entity from one entity group or namespace to another and wish to preserve the ID part of the key, be sure to allocate the ID first to prevent Cloud Datastore from selecting that ID for a future assignment.
Entities in Cloud Datastore form a hierarchically structured space similar to the directory structure of a file system. When you create an entity, you can optionally designate another entity as its parent; the new entity is a child of the parent entity (note that unlike in a file system, the parent entity need not actually exist). An entity without a parent is a root entity. The association between an entity and its parent is permanent, and cannot be changed once the entity is created. Cloud Datastore will never assign the same numeric ID to two entities with the same parent, or to two root entities (those without a parent).
An entity's parent, parent's parent, and so on recursively, are its ancestors; its children, children's children, and so on, are its descendants. A root entity and all of its descendants belong to the same entity group. The sequence of entities beginning with a root entity and proceeding from parent to child, leading to a given entity, constitute that entity's ancestor path. The complete key identifying the entity consists of a sequence of kind-identifier pairs specifying its ancestor path and terminating with those of the entity itself:
[Person:GreatGrandpa, Person:Grandpa, Person:Dad, Person:Me]
For a root entity, the ancestor path is empty and the key consists solely of the entity's own kind and identifier:
This concept is illustrated by the following diagram:
To designate an entity's parent, use the
parent argument to the model class constructor when creating the child entity. The value of this argument can be the parent entity itself or its key; you can get the key by calling the parent entity's
key() method. The following example creates an entity of kind
Address and shows two ways of designating an
Employee entity as its parent:
# Create Employee entity employee = Employee() employee.put() # Set Employee as Address entity's parent directly... address = Address(parent=employee) # ...or using its key e_key = employee.key() address = Address(parent=e_key) # Save Address entity to datastore address.put()
Transactions and entity groups
Every attempt to create, update, or delete an entity takes place in the context of a transaction. A single transaction can include any number of such operations. To maintain the consistency of the data, the transaction ensures that all of the operations it contains are applied to Cloud Datastore as a unit or, if any of the operations fails, that none of them are applied. Furthermore, all strongly-consistent reads (ancestor queries or gets) performed within the same transaction observe a consistent snapshot of the data.
As mentioned above, an entity group is a set of entities connected through ancestry to a common root element. The organization of data into entity groups can limit what transactions can be performed:
- All the data accessed by a transaction must be contained in at most 25 entity groups.
- If you want to use queries within a transaction, your data must be organized into entity groups in such a way that you can specify ancestor filters that will match the right data.
- There is a write throughput limit of about one transaction per second within a single entity group. This limitation exists because Cloud Datastore performs masterless, synchronous replication of each entity group over a wide geographic area to provide high reliability and fault tolerance.
In many applications, it is acceptable to use eventual consistency (i.e. a non-ancestor query spanning multiple entity groups, which may at times return slightly stale data) when obtaining a broad view of unrelated data, and then to use strong consistency (an ancestor query, or a
get of a single entity) when viewing or editing a single set of highly related data. In such applications, it is usually a good approach to use a separate entity group for each set of highly related data.
For more information, see Structuring for Strong Consistency.
Properties and value types
The data values associated with an entity consist of one or more properties. Each property has a name and one or more values. A property can have values of more than one type, and two entities can have values of different types for the same property. Properties can be indexed or unindexed (queries that order or filter on a property P will ignore entities where P is unindexed). An entity can have at most 20,000 indexed properties.
|Value type||Python type(s)||Sort order||Notes|
||Numeric||64-bit integer, signed|
||Numeric||64-bit double precision,
|Text string (short)||
|Up to 1500 bytes|
|Text string (long)||
||None||Up to 1 megabyte
|Byte string (short)||
||Byte order||Up to 1500 bytes|
|Byte string (long)||
||None||Up to 1 megabyte
|Date and time||
|Google Accounts user||
in Unicode order
|Instant messaging handle||
|Cloud Datastore key||
||By path elements
Important: We strongly recommend that you do not
UserProperty, since it includes the email address
and the user's unique ID. If a user changes their email address and you compare their old, stored
User to the new
User value, they won't match.
For text strings and unencoded binary data (byte strings), Cloud Datastore supports two value types:
- Short strings (up to 1500 bytes) are indexed and can be used in query filter conditions and sort orders.
- Long strings (up to 1 megabyte) are not indexed and cannot be used in query filters and sort orders.
Blobin the Cloud Datastore API. This type is unrelated to blobs as used in the Blobstore API.
When a query involves a property with values of mixed types, Cloud Datastore uses a deterministic ordering based on the internal representations:
- Null values
- Fixed-point numbers
- Dates and times
- Boolean values
- Byte sequences
- Byte string
- Unicode string
- Blobstore keys
- Floating-point numbers
- Geographical points
- Google Accounts users
- Cloud Datastore keys
Because long text strings and long byte strings are not indexed, they have no ordering defined.
Working with entities
Applications can use the Cloud Datastore API to create, retrieve, update, and delete entities. If the application knows the complete key for an entity (or can derive it from its parent key, kind, and identifier), it can use the key to operate directly on the entity. An application can also obtain an entity's key as a result of a Cloud Datastore query; see the Datastore Queries page for more information.
Creating an entity
In Python, you create a new entity by constructing an instance of a model class, populating its properties if necessary, and calling its
put() method to save it to Datastore. You can specify the entity's key name by passing a key_name argument to the constructor:
employee = Employee(key_name='asalieri', first_name='Antonio', last_name='Salieri') employee.hire_date = datetime.datetime.now().date() employee.attended_hr_training = True employee.put()
If you don't provide a key name, Cloud Datastore will automatically generate a numeric ID for the entity's key:
employee = Employee(first_name='Antonio', last_name='Salieri') employee.hire_date = datetime.datetime.now().date() employee.attended_hr_training = True employee.put()
Retrieving an entity
To retrieve an entity identified by a given key, pass the
Key object as an argument to the
db.get() function. You can generate the
Key object using the class method
The complete path is a sequence of entities in the ancestor path, with each entity represented by its kind (a string) followed by its identifier (key name or numeric ID):
address_k = db.Key.from_path('Employee', 'asalieri', 'Address', 1) address = db.get(address_k)
db.get() returns an instance of the appropriate model class. Be sure that you have imported the model class for the entity being retrieved.
Updating an entity
To update an existing entity, modify the attributes of the object, then call its
put() method. The object data overwrites the existing entity. The entire object is sent to Cloud Datastore with every call to
To delete a property, delete the attribute from the Python object:
then save the object.
Deleting an entity
Given an entity's key, you can delete the entity with the
address_k = db.Key.from_path('Employee', 'asalieri', 'Address', 1) db.delete(address_k)
or by calling the entity's own
employee_k = db.Key.from_path('Employee', 'asalieri') employee = db.get(employee_k) # ... employee.delete()
db.delete() functions (and their asynchronous counterparts
db.delete_async()) can accept a list argument to act on multiple entities in a single Cloud Datastore call:
# A batch put. db.put([e1, e2, e3]) # A batch get. entities = db.get([k1, k2, k3]) # A batch delete. db.delete([k1, k2, k3])
Batch operations do not change your costs. You will be charged for every key in a batched operation, whether or not each key exists. The size of the entities involved in an operation does not affect the cost.
Deleting entities in bulk
If you need to delete a large number of entities, we recommend using Cloud Dataflow to delete entities in bulk.
Using an empty listFor the NDB interface, Cloud Datastore historically wrote an empty list as an omitted property for both static and dynamic properties. To maintain backward compatibility, this behavior continues to be the default. To override this either globally or on a per ListProperty basis, set the write_empty_list argument to
truein your Property class; the empty list is then written to Cloud Datastore and can be read as an empty list.
For the DB interface, empty list writes were historically not allowed at all if the property was dynamic: if you attempted this you got an error. This means that there is no default behavior needing to be preserved for backwards compatibility for DB dynamic properties, and so you can simply write and read the empty list in the dynamic model without any changes.
However, for DB static properties, the empty list was written as an omitted
property, and this behavior continues by default for backward compatibility.
If you want to turn on empty lists for DB static properties, use the
true in your Property class; the empty list is then written to