Entities, Properties, and Keys

Data objects in Cloud Datastore are known as entities. An entity has one or more named properties, each of which can have one or more values. Entities of the same kind do to not need have the same properties, and an entity's values for a given property do not all need to be of the same data type. (If necessary, an application can establish and enforce such restrictions in its own data model.)

Cloud Datastore supports a variety of data types for property values. These include, among others:

  • Integers
  • Floating-point numbers
  • Strings
  • Dates
  • Binary data

For a full list of types, see Properties and value types.

Each entity in Cloud Datastore has a key that uniquely identifies it. The key consists of the following components:

  • The namespace of the entity, which allows for multitenancy
  • The kind of the entity, which categorizes it for the purpose of Cloud Datastore queries
  • An identifier for the individual entity, which can be either
    • a key name string
    • an integer numeric ID
  • An optional ancestor path locating the entity within the Cloud Datastore hierarchy

An application can fetch an individual entity from Cloud Datastore using the entity's key, or it can retrieve one or more entities by issuing a query based on the entities' keys or property values.

The Go App Engine SDK includes a package for representing Cloud Datastore entities as Go structs, and for storing and retrieving them in Cloud Datastore.

Cloud Datastore itself does not enforce any restrictions on the structure of entities, such as whether a given property has a value of a particular type; this task is left to the application.

Kinds and identifiers

Each Cloud Datastore entity is of a particular kind, which categorizes the entity for the purpose of queries: for instance, a human resources application might represent each employee at a company with an entity of kind Employee. In the Go Datastore API, you specify an entity's kind when you create a datastore.Key. All kind names that begin with two underscores (__) are reserved and may not be used.

The following example creates an entity of kind Employee, populates its property values, and saves it to Datastore:

import (
	"time"

	"golang.org/x/net/context"

	"google.golang.org/appengine/datastore"
)

type Employee struct {
	FirstName          string
	LastName           string
	HireDate           time.Time
	AttendedHRTraining bool
}

func f(ctx context.Context) {
	// ...
	employee := &Employee{
		FirstName: "Antonio",
		LastName:  "Salieri",
		HireDate:  time.Now(),
	}
	employee.AttendedHRTraining = true

	key := datastore.NewIncompleteKey(ctx, "Employee", nil)
	if _, err := datastore.Put(ctx, key, employee); err != nil {
		// Handle err
	}
	// ...
}

The Employee type declares four fields for the data model: FirstName, LastName, HireDate, and AttendedHRTraining.

In addition to a kind, each entity has an identifier, assigned when the entity is created. Because it is part of the entity's key, the identifier is associated permanently with the entity and cannot be changed. It can be assigned in either of two ways:

  • Your application can specify its own key name string for the entity.
  • You can have Cloud Datastore automatically assign the entity an integer numeric ID.

To assign an entity a key name, provide a non-empty stringID argument to datastore.NewKey:

// Create a key with a key name "asalieri".
key := datastore.NewKey(
	ctx,        // context.Context
	"Employee", // Kind
	"asalieri", // String ID; empty means no string ID
	0,          // Integer ID; if 0, generate automatically. Ignored if string ID specified.
	nil,        // Parent Key; nil means no parent
)

To have Cloud Datastore assign a numeric ID automatically, use an empty stringID argument:

// Create a key such as Employee:8261.
key := datastore.NewKey(ctx, "Employee", "", 0, nil)
// This is equivalent:
key = datastore.NewIncompleteKey(ctx, "Employee", nil)

Assigning identifiers

Cloud Datastore can be configured to generate auto IDs using two different auto id policies:

  • The default policy generates a random sequence of unused IDs that are approximately uniformly distributed. Each ID can be up to 16 decimal digits long.
  • The legacy policy creates a sequence of non-consecutive smaller integer IDs.

If you want to display the entity IDs to the user, and/or depend upon their order, the best thing to do is use manual allocation.

Cloud Datastore generates a random sequence of unused IDs that are approximately uniformly distributed. Each ID can be up to 16 decimal digits long.

System-allocated ID values are guaranteed unique to the entity group. If you copy an entity from one entity group or namespace to another and wish to preserve the ID part of the key, be sure to allocate the ID first to prevent Cloud Datastore from selecting that ID for a future assignment.

Ancestor paths

Entities in Cloud Datastore form a hierarchically structured space similar to the directory structure of a file system. When you create an entity, you can optionally designate another entity as its parent; the new entity is a child of the parent entity (note that unlike in a file system, the parent entity need not actually exist). An entity without a parent is a root entity. The association between an entity and its parent is permanent, and cannot be changed once the entity is created. Cloud Datastore will never assign the same numeric ID to two entities with the same parent, or to two root entities (those without a parent).

An entity's parent, parent's parent, and so on recursively, are its ancestors; its children, children's children, and so on, are its descendants. A root entity and all of its descendants belong to the same entity group. The sequence of entities beginning with a root entity and proceeding from parent to child, leading to a given entity, constitute that entity's ancestor path. The complete key identifying the entity consists of a sequence of kind-identifier pairs specifying its ancestor path and terminating with those of the entity itself:

[Person:GreatGrandpa, Person:Grandpa, Person:Dad, Person:Me]

For a root entity, the ancestor path is empty and the key consists solely of the entity's own kind and identifier:

[Person:GreatGrandpa]

This concept is illustrated by the following diagram:

Entity group

To designate an entity's parent, use the parent argument to datastore.NewKey. The value of this argument should be the parent entity's key.. The following example creates an entity of kind Address and designates an Employee entity as its parent:

// Create Employee entity
employee := &Employee{ /* ... */ }
employeeKey, err := datastore.Put(ctx, datastore.NewIncompleteKey(ctx, "Employee", nil), employee)

// Use Employee as Address entity's parent
// and save Address entity to datastore
address := &Address{ /* ... */ }
addressKey := datastore.NewIncompleteKey(ctx, "Address", employeeKey)
_, err = datastore.Put(ctx, addressKey, address)

Transactions and entity groups

Every attempt to create, update, or delete an entity takes place in the context of a transaction. A single transaction can include any number of such operations. To maintain the consistency of the data, the transaction ensures that all of the operations it contains are applied to Cloud Datastore as a unit or, if any of the operations fails, that none of them are applied. Furthermore, all strongly-consistent reads (ancestory queries or gets) performed within the same transaction observe a consistent snapshot of the data.

As mentioned above, an entity group is a set of entities connected through ancestry to a common root element. The organization of data into entity groups can limit what transactions can be performed:

  • All the data accessed by a transaction must be contained in at most 25 entity groups.
  • If you want to use queries within a transaction, your data must be organized into entity groups in such a way that you can specify ancestor filters that will match the right data.
  • There is a write throughput limit of about one transaction per second within a single entity group. This limitation exists because Cloud Datastore performs masterless, synchronous replication of each entity group over a wide geographic area to provide high reliability and fault tolerance.

In many applications, it is acceptable to use eventual consistency (i.e. a non-ancestor query spanning multiple entity groups, which may at times return slightly stale data) when obtaining a broad view of unrelated data, and then to use strong consistency (an ancestory query, or a get of a single entity) when viewing or editing a single set of highly related data. In such applications, it is usually a good approach to use a separate entity group for each set of highly related data. For more information, see Structuring for Strong Consistency.

Properties and value types

The data values associated with an entity consist of one or more properties. Each property has a name and one or more values. A property can have values of more than one type, and two entities can have values of different types for the same property. Properties can be indexed or unindexed (queries that order or filter on a property P will ignore entities where P is unindexed). An entity can have at most 20,000 indexed properties.

The following value types are supported:

Value type Go type(s) Sort order Notes
Integer int
int8
int16
int32
int64
Numeric 64-bit integer, signed
Floating-point number float32
float64
Numeric 64-bit double precision,
IEEE 754
Boolean bool false<true
String (short) string Unicode
Up to 1500 bytes
String (long) string (with noindex) None Up to 1 megabyte

Not indexed
Byte slice (short) datastore.ByteString Byte order Up to 1500 bytes
Byte slice (long) []byte None Up to 1 megabyte

Not indexed
Date and time time.Time Chronological
Geographical point appengine.GeoPoint By latitude,
then longitude
Cloud Datastore key *datastore.Key By path elements
(kind, identifier,
kind, identifier...)
Blobstore key appengine.BlobKey Byte order

You can also use a struct or slice to aggregate properties. See the Cloud Datastore reference for more details.

When a query involves a property with values of mixed types, Cloud Datastore uses a deterministic ordering based on the internal representations:

  1. Null values
  2. Fixed-point numbers
    • Integers
    • Dates and times
  3. Boolean values
  4. Byte sequences
    • Byte slices (short)
    • Unicode string
    • Blobstore keys
  5. Floating-point numbers
  6. Geographical points
  7. Cloud Datastore keys

Because long byte slices and long strings are not indexed, they have no ordering defined.

Working with entities

Applications can use the Cloud Datastore API to create, retrieve, update, and delete entities. If the application knows the complete key for an entity (or can derive it from its parent key, kind, and identifier), it can use the key to operate directly on the entity. An application can also obtain an entity's key as a result of a Cloud Datastore query; see the Datastore Queries page for more information.

Creating an entity

In Go, you create a new entity by constructing an instance of a Go struct, populating its fields, and calling datastore.Put to save it to Datastore. Only exported fields (beginning with an upper case letter) will be saved to Cloud Datastore. You can specify the entity's key name by passing a non-empty stringID argument to datastore.NewKey:

employee := &Employee{
	FirstName: "Antonio",
	LastName:  "Salieri",
	HireDate:  time.Now(),
}
employee.AttendedHRTraining = true
key := datastore.NewKey(ctx, "Employee", "asalieri", 0, nil)
_, err = datastore.Put(ctx, key, employee)

If you provide a empty key name, or use datastore.NewIncompleteKey, Cloud Datastore will automatically generate a numeric ID for the entity's key:

employee := &Employee{
	FirstName: "Antonio",
	LastName:  "Salieri",
	HireDate:  time.Now(),
}
employee.AttendedHRTraining = true
key := datastore.NewIncompleteKey(ctx, "Employee", nil)
_, err = datastore.Put(ctx, key, employee)

Retrieving an entity

To retrieve an entity identified by a given key, pass the *datastore.Key as an argument to the datastore.Get function. You can generate the *datastore.Key using the datastore.NewKey function.

employeeKey := datastore.NewKey(ctx, "Employee", "asalieri", 0, nil)
addressKey := datastore.NewKey(ctx, "Address", "", 1, employeeKey)
var addr Address
err = datastore.Get(ctx, addressKey, &addr)

datastore.Get populates an instance of the appropriate Go struct.

Updating an entity

To update an existing entity, modify the attributes of the struct, then call datastore.Put. The data overwrites the existing entity. The entire object is sent to Cloud Datastore with every call to datastore.Put.

Deleting an entity

Given an entity's key, you can delete the entity with the datastore.Delete function:

key := datastore.NewKey(ctx, "Employee", "asalieri", 0, nil)
err = datastore.Delete(ctx, key)

Batch operations

datastore.Put, datastore.Get and datastore.Delete have bulk variants called datastore.PutMulti, datastore.GetMulti and datastore.DeleteMulti. They permit acting on multiple entities in a single Cloud Datastore call:

// A batch put.
_, err = datastore.PutMulti(ctx, []*datastore.Key{k1, k2, k3}, []interface{}{e1, e2, e3})

// A batch get.
var entities = make([]*T, 3)
err = datastore.GetMulti(ctx, []*datastore.Key{k1, k2, k3}, entities)

// A batch delete.
err = datastore.DeleteMulti(ctx, []*datastore.Key{k1, k2, k3})

Batch operations do not change your costs. You will be charged for every key in a batched operation, whether or not each key exists. The size of the entities involved in an operation does not affect the cost.

Using an empty list

Understanding write costs

When your application executes a Cloud Datastore put operation, Cloud Datastore must perform a number of writes to store the entity. Your application is charged for each of these writes. You can see how many writes will be required to store an entity by looking at the data viewer in the SDK Development Console. This section explains how these write costs are calculated.

Every entity requires a minimum of two writes to store: one for the entity itself and another for the built-in EntitiesByKind index, which is used by the query planner to service a variety of queries. In addition, Cloud Datastore maintains two other built-in indexes, EntitiesByProperty and EntitiesByPropertyDesc, which provide efficient scans of entities by single property values in ascending and descending order, respectively. Each of an entity's indexed property values must be written to each of these indexes.

As an example, consider an entity with properties A, B, and C:

Key: 'Foo:1' (kind = 'Foo', id = 1, no parent)
A: 1, 2
B: null
C: 'this', 'that', 'theOther'

Assuming there are no composite indexes (see below) for entities of this kind, this entity requires 14 writes to store:

  • 1 for the entity itself
  • 1 for the EntitiesByKind index
  • 4 for property A (2 for each of two values)
  • 2 for property B (a null value still needs to be written)
  • 6 for property C (2 for each of three values)

Composite indexes (those referring to multiple properties) require additional writes to maintain. Suppose you define the following composite index:

Kind: 'Foo'
A ▲, B ▼

where the triangles indicate the sort order for the specified properties: ascending for property A and descending for property B. Storing the entity defined above now takes an additional write to the composite index for every combination of A and B values:

(1, null) (2, null)

This adds 2 writes for the composite index, for a total of 1 + 1 + 4 + 2 + 6 + 2 = 16. Now add property C to the index:

Kind: 'Foo'
A ▲, B ▼, C ▼

Storing the same entity now requires a write to the composite index for each possible combination of A, B, and C values:

(1, null, 'this') (1, null, 'that') (1, null, 'theOther')

(2, null, 'this') (2, null, 'that') (2, null, 'theOther')

This brings the total number of writes to 1 + 1 + 4 + 2 + 6 + 6 = 20.

If a Cloud Datastore entity contains many multiple-valued properties, or if a single such property is referenced many times, the number of writes required to maintain the index can explode combinatorially. Such exploding indexes can be very expensive to maintain. For example, consider a composite index that includes ancestors:

Kind: 'Foo'
A ▲, B ▼, C ▼
Ancestor: True

Storing a simple entity with this index present takes the same number of writes as before. However, if the entity has ancestors, it requires a write for each possible combination of property values and ancestors, in addition to those for the entity itself. Thus an entity defined as

Key: 'GreatGrandpa:1/Grandpa:1/Dad:1/Foo:1' (kind = 'Foo', id = 1, parent = 'GreatGrandpa:1/Grandpa:1/Dad:1')
A: 1, 2
B: null
C: 'this', 'that', 'theOther'

would require a write to the composite index for each of the following combinations of properties and ancestors:

(1, null, 'this', 'GreatGrandpa') (1, null, 'this', 'Grandpa') (1, null, 'this', 'Dad') (1, null, 'this', 'Foo')

(1, null, 'that', 'GreatGrandpa') (1, null, 'that', 'Grandpa') (1, null, 'that', 'Dad') (1, null, 'that', 'Foo')

(1, null, 'theOther', 'GreatGrandpa') (1, null, 'theOther', 'Grandpa') (1, null, 'theOther', 'Dad') (1, null, 'theOther', 'Foo')

(2, null, 'this', 'GreatGrandpa') (2, null, 'this', 'Grandpa') (2, null, 'this', 'Dad') (2, null, 'this', 'Foo')

(2, null, 'that', 'GreatGrandpa') (2, null, 'that', 'Grandpa') (2, null, 'that', 'Dad') (2, null, 'that', 'Foo')

(2, null, 'theOther', 'GreatGrandpa') (2, null, 'theOther', 'Grandpa') (2, null, 'theOther', 'Dad') (2, null, 'theOther', 'Foo')

Storing this entity in Cloud Datastore now requires 1 + 1 + 4 + 2 + 6 + 24 = 38 writes.

Send feedback about...

App Engine standard environment for Go