Google Cloud Platform
Cloud Datastore

Datastore Queries

A query retrieves entities from Datastore that meet a specified set of conditions.

The query operates on entities of a given kind; it can specify filters on the entities' property values, keys, and ancestors, and can return zero or more entities as results. A query can also specify sort orders to sequence the results by their property values. The results include all entities that have at least one value for every property named in the filters and sort orders, and whose property values meet all the specified filter criteria. The query can return entire entities, projected entities, or just entity keys.

A typical query includes the following:

  • An entity kind to which the query applies
  • Zero or more filters based on the entities' property values, keys, and ancestors
  • Zero or more sort orders to sequence the results

When executed, the query retrieves all entities of the given kind that satisfy all of the given filters, sorted in the specified order. Queries execute as read-only.

Note: To conserve memory and improve performance, a query should, whenever possible, specify a limit on the number of results returned.

Every Datastore query computes its results using one or more indexes, which contain entity keys in a sequence specified by the index's properties and, optionally, the entity's ancestors. The indexes are updated incrementally to reflect any changes the application makes to its entities, so that the correct results of all queries are available with no further computation needed.

The index-based query mechanism supports a wide range of queries and is suitable for most applications. However, it does not support some kinds of query common in other database technologies: in particular, joins and aggregate queries aren't supported within the Datastore query engine. See Restrictions on queries, below, for limitations on Datastore queries.

Contents

Datastore query interface

Here's a basic example of issuing a query against the Datastore. It retrieves all tasks that are not yet done with priorities greater than or equal to 4, sorted in descending order by priority:

Java

Node.js

Go

Python

Here's how to run a query:

Java

Node.js

Go

Python

Query structure

A query can specify an entity kind, zero or more filters, and zero or more sort orders.

Filters

A query's filters set constraints on the properties, keys, and ancestors of the entities to be retrieved.

Property filters

A property filter specifies

  • A property name
  • A comparison operator
  • A property value

This example returns Task entities that are marked not done:

Java

Node.js

Go

Python

The property value must be supplied by the application; it cannot refer to or be calculated in terms of other properties. An entity satisfies the filter if it has a property of the given name whose value compares to the value specified in the filter in the manner described by the comparison operator. If the property of the given name is array-valued, the entity satisfies the filter if any of the values compares to the value specified in the filter in the manner described by the comparison operator.

The comparison operator can be any of the following:

Operator Meaning
EQUAL Equal to
LESS_THAN Less than
LESS_THAN_OR_EQUAL Less than or equal to
GREATER_THAN Greater than
GREATER_THAN_OR_EQUAL Greater than or equal to

Composite filters

A composite filter consists of more than one property filter. This example returns Task entities that are marked not done and have a priority of 4:

Java

Node.js

Go

Python

The Datastore currently only natively supports combining filters with the AND operator.

Key filters

To filter on the value of an entity's key, use the special property __key__:

Java

Node.js

Go

Python

When comparing for inequality, keys are ordered by the following criteria, in order:

  1. Ancestor path
  2. Entity kind
  3. Identifier (key name or numeric ID)

Elements of the ancestor path are compared similarly: by kind (string), then by key name or numeric ID. Kinds and key names are strings and are ordered by byte value; numeric IDs are integers and are ordered numerically. If entities with the same parent and kind use a mix of key name strings and numeric IDs, those with numeric IDs precede those with key names.

Queries on keys use indexes just like queries on properties and require custom indexes in the same cases, with a couple of exceptions: inequality filters or an ascending sort order on the key do not require a custom index, but a descending sort order on the key does. As with all queries, the development server creates appropriate entries in the index configuration file when a query that needs a custom index is used in the development environment.

Sort orders

A query sort order specifies

  • A property name.
  • A sort direction (ascending or descending). By default the sort order is ascending.

This example sorts Task entities by creation time in ascending order:

Java

Node.js

Go

Python

This example sorts Task entities by creation time in descending order:

Java

Node.js

Go

Python

If a query includes multiple sort orders, they are applied in the sequence specified. The following example sorts first by descending priority and then by ascending creation time:

Java

Node.js

Go

Python

If no sort orders are specified, the results are returned in the order they are retrieved from the Datastore.

Note: Because of the way Datastore executes queries, if a query specifies inequality filters on a property and sort orders on other properties, the property used in the inequality filters must be ordered before the other properties.

Special query types

Some specific types of query deserve special mention:

Ancestor queries

An ancestor query limits its results to the specified entity and its descendants. This example returns all Task entities that have the specified TaskList entity as an ancestor:

Java

Node.js

Go

Python

Kindless queries

A query with no kind and no ancestor retrieves all of the entities of an application from the Datastore. Such kindless queries cannot include filters or sort orders on property values. They can, however, filter on entity keys and use ancestor filters. Key filters can be used by specifying __key__ as the property name:

Java

Node.js

Go

Python

Projection queries

Most Datastore queries return whole entities as their results, but often an application is actually interested in only a few of the entity's properties. Projection queries allow you to query the Datastore for just those specific properties of an entity that you actually need, at lower latency and cost than retrieving the entire entity.

Keys-only queries

A keys-only query (which is a type of projection query) returns just the keys of the result entities instead of the entities themselves, at lower latency and cost than retrieving entire entities.

It is often more economical to do a keys-only query first, and then fetch a subset of entities from the results, rather than executing a general query which may fetch more entities than you actually need.

Here's how to create a keys-only query:

Java

Node.js

Go

Python

Projections

Projection queries are similar to SQL queries of the form

SELECT priority, percent_complete FROM Task

You can use all of the filtering and sorting features available for standard entity queries, subject to the limitations described below. The query returns abridged results with only the specified properties (priority and percent_complete in the example) populated with values; all other properties are not populated.

Here's an example that demonstrates the construction of a projection query:

Java

Node.js

Go

Python

And here's how to run the projection query:

Java

Node.js

Go

Python

Grouping

Projection queries can use the distinct on clause to ensure that only the first result for each distinct combination of values for the specified properties will be returned. This will return only the first result for entities which have the same values for the properties that are being projected.

Java

Node.js

Go

Python

Array values

Consider the following when your query includes properties with array values.

Properties with array values can behave in surprising ways

Because of the way they're indexed, entities with multiple values for the same property can sometimes interact with query filters and sort orders in unexpected and surprising ways.

If a query has multiple inequality filters on a given property, an entity will match the query only if at least one of its individual values for the property satisfies all of the filters. For example, if an entity of kind Task has values fun and programming for property tag, it will not match the query:

Java

Node.js

Go

Python

Each of the entity's tag values satisfies one of the filters, but neither single value satisfies both. Note that this does not apply to equality filters. For example, the same entity will satisfy the query

Java

Node.js

Go

Python

even though neither of the entity's individual tag values satisfies both filter conditions.

Similarly, the sort order for multiple-valued properties is unusual. Because such properties appear once in the index for each unique value, the first value seen in the index determines an entity's sort order:

  • If the query results are sorted in ascending order, the smallest value of the property is used for ordering.
  • If the results are sorted in descending order, the greatest value is used for ordering.
  • Other values do not affect the sort order, nor does the number of values.

This has the unusual consequence that an entity with property values 1 and 9 precedes one with values 4, 5, 6, and 7 in both ascending and descending order.

Projections and array-valued properties

Projecting a property with array values will not populate all values for that property. Instead, a separate entity will be returned for each unique combination of projected values matching the query. For example, suppose you have an entity of kind Task with two multiple-valued properties, tags and collaborators:

Java

Node.js

Go

Python

Then the projection query

SELECT tags, collaborators FROM Task WHERE collaborators < 'charlie'

will return four entities with the following combinations of values:

tags = 'fun', collaborators = 'alice'
tags = 'fun', collaborators = 'bob'
tags = 'programming', collaborators = 'alice'
tags = 'programming', collaborators = 'bob'

Cursors, limits, and offsets

You can specify a limit for your query to control the maximum number of results returned in one batch. The following example retrieves at most five Task entities:

Java

Node.js

Go

Python

Query cursors allow an application to retrieve a query's results in convenient batches without incurring the overhead of a query offset. After performing a retrieval operation, the application can obtain a cursor, which is an opaque byte string marking the index position of the last result retrieved. The application can save this string (for instance in the Datastore, a cache, or embedded in a web page as a base-64 encoded HTTP GET or POST parameter), and can then use the cursor as the starting point for a subsequent retrieval operation to obtain the next batch of results from the point where the previous retrieval ended. A retrieval can also specify an end cursor, to limit the extent of the result set returned.

The following example demonstrates the use of cursors for pagination:

Java

Node.js

Go

Python

Although Datastore supports integer offsets, you should avoid using them. Instead, use cursors. Using an offset only avoids returning the skipped entities to your application, but these entities are still retrieved internally. The skipped entities do affect the latency of the query, and your application is billed for the read operations required to retrieve them. Using cursors instead of offsets lets you avoid all these costs.

Limitations of cursors

Cursors are subject to the following limitations:

  • A cursor can be used only by the same project that performed the original query, and only to continue the same query. To use the cursor in a subsequent retrieval operation, you must reconstitute the original query exactly, including the same entity kind, ancestor, property filters, and sort orders. It is not possible to retrieve results using a cursor without setting up the same query from which it was originally generated.
  • Cursors don't always work as expected with a query that uses an inequality filter or a sort order on a property with multiple values. The de-duplication logic for such multiple-valued properties does not persist between retrievals, possibly causing the same result to be returned more than once.
  • New Datastore releases may change internal implementation details, invalidating cursors that depend on them. If an application attempts to use a cursor that is no longer valid, the Datastore raises an exception.

Cursors and data updates

The cursor represents the location in the result list after the last result returned. A cursor is not a relative position in the list (it's not an offset); it's a marker to which the Datastore can jump when starting an index scan for results. If the results for a query change between uses of a cursor, the query notices only changes that occur in results after the cursor. If a new result appears before the cursor's position for the query, it will not be returned when the results after the cursor are fetched. Similarly, if an entity is no longer a result for a query but had appeared before the cursor, the results that appear after the cursor do not change. If the last result returned is removed from the result set, the cursor still knows how to locate the next result.

When retrieving query results, you can use both a start cursor and an end cursor to return a continuous group of results from the Datastore. When using a start and end cursor to retrieve the results, you are not guaranteed that the size of the results will be the same as when you generated the cursors. Entities may be added or deleted from the Datastore between the time the cursors are generated and when they are used in a query.

Restrictions on queries

The nature of the index query mechanism imposes certain restrictions on what a query can do. Datastore queries do not support substring matches, case-insensitive matches, or so-called full-text search. The NOT, OR, and != operators are not natively supported, but some client libraries may add support on top of the Datastore. Additionally:

Entities lacking a property named in the query are ignored

Entities of the same kind need not have the same properties. To be eligible as a query result, an entity must possess a value (possibly null) for every property named in the query's filters and sort orders. If not, the entity is omitted from the indexes used to execute the query and consequently will not be included in the query's results.

Filtering on unindexed properties returns no results

A query can't find property values that aren't indexed, nor can it sort on such properties. See the Unindexed properties section for a detailed discussion of unindexed properties.

Inequality filters are limited to at most one property

To avoid having to scan the entire index, the query mechanism relies on all of a query's potential results being adjacent to one another in the index. To satisfy this constraint, a single query may not use inequality comparisons (LESS_THAN, LESS_THAN_OR_EQUAL, GREATER_THAN, GREATER_THAN_OR_EQUAL) on more than one property across all of its filters. For example, the following query is valid, because both inequality filters apply to the same property:

Java

Node.js

Go

Python

However, this query is not valid, because it uses inequality filters on two different properties:

Java

Node.js

Go

Python

Note that a query can combine equality (EQUAL) filters for different properties, along with one or more inequality filters on a single property. Thus the following is a valid query:

Java

Node.js

Go

Python

Ordering of query results is undefined when no sort order is specified

When a query does not specify a sort order, the results are returned in the order they are retrieved. As the Datastore implementation evolves (or if a project's indexes change), this order may change. Therefore, if your application requires its query results in a particular order, be sure to specify that sort order explicitly in the query.

Sort orders are ignored on properties with equality filters

Queries that include an equality filter for a given property ignore any sort order specified for that property. This is a simple optimization to save needless processing for single-valued properties, since all results have the same value for the property and so no further sorting is needed. Multiple-valued properties, however, may have additional values besides the one matched by the equality filter. Because this use case is rare and applying the sort order would be expensive and require extra indexes, the Datastore query planner simply ignores the sort order even in the multiple-valued case. This may cause query results to be returned in a different order than the sort order appears to imply.

Properties used in inequality filters must be sorted first

To retrieve all results that match an inequality filter, a query scans the index for the first row matching the filter, then scans forward until it encounters a nonmatching row. For the consecutive rows to encompass the complete result set, they must be ordered by the property used in the inequality filter before any other properties. Thus if a query specifies one or more inequality filters along with one or more sort orders, the first sort order must refer to the same property named in the inequality filters. The following is a valid query:

Java

Node.js

Go

Python

This query is not valid, because it doesn't sort on the property used in the inequality filter:

Java

Node.js

Go

Python

Similarly, this query is not valid because the property used in the inequality filter is not the first one sorted:

Java

Node.js

Go

Python

Queries inside transactions must be ancestor queries

Datastore transactions operate only on entities belonging to the same entity group (descended from a common ancestor). To preserve this restriction, all queries performed within a transaction must be an ancestor query specifying an ancestor in the same entity group as the other operations in the transaction.

Limitations on projections

Projection queries are subject to the following limitations:

  • Only indexed properties can be projected.

    Projection is not supported for strings that are longer than 1500 bytes, byte arrays that have more than 1500 elements, and other properties explicitly marked as unindexed.

  • The same property cannot be projected more than once.

  • Properties referenced in an equality filter cannot be projected.

    For example,

    SELECT tag FROM Task WHERE priority = 1
    

    is valid (projected property not used in the equality filter), as is

    SELECT tag FROM Task WHERE tag > 'fun`
    

    (not an equality filter), but

    SELECT tag FROM Task WHERE tag = 'fun`
    

    (projected property used in equality filter) is not.

  • Results returned by a projection query should not be saved back to the Datastore.

    Because the query returns results that are only partially populated, you should not write them back to the Datastore.