Datastore Queries

Note: Developers building new applications are strongly encouraged to use the NDB Client Library, which has several benefits compared to this client library, such as automatic entity caching via the Memcache API. If you are currently using the older DB Client Library, read the DB to NDB Migration Guide

A Datastore query retrieves entities from Cloud Datastore that meet a specified set of conditions.

A typical query includes the following:

  • An entity kind to which the query applies
  • Optional filters based on the entities' property values, keys, and ancestors
  • Optional sort orders to sequence the results
When executed, a query retrieves all entities of the given kind that satisfy all of the given filters, sorted in the specified order. Queries execute as read-only.

This page describes the structure and kinds of queries used within App Engine to retrieve data from Cloud Datastore.

Filters

A query's filters set constraints on the properties, keys, and ancestors of the entities to be retrieved.

Property filters

A property filter specifies

  • A property name
  • A comparison operator
  • A property value
For example:

q = Person.all()
q.filter("height <=", max_height)

The property value must be supplied by the application; it cannot refer to or be calculated in terms of other properties. An entity satisfies the filter if it has a property of the given name whose value compares to the value specified in the filter in the manner described by the comparison operator.

The comparison operator can be any of the following:

Operator Meaning
= Equal to
< Less than
<= Less than or equal to
> Greater than
>= Greater than or equal to
!= Not equal to
IN Member of (equal to any of the values in a specified list)

The not-equal (!=) operator actually performs two queries: one in which all other filters are unchanged and the not-equal filter is replaced with a less-than (<) filter, and one where it is replaced with a greater-than (>) filter. The results are then merged, in order. A query can have no more than one not-equal filter, and a query that has one cannot have any other inequality filters.

The IN operator also performs multiple queries: one for each item in the specified list, with all other filters unchanged and the IN filter replaced with an equality (=) filter. The results are merged in order of the items in the list. If a query has more than one IN filter, it is performed as multiple queries, one for each possible combination of values in the IN lists.

A single query containing not-equal (!=) or IN operators is limited to a maximum of 30 subqueries.

Key filters

To filter on the value of an entity's key, use the special property __key__:

q = Person.all()
q.filter('__key__ >', last_seen_key)

When comparing for inequality, keys are ordered by the following criteria, in order:

  1. Ancestor path
  2. Entity kind
  3. Identifier (key name or numeric ID)

Elements of the ancestor path are compared similarly: by kind (string), then by key name or numeric ID. Kinds and key names are strings and are ordered by byte value; numeric IDs are integers and are ordered numerically. If entities with the same parent and kind use a mix of key name strings and numeric IDs, those with numeric IDs precede those with key names.

Queries on keys use indexes just like queries on properties and require custom indexes in the same cases, with a couple of exceptions: inequality filters or an ascending sort order on the key do not require a custom index, but a descending sort order on the key does. As with all queries, the development web server creates appropriate entries in the index configuration file when a query that needs a custom index is tested.

Ancestor filters

You can filter your Datastore queries to a specified ancestor, so that the results returned will include only entities descended from that ancestor:

q = Person.all()
q.ancestor(ancestor_key)

Special query types

Some specific types of query deserve special mention:

Kindless queries

A query with no kind and no ancestor filter retrieves all of the entities of an application from Datastore. This includes entities created and managed by other App Engine features, such as statistics entities and Blobstore metadata entities (if any). Such kindless queries cannot include filters or sort orders on property values. They can, however, filter on entity keys by specifying __key__ as the property name:

q = db.Query()
q.filter('__key__ >', last_seen_key)

In Python, every entity returned by the query must have a corresponding model class defined for the entity's kind. To define the model classes for the statistics entity kinds, you must import the stats package:

from google.appengine.ext.db import stats

If your application has a Blobstore value, you must add the following code to get the query API to recognize the __BlobInfo__ entity kind. (Importing the Blobstore API does not define this class.)

from google.appengine.ext import db

class BlobInfo(db.Expando):
  @classmethod
  def kind(cls):
    return '__BlobInfo__'

Ancestor queries

A query with an ancestor filter limits its results to the specified entity and its descendants:

tom = Person(key_name='Tom')

wedding_photo = Photo(parent=tom)
wedding_photo.image_url='http://domain.com/some/path/to/wedding_photo.jpg'
wedding_photo.put()

baby_photo = Photo(parent=tom)
baby_photo.image_url='http://domain.com/some/path/to/baby_photo.jpg'
baby_photo.put()

dance_photo = Photo(parent=tom)
dance_photo.image_url='http://domain.com/some/path/to/dance_photo.jpg'
dance_photo.put()

camping_photo = Photo()
camping_photo.image_url='http://domain.com/some/path/to/camping_photo.jpg'
camping_photo.put()


photo_query = Photo.all()
photo_query.ancestor(tom)


# This returns wedding_photo, baby_photo, and dance_photo,
# but not camping_photo, because tom is not an ancestor
for photo in photo_query.run(limit=5):
  # Do something with photo

Kindless ancestor queries

A kindless query that includes an ancestor filter will retrieve the specified ancestor and all of its descendants, regardless of kind. This type of query does not require custom indexes. Like all kindless queries, it cannot include filters or sort orders on property values, but can filter on the entity's key:

q = db.Query()
q.ancestor(ancestor_key)
q.filter('__key__ >', last_seen_key)

To perform a kindless ancestor query using GQL (either in the App Engine Administration Console or using the GqlQuery class), omit the FROM clause:

q = db.GqlQuery('SELECT * WHERE ANCESTOR IS :1 AND __key__ > :2',
                ancestor_key,
                last_seen_key)

The following example illustrates how to retrieve all entities descended from a given ancestor:

tom = Person(key_name='Tom')

wedding_photo = Photo(parent=tom)
wedding_photo.image_url='http://domain.com/some/path/to/wedding_photo.jpg'
wedding_photo.put()

wedding_video = Video(parent=tom)
wedding_video.video_url='http://domain.com/some/path/to/wedding_video.avi'
wedding_video.put()

# The following query returns both weddingPhoto and weddingVideo,
# even though they are of different entity kinds
media_query = db.query_descendants(tom)
for media in media_query.run(limit=5):
  # Do something with media

Keys-only queries

A keys-only query returns just the keys of the result entities instead of the entities themselves, at lower latency and cost than retrieving entire entities:

q = Person.all(keys_only=True)

It is often more economical to do a keys-only query first, and then fetch a subset of entities from the results, rather than executing a general query which may fetch more entities than you actually need.

Projection queries

Sometimes all you really need from the results of a query are the values of a few specific properties. In such cases, you can use a projection query to retrieve just the properties you're actually interested in, at lower latency and cost than retrieving the entire entity; see the Projection Queries page for details.

Sort orders

A query sort order specifies

  • A property name
  • A sort direction (ascending or descending)

In Python, descending sort order is denoted by a hyphen (-) preceding the property name; omitting the hyphen specifies ascending order by default. For example:

# Order alphabetically by last name:
q = Person.all()
q.order('last_name')

# Order by height, tallest to shortest:
q = Person.all()
q.order('-height')

If a query includes multiple sort orders, they are applied in the sequence specified. The following example sorts first by ascending last name and then by descending height:

q = Person.all()
q.order('lastName')
q.order('-height')

If no sort orders are specified, the results are returned in the order they are retrieved from Datastore.

Note: Because of the way Datastore executes queries, if a query specifies inequality filters on a property and sort orders on other properties, the property used in the inequality filters must be ordered before the other properties.

Indexes

Every Datastore query computes its results using one or more indexes, which contain entity keys in a sequence specified by the index's properties and, optionally, the entity's ancestors. The indexes are updated incrementally to reflect any changes the application makes to its entities, so that the correct results of all queries are available with no further computation needed.

App Engine predefines a simple index on each property of an entity. An App Engine application can define further custom indexes in an index configuration file named index.yaml. The development server automatically adds suggestions to this file as it encounters queries that cannot be executed with the existing indexes. You can tune indexes manually by editing the file before uploading the application.

Query interface example

The Python Datastore API provides two classes for preparing and executing queries:

  • Query uses method calls to prepare the query.
  • GqlQuery uses a SQL-like query language called GQL to prepare the query from a query string.
class Person(db.Model):
  first_name = db.StringProperty()
  last_name = db.StringProperty()
  city = db.StringProperty()
  birth_year = db.IntegerProperty()
  height = db.IntegerProperty()


# Query interface constructs a query using instance methods
q = Person.all()
q.filter("last_name =", "Smith")
q.filter("height <=", max_height)
q.order("-height")


# GqlQuery interface constructs a query using a GQL query string
q = db.GqlQuery("SELECT * FROM Person " +
                "WHERE last_name = :1 AND height <= :2 " +
                "ORDER BY height DESC",
                "Smith", max_height)


# Query is not executed until results are accessed
for p in q.run(limit=5):
  print "%s %s, %d inches tall" % (p.first_name, p.last_name, p.height)

What's next?