GQL Reference for Python NDB/DB

GQL is a SQL-like language for retrieving entities and keys. The syntax for GQL queries is similar to that of SQL. This page is a reference for using GQL with the Python NDB and DB client libraries.

GQL maps roughly to SQL: You can think of a GQL kind as a SQL table, a GQL entity as a SQL row, and a GQL property as a SQL column. However, a SQL row-column lookup is a single value, whereas in GQL a property value can be a list.

GQL Versions

You need different versions of GQL depending on where you run queries. There are two GQL references:

Syntax

The GQL syntax for Python NDB/DB can be summarized as follows:

SELECT [DISTINCT] [* | <property list> | __key__]
  [FROM <kind>]
  [WHERE <condition> [AND <condition> ...]]
  [ORDER BY <property> [ASC | DESC] [, <property> [ASC | DESC] ...]]
  [LIMIT [<offset>,]<count>]
  [OFFSET <offset>]

  <property list> := <property> [, <property> ...]
  <condition> := <property> {< | <= | > | >= | = | != } <value>
  <condition> := <property> IN <list>
  <condition> := ANCESTOR IS <entity or key>
  <list> := (<value> [, <value> ...]])

As with SQL, GQL keywords are case insensitive. Kind and property names are case sensitive.

GQL only supports SELECT statements.

A GQL query returns zero or more entire entities, projected entities, or keys of the requested kind. Every GQL query always begins with SELECT *, SELECT __key__ or SELECT <property list>, where property is a comma delimited list of one or more entity properties to be returned from the query. (A GQL query cannot perform a SQL-like "join" query.)

Tip: SELECT __key__ or SELECT <property list> queries are faster and use less CPU time than SELECT * queries.

The optional DISTINCT(experimental) clause specifies that only completely unique results will be returned in a result set. This will only return the first result for entities which have the same values for the properties that are being projected.

The optional FROM clause limits the result set to those entities of the given kind. A query without a FROM clause is called a kindless query and can only have a WHERE that specifies a __key__ property.

The optional WHERE clause limits the result set to those entities that meet one or more conditions. Each condition compares a property of the entity with a value using a comparison operator. If multiple conditions are given with the AND keyword, then an entity must meet all of the conditions to be returned by the query. GQL does not have an OR operator. However, it does have an IN operator, which provides a limited form of OR.

The IN operator compares value of a property to each item in a list. The IN operator is equivalent to many = queries, one for each value, that are ORed together. An entity whose value for the given property equals any of the values in the list can be returned for the query.

Note: The IN and != operators use multiple queries behind the scenes. For example, the IN operator executes a separate underlying datastore query for every item in the list. The entities returned are a result of the cross-product of all the underlying datastore queries and are de-duplicated. A maximum of 30 datastore queries are allowed for any single GQL query.

A condition can also test whether an entity has a given entity as an ancestor, using the ANCESTOR IS operator. The value is a model instance or key for the ancestor entity. For more information on ancestors, see Keys and Entity Groups.

The left-hand side of a comparison is always a property name. A typical property name consists of alphanumeric characters optionally mixed with underscores and dots. In other words, they match the regular expression [a-zA-Z0-9_]+(\.[a-zA-Z0-9_]+)*.

Caution: Property names containing other printable characters must be quoted with double-quotes. For example: "first-name". Spaces or non-printable characters in property names are not supported.

The right-hand side of a comparison can be one of the following (as appropriate for the property's data type):

  • a str literal, as a single-quoted string. Single-quote characters in the string must be escaped as ''. For example: 'Joe''s Diner'
  • an integer or floating point number literal. For example: 42.7
  • a Boolean literal, as TRUE or FALSE.
  • the NULL literal, which represents the null value (None in Python).
  • a datetime, date, or time literal, with either numeric values or a string representation, in the following forms:
    • DATETIME(year, month, day, hour, minute, second)
    • DATETIME('YYYY-MM-DD HH:MM:SS')
    • DATE(year, month, day)
    • DATE('YYYY-MM-DD')
    • TIME(hour, minute, second)
    • TIME('HH:MM:SS')
  • an entity key literal, with either a string-encoded key or a complete path of kinds and key names/IDs:

    • KEY('encoded key')
    • KEY('kind', 'name'/ID [, 'kind', 'name'/ID...])
  • a User object literal, with the user's email address:
    USER('email-address')
  • a GeoPt literal, with the latitude and longitude as floating point values:
    GEOPT(lat, long)
  • a bound parameter value. In the query string, positional parameters are referenced by number: title = :1. Keyword parameters are referenced by name: title = :mytitle

Note: conditions of the form property = NULL check to see whether a null value is explicitly stored in the datastore for that property. This is not the same as checking to see if the entity lacks any value for the property! Datastore queries which refer to a property never return entities which don't have some value for that property.

Bound parameters can be bound as positional arguments or keyword arguments passed to the GqlQuery constructor or a model class's gql() method. Property data types that do not have corresponding value literal syntax must be specified using parameter binding, including the list data type. Parameter bindings can be re-bound with new values during the lifetime of the GqlQuery instance (such as to efficiently reuse a query) using the bind() method.

The optional ORDER BY clause indicates that results should be returned sorted by the given properties, in either ascending (ASC) or descending (DESC) order. The ORDER BY clause can specify multiple sort orders as a comma-delimited list, evaluated from left to right. If the direction is not specified, it defaults to ASC. If no ORDER BY clause is specified, the order of the results is undefined and may change over time.

An optional LIMIT clause causes the query to stop returning results after the first <count> entities. The LIMIT clause can also include an <offset>, to skip that many results to find the first result to return. An optional OFFSET clause can specify an <offset>, if no LIMIT clause is present.

Note: Like the offset parameter for the fetch() method, an OFFSET in a GQL query string does not reduce the number of entities fetched from the datastore. It only affects which results are returned by the fetch() method. A query with an offset has performance characteristics that correspond linearly with the offset size plus the limit size.

For information on executing GQL queries, binding parameters, and accessing results, see the GqlQuery class, and the Model.gql() class method.

Examples

from google.appengine.ext import db

class Person(db.Model):
  name = db.StringProperty()
  age = db.IntegerProperty()

# We use a unique username for the Entity's key.
amy = Person(key_name='amym', name='Amy', age=48)
amy.put()
Person(key_name='bettyd', name='Betty', age=42).put()
Person(key_name='charliec', name='Charlie', age=32).put()
Person(key_name='charliek', name='Charlie', age=29).put()
Person(key_name='eedna', name='Edna', age=20).put()
Person(key_name='fredm', name='Fred', age=16, parent=amy).put()
Person(key_name='georgemichael', name='George').put()

To find all of the entities of the Person kind whose ages are between 18 and 35 (i.e. both Charlies and Edna), use this query:

SELECT * FROM Person WHERE age >= 18 AND age <= 35

To find the three entities of the Person kind whose ages are the greatest (i.e. Amy, Betty and Charlie), use this query:

SELECT * FROM Person ORDER BY age DESC LIMIT 3

To find the entities of the Person kind whose names are one of "Betty" or "Charlie", use this query:

SELECT * FROM Person WHERE name IN ('Betty', 'Charlie')

To return only the name values for each Person, use this query:

SELECT name FROM Person

To return only the name values for each Person, ordered by age, use this query:

SELECT name FROM Person ORDER BY age

To find the keys of the entities of the Person kind that have an age of None (i.e. KEY('Person', 'georgemichael')), use this query:

SELECT __key__ FROM Person WHERE age = NULL

To find all the entities, regardless of kind, that are in Amy's entity group (i.e. Amy and Fred), use this query:

SELECT * WHERE __key__ HAS ANCESTOR KEY(Person, 'Amy')

To match by Key, we can use __key__ on the left hand side of a condition. For example, we can use this to get all Person entities that have a username that starts with "a".

SELECT * FROM Person WHERE __key__ >= KEY('Person', 'a') AND __key__ < KEY('Person', 'b')

Note: If you ever build a query with an equality on __key__, consider using get() instead to fetch the entity directly.