Metadata

Note: Developers building new applications are strongly encouraged to use the NDB Client Library, which has several benefits compared to this client library, such as automatic entity caching via the Memcache API. If you are currently using the older DB Client Library, read the DB to NDB Migration Guide

Datastore provides programmatic access to some of its metadata to support metaprogramming, implementing backend administrative functions, simplify consistent caching, and similar purposes; you can use it, for instance, to build a custom Datastore viewer for your application. The metadata available includes information about the entity groups, namespaces, entity kinds, and properties your application uses, as well as the property representations for each property.

The Datastore Dashboard in the Google Cloud console also provides some metadata about your application, but the data displayed there differs in some important respects from that returned by these functions.

  • Freshness. Reading metadata using the API gets current data, whereas data in the dashboard is updated only once daily.
  • Contents. Some metadata in the dashboard is not available via the APIs; the reverse is also true.
  • Speed. Metadata gets and queries are billed in the same way as Datastore gets and queries. Metadata queries that fetch information on namespaces, kinds, and properties are generally slow to execute. As a rule of thumb, expect a metadata query that returns N entities to take about the same time as N ordinary queries each returning a single entity. Furthermore, property representation queries (non-keys-only property queries) are slower than keys-only property queries. Metadata gets of entity group metadata are somewhat faster than getting a regular entity.

Helper functions

The following functions obtain metadata information:

  • get_entity_group_version() gets a version number for an entity group; this is useful for finding out if any entity in the group has changed since the last time you got the version number.
  • get_namespaces() returns a list containing the names of all of an application's namespaces or those in a specified range.
  • get_kinds() returns a list containing the names of all of an application's entity kinds or those in a specified range.
  • get_properties_of_kind() returns a list containing the names of all of an application's indexed properties (or those in a specified range) associated with a given entity kind. Unindexed properties are not included.
  • get_representations_of_kind() returns a dictionary containing the representations for all of an application's indexed properties or those in a specified range associated with a given entity kind. The dictionary maps the name of each property to a list of that property's representations. Unindexed properties are not included.

Entity group metadata

Cloud Datastore provides access to the "version" of an entity group, a strictly positive number that is guaranteed to increase on every change to the entity group.

The following example shows how to get an entity group's version:

from google.appengine.ext import db
from google.appengine.ext.db import metadata

class Simple(db.Model):
  x = db.IntegerProperty()

entity1 = Simple(x=11)
entity1.put()

# Print entity1's entity group version
print 'version', metadata.get_entity_group_version(entity1)

# Write to a different entity group
entity2 = Simple(x=22)
entity2.put()

# Will print the same version, as entity1's entity group has not changed
print 'version', metadata.get_entity_group_version(entity1)

# Change entity1's entity group by adding a new child entity
entity3 = Simple(x=33, parent=entity1.key())
entity3.put()

# Will print a higher version, as entity1's entity group has changed
print metadata.get_entity_group_version(entity1)

Legacy behavior

In the legacy entity group version behavior, the entity group version increases only on changes to the entity group. The legacy entity group metadata behavior could be used, for example, to keep a consistent cache of a complex ancestor query on an entity group.

This example caches query results (a count of matching results) and uses the legacy behavior of entity group versions to use the cached value if it's current:

from google.appengine.api import memcache
from google.appengine.ext import db
from google.appengine.ext.db import metadata

def count_entity_group(entity_group_key):
  """Count the entities in the specified entity group."""
  # Check if we have a cached version of the current entity group count
  cached = memcache.get(str(entity_group_key))
  if cached:
    (version, count) = cached
    # Is the cached value for the current version?
    if version == metadata.get_entity_group_version(entity_group_key):
      return count

  def tx():
    # Need to actually count entities. Using a transaction to get a consistent
    # count and entity group version.
    count = db.Query(keys_only=True).ancestor(entity_group_key).count(limit=5000)
    # Cache the count and the entity group version
    version = metadata.get_entity_group_version(entity_group_key)
    memcache.set(str(entity_group_key), (version, count))
    return count

  return db.run_in_transaction(tx)

get_entity_group_version() may return None for an entity group which has never been written to.

Entity group versions are obtained by calling get() on a special pseudo-entity that contains a __version__ property. See the reference documentation on EntityGroup for details.

Metadata queries

If the helper functions described in the preceding section don't meet your needs, you can issue more elaborate or flexible metadata requests with an explicit metadata query. In Python, the model classes for such queries are defined in the google.appengine.ext.db.metadata package. These models provide special entity kinds that are reserved for metadata queries:

Model class Entity kind
Namespace __namespace__
Kind __kind__
Property __property__

These models and kinds will not conflict with others of the same names that may already exist in your application. By querying on these special kinds, you can retrieve entities containing the desired metadata.

The entities returned by metadata queries are generated dynamically, based on the current state of Datastore. While you can create local instances of the Namespace, Kind, or Property model classes, any attempt to store them in Datastore will fail with a BadRequestError exception.

You can issue metadata queries using a query object belonging to either of two classes:

  • A Query object returned by the class method Namespace.all(), Kind.all(), or Property.all() (inherited from the superclass method Model.all())
  • A GqlQuery object for GQL-style queries

The following example returns the names of all entity kinds in an application:

from google.appengine.ext import db
from google.appengine.ext.db.metadata import Kind

for k in Kind.all():
  print "kind: '%s'" % k.kind_name

Namespace queries

If your application uses the Namespaces API , you can use a namespace query to find all namespaces used in the application's entities. This allows you to perform activities such as administrative functions across multiple namespaces.

Namespace queries return entities of the special kind __namespace__ whose key name is the name of a namespace. (An exception is the default namespace designated by the empty string "": since the empty string is not a valid key name, this namespace is keyed with the numeric ID 1 instead.) Queries of this type support filtering only for ranges over the special pseudoproperty __key__, whose value is the entity's key. The results can be sorted by ascending (but not descending) __key__ value. Because __namespace__ entities have no properties, both keys-only and non-keys-only queries return the same information.

Namespace entities are instances of the model class google.appengine.ext.db.metadata.Namespace. The string property namespace_name, computed from the entity's key, returns the name of the corresponding namespace. (If the key has numeric ID 1, the property returns the empty string.) To facilitate querying, the Namespace model provides the following class methods:

As an example, here is the implementation of the helper function get_namespaces(), which returns a list containing the names of all of an application's namespaces (or those in the range between two specified names, start and end):

from google.appengine.ext import db
from google.appengine.ext.db.metadata import Namespace

def get_namespaces(start=None, end=None):

  # Start with unrestricted namespace query
  q = Namespace.all()

  # Limit to specified range, if any
  if start is not None:
    q.filter('__key__ >=', Namespace.key_for_namespace(start))
  if end is not None:
    q.filter('__key__ <', Namespace.key_for_namespace(end))

  # Return list of query results
  return [ns.namespace_name for ns in q]

Kind queries

Kind queries return entities of kind __kind__ whose key name is the name of an entity kind. Queries of this type are implicitly restricted to the current namespace and support filtering only for ranges over the __key__ pseudoproperty. The results can be sorted by ascending (but not descending) __key__ value. Because __kind__ entities have no properties, both keys-only and non-keys-only queries return the same information.

Kind entities are instances of the model class google.appengine.ext.db.metadata.Kind. The string property kind_name, computed from the entity's key, returns the name of the corresponding entity kind. To facilitate querying, the Kind model provides the following class methods:

As an example, here is the implementation of the helper function get_kinds(), which returns a list containing the names of all of an application's entity kinds (or those in the range between two specified names, start and end):

from google.appengine.ext import db
from google.appengine.ext.db.metadata import Kind

def get_kinds(start=None, end=None):

  # Start with unrestricted kind query
  q = Kind.all()

  # Limit to specified range, if any
  if start is not None and start != '':
    q.filter('__key__ >=', Kind.key_for_kind(start))
  if end is not None:
    if end == '':
      return []        # Empty string is not a valid kind name, so can't filter
    q.filter('__key__ <', Kind.key_for_kind(end))

  # Return list of query results
  return [k.kind_name for k in q]

The following example prints all kinds whose names start with a lowercase letter:

from google.appengine.ext import db
from google.appengine.ext.db.metadata import Kind

# Start with unrestricted kind query
q = Kind.all()

# Limit to lowercase initial letters
q.filter('__key__ >=', Kind.key_for_kind('a'))
endChar = chr(ord('z') + 1)                        # Character after 'z'
q.filter('__key__ <', Kind.key_for_kind(endChar))

# Print query results
for k in q:
  print k.kind_name

Property queries

Property queries return entities of kind __property__ denoting the properties associated with an entity kind (whether or not those properties are currently defined in the kind's model). The entity representing property P of kind K is built as follows:

  • The entity's key has kind __property__ and key name P.
  • The parent entity's key has kind __kind__ and key name K.

Property entities are instances of the model class google.appengine.ext.db.metadata.Property. The string properties kind_name and property_name, computed from the entity's key, return the names of the corresponding kind and property. The Property model provides four class methods to simplify building and examining __property__ keys:

The following example illustrates these methods:

from google.appengine.ext import db
from google.appengine.ext.db.metadata import Property

class Employee(db.Model):
  name = db.StringProperty()
  ssn = db.IntegerProperty()

employee_key = Property.key_for_kind("Employee")
employee_name_key = Property.key_for_property("Employee", "Name")

Property.key_to_kind(employee_key)           # Returns "Employee"
Property.key_to_property(employee_name_key)  # Returns "Name"

The behavior of a property query depends on whether it is a keys-only or a non-keys-only (property representation) query, as detailed in the subsections below.

Property queries: keys-only

Keys-only property queries return a key for each indexed property of a specified entity kind. (Unindexed properties are not included.) The following example prints the names of all of an application's entity kinds and the properties associated with each:

from google.appengine.ext import db
from google.appengine.ext.db.metadata import Property

# Create unrestricted keys-only property query
q = Property.all(keys_only=True)

# Print query results
for p in q:
  print "%s: %s" % (Property.key_to_kind(p), Property.key_to_property(p))

Queries of this type are implicitly restricted to the current namespace and support filtering only for ranges over the pseudoproperty __key__, where the keys denote either __kind__ or __property__ entities. The results can be sorted by ascending (but not descending) __key__ value. Filtering is applied to kind-property pairs, ordered first by kind and second by property: for instance, suppose you have an entity with these properties:

  • kind Account with properties
    • balance
    • company
  • kind Employee with properties
    • name
    • ssn
  • kind Invoice with properties
    • date
    • amount
  • kind Manager with properties
    • name
    • title
  • kind Product with properties
    • description
    • price

The query to return the property data would look like this:

from google.appengine.ext import db
from google.appengine.ext.db.metadata import Property

# Start with unrestricted keys-only property query
q = Property.all(keys_only=True)

# Limit range
q.filter('__key__ >=', Property.key_for_property("Employee", "salary"))
q.filter('__key__ <=', Property.key_for_property("Manager", "salary"))

# Print query results
for p in q:
  print "%s: %s" % (Property.key_to_kind(p), Property.key_to_property(p))

The above query would return the following:

Employee: ssn
Invoice: date
Invoice: amount
Manager: name

Notice that the results do not include the name property of kind Employee and the title property of kind Manager, nor any properties of kinds Account and Product, because they fall outside the range specified for the query.

Property queries also support ancestor filtering on a __kind__ or __property__ key, to limit the query results to a single kind or property. You can use this, for instance, to get the properties associated with a given entity kind, as in the following example:

(an implementation of the helper function get_properties_of_kind())

from google.appengine.ext import db
from google.appengine.ext.db.metadata import Property

def get_properties_of_kind(kind, start=None, end=None):

  # Start with unrestricted keys-only property query
  q = Property.all(keys_only=True)

  # Limit to specified kind
  q.ancestor(Property.key_for_kind(kind))

  # Limit to specified range, if any
  if start is not None and start != '':
    q.filter('__key__ >=', Property.key_for_property(kind, start))
  if end is not None:
    if end == '':
      return []     # Empty string is not a valid property name, so can't filter
    q.filter('__key__ <', Property.key_for_property(kind, end))

  # Return list of query results
  return [Property.key_to_property(p) for p in q]

Property queries: non-keys-only (property representation)

Non-keys-only property queries, known as property representation queries, return additional information on the representations used by each kind-property pair. (Unindexed properties are not included.) The entity returned for property P of kind K has the same key as for a corresponding keys-only query, along with an additional property_representation property returning the property's representations. The value of this property is an instance of class StringListProperty containing one string for each representation of property P found in any entity of kind K.

Note that representations are not the same as property classes; multiple property classes can map to the same representation. (For example, StringProperty and PhoneNumberProperty both use the STRING representation.)

The following table maps from property classes to their representations:

Property class Representation
IntegerProperty INT64
FloatProperty DOUBLE
BooleanProperty BOOLEAN
StringProperty STRING
ByteStringProperty STRING
DateProperty INT64
TimeProperty INT64
DateTimeProperty INT64
GeoPtProperty POINT
PostalAddressProperty STRING
PhoneNumberProperty STRING
EmailProperty STRING
UserProperty USER
IMProperty STRING
LinkProperty STRING
CategoryProperty STRING
RatingProperty INT64
ReferenceProperty
SelfReferenceProperty
REFERENCE
blobstore.BlobReferenceProperty STRING
ListProperty List element's representation
StringListProperty List element's representation

As an example, here is the implementation of the helper function get_representations_of_kind(), which returns a dictionary containing the representations for all of an application's indexed properties (or those in the range between two specified names, start and end) associated with a given entity kind. The dictionary maps the name of each property to a list of that property's representations:

from google.appengine.ext import db
from google.appengine.ext.db.metadata import Property

def get_representations_of_kind(kind, start=None, end=None):

  # Start with unrestricted non-keys-only property query
  q = Property.all()

  # Limit to specified kind
  q.ancestor(Property.key_for_kind(kind))

  # Limit to specified range, if any
  if start is not None and start != '':
    q.filter('__key__ >=', Property.key_for_property(kind, start))
  if end is not None:
    if end == '':
      return []     # Empty string is not a valid property name, so can't filter
    q.filter('__key__ <', Property.key_for_property(kind, end))

  # Initialize result dictionary
  result = {}

  # Add query results to dictionary
  for p in q:
    result[p.property_name] = p.property_representation

  # Return dictionary
  return result