Indexes

Every Firestore in Datastore mode query computes its results using one or more indexes which contain entity keys in a sequence specified by the index's properties and, optionally, the entity's ancestors. The indexes are updated to reflect any changes the application makes to its entities, so that the correct results of all queries are available with no further computation needed.

There are two types of indexes:

Built-in indexes
By default, a Datastore mode database automatically predefines an index for each property of each entity kind. These single property indexes are suitable for simple types of queries.
Composite indexes
Composite indexes index multiple property values per indexed entity. Composite indexes support complex queries and are defined in an index configuration file (index.yaml).

The types of indexes are discussed in more detail later in this topic.

Index definition and structure

An index is defined on a list of properties of a given entity kind, with a corresponding order (ascending or descending) for each property. For use with ancestor queries, the index may also optionally include an entity's ancestors.

An index table contains a column for every property named in the index's definition. Each row of the table represents an entity that is a potential result for queries based on the index. An entity is included in the index only if it has an indexed value set for every property used in the index; if the index definition refers to a property for which the entity has no value, that entity will not appear in the index and hence will never be returned as a result for any query based on the index.

The rows of an index table are sorted first by ancestor and then by property values, in the order specified in the index definition. The perfect index for a query, which allows the query to be executed most efficiently, is defined on the following properties, in order:

  1. Properties used in equality filters
  2. Property used in an inequality filter (of which there can be no more than one)
  3. Properties used in sort orders
  4. Properties used in projections (that are not already included in sort orders)

This ensures that all results for every possible execution of the query appear in consecutive rows of the table. Datastore mode databases execute a query using a perfect index by the following steps:

  1. Identifies the index corresponding to the query's kind, filter properties, filter operators, and sort orders
  2. Scans from the beginning of the index to the first entity that meets all of the query's filter conditions
  3. Continues scanning the index, returning each entity in turn, until it
    • encounters an entity that does not meet the filter conditions, or
    • reaches the end of the index, or
    • has collected the maximum number of results requested by the query

For example, consider the following query:

SELECT * FROM Task
WHERE category = 'Personal'
  AND priority < 3
ORDER BY priority DESC

The perfect index for this query is a table of keys for entities of kind Task, with columns for the values of the category and priority properties. The index is sorted first in ascending order by category and then in descending order by priority:

indexes:
- kind: Task
  properties:
  - name: category
    direction: asc
  - name: priority
    direction: desc

Two queries of the same form but with different filter values use the same index. For example, the following query uses the same index as the one above:

SELECT * FROM Task
WHERE category = 'Work'
  AND priority < 5
ORDER BY priority DESC

For this index

indexes:
- kind: Task
  properties:
  - name: category
    direction: asc
  - name: priority
    direction: asc
  - name: created
    direction: asc

the following two queries also use the same index, despite their different forms:

SELECT * FROM Task
WHERE category = 'Personal'
  AND priority = 5
ORDER BY created ASC

and

SELECT * FROM Task
WHERE category = 'Work'
ORDER BY priority ASC, created ASC

The index created above can satisfy both of these queries.

Index configuration

Firestore in Datastore mode provides built-in, or automatic, indexes for queries of the following forms:

  • Kindless queries using only ancestor and key filters
  • Queries using only ancestor and equality filters
  • Queries using only inequality filters (which are limited to a single property)
  • Queries using only ancestor filters, equality filters on properties, and inequality filters on keys
  • Queries with no filters and only one sort order on a property, either ascending or descending

As an example, by default, Datastore mode databases automatically predefine two single property indexes for each property of each entity kind, one in ascending order and one in descending order. If you do not want your database to maintain an index for a property, exclude the property from your indexes. Note that excluding a property removes it from any composite indexes.

Built-in indexes are sufficient to perform many simple queries, such as equality-only queries and simple inequality queries.

Built-in indexes do not appear in the Indexes page of the Google Cloud console.

For more complex queries, an application must define composite, or manual, indexes. Composite indexes are required for queries of the following form:

  • Queries with ancestor and inequality filters
  • Queries with one or more inequality filters on a property and one or more equality filters on other properties
  • Queries with a sort order on keys in descending order
  • Queries with multiple sort orders
  • Queries with one or more filters and one or more sort orders

Composite indexes are defined in the application's index configuration file (index.yaml). (Built-in indexes are not contained in the index configuration file.)

Composite indexes are composed of multiple properties and require that each individual property must not be excluded from your indexes.

Composite indexes are viewable in the Indexes page of the Google Cloud console. You cannot use the Google Cloud console to create or update composite indexes.

If the application tries to perform a query that cannot be executed with the available indexes (either built-in or specified in the index configuration file), the query will fail.

The Datastore mode API automatically suggests indexes that are appropriate for most applications. Depending on your application's use of your Datastore mode database and the size and shape of your data, manual adjustments to your indexes may be warranted. For example, writing entities with multiple property values may result in an exploding index with high storage costs and increased write latency.

The Datastore emulator can help make it easier to manage your index configuration file. Instead of failing to execute a query that requires an index and does not have one, the Datastore emulator can generate an index configuration that would allow the query to succeed. If your local testing of an application exercises every possible query the application will issue, using every combination of filter and sort order, the generated entries will represent a complete set of indexes. If your testing does not exercise every possible query form, you can review and adjust the index configuration file before updating indexes.

You can learn more about index.yaml at Index Configuration.

Deploying or deleting indexes

When you are done modifying your index configuration file, run the gcloud datastore indexes create command to place the indexes into service. Learn more at updating your indexes.

If you previously deployed indexes that are no longer needed, you can delete the unused indexes.

Storage costs and write latency

Indexes contribute to your storage costs. Index entry size describes how built-in and composite indexes contribute to your database's storage size. You can use Firestore in Datastore mode statistics to see more information about index entries and index storage size.

Indexes also contribute to write latency. When updating a property value, the database also updates every related index. The more indexes the database needs to update, the longer the operation takes.

You can reduce storage costs and improve write performance by deleting unused indexes and excluding properties from indexing. This also prevents operations from failing due to index limits.

Indexes and properties

Here are a few special considerations to keep in mind about indexes and how they relate to the properties of your entities:

Properties with mixed value types

When two entities have properties of the same name but different value types, an index of the property sorts the entities first by value type and then by a secondary ordering appropriate to each type. For example, if two entities each have a property named age, one with an integer value and one with a string value, the entity with the integer value always precedes the one with the string value when sorted by the age property, regardless of the property values themselves.

This is especially worth noting in the case of integers and floating-point numbers, which are treated as separate types by Datastore mode. Because all integers are sorted before all floats, a property with the integer value 38 is sorted before one with the floating-point value 37.5.

Excluded properties

If you know you will never have to filter or sort on a particular property, you can tell your Datastore mode database not to maintain index entries for that property by excluding it from indexes. This lowers the cost of running your application by reducing the storage size needed for index entries. This can also improve write latency. An entity with an excluded property behaves as if the property were not set: queries with a filter or sort order on the excluded property will never match that entity.

The description property in the following example is excluded from indexes:

C#

To learn how to install and use the client library for Cloud Datastore, see Cloud Datastore client libraries. For more information, see the Cloud Datastore C# API reference documentation.

To authenticate to Cloud Datastore, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.

Entity task = new Entity()
{
    Key = _db.CreateKeyFactory("Task").CreateKey("sampleTask"),
    ["category"] = "Personal",
    ["created"] = new DateTime(1999, 01, 01, 0, 0, 0, DateTimeKind.Utc),
    ["done"] = false,
    ["priority"] = 4,
    ["percent_complete"] = 10.0,
    ["description"] = new Value()
    {
        StringValue = "Learn Cloud Datastore",
        ExcludeFromIndexes = true
    },
};

Go

To learn how to install and use the client library for Cloud Datastore, see Cloud Datastore client libraries. For more information, see the Cloud Datastore Go API reference documentation.

To authenticate to Cloud Datastore, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.

type Task struct {
	Category        string
	Done            bool
	Priority        int
	Description     string `datastore:",noindex"`
	PercentComplete float64
	Created         time.Time
}
task := &Task{
	Category:        "Personal",
	Done:            false,
	Priority:        4,
	Description:     "Learn Cloud Datastore",
	PercentComplete: 10.0,
	Created:         time.Now(),
}

Java

To learn how to install and use the client library for Cloud Datastore, see Cloud Datastore client libraries. For more information, see the Cloud Datastore Java API reference documentation.

To authenticate to Cloud Datastore, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.

Entity task =
    Entity.newBuilder(taskKey)
        .set("category", "Personal")
        .set("created", Timestamp.now())
        .set("done", false)
        .set("priority", 4)
        .set("percent_complete", 10.0)
        .set(
            "description",
            StringValue.newBuilder("Learn Cloud Datastore").setExcludeFromIndexes(true).build())
        .build();

Node.js

To learn how to install and use the client library for Cloud Datastore, see Cloud Datastore client libraries. For more information, see the Cloud Datastore Node.js API reference documentation.

To authenticate to Cloud Datastore, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.

const task = [
  {
    name: 'category',
    value: 'Personal',
  },
  {
    name: 'created',
    value: new Date(),
  },
  {
    name: 'done',
    value: false,
  },
  {
    name: 'priority',
    value: 4,
  },
  {
    name: 'percent_complete',
    value: 10.0,
  },
  {
    name: 'description',
    value: 'Learn Cloud Datastore',
    excludeFromIndexes: true,
  },
];

PHP

To learn how to install and use the client library for Cloud Datastore, see Cloud Datastore client libraries. For more information, see the Cloud Datastore PHP API reference documentation.

To authenticate to Cloud Datastore, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.

$task = $datastore->entity(
    $key,
    [
        'category' => 'Personal',
        'created' => new DateTime(),
        'done' => false,
        'priority' => 4,
        'percent_complete' => 10.0,
        'description' => 'Learn Cloud Datastore'
    ],
    ['excludeFromIndexes' => ['description']]
);

Python

To learn how to install and use the client library for Cloud Datastore, see Cloud Datastore client libraries. For more information, see the Cloud Datastore Python API reference documentation.

To authenticate to Cloud Datastore, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.

from google.cloud import datastore

# For help authenticating your client, visit
# https://cloud.google.com/docs/authentication/getting-started
client = datastore.Client()

key = client.key("Task")
task = datastore.Entity(key, exclude_from_indexes=("description",))
task.update(
    {
        "category": "Personal",
        "description": "Learn Cloud Datastore",
        "created": datetime.datetime.now(tz=datetime.timezone.utc),
        "done": False,
        "priority": 4,
        "percent_complete": 10.5,
    }
)
client.put(task)

Ruby

To learn how to install and use the client library for Cloud Datastore, see Cloud Datastore client libraries. For more information, see the Cloud Datastore Ruby API reference documentation.

To authenticate to Cloud Datastore, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.

task = datastore.entity "Task" do |t|
  t["category"] = "Personal"
  t["created"] = Time.now
  t["done"] = false
  t["priority"] = 4
  t["percent_complete"] = 10.0
  t["description"] = "Learn Cloud Datastore"
  t.exclude_from_indexes! "description", true
end

GQL

Not Applicable

The query in the following example will not return any results if the description property was excluded:

C#

To learn how to install and use the client library for Cloud Datastore, see Cloud Datastore client libraries. For more information, see the Cloud Datastore C# API reference documentation.

To authenticate to Cloud Datastore, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.

Query query = new Query("Task")
{
    Filter = Filter.Equal("description", "Learn Cloud Datastore")
};

Go

To learn how to install and use the client library for Cloud Datastore, see Cloud Datastore client libraries. For more information, see the Cloud Datastore Go API reference documentation.

To authenticate to Cloud Datastore, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.

query := datastore.NewQuery("Tasks").
	FilterField("Description", "=", "A task description")

Java

To learn how to install and use the client library for Cloud Datastore, see Cloud Datastore client libraries. For more information, see the Cloud Datastore Java API reference documentation.

To authenticate to Cloud Datastore, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.

Query<Entity> query =
    Query.newEntityQueryBuilder()
        .setKind("Task")
        .setFilter(PropertyFilter.eq("description", "A task description"))
        .build();

Node.js

To learn how to install and use the client library for Cloud Datastore, see Cloud Datastore client libraries. For more information, see the Cloud Datastore Node.js API reference documentation.

To authenticate to Cloud Datastore, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.

const query = datastore
  .createQuery('Task')
  .filter(new PropertyFilter('description', '=', 'A task description.'));

PHP

To learn how to install and use the client library for Cloud Datastore, see Cloud Datastore client libraries. For more information, see the Cloud Datastore PHP API reference documentation.

To authenticate to Cloud Datastore, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.

$query = $datastore->query()
    ->kind('Task')
    ->filter('description', '=', 'A task description.');

Python

To learn how to install and use the client library for Cloud Datastore, see Cloud Datastore client libraries. For more information, see the Cloud Datastore Python API reference documentation.

To authenticate to Cloud Datastore, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.

from google.cloud import datastore

# For help authenticating your client, visit
# https://cloud.google.com/docs/authentication/getting-started
client = datastore.Client()

query = client.query(kind="Task")
query.add_filter(filter=PropertyFilter("description", "=", "Learn Cloud Datastore"))

Ruby

To learn how to install and use the client library for Cloud Datastore, see Cloud Datastore client libraries. For more information, see the Cloud Datastore Ruby API reference documentation.

To authenticate to Cloud Datastore, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.

query = datastore.query("Task")
                 .where("description", "=", "A task description.")

GQL


# Will not return any results!
SELECT * FROM Task WHERE description = 'A task description.'

You can later change the property back to indexed.

Note, however, that changing a property from excluded to indexed does not affect any existing entities that may have been created before the change. Queries filtering on the property will not return such existing entities, because the entities weren't written to the query's index when they were created. To make the entities accessible by future queries, you must rewrite them to your database so that they will be entered in the appropriate indexes. That is, you must do the following for each such existing entity:

  1. Lookup (get) the entity.
  2. Write (put) the entity back to your database.

Similarly, changing a property from indexed to excluded only affects entities subsequently written to your database. The index entries for any existing entities with that property will continue to exist until the entities are updated or deleted. To avoid unwanted results, you must purge your code of all queries that filter or sort by the (now excluded) property.

Index limits

Firestore in Datastore mode imposes limits on the number and overall size of index entries that can be associated with a single entity. These limits are large, and most applications are not affected. However, there are circumstances in which you might encounter the limits.

As described above, a Datastore mode database creates an entry in a predefined index for every property of every entity except those you have explicitly declared as excluded from your indexes. The property may also be included in additional, custom indexes declared in your index configuration file (index.yaml). Provided that an entity has no list properties, it will have at most one entry in each such custom index (for non-ancestor indexes) or one for each of the entity's ancestors (for ancestor indexes). Each of these index entries must be updated every time the value of the property changes.

For a property that has a single value for each entity, each possible value needs to be stored just once per entity in the property's predefined index. Even so, it is possible for an entity with a large number of such single-valued properties to exceed the index entry or size limit. Similarly, an entity that can have multiple values for the same property requires a separate index entry for each value; again, if the number of possible values is large, such an entity can exceed the entry limit.

The situation becomes worse in the case of entities with multiple properties, each of which can take on multiple values. To accommodate such an entity, the index must include an entry for every possible combination of property values. Custom indexes that refer to multiple properties, each with multiple values, can "explode" combinatorially, requiring large numbers of entries for an entity with only a relatively small number of possible property values. Such exploding indexes can dramatically increase the storage size of an entity, because of the large number of index entries that must be stored. Exploding indexes also can easily cause the entity to exceed the index entry count or size limit.

Consider the following code:

C#

To learn how to install and use the client library for Cloud Datastore, see Cloud Datastore client libraries. For more information, see the Cloud Datastore C# API reference documentation.

To authenticate to Cloud Datastore, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.

Entity task = new Entity()
{
    Key = _db.CreateKeyFactory("Task").CreateKey("sampleTask"),
    ["tags"] = new ArrayValue() { Values = { "fun", "programming", "learn" } },
    ["collaborators"] = new ArrayValue() { Values = { "alice", "bob", "charlie" } },
    ["created"] = DateTime.UtcNow
};

Go

To learn how to install and use the client library for Cloud Datastore, see Cloud Datastore client libraries. For more information, see the Cloud Datastore Go API reference documentation.

To authenticate to Cloud Datastore, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.

task := &Task{
	Tags:          []string{"fun", "programming", "learn"},
	Collaborators: []string{"alice", "bob", "charlie"},
	Created:       time.Now(),
}

Java

To learn how to install and use the client library for Cloud Datastore, see Cloud Datastore client libraries. For more information, see the Cloud Datastore Java API reference documentation.

To authenticate to Cloud Datastore, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.

Entity task =
    Entity.newBuilder(taskKey)
        .set("tags", "fun", "programming", "learn")
        .set("collaborators", "alice", "bob", "charlie")
        .set("created", Timestamp.now())
        .build();

Node.js

To learn how to install and use the client library for Cloud Datastore, see Cloud Datastore client libraries. For more information, see the Cloud Datastore Node.js API reference documentation.

To authenticate to Cloud Datastore, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.

const task = {
  method: 'insert',
  key: datastore.key('Task'),
  data: {
    tags: ['fun', 'programming', 'learn'],
    collaborators: ['alice', 'bob', 'charlie'],
    created: new Date(),
  },
};

PHP

To learn how to install and use the client library for Cloud Datastore, see Cloud Datastore client libraries. For more information, see the Cloud Datastore PHP API reference documentation.

To authenticate to Cloud Datastore, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.

$task = $datastore->entity(
    $datastore->key('Task'),
    [
        'tags' => ['fun', 'programming', 'learn'],
        'collaborators' => ['alice', 'bob', 'charlie'],
        'created' => new DateTime(),
    ]
);

Python

To learn how to install and use the client library for Cloud Datastore, see Cloud Datastore client libraries. For more information, see the Cloud Datastore Python API reference documentation.

To authenticate to Cloud Datastore, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.

from google.cloud import datastore

# For help authenticating your client, visit
# https://cloud.google.com/docs/authentication/getting-started
client = datastore.Client()

task = datastore.Entity(client.key("Task"))
task.update(
    {
        "tags": ["fun", "programming", "learn"],
        "collaborators": ["alice", "bob", "charlie"],
        "created": datetime.datetime.now(tz=datetime.timezone.utc),
    }
)

Ruby

To learn how to install and use the client library for Cloud Datastore, see Cloud Datastore client libraries. For more information, see the Cloud Datastore Ruby API reference documentation.

To authenticate to Cloud Datastore, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.

task = datastore.entity "Task" do |t|
  t["tags"] = ["fun", "programming", "learn"]
  t["collaborators"] = ["alice", "bob", "charlie"]
  t["created"] = Time.now
end

GQL

Not Applicable

It creates a Task entity with three values for property tags, three values for property collaborators, and created set to the current date. This will require 9 index entries, one for each possible combination of property values:

('fun', 'alice', NOW())
('fun', 'bob', NOW())
('fun', 'charlie', NOW())

('programming', 'alice', NOW())
('programming', 'bob', NOW())
('programming', 'charlie', NOW())

('learn', 'alice', NOW())
('learn', 'bob', NOW())
('learn', 'charlie', NOW())

When the same property is repeated multiple times, Firestore in Datastore mode can detect exploding indexes and suggest an alternative index. However, in all other circumstances (such as the query defined in this example), a Datastore mode database will generate an exploding index. In this case, you can circumvent the exploding index by manually configuring an index in your index configuration file:

indexes:
- kind: Task
  properties:
  - name: tags
  - name: created
- kind: Task
  properties:
  - name: collaborators
  - name: created

This reduces the number of entries needed to only (|tags| * |created| + |collaborators| * |created|), or 6 entries instead of 9:

('fun', NOW())
('programming', NOW())
('learn', NOW())

('alice', NOW())
('bob', NOW())
('charlie', NOW())

Any commit operation that would cause an index to exceed the index entry or size limit will fail. The text of the error describes which limit was exceeded ("Too many indexed properties" or "Index entries too large") and which custom index was the cause. If you create a new index that would exceed the limits for any entity when built, queries against the index will fail and the index will appear in the Error state in the Google Cloud console. To handle such Error indexes,

  1. Remove the index from your index configuration file (index.yaml).
  2. Using the Google Cloud CLI, remove the index from the database using the datastore indexes cleanup command, as described in Deleting unused indexes.
  3. Either
    • reformulate the index definition and corresponding queries, or
    • remove the entities that are causing the index to explode.
  4. Add the index back to index.yaml.
  5. Using the Google Cloud CLI, add the index to the database by running the datastore indexes create command, as described in Updating Indexes.

You can avoid exploding indexes by avoiding queries that would require a custom index using a list property. As described above, this includes queries with multiple sort orders or queries with a mix of equality and inequality filters.

Indexes for projections

Projection queries require all properties specified in the projection to be included in an index. The Datastore emulator automatically generates the needed indexes for you in the index configuration file, index.yaml, which is uploaded with your application.

One way to minimize the number of indexes required is to project the same properties consistently, even when not all of them are always needed. For example, these queries require two separate indexes:

SELECT priority, percent_complete FROM Task

SELECT priority, percent_complete, created FROM Task

However, if you always project properties priority, percent_complete, created, even when created is not required, only one index will be needed.

Converting an existing query into a projection query may require building a new index if the properties in the projection are not already included in another part of the query. For example, suppose you had an existing query like

SELECT * FROM Task
WHERE priority > 1
ORDER BY priority, percent_complete

which requires the index:

indexes:
- kind: Task
  properties:
  - name: priority
  - name: percent_complete

Converting this to either of the projection queries

SELECT created FROM Task
WHERE priority > 1
ORDER BY priority, percent_complete

SELECT priority, percent_complete, created FROM Task
WHERE priority > 1
ORDER BY priority, percent_complete

introduces a new property (created) and thus will require building a new index:

indexes:
- kind: Task
  properties:
  - name: priority
  - name: percent_complete
  - name: created

However,

SELECT priority, percent_complete FROM Task
WHERE priority > 1
ORDER BY priority, percent_complete

would not change the required index, since the projected properties priority and percent_complete were already included in the existing query.

Multiple databases

You can use gcloud firestore to manage a single index for Datastore mode or use gcloud datastore with an index.yaml file to manage all the indexes under a database.

gcloud firestore
gcloud firestore indexes composite create --api-scope=datastore-mode-api  --query-scope=QUERY_SCOPE --database=DATABASE_ID
gcloud datastore
gcloud alpha datastore indexes create index.yaml --database=DATABASE_ID

Replace the following:

  • DATABASE_ID: a database ID.
  • QUERY_SCOPE: either collection-recursive for ancestor indexes or collection-group for non-ancestor indexes.