Firestore in Datastore mode uses indexes for every query your application makes.
These indexes are updated whenever an entity changes, so the results can be
returned quickly when the application makes a query. Datastore mode provides built-in indexes automatically, but needs to
know in advance which composite indexes the application will require. You specify which
composite indexes your application needs in a configuration file. The Datastore emulator can
generate the Datastore mode index configuration automatically as you test your
application. The gcloud
command-line tool provides commands to update the
indexes that are available to your production Datastore mode database.
System requirements
To use the gcloud
tool, you must have installed the Google Cloud SDK.
About index.yaml
Every Datastore mode query made by an application needs a corresponding index.
Indexes for simple queries, such as queries over a single property, are created
automatically. Indexes for complex queries must be defined in a configuration
file named index.yaml
. This file is uploaded with the application to create
indexes in a Datastore mode database.
The Datastore emulator automatically adds items to this file when the
application tries to execute a query that needs an index that does not have an
appropriate entry in the configuration file. You can adjust indexes or create
new ones manually by editing the file. The index.yaml
is located in the
<project-directory>/WEB-INF/
folder. By default, the data directory that
contains WEB-INF/appengine-generated/index.yaml
is
~/.config/gcloud/emulators/datastore/
. See
Datastore emulator project directories for additional details.
The following is an example of an index.yaml
file:
indexes:
- kind: Task
ancestor: no
properties:
- name: done
- name: priority
direction: desc
- kind: Task
properties:
- name: collaborators
direction: asc
- name: created
direction: desc
- kind: TaskList
ancestor: yes
properties:
- name: percent_complete
direction: asc
- name: type
direction: asc
The syntax of index.yaml
is the YAML format. For more information about this
syntax, see the YAML website.
Index definitions
index.yaml
has a single list element called indexes
. Each element in the
list represents an index for the application.
An index element can have the following elements:
kind
- The kind of the entity for the query. This element is required.
properties
A list of properties to include as columns of the index, in the order to be sorted: properties used in equality filters first, followed by the property used in inequality filters, then the sort orders and their directions.
Each element in this list has the following elements:
name
- The Datastore mode name of the property.
direction
- The direction to sort, either
asc
for ascending ordesc
for descending. This is only required for properties used in sort orders of the query, and must match the direction used by the query. The default isasc
.
ancestor
yes
if the query has an ancestor clause. The default isno
.
Automatic and manual indexes
When the Datastore emulator adds a generated index definition to
index.yaml
, it does so below the following line, inserting it if necessary:
# AUTOGENERATED
The emulator considers all index definitions below this line to be automatic, and it may update existing definitions below this line as the application makes queries.
All index definitions above this line are considered to be under manual control,
and are not updated by the emulator. The emulator will only
make changes below the line, and will only do so if the complete index.yaml
file does not describe an index that accounts for a query executed by the
application. To take control of an automatic index definition, move it above
this line.
Updating indexes
The datastore indexes create
command looks at your local Datastore index
configuration (the index.yaml
file), and if the index configuration defines an
index that doesn't exist yet in your production Datastore mode database, your database
creates the new index. See the development workflow using the gcloud
tool
for an example of how to use indexes create
.
To create an index, the database must set up the index and then backfill the index with existing data. Index creation time is the sum of setup time and backfill time:
Setting up an index takes a few minutes. The minimum creation time for an index is a few minutes, even for an empty database.
Backfill time depends on how much existing data belongs in the new index. The more property values that belong in the index, the longer it takes to backfill the index.
If the application performs a query that requires an index that hasn't finished building yet, the query raises an exception. To prevent this, you must be careful about deploying a new version of your application that requires an index before the new index finishes building.
You can check the status of the indexes from the Indexes page in the Cloud Console.
Deleting unused indexes
When you change or remove an index from the index configuration, the original index is not deleted from your Datastore mode database automatically. This gives you the opportunity to leave an older version of the application running while new indexes are being built, or to revert to the older version immediately if a problem is discovered with a newer version.
When you are sure that old indexes are no longer needed, you can delete them
by using the datastore indexes cleanup
command. This command
deletes all indexes for the production Datastore mode instance that are not mentioned
in the local version of index.yaml
. See
the development workflow using the gcloud
tool for an example of how to
use indexes cleanup
.
Command-line arguments
For details on command-line arguments for creating and cleaning indexes, see
datastore indexes create
and datastore indexes cleanup
,
respectively. For details on command-line arguments for the gcloud
tool, see
the gcloud
tool reference.
Managing long-running operations
Index builds are long-running operations and can take a substantial amount of time to complete.
After you start an index build, Datastore mode assigns
the operation a unique name. Operation names are prefixed with projects/[PROJECT_ID]/databases/(default)/operations/
,
for example:
projects/project-id/databases/(default)/operations/ASA1MTAwNDQxNAgadGx1YWZlZAcSeWx0aGdpbi1zYm9qLW5pbWRhEgopEg
However, you can leave out the prefix when specifying an operation name for
the describe
command.
Listing all long-running operations
To list long-running operations, use the gcloud datastore operations list command. This command lists ongoing and recently completed operations. Operations are listed for a few days after completion:
gcloud
gcloud datastore operations list
rest
Before using any of the request data below, make the following replacements:
- project-id: your project ID
HTTP method and URL:
GET https://datastore.googleapis.com/v1/projects/project-id/operations
To send your request, expand one of these options:
See information about the response below.
For example, a recently completed index build shows the following information:
{ "operations": [ { "name": "projects/project-id/operations/S01vcFVpSmdBQ0lDDCoDIGRiNTdiZDQNmE4YS0yMTVmNWUzZSQadGx1YWZlZAcSMXRzYWVzdS1yZXhlZG5pLW5pbWRhFQpWEg", "done": true, "metadata": { "@type": "type.googleapis.com/google.datastore.admin.v1.IndexOperationMetadata", "common": { "endTime": "2020-06-23T16:55:29.923562Z", "operationType": "CREATE_INDEX", "startTime": "2020-06-23T16:55:10Z", "state": "SUCCESSFUL" }, "indexId": "CICAJiUpoMK", "progressEntities": { "workCompleted": "2193027", "workEstimated": "2198182" } }, "response": { "@type": "type.googleapis.com/google.datastore.admin.v1.Index", "ancestor": "NONE", "indexId": "CICAJiUpoMK", "kind": "Task", "projectId": "project-id", "properties": [ { "direction": "ASCENDING", "name": "priority" }, { "direction": "ASCENDING", "name": "done" }, { "direction": "DESCENDING", "name": "created" } ], "state": "READY" } }, ] }
Describing a single operation
Instead of listing all long-running operations, you can list the details of a single operation:
gcloud
Use the operations describe
command to show the status
of an index build.
gcloud datastore operations describe operation-name
rest
Before using any of the request data below, make the following replacements:
- project-id: your project ID
HTTP method and URL:
GET https://datastore.googleapis.com/v1/projects/project-id/operations
To send your request, expand one of these options:
See information about the response below.
Estimating the completion time
As your operation runs, see the value of the state
field
for the overall status of the operation.
A request for the status of a long-running operation also returns the metrics
workEstimated
and workCompleted
. These metrics are returned for the number
of entities. workEstimated
shows the estimated total number of entities an
operation will process, based on database statistics. workCompleted
shows the number of entities processed so far. After the operation completes,
workCompleted
reflects the total number of entities that were
actually processed, which might be different than the value of workEstimated
.
Divide workCompleted
by workEstimated
for a rough progress estimate. The
estimate might be inaccurate because it depends on delayed statistics
collection.
For example, here is the progress status of an index build:
{ "operations": [ { "name": "projects/project-id/operations/AyAyMDBiM2U5NTgwZDAtZGIyYi0zYjc0LTIzYWEtZjg1ZGdWFmZWQHEjF0c2Flc3UtcmV4ZWRuaS1uaW1kYRUKSBI", "metadata": { "@type": "type.googleapis.com/google.datastore.admin.v1.IndexOperationMetadata", "common": { "operationType": "CREATE_INDEX", "startTime": "2020-06-23T16:52:25.697539Z", "state": "PROCESSING" }, "progressEntities": { "workCompleted": "219327", "workEstimated": "2198182" } }, }, ...
When an operation is done, the operation description will contain "done":
true
. See the value of the state
field for
the result of the operation. If the done
field is not set in the response,
then its value is false
. Do not depend on the existence of the done
value
for in-progress operations.