This page describes how to export and import Firestore in Datastore mode entities using
the managed export and import service. The managed export and import service
is available through the Cloud Console, gcloud
command-line tool,
and the Datastore Admin API (REST, RPC).
With the managed export and import service, you can recover from accidental deletion of data and export data for offline processing. You can export all entities or just specific kinds of entities. Likewise, you can import all data from an export or only specific kinds. As you use the managed export and import service, consider the following:
The export service uses eventually consistent reads. You cannot assume an export happens at a single point in time. The export might include entities written after the export begins and exclude entities written before the export begins.
An export does not contain any indexes. When you import data, the required indexes are automatically rebuilt using your database's current index definitions. Per-entity property value index settings are exported and honored during import.
Imports do not assign new IDs to entities. Imports use the IDs that existed at the time of the export and overwrite any existing entity with the same ID. During an import, the IDs are reserved during the time that the entities are being imported. This feature prevents ID collisions with new entities if writes are enabled while an import is running.
If an entity in your database is not affected by an import, it will remain in your database after the import.
Data exported from one Datastore mode database can be imported into another Datastore mode database, even one in another project.
The managed export and import service limits the number of concurrent exports and imports to 50 and allows a maximum of 20 export and import requests per minute for a project. For each request, the service limits the number of entity filter combinations to 100.
The output of a managed export uses the LevelDB log format.
To import only a subset of entities or to import data into BigQuery, you must specify an entity filter in your export.
Before you begin
Before you can use the managed export and import service, you must complete the following tasks.
Enable billing for your Google Cloud project. Only Google Cloud projects with billing enabled can use the export and import functionality.
Create a Cloud Storage bucket in the same location as your Firestore in Datastore mode database. You cannot use a Requester Pays bucket for export and import operations.
Assign an IAM role to your user account that grants the
datastore.databases.export
permission, if you are exporting data, or thedatastore.databases.import
permission, if you are importing data. TheDatastore Import Export Admin
role, for example, grants both permissions.If the Cloud Storage bucket is in another project, give your project's default services account access to the bucket.
Set up gcloud
for your project
If you plan to use gcloud
to start your import and export operations,
set up gcloud
and connect to your project in one of the
following ways:
Access
gcloud
from the Google Cloud Platform console using Cloud Shell.Configure the
gcloud
tool to use your current project:gcloud config set project project-id
Starting managed export and import operations
This section describes how to start a managed export or import operation.
Exporting all entities
Console
Go to the Datastore Entities Export page in the Google Cloud Console.
Set the Namespace field to
All Namespaces
, and set the Kind field toAll Kinds
.Below Destination, enter the name of your Cloud Storage bucket.
Click Export.
The console opens the Entities page and reports the success or failure of your managed export request.
The console also displays a View Status button.
Click this button to open a Cloud Shell terminal that is pre-populated
with the gcloud
command that is needed to see the status of your operation.
Run this command whenever you want to view the status of the operation.
gcloud
Use the gcloud datastore export
command to export all entities in your database.
gcloud datastore export gs://bucket-name --async
where bucket-name is the name of your Cloud Storage bucket and an optional prefix, for example,
bucket-name/firestore-exports/export-name
. You cannot
re-use the same prefix for another export
operation. If you do
not provide a file prefix, the managed export service creates one based on
the current time.
Use the --async
flag to prevent gcloud
from waiting for
the operation to complete. If you omit the --async
flag, you can type
Ctrl+c
to stop waiting for an operation. This will not cancel the operation.
rest
Before using any of the request data below, make the following replacements:
- project-id: your project ID
- bucket-name: your Cloud Storage bucket name
HTTP method and URL:
POST https://datastore.googleapis.com/v1/projects/project-id:export
Request JSON body:
{ "outputUrlPrefix": "gs://bucket-name", }
To send your request, expand one of these options:
You should receive a JSON response similar to the following:
{ "name": "projects/project-id/operations/operation-id", "metadata": { "@type": "type.googleapis.com/google.datastore.admin.v1.ExportEntitiesMetadata", "common": { "startTime": "2019-09-18T18:42:26.591949Z", "operationType": "EXPORT_ENTITIES", "state": "PROCESSING" }, "entityFilter": {}, "outputUrlPrefix": "gs://bucket-name/2019-09-18T18:42:26_85726" } }
Exporting specific kinds or namespaces
To export a specific subset of kinds and/or namespaces, provide an entity filter with values for kinds and namespace IDs. Each request is limited to 100 entity filter combinations, where each combination of filtered kind and namespace counts as a separate filter towards this limit.
Console
In the console, you can select either all kinds or one specific kind. Similarly, you can select all namespaces or one specific namespace.
To specify a list of namespaces and kinds to export, use gcloud
instead.
Go to the Datastore Export page in the Google Cloud Console.
Set the Namespace field to
All Namespaces
or to the name of one of your namespaces.Set the Kind field to
All Kinds
or to the name a kind.Under Destination, enter the name of your Cloud Storage bucket.
Click Export.
The console opens the Entities page and reports the success or failure of your managed export request.
The console also displays a View Status button.
Click this button to open a Cloud Shell terminal that is pre-populated
with the gcloud
command that is needed to see the status of your operation.
Run this command whenever you want to view the status of the operation.
gcloud
gcloud datastore export --kinds="KIND1,KIND2" --namespaces="(default),NAMESPACE2" gs://bucket-name --async
where bucket-name is the name of your Cloud Storage bucket and an optional prefix, for example,
bucket-name/firestore-exports/export-name
. You cannot
re-use the same prefix for another export
operation. If you do
not provide a file prefix, the managed export service creates one based on
the current time.
Use the --async
flag to prevent gcloud
from waiting for
the operation to complete. If you omit the --async
flag, you can type
Ctrl+c
to stop waiting for an operation. This will not cancel the operation.
rest
Before using any of the request data below, make the following replacements:
- project-id: your project ID
- bucket-name: your Cloud Storage bucket name
- kind: the entity kind
- namespace: the namespace ID (use "" for the default namespace ID)
HTTP method and URL:
POST https://datastore.googleapis.com/v1/projects/project-id:export
Request JSON body:
{ "outputUrlPrefix": "gs://bucket-name", "entityFilter": { "kinds": ["kind"], "namespaceIds": ["namespace"], }, }
To send your request, expand one of these options:
You should receive a JSON response similar to the following:
{ "name": "projects/project-id/operations/operation-id", "metadata": { "@type": "type.googleapis.com/google.datastore.admin.v1.ExportEntitiesMetadata", "common": { "startTime": "2019-09-18T21:17:36.232704Z", "operationType": "EXPORT_ENTITIES", "state": "PROCESSING" }, "entityFilter": { "kinds": [ "Task" ], "namespaceIds": [ "" ] }, "outputUrlPrefix": "gs://bucket-name/2019-09-18T21:17:36_82974" } }
Metadata files
An export operation creates a metadata file for each namespace-kind pair
specified. Metadata files are typically named
NAMESPACE_NAME_KIND_NAME.export_metadata
. However, if a namespace or kind
would create an invalid Cloud Storage object name, the file will be named
export[NUM].export_metadata
.
The metadata files are protocol buffers and can be decoded with the
protoc
protocol compiler.
For example, you can decode a metadata file to determine the namespace
and kinds the export files contain:
protoc --decode_raw < export0.export_metadata
Importing all entities
Console
Go to the Datastore Import page in the Google Cloud Console.
In the
File
field, click Browse and select anoverall_export_metadata
file.Set the Namespace field to
All Namespaces
, and set the Kind field toAll Kinds
.Click Import.
The console opens the Entities page and reports the success or failure of your managed import request.
The console also displays a View Status button.
Click this button to open a Cloud Shell terminal that is pre-populated
with the gcloud
command that is needed to see the status of your operation.
Run this command whenever you want to view the status of the operation.
gcloud
Use the gcloud datastore import command to import all entities that were previously exported with the managed export service.
gcloud datastore import gs://bucket-name/file-path/file-name.overall_export_metadata --async
where bucket-name/file-path/file-name is the path to your
overall_export_metadata
file within your Cloud Storage bucket.
Use the --async
flag to prevent gcloud
from waiting for
the operation to complete. If you omit the --async
flag, you can type
Ctrl+c
to stop waiting for an operation. This will not cancel the operation.
rest
Before using any of the request data below, make the following replacements:
- project-id: your project ID
- bucket-name: your Cloud Storage bucket name
- object-name: your Cloud Storage object name (example:
2017-05-25T23:54:39_76544/2017-05-25T23:54:39_76544.overall_export_metadata
HTTP method and URL:
POST https://datastore.googleapis.com/v1/projects/project-id:import
Request JSON body:
{ "inputUrl": "gs://bucket-name/object-name", }
To send your request, expand one of these options:
You should receive a JSON response similar to the following:
{ "name": "projects/project-id/operations/operation-id", "metadata": { "@type": "type.googleapis.com/google.datastore.admin.v1.ImportEntitiesMetadata", "common": { "startTime": "2019-09-18T21:25:02.863621Z", "operationType": "IMPORT_ENTITIES", "state": "PROCESSING" }, "entityFilter": {}, "inputUrl": "gs://bucket-name/2019-09-18T18:42:26_85726/2019-09-18T18:42:26_85726.overall_export_metadata" } }
Locating your overall_export_metadata
file
You can determine the value to use for the import location by using the Cloud Storage browser in the Google Cloud Console:
Open the Cloud Storage Browser
You can also list and describe completed
operations. The outputURL
field shows the name of
the overall_export_metadata
file:
"outputUrl": "gs://bucket-name/2017-05-25T23:54:39_76544/2017-05-25T23:54:39_76544.overall_export_metadata",
Importing specific kinds or namespaces
To import a specific subset of kinds and/or namespaces, provide an entity filter with values for kinds and namespace IDs.
You can specify kinds and namespaces only if the export files were created with an entity filter. You can not import a subset of kinds and namespaces from an export of all entities.
Console
In the console, you can select either all kinds or one specific kind. Similarly, you can select all namespaces or one specific namespace.
To specify a list of namespaces and kinds to import, use gcloud
instead.
Go to the Datastore Import page in the Google Cloud Console.
In the
File
field, click Browse and select anoverall_export_metadata
file.Set the Namespace field to
All Namespaces
or to a specific namespace.Set the Kind field to
All Kinds
or to a specifc kind.Click Import.
The console opens the Entities page and reports the success or failure of your managed import request.
The console also displays a View Status button.
Click this button to open a Cloud Shell terminal that is pre-populated
with the gcloud
command that is needed to see the status of your operation.
Run this command whenever you want to view the status of the operation.
gcloud
gcloud datastore import --kinds="KIND1,KIND2" --namespaces="(default),NAMESPACE2" gs://bucket-name/file-path/file-name.overall_export_metadata --async
where bucket-name/file-path/file-name is the path to your
overall_export_metadata
file within your Cloud Storage bucket.
Use the --async
flag to prevent gcloud
from waiting for
the operation to complete. If you omit the --async
flag, you can type
Ctrl+c
to stop waiting for an operation. This will not cancel the operation.
rest
Before using any of the request data below, make the following replacements:
- project-id: your project ID
- bucket-name: your Cloud Storage bucket name
- object-name: your Cloud Storage object name (example:
2017-05-25T23:54:39_76544/2017-05-25T23:54:39_76544.overall_export_metadata
- kind: the entity kind
- namespace: the namespace ID (use "" for the default namespace ID)
HTTP method and URL:
POST https://datastore.googleapis.com/v1/projects/project-id:import
Request JSON body:
{ "inputUrl": "gs://bucket-name/object-name", "entityFilter": { "kinds": ["kind"], "namespaceIds": ["namespace"], }, }
To send your request, expand one of these options:
You should receive a JSON response similar to the following:
{ "name": "projects/project-id/operations/operation-id", "metadata": { "@type": "type.googleapis.com/google.datastore.admin.v1.ImportEntitiesMetadata", "common": { "startTime": "2019-09-18T21:51:02.830608Z", "operationType": "IMPORT_ENTITIES", "state": "PROCESSING" }, "entityFilter": { "kinds": [ "Task" ], "namespaceIds": [ "" ] }, "inputUrl": "gs://bucket-name/2019-09-18T21:49:25_96833/2019-09-18T21:49:25_96833.overall_export_metadata" } }
Import transformations
An import operation updates entity keys and key reference properties in the import data with the project ID of the destination project. If this update increases your entity sizes, it can cause "entity is too big" or "index entries too large" errors for import operations.
To avoid either error, import into a destination project with a shorter project ID.
Managing long-running operations
Managed import and export operations are long-running operations. These method calls can take a substantial amount of time to complete.
After you start an export or import operation, Datastore mode assigns the operation a unique name. You can use the operation name to delete, cancel, or status check the operation.
Operation names are prefixed with projects/[PROJECT_ID]/databases/(default)/operations/
,
for example:
projects/project-id/databases/(default)/operations/ASA1MTAwNDQxNAgadGx1YWZlZAcSeWx0aGdpbi1zYm9qLW5pbWRhEgopEg
However, you can leave out the prefix when specifying an operation name for
the describe
, cancel
, and delete
commands.
Listing all long-running operations
To list long-running operations, use the gcloud datastore operations list command. This command lists ongoing and recently completed operations. Operations are listed for a few days after completion:
gcloud
gcloud datastore operations list
rest
Before using any of the request data below, make the following replacements:
- project-id: your project ID
HTTP method and URL:
GET https://datastore.googleapis.com/v1/projects/project-id/operations
To send your request, expand one of these options:
See information about the response below.
For example, a recently completed export operation shows the following information:
{ "operations": [ { "name": "projects/project-id/operations/ASAyMDAwOTEzBxp0bHVhZmVkBxJsYXJ0bmVjc3Utc2Jvai1uaW1kYRQKKhI", "metadata": { "@type": "type.googleapis.com/google.datastore.admin.v1.ExportEntitiesMetadata", "common": { "startTime": "2017-12-05T23:01:39.583780Z", "endTime": "2017-12-05T23:54:58.474750Z", "operationType": "EXPORT_ENTITIES" }, "progressEntities": { "workCompleted": "21933027", "workEstimated": "21898182" }, "progressBytes": { "workCompleted": "12421451292", "workEstimated": "9759724245" }, "entityFilter": { "namespaceIds": [ "" ] }, "outputUrlPrefix": "gs://bucket-name" }, "done": true, "response": { "@type": "type.googleapis.com/google.datastore.admin.v1.ExportEntitiesResponse", "outputUrl": "gs://bucket-name/2017-05-25T23:54:39_76544/2017-05-25T23:54:39_76544.overall_export_metadata" } } ] }
Describing a single operation
Instead of listing all long-running operations, you can list the details of a single operation:
gcloud
Use the operations describe
command to show the status
of an export or import operation.
gcloud datastore operations describe operation-name
rest
Before using any of the request data below, make the following replacements:
- project-id: your project ID
- operation-name: the operation name
HTTP method and URL:
GET https://datastore.googleapis.com/v1/projects/project-id/operations/operation-name
To send your request, expand one of these options:
You should receive a JSON response similar to the following:
{ "name": "projects/project-id/operations/ASA3ODAwMzQxNjIyChp0bHVhZmVkBxJsYXJ0bmVjc3Utc2Jvai1uaW1kYRQKLRI", "metadata": { "@type": "type.googleapis.com/google.datastore.admin.v1.ExportEntitiesMetadata", "common": { "startTime": "2019-10-08T20:07:28.105236Z", "endTime": "2019-10-08T20:07:36.310653Z", "operationType": "EXPORT_ENTITIES", "state": "SUCCESSFUL" }, "progressEntities": { "workCompleted": "21", "workEstimated": "21" }, "progressBytes": { "workCompleted": "2272", "workEstimated": "2065" }, "entityFilter": {}, "outputUrlPrefix": "gs://bucket-name/2019-10-08T20:07:28_28481" }, "done": true, "response": { "@type": "type.googleapis.com/google.datastore.admin.v1.ExportEntitiesResponse", "outputUrl": "gs://bucket-name/2019-10-08T20:07:28_28481/2019-10-08T20:07:28_28481.overall_export_metadata" } }
Estimating the completion time
As your operation runs, see the value of the state
field
for the overall status of the operation.
A request for the status of a long-running operation also returns the metrics
workEstimated
and workCompleted
. Each of these metrics is returned in both
number of bytes and number of entities. workEstimated
shows the estimated
total number of bytes and entities an operation will process, based on
database statistics. workCompleted
shows the number of
bytes and entities processed so far. After the operation completes,
workCompleted
reflects the total number of bytes and entities that were
actually processed, which might be larger than the value of workEstimated
.
Divide workCompleted
by workEstimated
for a rough progress estimate. The
estimate might be inaccurate because it depends on delayed statistics
collection.
For example, here is the progress status of an export operation:
{ "operations": [ { "name": "projects/project-id/operations/ASAyMDAwOTEzBxp0bHVhZmVkBxJsYXJ0bmVjc3Utc2Jvai1uaW1kYRQKKhI", "metadata": { "@type": "type.googleapis.com/google.datastore.admin.v1.ExportEntitiesMetadata", ... "progressEntities": { "workCompleted": "1", "workEstimated": "3" }, "progressBytes": { "workCompleted": "85", "workEstimated": "257" }, ...
When an operation is done, the operation description will contain "done":
true
. See the value of the state
field for
the result of the operation. If the done
field is not set in the response,
then its value is false
. Do not depend on the existence of the done
value
for in-progress operations.
Cancel an operation
Use the operations cancel
command to stop an operation in progress:
gcloud datastore operations cancel operation-name
Cancelling a running operation does not undo the operation. A cancelled export operation leaves documents already exported in Cloud Storage, and a cancelled import operation leaves in place updates already made to your database. You cannot import a partially completed export.
Delete an operation
Use the operations delete
command to remove an operation from the
output of operations list
. This command will not delete export files from
Cloud Storage.
gcloud datastore operations delete operation-name
Billing and pricing for managed exports and imports
You are required to enable billing for your Google Cloud project before you use the managed export and import service. Export and import operations contribute to your Google Cloud costs in the following ways:
- Entity reads and writes performed by export and import operations count towards your Firestore in Datastore mode costs.
- Output files stored in Cloud Storage count towards your Cloud Storage data storage costs.
The costs of export and import operations do not count towards the App Engine spending limit. Export or import operations will not trigger any Google Cloud budget alerts until after completion. Similarly, reads and writes performed during an export or import operation are applied to your daily quota after the operation is complete.
Viewing export and import costs
Export and import operations apply the goog-firestoremanaged:exportimport
label to billed operations. In the Cloud Billing reports page,
you can use this label to view costs related to import and export operations:
Permissions
To run export and import operations, your user account and your project's default service account require the Identity and Access Management permissions described below.
User account permissions
The user account or service account initiating the operation
requires the datastore.databases.export
and datastore.databases.import
IAM permissions. If you are the project owner, your account has
the required permissions. Otherwise, the following IAM roles
grant the necessary permissions:
- Datastore Owner
- Datastore Import Export Admin
A project owner can grant one of these roles by following the steps in Grant access.
Default service account permissions
Each Google Cloud project automatically creates a default service account
named PROJECT_ID@appspot.gserviceaccount.com
. Export and
import operations use this service account to authorize Cloud Storage
operations.
Your project's default service account requires access to the Cloud Storage bucket used in an export or import operation. If your Cloud Storage bucket is in the same project as your Datastore mode database, then the default service account has access to the bucket by default.
If the Cloud Storage bucket is in another project, then you must give the default service account access to the Cloud Storage bucket.
Assign roles to the default service account
You can use the gsutil command-line tool to assign one of the roles below. For example, to assign the Storage Admin role to the default service account run:
gsutil iam ch serviceAccount:[PROJECT_ID]@appspot.gserviceaccount.com:roles/storage.admin \ gs://[BUCKET_NAME]
Alternatively, you can assign this role using the Cloud Console.
Export operations
For export operations involving a bucket in another project, modify the permissions of the bucket to assign one of the following Cloud Storage roles to the default service account of the project containing your Datastore mode database:
- Storage Admin
- Storage Object Admin
- Storage Legacy Bucket Writer
You can also create an IAM custom role with slightly different permissions than the ones contained in the roles listed above:
storage.buckets.get
storage.objects.create
storage.objects.list
Import operations
For import operations involving a Cloud Storage bucket in another project, modify the permissions of the bucket to assign one of the following Cloud Storage roles to the default service account of the project containing your Datastore mode database:
- Storage Admin
- Both Storage Object Viewer and Storage Legacy Bucket Reader
You can also create an IAM custom role with the following permissions:
storage.buckets.get
storage.objects.get
Disabled or deleted default service account
If you disable or delete your App Engine default service account, your App Engine app will lose access to your Datastore mode database. If you disabled your App Engine service account, you can re-enable it, see enabling a service account. If you deleted your App Engine service account within the last 30 days, you can restore your service account, see undeleting a service account.
Differences from Datastore Admin backups
If you previously used the Datastore Admin console for backups, you should note the following differences:
Exports created by a managed export do not appear in the Datastore Admin console. Managed exports and imports are a new service that does not share data with App Engine's backup and restore functionality, which is administered through the Cloud Console.
The managed export and import service does not support the same metadata as the Datastore Admin backup and does not store progress status in your database. For information on checking the progress of export and import operations, see Managing long-running operations
You cannot view service logs of managed export and import operations.
The managed import service is backwards compatible with Datastore Admin backup files. You can import a Datastore Admin backup file using the managed import service, but you cannot import the output of a managed export using the Datastore Admin console.
Audit logs
Firestore in Datastore mode writes Admin Activity audit logs for Cloud Audit Logs. Admin Activity audit logs include export operations, import operations, and indexing operations. To view Admin Activity audit logs for your Datastore mode database, see Viewing audit logs.
Datastore mode Admin Activity audit logs appear under the
Cloud Datastore Database
and Cloud Datastore Index
resource types.
Both Firestore and Datastore Admin Activity logs appear
under these resource types. Firestore in Datastore mode logs the following operations:
Audit logs category | Datastore mode operations |
---|---|
Admin Activity | DatastoreAdmin.CreateIndex DatastoreAdmin.DeleteIndex DatastoreAdmin.ExportEntities DatastoreAdmin.GetIndex DatastoreAdmin.ImportEntities DatastoreAdmin.ListIndexes |
Importing into BigQuery
To import data from a managed export into BigQuery, see Loading Datastore export service data.
Data exported without specifying an entity filter cannot be loaded into BigQuery. If you want to import data into BigQuery, your export request must include one or more kind names in the entity filter.
BigQuery column limit
BigQuery imposes a limit of 10,000 columns per table. Export operations generate a BigQuery table schema for each kind. In this schema, each unique property within a kind's entities becomes a schema column.
If a kind's BigQuery schema surpasses 10,000 columns, the export operation attempts to stay under the column limit by treating embedded entities as blobs. If this conversion brings the number of columns in the schema under 10,000, you can load the data into BigQuery, but you cannot query the properties within embedded entities. If the number of columns still exceeds 10,000, the export operation does not generate a BigQuery schema for the kind and you cannot load its data into BigQuery.