Exporting and Importing Entities

This page describes how to export and import Cloud Firestore in Datastore mode entities using the managed export and import service. The managed export and import service is available through the Cloud Console, gcloud command-line tool, and the Datastore Admin API (REST, RPC).

With the managed export and import service, you can recover from accidental deletion of data and export data for offline processing. You can export all entities or just specific kinds of entities. Likewise, you can import all data from an export or only specific kinds. As you use the managed export and import service, consider the following:

  • The export service uses eventually consistent reads. You cannot assume an export happens at a single point in time. The export might include entities written after the export begins and exclude entities written before the export begins.

  • An export does not contain any indexes. When you import data, the required indexes are automatically rebuilt using your database's current index definitions. Per-entity property value index settings are exported and honored during import.

  • Imports do not assign new IDs to entities. Imports use the IDs that existed at the time of the export and overwrite any existing entity with the same ID. During an import, the IDs are reserved during the time that the entities are being imported. This feature prevents ID collisions with new entities if writes are enabled while an import is running.

  • If an entity in your database is not affected by an import, it will remain in your database after the import.

  • Data exported from one Datastore mode database can be imported into another Datastore mode database, even one in another project.

  • The managed export and import service limits the number of concurrent exports and imports to 50 and allows a maximum of 20 export and import requests per minute for a project.

  • The output of a managed export uses the LevelDB log format.

  • To import only a subset of entities or to import data into BigQuery, you must specify an entity filter in your export.

Before you begin

Before you can use the managed export and import service, you must complete the following tasks.

  1. Enable billing for your Google Cloud project. Only Google Cloud projects with billing enabled can use the export and import functionality.

  2. Create a Cloud Storage bucket in the same location as your Cloud Firestore in Datastore mode database. You cannot use a Requester Pays bucket for export and import operations.

  3. Assign an IAM role to your user account that grants the datastore.databases.export permission, if you are exporting data, or the datastore.databases.import permission, if you are importing data. The Datastore Import Export Admin role, for example, grants both permissions.

  4. If the Cloud Storage bucket is in another project, give your project's default services account access to the bucket.

Set up gcloud for your project

If you plan to use gcloud to start your import and export operations, set up gcloud and connect to your project in one of the following ways:

Starting managed export and import operations

This section describes how to start a managed export or import operation.

Exporting all entities

Console

  1. Go to the Datastore Entities Export page in the Google Cloud Console.

    Go to the Datastore Export page

  2. Set the Namespace field to All Namespaces, and set the Kind field to All Kinds.

  3. Below Destination, enter the name of your Cloud Storage bucket.

  4. Click Export.

The console opens the Entities page and reports the success or failure of your managed export request.

The console also displays a View Status button. Click this button to open a Cloud Shell terminal that is pre-populated with the gcloud command that is needed to see the status of your operation.

Run this command whenever you want to view the status of the operation.

gcloud

Use the gcloud datastore export command to export all entities in your database.

 gcloud datastore export gs://bucket-name --async

where bucket-name is the name of your Cloud Storage bucket and an optional prefix, for example, bucket-name/firestore-exports/export-name. You cannot re-use the same prefix for another export operation. If you do not provide a file prefix, the managed export service creates one based on the current time.

Use the --async flag to prevent gcloud from waiting for the operation to complete. If you omit the --async flag, you can type Ctrl+c to stop waiting for an operation. This will not cancel the operation.

rest

Before using any of the request data below, make the following replacements:

  • project-id: your project ID
  • bucket-name: your Cloud Storage bucket name

HTTP method and URL:

POST https://datastore.googleapis.com/v1/projects/project-id:export

Request JSON body:

{
  "outputUrlPrefix": "gs://bucket-name",
}

To send your request, expand one of these options:

You should receive a JSON response similar to the following:

{
  "name": "projects/project-id/operations/operation-id",
  "metadata": {
    "@type": "type.googleapis.com/google.datastore.admin.v1.ExportEntitiesMetadata",
    "common": {
      "startTime": "2019-09-18T18:42:26.591949Z",
      "operationType": "EXPORT_ENTITIES",
      "state": "PROCESSING"
    },
    "entityFilter": {},
    "outputUrlPrefix": "gs://bucket-name/2019-09-18T18:42:26_85726"
  }
}
The response is a long-running operation, which you can check for completion.

Exporting specific kinds or namespaces

To export a specific subset of kinds and/or namespaces, provide an entity filter with values for kinds and namespace IDs.

Console

In the console, you can select either all kinds or one specific kind. Similarly, you can select all namespaces or one specific namespace.

To specify a list of namespaces and kinds to export, use gcloud instead.

  1. Go to the Datastore Export page in the Google Cloud Console.

    Go to the Datastore Export page

  2. Set the Namespace field to All Namespaces or to the name of one of your namespaces.

  3. Set the Kind field to All Kinds or to the name a kind.

  4. Under Destination, enter the name of your Cloud Storage bucket.

  5. Click Export.

The console opens the Entities page and reports the success or failure of your managed export request.

The console also displays a View Status button. Click this button to open a Cloud Shell terminal that is pre-populated with the gcloud command that is needed to see the status of your operation.

Run this command whenever you want to view the status of the operation.

gcloud

gcloud datastore export --kinds="KIND1,KIND2" --namespaces="(default),NAMESPACE2" gs://bucket-name --async

where bucket-name is the name of your Cloud Storage bucket and an optional prefix, for example, bucket-name/firestore-exports/export-name. You cannot re-use the same prefix for another export operation. If you do not provide a file prefix, the managed export service creates one based on the current time.

Use the --async flag to prevent gcloud from waiting for the operation to complete. If you omit the --async flag, you can type Ctrl+c to stop waiting for an operation. This will not cancel the operation.

rest

Before using any of the request data below, make the following replacements:

  • project-id: your project ID
  • bucket-name: your Cloud Storage bucket name
  • kind: the entity kind
  • namespace: the namespace ID (use "" for the default namespace ID)

HTTP method and URL:

POST https://datastore.googleapis.com/v1/projects/project-id:export

Request JSON body:

{
  "outputUrlPrefix": "gs://bucket-name",
  "entityFilter": {
    "kinds": ["kind"],
    "namespaceIds": ["namespace"],
  },
}

To send your request, expand one of these options:

You should receive a JSON response similar to the following:

{
  "name": "projects/project-id/operations/operation-id",
  "metadata": {
    "@type": "type.googleapis.com/google.datastore.admin.v1.ExportEntitiesMetadata",
    "common": {
      "startTime": "2019-09-18T21:17:36.232704Z",
      "operationType": "EXPORT_ENTITIES",
      "state": "PROCESSING"
    },
    "entityFilter": {
      "kinds": [
        "Task"
      ],
      "namespaceIds": [
        ""
      ]
    },
    "outputUrlPrefix": "gs://bucket-name/2019-09-18T21:17:36_82974"
  }
}
The response is a long-running operation, which you can check for completion.

Metadata files

An export operation creates a metadata file for each namespace-kind pair specified. Metadata files are typically named NAMESPACE_NAME_KIND_NAME.export_metadata. However, if a namespace or kind would create an invalid Cloud Storage object name, the file will be named export[NUM].export_metadata.

The metadata files are protocol buffers and can be decoded with the protoc protocol compiler. For example, you can decode a metadata file to determine the namespace and kinds the export files contain:

protoc --decode_raw < export0.export_metadata

Importing all entities

Console

  1. Go to the Datastore Import page in the Google Cloud Console.

    Go to the Datastore Import page

  2. In the File field, click Browse and select an overall_export_metadata file.

  3. Set the Namespace field to All Namespaces, and set the Kind field to All Kinds.

  4. Click Import.

The console opens the Entities page and reports the success or failure of your managed import request.

The console also displays a View Status button. Click this button to open a Cloud Shell terminal that is pre-populated with the gcloud command that is needed to see the status of your operation.

Run this command whenever you want to view the status of the operation.

gcloud

Use the gcloud datastore import command to import all entities that were previously exported with the managed export service.

gcloud datastore import gs://bucket-name/file-path/file-name.overall_export_metadata --async

where bucket-name/file-path/file-name is the path to your overall_export_metadata file within your Cloud Storage bucket.

Use the --async flag to prevent gcloud from waiting for the operation to complete. If you omit the --async flag, you can type Ctrl+c to stop waiting for an operation. This will not cancel the operation.

rest

Before using any of the request data below, make the following replacements:

  • project-id: your project ID
  • bucket-name: your Cloud Storage bucket name
  • object-name: your Cloud Storage object name (example: 2017-05-25T23:54:39_76544/2017-05-25T23:54:39_76544.overall_export_metadata

HTTP method and URL:

POST https://datastore.googleapis.com/v1/projects/project-id:import

Request JSON body:

{
  "inputUrl": "gs://bucket-name/object-name",
}

To send your request, expand one of these options:

You should receive a JSON response similar to the following:

{
  "name": "projects/project-id/operations/operation-id",
  "metadata": {
    "@type": "type.googleapis.com/google.datastore.admin.v1.ImportEntitiesMetadata",
    "common": {
      "startTime": "2019-09-18T21:25:02.863621Z",
      "operationType": "IMPORT_ENTITIES",
      "state": "PROCESSING"
    },
    "entityFilter": {},
    "inputUrl": "gs://bucket-name/2019-09-18T18:42:26_85726/2019-09-18T18:42:26_85726.overall_export_metadata"
  }
}
The response is a long-running operation, which you can check for completion.

Locating your overall_export_metadata file

You can determine the value to use for the import location by using the Cloud Storage browser in the Google Cloud Console:

Open the Cloud Storage Browser

You can also list and describe completed operations. The outputURL field shows the name of the overall_export_metadata file:

"outputUrl": "gs://bucket-name/2017-05-25T23:54:39_76544/2017-05-25T23:54:39_76544.overall_export_metadata",

Importing specific kinds or namespaces

To import a specific subset of kinds and/or namespaces, provide an entity filter with values for kinds and namespace IDs.

You can specify kinds and namespaces only if the export files were created with an entity filter. You can not import a subset of kinds and namespaces from an export of all entities.

Console

In the console, you can select either all kinds or one specific kind. Similarly, you can select all namespaces or one specific namespace.

To specify a list of namespaces and kinds to import, use gcloud instead.

  1. Go to the Datastore Import page in the Google Cloud Console.

    Go to the Datastore Import page

  2. In the File field, click Browse and select an overall_export_metadata file.

  3. Set the Namespace field to All Namespaces or to a specific namespace.

  4. Set the Kind field to All Kinds or to a specifc kind.

  5. Click Import.

The console opens the Entities page and reports the success or failure of your managed import request.

The console also displays a View Status button. Click this button to open a Cloud Shell terminal that is pre-populated with the gcloud command that is needed to see the status of your operation.

Run this command whenever you want to view the status of the operation.

gcloud

gcloud datastore import --kinds="KIND1,KIND2" --namespaces="(default),NAMESPACE2" gs://bucket-name/file-path/file-name.overall_export_metadata --async

where bucket-name/file-path/file-name is the path to your overall_export_metadata file within your Cloud Storage bucket.

Use the --async flag to prevent gcloud from waiting for the operation to complete. If you omit the --async flag, you can type Ctrl+c to stop waiting for an operation. This will not cancel the operation.

rest

Before using any of the request data below, make the following replacements:

  • project-id: your project ID
  • bucket-name: your Cloud Storage bucket name
  • object-name: your Cloud Storage object name (example: 2017-05-25T23:54:39_76544/2017-05-25T23:54:39_76544.overall_export_metadata
  • kind: the entity kind
  • namespace: the namespace ID (use "" for the default namespace ID)

HTTP method and URL:

POST https://datastore.googleapis.com/v1/projects/project-id:import

Request JSON body:

{
  "inputUrl": "gs://bucket-name/object-name",
  "entityFilter": {
    "kinds": ["kind"],
    "namespaceIds": ["namespace"],
  },
}

To send your request, expand one of these options:

You should receive a JSON response similar to the following:

{
  "name": "projects/project-id/operations/operation-id",
  "metadata": {
    "@type": "type.googleapis.com/google.datastore.admin.v1.ImportEntitiesMetadata",
    "common": {
      "startTime": "2019-09-18T21:51:02.830608Z",
      "operationType": "IMPORT_ENTITIES",
      "state": "PROCESSING"
    },
    "entityFilter": {
      "kinds": [
        "Task"
      ],
      "namespaceIds": [
        ""
      ]
    },
    "inputUrl": "gs://bucket-name/2019-09-18T21:49:25_96833/2019-09-18T21:49:25_96833.overall_export_metadata"
  }
}
The response is a long-running operation, which you can check for completion.

Managing long-running operations

Managed import and export operations are long-running operations. These method calls can take a substantial amount of time to complete.

After you start an export or import operation, Datastore mode assigns the operation a unique name. You can use the operation name to delete, cancel, or status check the operation.

Operation names are prefixed with projects/[PROJECT_ID]/databases/(default)/operations/, for example:

projects/project-id/databases/(default)/operations/ASA1MTAwNDQxNAgadGx1YWZlZAcSeWx0aGdpbi1zYm9qLW5pbWRhEgopEg

However, you can leave out the prefix when specifying an operation name for the describe, cancel, and deletecommands.

Listing all long-running operations

To list long-running operations, use the gcloud datastore operations list command. This command lists ongoing and recently completed operations. Operations are listed for a few days after completion:

gcloud

gcloud datastore operations list

rest

Before using any of the request data below, make the following replacements:

  • project-id: your project ID

HTTP method and URL:

GET https://datastore.googleapis.com/v1/projects/project-id/operations

To send your request, expand one of these options:

See information about the response below.

For example, a recently completed export operation shows the following information:

{
  "operations": [
    {
      "name": "projects/project-id/operations/ASAyMDAwOTEzBxp0bHVhZmVkBxJsYXJ0bmVjc3Utc2Jvai1uaW1kYRQKKhI",
      "metadata": {
        "@type": "type.googleapis.com/google.datastore.admin.v1.ExportEntitiesMetadata",
        "common": {
          "startTime": "2017-12-05T23:01:39.583780Z",
          "endTime": "2017-12-05T23:54:58.474750Z",
          "operationType": "EXPORT_ENTITIES"
        },
        "progressEntities": {
          "workCompleted": "21933027",
          "workEstimated": "21898182"
        },
        "progressBytes": {
          "workCompleted": "12421451292",
          "workEstimated": "9759724245"
        },
        "entityFilter": {
          "namespaceIds": [
            ""
          ]
        },
        "outputUrlPrefix": "gs://bucket-name"
      },
      "done": true,
      "response": {
        "@type": "type.googleapis.com/google.datastore.admin.v1.ExportEntitiesResponse",
        "outputUrl": "gs://bucket-name/2017-05-25T23:54:39_76544/2017-05-25T23:54:39_76544.overall_export_metadata"
      }
    }
  ]
}

Describe a single operation

Instead of listing all long-running operations, you can list the details of a single operation:

gcloud

Use the operations describe command to show the status of an export or import operation.

gcloud datastore operations describe operation-name

rest

Before using any of the request data below, make the following replacements:

  • project-id: your project ID
  • operation-name: the operation name

HTTP method and URL:

GET https://datastore.googleapis.com/v1/projects/project-id/operations/operation-name

To send your request, expand one of these options:

You should receive a JSON response similar to the following:

{
  "name": "projects/project-id/operations/ASA3ODAwMzQxNjIyChp0bHVhZmVkBxJsYXJ0bmVjc3Utc2Jvai1uaW1kYRQKLRI",
  "metadata": {
    "@type": "type.googleapis.com/google.datastore.admin.v1.ExportEntitiesMetadata",
    "common": {
      "startTime": "2019-10-08T20:07:28.105236Z",
      "endTime": "2019-10-08T20:07:36.310653Z",
      "operationType": "EXPORT_ENTITIES",
      "state": "SUCCESSFUL"
    },
    "progressEntities": {
      "workCompleted": "21",
      "workEstimated": "21"
    },
    "progressBytes": {
      "workCompleted": "2272",
      "workEstimated": "2065"
    },
    "entityFilter": {},
    "outputUrlPrefix": "gs://bucket-name/2019-10-08T20:07:28_28481"
  },
  "done": true,
  "response": {
    "@type": "type.googleapis.com/google.datastore.admin.v1.ExportEntitiesResponse",
    "outputUrl": "gs://bucket-name/2019-10-08T20:07:28_28481/2019-10-08T20:07:28_28481.overall_export_metadata"
  }
}

Estimating the completion time

As your operation runs, see the value of the state field for the overall status of the operation.

A request for the status of a long-running operation also returns the metrics workEstimated and workCompleted. Each of these metrics is returned in both number of bytes and number of entities. workEstimated shows the estimated total number of bytes and entities an operation will process, based on database statistics. workCompleted shows the number of bytes and entities processed so far. After the operation completes, workCompleted reflects the total number of bytes and entities that were actually processed, which might be larger than the value of workEstimated.

Divide workCompleted by workEstimated for a rough progress estimate. The estimate might be inaccurate because it depends on delayed statistics collection.

For example, here is the progress status of an export operation:

{
  "operations": [
    {
      "name": "projects/project-id/operations/ASAyMDAwOTEzBxp0bHVhZmVkBxJsYXJ0bmVjc3Utc2Jvai1uaW1kYRQKKhI",
      "metadata": {
        "@type": "type.googleapis.com/google.datastore.admin.v1.ExportEntitiesMetadata",
        ...
        "progressEntities": {
          "workCompleted": "1",
          "workEstimated": "3"
        },
        "progressBytes": {
          "workCompleted": "85",
          "workEstimated": "257"
        },
        ...

When an operation is done, the operation description will contain "done": true. See the value of the state field for the result of the operation. If the done field is not set in the response, then its value is false. Do not depend on the existence of the done value for in-progress operations.

Cancel an operation

Use the operations cancel command to stop an operation in progress:

gcloud datastore operations cancel operation-name

Cancelling a running operation does not undo the operation. A cancelled export operation leaves documents already exported in Cloud Storage, and a cancelled import operation leaves in place updates already made to your database. You cannot import a partially completed export.

Delete an operation

Use the operations delete command to remove an operation from the output of operations list. This command will not delete export files from Cloud Storage.

gcloud datastore operations delete operation-name

Billing and pricing for managed exports and imports

You are required to enable billing for your Google Cloud project before you use the managed export and import service. Export and import operations contribute to your Google Cloud costs in the following ways:

The costs of export and import operations do not count towards the App Engine spending limit. Export or import operations will not trigger any Google Cloud budget alerts until after completion. Similarly, reads and writes performed during an export or import operation are applied to your daily quota after the operation is complete.

Permissions

To run export and import operations, your user account and your project's default service account require the Cloud Identity and Access Management permissions described below.

User account permissions

The user account or service account initiating the operation requires the datastore.databases.export and datastore.databases.import Cloud IAM permissions. If you are the project owner, your account has the required permissions. Otherwise, the following Cloud IAM roles grant the necessary permissions:

  • Datastore Owner
  • Datastore Import Export Admin

A project owner can grant one of these roles by following the steps in Grant access.

Default service account permissions

Each Google Cloud project automatically creates a default service account named PROJECT_ID@appspot.gserviceaccount.com. Export and import operations use this service account to authorize Cloud Storage operations.

Your project's default service account requires access to the Cloud Storage bucket used in an export or import operation. If your Cloud Storage bucket is in the same project as your Datastore mode database, then the default service account has access to the bucket by default.

If the Cloud Storage bucket is in another project, then you must give the default service account access to the Cloud Storage bucket.

Assign roles to the default service account

You can use the gsutil command-line tool to assign one of the roles below. For example, to assign the Storage Admin role to the default service account run:

gsutil iam ch serviceAccount:[PROJECT_ID]@appspot.gserviceaccount.com:roles/storage.admin \
    gs://[BUCKET_NAME]

Alternatively, you can assign this role using the Cloud Console.

Export operations

For export operations involving a bucket in another project, modify the permissions of the bucket to assign one of the following Cloud Storage roles to the default service account of the project containing your Datastore mode database:

  • Storage Admin
  • Storage Object Admin
  • Storage Legacy Bucket Writer

You can also create a Cloud IAM custom role with the following permissions:

  • storage.buckets.get
  • storage.objects.create
  • storage.objects.list

Import operations

For import operations involving a Cloud Storage bucket in another project, modify the permissions of the bucket to assign one of the following Cloud Storage roles to the default service account of the project containing your Datastore mode database:

  • Storage Admin
  • Both Storage Object Viewer and Storage Legacy Bucket Reader

You can also create a Cloud IAM custom role with the following permissions:

  • storage.buckets.get
  • storage.objects.get

Differences from Datastore Admin backups

If you previously used the Datastore Admin console for backups, you should note the following differences:

  • There is no GUI for the managed export and import service.

  • Exports created by a managed export do not appear in the Datastore Admin console. Managed exports and imports are a new service that does not share data with App Engine's backup and restore functionality, which is administered through the Cloud Console.

  • The managed export and import service does not support the same metadata as the Datastore Admin backup and does not store progress status in your database. For information on checking the progress of export and import operations, see Managing long-running operations

  • You cannot view service logs of managed export and import operations.

  • The managed import service is backwards compatible with Datastore Admin backup files. You can import a Datastore Admin backup file using the managed import service, but you cannot import the output of a managed export using the Datastore Admin console.

Importing into BigQuery

To import data from a managed export into BigQuery, see Loading Datastore export service data.

Limitations

  • Data exported without specifying an entity filter cannot be loaded into BigQuery. If you want to import data into BigQuery, your export request must include one or more kind names in the entity filter.
Was this page helpful? Let us know how we did:

Send feedback about...

Cloud Datastore Documentation