This page describes how to schedule exports of your Firestore in Datastore mode data. To run exports on a schedule, we recommend using Cloud Run functions and Cloud Scheduler. Create a Cloud Function that initiates exports and use Cloud Scheduler to run your function.
Before you begin
Before you schedule data exports, you must complete the following tasks:
- Enable billing for your Google Cloud project. Only Google Cloud projects with billing enabled can use the export and import feature.
- Create a Cloud Storage bucket in a location near your Datastore mode database location. Export operations require a destination Cloud Storage bucket. You cannot use a Requester Pays bucket for export operations.
Create a Cloud Function and Cloud Scheduler job
Follow the steps below to create a Cloud Function that initiates data exports and a Cloud Scheduler job to call that function:
Create a datastore_export
Cloud Function
-
Go to the Cloud Functions page in the Google Cloud console:
- Click Create Function
- Enter a function name such as
datastoreExport
- Under Trigger, select Cloud Pub/Sub. Cloud Scheduler uses your pub/sub topic to call your function.
- In the Topic field, select Create a topic. Enter a name for
the pub/sub topic such as
startDatastoreExport
. Take note of the topic name as you need it to create your Cloud Scheduler job. - Under Source code, select Inline editor.
- In the Runtime dropdown, select Python 3.7.
- Enter the following code for
main.py
: - In
requirements.txt
, add the following dependency: - Under Entry point, enter
datastore_export
, the name of the function inmain.py
. - Click Deploy to deploy the Cloud Function.
Configure access permissions
Next, give the Cloud Function permission to start export operations and write to your Cloud Storage bucket.
This Cloud Function uses your project's default service account to authenticate and authorize its export operations. When you create a project, a default service account is created for you with the following name:
project_id@appspot.gserviceaccount.com
This service account needs permission to start export operations and to write to your Cloud Storage bucket. To grant these permissions, assign the following IAM roles to the default service account:
Cloud Datastore Import Export Admin
Storage Object User
role on the bucket
You can use the Google Cloud CLI to assign these roles. You can access this
tool from Cloud Shell in the Google Cloud console:
Start Cloud Shell
-
Assign the Cloud Datastore Import Export Admin role. Replace project_id, and run the following command:
gcloud projects add-iam-policy-binding project_id \ --member serviceAccount:project_id@appspot.gserviceaccount.com \ --role roles/datastore.importExportAdmin
-
Assign the Storage Object User role on your bucket. Replace bucket_name and project_id, and run the following command:
gcloud storage buckets add-iam-policy-binding gs://bucket_name \ --member=serviceAccount:project_id@appspot.gserviceaccount.com \ --role=roles/storage.objectUser
Create a Cloud Scheduler job
Next, create a Cloud Scheduler job that calls the datastore_export
Cloud Function:
Go to the Cloud Scheduler page in the Google Cloud console:
Click Create Job.
Enter a Name for the job such as
scheduledDatastoreExport
.Enter a Frequency in unix-cron format.
Select a Timezone.
Under Target, select Pub/Sub. In the Topic field, enter the name of the pub/sub topic you defined alongside your Cloud Function,
startDatastoreExport
in the example above.In the Payload field, enter a JSON object to configure the export operation. The
datastore_export
Cloud Function requires abucket
value. You can optionally includekinds
ornamespaceIDs
values to set an entity filter, for example:Export all entities
{ "bucket": "gs://bucket_name" }
Export with entity filter
Export entities of kind
User
orTask
from all namespaces:{ "bucket": "gs://bucket_name", "kinds": ["User", "Task"] }
Export entities of kind
User
orTask
from the default andTesters
namespaces. Use an empty string (""
) to specify the default namespace:{ "bucket": "gs://bucket_name", "kinds": ["User", "Task"], "namespaceIds": ["", "Testers"] }
Export entities of any kind from the default and
Testers
namespaces. Use an empty string (""
) to specify the default namespace:{ "bucket": "gs://bucket_name", "namespaceIds": ["", "Testers"] }
Where
bucket_name
is the name of your Cloud Storage bucket.Click Create.
Test your scheduled exports
To test your Cloud Function and Cloud Scheduler job, run your Cloud Scheduler job in the Cloud Scheduler page of the Google Cloud console. If successful, this initiates a real export operation.
Go to the Cloud Scheduler page in the Google Cloud console.
Go to Cloud SchedulerIn the row for your new Cloud Scheduler job, click Run now.
After a few seconds, click Refresh. The Cloud Scheduler job should update the result column to Success and Last run to the current time.
The Cloud Scheduler page confirms only that the job sent a message to the pub/sub topic. To see if your export request succeeded, view the logs of your Cloud Function.
View the Cloud Function logs
To see if the Cloud Function successfully started an export operation, see the Logs Explorer page in the Google Cloud console.
The log for the Cloud Function reports errors and successful export initiations.
View export progress
You can use the gcloud datastore operations list
command to view the
progress of your export operations, see
listing all long-running operations.
After an export operation completes, you can view the output files in your Cloud Storage bucket. The managed export service uses a timestamp to organize your export operations: