Scheduling an Export

This page describes how to schedule an automatic export of your Cloud Firestore in Datastore mode entities.

To schedule exports of your entities, we recommend you deploy an App Engine service that calls the Datastore mode managed export feature. Once deployed, you can run this service on a schedule with the App Engine Cron Service.

Before you begin

Before you can schedule data exports with App Engine and the managed export feature, you must complete the following tasks:

  1. Enable billing for your GCP project. Only GCP projects with billing enabled can use the export and import feature.

  2. Create a Cloud Storage bucket for your project. All managed exports and imports rely on Cloud Storage. You must use the same location for your Cloud Storage bucket and your Datastore mode database.

    To find your Datastore mode database location, see viewing the location of your project.

  3. Install the Google Cloud SDK to deploy the application.

Setting up scheduled exports

After completing the requirements above, set up scheduled exports by completing the following procedures.

Configuring access permissions

This app uses the App Engine default service account to authenticate and authorize its export requests. When you create a project, App Engine creates a default service account for you with the following format:

[PROJECT_ID]@appspot.gserviceaccount.com

The service account requires permission to start Datastore mode database export operations and to write to your Cloud Storage bucket. To grant these permissions, assign the following IAM roles to the default service account:

  • Cloud Datastore Import Export Admin
  • Storage Admin of the Cloud Storage bucket

You can use the gcloud and gsutil command-line tools from the Cloud SDK to assign these roles:

  1. Use the gcloud command-line tool to assign the Cloud Datastore Export Admin role:

    gcloud projects add-iam-policy-binding [PROJECT_ID] \
        --member serviceAccount:[PROJECT_ID]@appspot.gserviceaccount.com \
        --role roles/datastore.importExportAdmin
    

    Alternatively, you can assign this role using the GCP Console.

  2. Use the gsutil command-line tool to assign the Storage Admin role on your bucket:

    gsutil iam ch serviceAccount:[PROJECT_ID]@appspot.gserviceaccount.com:admin \
        gs://[BUCKET_NAME]
    

    Alternatively, you can assign this role using the GCP Console.

Deploying the app

Deploy the following sample app in either Python or Java:

Python

Create the app files

In a new folder on your development machine, create the following files that provide the code for an App Engine app:

  • app.yaml
  • cloud_datastore_admin.py

Use the following code for the files.

app.yaml

The following file configures app deployment:

runtime: python27
api_version: 1
threadsafe: true
service: cloud-datastore-admin

libraries:
- name: webapp2
  version: "latest"

handlers:
- url: /cloud-datastore-export
  script: cloud_datastore_admin.app
  login: admin

The line service: cloud-datastore-admin deploys the app to the cloud-datastore-admin service. If this is the only App Engine service in your project, remove this line to instead deploy the app to the default service.

cloud_datastore_admin.py

import datetime
import httplib
import json
import logging
import webapp2

from google.appengine.api import app_identity
from google.appengine.api import urlfetch


class Export(webapp2.RequestHandler):

  def get(self):
    access_token, _ = app_identity.get_access_token(
        'https://www.googleapis.com/auth/datastore')
    app_id = app_identity.get_application_id()
    timestamp = datetime.datetime.now().strftime('%Y%m%d-%H%M%S')

    output_url_prefix = self.request.get('output_url_prefix')
    assert output_url_prefix and output_url_prefix.startswith('gs://')
    if '/' not in output_url_prefix[5:]:
      # Only a bucket name has been provided - no prefix or trailing slash
      output_url_prefix += '/' + timestamp
    else:
      output_url_prefix += timestamp

    entity_filter = {
        'kinds': self.request.get_all('kind'),
        'namespace_ids': self.request.get_all('namespace_id')
    }
    request = {
        'project_id': app_id,
        'output_url_prefix': output_url_prefix,
        'entity_filter': entity_filter
    }
    headers = {
        'Content-Type': 'application/json',
        'Authorization': 'Bearer ' + access_token
    }
    url = 'https://datastore.googleapis.com/v1/projects/%s:export' % app_id
    try:
      result = urlfetch.fetch(
          url=url,
          payload=json.dumps(request),
          method=urlfetch.POST,
          deadline=60,
          headers=headers)
      if result.status_code == httplib.OK:
        logging.info(result.content)
      elif result.status_code >= 500:
        logging.error(result.content)
      else:
        logging.warning(result.content)
      self.response.status_int = result.status_code
    except urlfetch.Error:
      logging.exception('Failed to initiate export.')
      self.response.status_int = httplib.INTERNAL_SERVER_ERROR


app = webapp2.WSGIApplication(
    [
        ('/cloud-datastore-export', Export),
    ], debug=True)

Deploy the app

  1. Make sure gcloud is configured for the correct project:

    gcloud config set project [PROJECT_NAME]
    
  2. From the same directory as your app.yaml file, deploy the app to your project:

    gcloud app deploy
    

Java

The following sample app assumes you have set up Maven with the App Engine plugin

Download the app

Download the java-docs-samples repository and navigate to the datastore-schedule-export app directory:

  1. Clone the sample app repository to your local machine:

    git clone https://github.com/GoogleCloudPlatform/java-docs-samples.git
    

    Alternatively, download the sample as a zip file and extract it.

  2. Navigate to the directory that contains the sample code:

    cd java-docs-samples/appengine-java8/datastore-schedule-export/
    

The app sets up a servlet in the DatastoreExportServlet.java file.

Deploying to a different service

If you deploy the app as is, it deploys to the default service. If you already have a default service, you should instead deploy to a different service.

Modify the src/main/webapp/WEB-INF/appengine-web.xml file by adding <module>service_name</module>. For example:

   <?xml version="1.0" encoding="utf-8"?>
   <appengine-web-app xmlns="http://appengine.google.com/ns/1.0">
      <runtime>java8</runtime>
      <!-- ... -->
      <module>cloud-datastore-admin</module>
      <!-- ... -->
   </appengine-web-app>

For more on the app configuration file, see the appengine-web.xml reference

Deploy the app

  1. Make sure gcloud is configured for the correct project:

    gcloud config set project [PROJECT_NAME]
    
  2. Deploy the app to your project:

    mvn appengine:deploy
    

The service receives export requests at [SERVICE_URL]/cloud-datastore-export and sends an authenticated request to the Cloud Datastore Admin API to begin the export.

The service uses the following URL parameters to configure the export request:

  • output_url_prefix (required): specifies where to save your Datastore mode database export. If the URL ends with a /, it's used as is. Otherwise, the app adds a timestamp to the url.
  • kind (optional, multiple): restricts export to only these kinds.
  • namespace_id (optional, multiple): restricts export to only these namespaces.

Deploying the cron job

To set up a cron job that calls the schedule-datastore-exports app, create and deploy a cron.yaml file.

  1. Create a cron.yaml file:

    Python

    cron:
    - description: "Daily Cloud Datastore Export"
      url: /cloud-datastore-export?output_url_prefix=gs://[BUCKET_NAME]
      target: cloud-datastore-admin
      schedule: every 24 hours
    

    If you deployed the app to the default service instead of to the cloud-datastore-admin service, remove target: cloud-datastore-admin.

    Java

    cron:
    - description: "Daily Cloud Datastore Export"
      url: /cloud-datastore-export?output_url_prefix=gs://[BUCKET_NAME]
      schedule: every 24 hours
    

    If you did not deploy the app to the default service, add a service target line. For example: target: cloud-datastore-admin.

    Replace [BUCKET_NAME] with the name of your Cloud Storage bucket.

  2. Configure the cron job. The example cron.yaml starts an export request of every entity once every 24 hours. For more scheduling options, see Schedule format.

    To export entities of only specific kinds, add kind parameters to the url value. Similarly, add namespace_id parameters to export entities from specific namespaces. For example:

    • Export entities of kind Song:

      url: /cloud-datastore-export?output_url_prefix=gs://[BUCKET_NAME]&kind=Song
      
    • Export entities of kind Song and kind Album:

      url: /cloud-datastore-export?output_url_prefix=gs://[BUCKET_NAME]&kind=Song&kind=Album
      
    • Export entities of kind Song and kind Album if they are in either the Classical namespace or the Pop namespace:

      url: /cloud-datastore-export?output_url_prefix=gs://[BUCKET_NAME]&namespace_id=Classical&namespace_id=Pop&kind=Song&kind=Album
      
  3. Deploy the cron job. Run the following command in the same directory as your cron.yaml file:

      gcloud app deploy cron.yaml
    

Testing your cron app

You can test your deployed cron job by running the cron job early in the Cron Jobs page of the Google Cloud Platform Console:

  1. Open the Cron Jobs page in the GCP Console.
    Open the Cron Jobs page

  2. Click the Run now button for your cron job.

  3. After the job completes, verify the status message under Status. To see the cron job's log file, click View under the Log column.

Viewing your exports

After a cron job successfully completes, you can view the exports in your Cloud Storage bucket:

  1. Open the Cloud Storage browser in the GCP Console.
    Open the Cloud Storage browser

  2. In the list of buckets, click on the bucket that you created for your exports.

  3. Verify exports are listed in the bucket.

What's next

  • To learn how to import data from a Datastore mode database export, see Importing Entities.
Bu sayfayı yararlı buldunuz mu? Lütfen görüşünüzü bildirin:

Şunun hakkında geri bildirim gönderin...

Cloud Datastore Documentation