Scheduled Backups

Jacob Butcher, Doug Anderson
April 16, 2012


Note: Use of this feature is limited to backups started from the application's cron or task queue.

You can run scheduled backups for your application using the App Engine Cron service. To do this for Python or Go apps, specify backup cron jobs in cron.yaml. For Java apps, specify the backup cron job in cron.xml. Currently there is no way to specify a scheduled backup programmatically.

Setting Up a Scheduled Backup

To set a scheduled backup for your app,

  1. If you haven't already done so, enable Datastore Admin for your app.
  2. If you haven't already, create and configure the Cloud Storage bucket you wish to use for backups.
  3. In your application directory, if you don't already have one, create a cron.yaml file for a Python or Go app or a cron.xml file for a Java app.
  4. Add the backup cron entries. These specify the backup schedule, the set of entities to back up, and the storage to be used for the backups, as described in Specifying Backups in a Cron File. Here are some examples:


    Sample Python cron.yaml

    - description: My Daily Backup
      url: /_ah/datastore_admin/backup.create?name=BackupToCloud&kind=LogTitle&kind=EventLog&filesystem=gs&gs_bucket_name=whitsend
      schedule: every 12 hours
      target: ah-builtin-python-bundle


    Sample Java cron.xml (note use of "&", as "&" is interpreted by XML)

    <?xml version="1.0" encoding="UTF-8"?>
        <description>My Daily Backup</description>
        <schedule>every 12 hours</schedule>
  5. Deploy this file with your app. (You can verify the Cron job you just deployed by clicking Cron Jobs in the left nav pane.)

The backups will occur on the schedule you specified. While it runs, it will show up in the Pending Backups list. After the backup is complete, you can view it and use it in the list of available backups within the Datastore Admin tab.

Specifying Backups in a Cron File

These are the fields to include in your cron file to perform scheduled backups:

This is the title that appears in the Cron Job list. It can be anything you wish.
The url is required and must be in this format:

These fields can appear in the url query string:

  • name is an optional prefix that is prepended to the backup name. It helps you identify your backups. If not supplied, the default "cron-" will be used.
  • The kind field can appear one or more times. Each value specifies an entity kind that you wish to back up. You must specify at least one entity kind. In the Google Cloud Platform Console, the default is that all entity kinds are backed up. With a cron backup, there is no such default: if you don't specify a kind, it doesn't get backed up.
  • queue is optional. It specifies the task queue to be used. If not supplied, the default task queue is used.
  • filesystem specifies the storage to be used for backups. Specify the value "gs", which means that Google Cloud Storage will be used.
  • gs_bucket_name is required. It specifies the Cloud Storage bucket name used for backup storage.
  • namespace is optional. When provided, only entities from the selected namespace are included in the backup.

Note: The url cannot be longer than 2000 characters. As shown in the cron.xml Java example above, you must use the HTML entity "&amp;" to separate fields, rather than the ampersand character ("&") since that will be interpreted by XML.

This field is required: it defines the recurring schedule at which the backup runs. For complete details, see the Schedule Format documentation for Python or Java).
This is required. It identifies the app version the cron backup job is to be run on. You must use the value ah-builtin-python-bundle because that is the version of your app that contains the Datastore Admin features that the cron job needs to execute. Keep in mind that the cron backup job is running against this version of your app, so you incur costs when the cron backup job is running. (The ah-builtin-python-bundle version of your app is enabled when you enable Datastore admin for your app.)

Warning! Backup, restore, copy, and delete operations are executed within your application, and thus count against your quota.

Very frequent backups often lead to higher costs. When you run a Datastore Admin job, you are actually running underlying MapReduce jobs. MapReduce jobs cause frontend instance hours to increase on top of Storage operations and Storage usage. To keep an eye on your resource usage, click on the Dashboard link under Main in the left navigation. On the top of the page select ah-builtin-python-bundle from the Version drop down menu.


When the scheduled backup runs, App Engine performs a GET using the backup url. If the GET succeeds it results in http status 200. When it fails it results in http status code 400. You can look at the logs to determine whether a backup succeeded or failed by doing the following:

  1. In the GCP Console, visit the logs for your project.
  2. In the pulldown menu of resources, select App Engine, and then select the appropriate module. For the version, select ah-builtin-python-bundle to display the logs.
  3. Locate your backup job in the log to determine whether it succeeded or failed. If there was a failure, in addition to the status code 400, there will be an error message to help you determine the cause of the error.
Was this page helpful? Let us know how we did:

Send feedback about...

App Engine Documentation