Hide
App Engine

Backup/Restore, Copy, and Delete Data

Enabling Datastore Admin for an application

In order to use the features of the Datastore Admin tab, you may be asked to enable Datastore Admin for your application using the Applications Settings page of the Admin Console. Under the Data heading in the left-hand navigation menu, click Datastore Admin, then click Enable Datastore Admin in the page that appears.

Caveats on using data admin features

  • For copy, delete, and backups, recent updates may not be considered.
  • All Datastore Admin operations occur within your applications, and thus count against your quota.
  • We strongly recommend that you set your application to read-only mode during a backup or restore.
  • When copying entities to a remote target or restoring from backup all entities with the same keys will be overridden. Operations can be performed multiple times without the risk of creating duplicates. Be aware that the copy/restore operations do not delete extra data.
  • If a non-default queue is chosen for backup/restore, it must not have any target other than ah-builtin-python-bundle specified in queue.yaml.

Very frequent backups often lead to higher costs. When you run a Datastore Admin job, you are actually running an underlying MapReduce job. MapReduce jobs cause frontend instance hours to increase on top of Storage operations and Storage usage. To keep an eye on your resource usage:

  1. Visit the Admin Console and click the Dashboard link under Main in the left navigation.
  2. On the top of the page, select ah-builtin-python-bundle from the Version drop down menu.

Backup and restore data

You can back up all entities or just the selected kinds of entities, and you can restore from one of these backups when you need to. The backup and restore feature is intended to help you recover from accidental deletes of data or to enable you to export data.

You can also use backup files to export your data to other Cloud Platform services, such as BigQuery. If you want to export your data, be sure to use Google Cloud Storage for your backups and not Blobstore.

When you restore from a backup, any new entities added since the backup are retained, and entities that existed at backup-time and that were modified after the backup are overwritten with values from the backup. You can restore all data from a backup or you can restore specific entity kinds from the backup. In addition, you can also use this feature to restore a backup of one app's data to some other app, provided that you use Google Cloud Storage for your backups.

Limits affecting backups and restores

By default, you will be limited to no more than 100 GB of backups and 100 GB of restores per day subject to retries.

Backing up data

To create a back up file for future data restores for exporting:

  1. Go to the Datastore Admin screen in the Data section of the Admin Console.
  2. Select the entity kind(s) that you wish to backup.
  3. Click Backup Entities to display the backup form.
  4. Notice that the default queue is used for the backup job; you can use this in most cases. If you need to change this to another queue, make sure the queue used does not have any target specified in queue.yaml other than ah-builtin-python-bundle.
  5. Notice that a backup name is supplied and that it includes a datestamp. You must change this value if you make more than one backup per day because a backup will not be made if a backup of the same name already exists.
  6. Select the backup storage location by making the appropriate choice in the dropdown menu: Blobstore or Google Cloud Storage. If you are exporting data, use Google Cloud Storage.
  7. When you choose Google Cloud Storage, you are prompted for the bucket name where the backups are to be stored, in the format /gs/my_bucket.

    To find the bucket name you should use, locate the bucket name in the Admin Console under Application Settings > Basics. (The name is typically in the form <App ID>.appspot.com) If you don't see any bucket, create the default bucket using Application Settings > Cloud Integrations > Create.

    The default bucket is already fully configured for your app, has free quota, and does not require billing to be set up.

    If you don't use the default bucket, you must enable your app for billing, create your own bucket, and then assign write permissions to the bucket, as described in the Google Cloud Storage documents.

  8. Start the backup jobs by clicking Backup Entities. Notice that a job status page is displayed.

  9. If you disabled writes, re-enable Datastore writes for your application.

Aborting a backup

If Backup jobs are currently running, they appear in a Pending Backups list in the Datastore Admin screen. You can stop these running backups by selecting the backup in the list and clicking Abort. When you abort a backup job, App Engine attempts to delete backup data that has been saved up to that point. However, in some cases, some files may remain after the abort. You can locate these files in the location you chose for your backups (Blobstore or Google Cloud Storage) and safely delete them after the abort completes. The names of such files will start with the following pattern: datastore_backup__your_backup_name_.

Finding information about a backup

You may want to find out details about a backup, such as which entity kinds it contains, where it was saved (e.g., Blobstore or Google Cloud Storage), and its starting and ending time. To display this backup information:

  1. Select one or more backups in the Backups or Pending Backups list.
  2. Click Info to display information for those backups.
  3. Click Back to return to the main Datastore Admin screen

Scheduled backups

You can run scheduled backups using the App Engine Cron service. For details, see Scheduled Backups.

Restoring data

To restore from a backup:

  1. Optionally, disable Datastore writes for your app. (It's normally a good idea to do this to avoid conflicts between the restore and any new data written to the Datastore.)
  2. Go to the Datastore Admin screen in the Data section of the Administration Console.
  3. In the list of available backups, select the backup that you want to restore from.
  4. Click Restore.
  5. In the advisory page that is displayed, notice the list of entities with checkboxes. By default, all of the entities will be restored. Uncheck the checkbox next to each entity that you don't want to restore.
  6. Also in the advisory page, notice that the default queue, with its pre-configured performance settings, is used for the restore job. Change this to another queue that you have configured differently if you need different queue performance characteristics, making sure the queue chosen does not have any target specified in queue.yaml other than ah-builtin-python-bundle.
  7. Start the restore by clicking Restore. Notice that a job status page is displayed.
  8. If you disabled writes, re-enable Datastore writes for your application.

Restoring data to another app

If you back up your data using Google Cloud Storage, you can restore backups to apps other than the one used to create the backup.

To restore backup data from one app to a different app:

  1. Using the Google Developers console, locate the project that has the bucket used for your backups and add the target app (the app you are restoring to) to the project team with Edit permissions.
  2. Make a new backup in your applications whose data is to be copied. The permissions set in the previous step are not retroactive to existing backups, so the target app will not be able to access those earlier backups. The target app can access only backups made after it was given permissions.
  3. Optionally, disable Datastore writes for your target app. (This is normally a good idea, to avoid conflicts between the restore and any new data written to the Datastore.)
  4. For the target app, go to the Datastore Admin screen in the Data section of the Administration Console.
  5. In the textbox next to the button labelled Import Backup Information specify the bucket containing the backup, in the format /gs/my_bucket. This will result in a displayed list of all the backups in that bucket. Alternatively, supply the file handle for a specific backup; the handle can be obtained from the source application by selecting the backup and clicking Info; the file handle appears next to the label Handle.
  6. Click Import Backup Information.
  7. The resulting selection page shows the available backups for the bucket you specified, unless you specified a backup by its handle. Select the desired backup and click one of the following:
    • Add to Backup List if you want this backup to be retained in the list of available backups for your app.
    • Restore From Backup if you want to restore from this backup but do not want the backup displayed in the list of available backups for your app.
  8. In the advisory page that is displayed, notice the list of entities with checkboxes. By default, all of the entities will be restored. Uncheck the checkbox next to each entity that you don't want to restore.
  9. Also in the advisory page, notice that the default queue, with its pre-configured performance settings, is used for the restore job. Change this to another queue that you have configured differently if you need different queue performance characteristics.
  10. Start the restore by clicking Restore. Notice that a job status page is displayed.
  11. If you disabled writes, re-enable Datastore writes for your application.

Copying entities to another application

You can use the Datastore Admin tab of the Admin Console to copy all entities of a kind, or all entities of all kinds, to another application. The Datastore copy functionality uses the Datastore Admin screen in the Data section of the source application's Admin Console. The default queue, with its pre-configured performance settings, is used for the copying job. Change this to another queue that you have configured differently if you need different queue performance characteristics.

From the Datastore Admin screen, you can select and copy entity kind(s) with the click of a button:

A note for Java developers

The Datastore copy feature is currently available only for Python applications. If your target app is built in Java, you'll need to create a non-default Python runtime for it and use that as the target application. The following steps describe how to create a non-default Python runtime for your target application using a sample application called Datastore_admin:

  1. Download Python 2.7.
  2. Download the Datastore_admin app.
  3. Download the Python SDK.
  4. Grant permission for the source application to write to the target application, as described in step 3 of Procedure for copying a Datastore.
  5. Run appcfg.py -A <your_app_id> update <directory-of-demo-app>

Procedure for copying a Datastore

To copy a Datastore:

  1. Make sure Datastore Admin is enabled for your application.
  2. Enable the remote_api builtin for the target application so it can receive data from the source. To do this, add the following to app.yaml:

    builtins:
    - remote_api: on
    
  3. Grant permission for the source app to write to the target app by doing the following:

    1. Add the following to the appengine_config.py file in the root directory of your application:

      remoteapi_CUSTOM_ENVIRONMENT_AUTHENTICATION = ('HTTP_X_APPENGINE_INBOUND_APPID',['source appid here'])
      

      If you do not have a appengine_config.py file, you can create a new one or copy the sample located in google/appengine/ext/appstats/sample_appengine_config.py.

    2. Upload the modified version of your application using appcfg.py update.

  4. Set the source application to read-only mode.

    Although this step is not required, it is strongly recommended. Copying entities into a new Datastore takes time and writes done during the copy might not be transferred during the copy. If the source application is not in read-only mode, you can still copy data, but you'll see a notreadonly warning in the Admin Console. The destination Datastore is not guaranteed to receive a complete copy of the new data unless writes are disabled.

  5. Select the entity kind(s) to copy individually or in bulk, and copy them using Copy To Other App. On the confirmation screen, enter the remote endpoint of the target app:

    • The typical remote endpoint is http://_your_target_app_id_.appspot.com/_ah/remote_api.
    • If you used the sample app to copy your data, the remote endpoint is http://datastore-admin._your_target_app_id_.appspot.com/_ah/remote_api.
    • If you used an alternate major version, the remote endpoint of your target application is http://app_version._your_target_app_id_.appspot.com/_ah/remote_api.

    After you confirm, the system validates the request. If the remote_api connection can be established, one or more mapreduce operations begins to copy the data. You can follow the link to see the status of the initial set of mapreduce operations. If the application uses the namespace feature, a mapreduce runs for each namespace for each entity kind.

    You can view a summary of the copy status from the Datastore Admin page. To view the individual mapreduce status, you can visit http://_your_target_app_id_.appspot.com/_ah/mapreduce/

  6. If you disabled Datastore writes as recommended prior to the copy, re-enable writes.

Deleting entities in bulk

You can use the Datastore Admin tab of the Admin Console to delete all entities of a kind, or all entities of all kinds, in all namespaces. To enable this feature, simply enable Datastore Admin for your application in the Administration Console.

Adding this builtin enables the Datastore Admin screen in the Data section of the Administration Console. From this screen, you can select the entity kind(s) to delete individually or in bulk, and delete them using the Delete Entities button.