Scheduling backups


This tutorial shows you how to schedule backups for Filestore instances using Cloud Scheduler and Cloud Run functions.

Objectives

  • Create a client service account for Cloud Scheduler that has the credentials needed to invoke a Cloud Run functions function.
  • Create a client service account for use by Cloud Run functions that has the credentials to call the Filestore endpoint.
  • Create a Cloud Run functions function that creates (or deletes) a backup of a file share.
  • Create a Cloud Scheduler job that runs the create backups (or delete backups) function at regular intervals.

Costs

In this document, you use the following billable components of Google Cloud:

To generate a cost estimate based on your projected usage, use the pricing calculator. New Google Cloud users might be eligible for a free trial.

Before you begin

  1. Sign in to your Google Cloud account. If you're new to Google Cloud, create an account to evaluate how our products perform in real-world scenarios. New customers also get $300 in free credits to run, test, and deploy workloads.
  2. In the Google Cloud console, on the project selector page, select or create a Google Cloud project.

    Go to project selector

  3. Make sure that billing is enabled for your Google Cloud project.

  4. Enable the Artifact Registry, Cloud Build, Filestore, Cloud Run functions, Cloud Logging, Pub/Sub, Cloud Run, and Cloud Scheduler APIs.

    Enable the APIs

  5. Install the Google Cloud CLI.
  6. To initialize the gcloud CLI, run the following command:

    gcloud init
  7. In the Google Cloud console, on the project selector page, select or create a Google Cloud project.

    Go to project selector

  8. Make sure that billing is enabled for your Google Cloud project.

  9. Enable the Artifact Registry, Cloud Build, Filestore, Cloud Run functions, Cloud Logging, Pub/Sub, Cloud Run, and Cloud Scheduler APIs.

    Enable the APIs

  10. Install the Google Cloud CLI.
  11. To initialize the gcloud CLI, run the following command:

    gcloud init
  12. If you don't have a Filestore instance in your project, you must first create one.

Create client service accounts for Cloud Scheduler and Cloud Run functions

  1. If you haven't already done so, from the Google Cloud console, click Activate Cloud Shell.

  2. Create a client service account that Cloud Scheduler runs as to invoke a Cloud Run functions function. For this example, use the iam service-accounts create command to name the account schedulerunner and set the display name to "Service Account for FS Backups-Scheduler":

    gcloud iam service-accounts create schedulerunner \
        --display-name="Service Account for FS Backups-Scheduler"
    
  3. Create a client service account that Cloud Run functions runs as to call the Filestore endpoint. For this example, we name the account backupagent and set the display name to "Service Account for FS Backups-GCF":

    gcloud iam service-accounts create backupagent \
        --display-name="Service Account for FS Backups-GCF"
    

    You can check whether the service account is created by running the iam service-accounts list command:

    gcloud iam service-accounts list
    

    The command returns something like this:

    NAME                                         EMAIL                                                   DISABLED
    Service Account for FS Backups-GCF           backupagent@$PROJECT_ID.iam.gserviceaccount.com         False
    Service Account for FS Backups-Scheduler     schedulerunner@$PROJECT_ID.iam.gserviceaccount.com      False
    

Set up environment variables

Set up the following environment variables in your local environment:

  • Google Cloud project ID and project:

    export PROJECT_ID=`gcloud config get-value core/project`
    export PROJECT_NUMBER=`gcloud projects describe $PROJECT_ID --format='value(projectNumber)'`
    
  • The Cloud Scheduler service agent and the client service accounts for Cloud Scheduler and Cloud Run functions:

    export SCHEDULER_SA=service-$PROJECT_NUMBER@gcp-sa-cloudscheduler.iam.gserviceaccount.com
    export SCHEDULER_CLIENT_SA=schedulerunner@$PROJECT_ID.iam.gserviceaccount.com
    export GCF_CLIENT_SA=backupagent@$PROJECT_ID.iam.gserviceaccount.com
    
  • Your Filestore instance:

    export FS_LOCATION=fs-location
    export INSTANCE_NAME=instance-id
    export SHARE_NAME=datafile-share-name
    

    Replace the following:

    • fs-location with the region or zone where the Filestore instance resides.
    • instance-id with the instance ID of the Filestore instance.
    • file-share-name with the name you specify for the NFS file share that is served from the instance.
  • Set up environment variables for your Filestore backup:

    export FS_BACKUP_LOCATION=region
    

    Replace region with the region where you want to store the backup.

Create a function that creates a backup

  1. In the Google Cloud console, go to the Cloud Run functions page.

    Go to the Cloud Run functions page

  2. Click Create Function and configure the function as follows:

    • Basics:
      • Environment: For this example, select 2nd gen, which is the default.
      • Function Name: For this example, we name the function fsbackup.
      • Region: For this example, select us-central1.
    • Trigger:
      • Trigger type: Select HTTPS from the menu.
      • Authentication: Select Require authentication.
    • Runtime, build, connections and security settings:
      • Runtime > Runtime service account > Service account: Select Service Account for FS Backups-GCF (backupagent@$PROJECT_ID.iam.gserviceaccount.com) from the menu.
      • Connections > Ingress settings: Select Allow all traffic.
  3. Click Next and continue the configuration as follows:

    • Runtime: Select Python 3.8, or a later version fully supported by Cloud Run functions from the menu.
    • Source code: Inline editor.
    • Entry point: Enter create_backup.
    • Add the following dependencies to your requirements.txt file:

      google-auth==2.29.0
      requests==2.31.0
      

      Depending on your use case, you may need to specify other dependencies along with their corresponding version numbers. For more information, see Pre-installed packages.

    • Copy the following Python code sample into the main.py file using the inline editor:

      Create backups

      1. This code sample creates a backup named mybackup- appended with the creation time.
      PROJECT_ID = 'project-id'
      SOURCE_INSTANCE_ZONE = 'filestore-zone'
      SOURCE_INSTANCE_NAME = 'filestore-name'
      SOURCE_FILE_SHARE_NAME = 'file-share-name'
      BACKUP_REGION = 'backup-region'
      
      import google.auth
      import google.auth.transport.requests
      from google.auth.transport.requests import AuthorizedSession
      import time
      import requests
      import json
      
      credentials, project = google.auth.default()
      request = google.auth.transport.requests.Request()
      credentials.refresh(request)
      authed_session = AuthorizedSession(credentials)
      
      def get_backup_id():
          return "mybackup-" + time.strftime("%Y%m%d-%H%M%S")
      
      def create_backup(request):
          trigger_run_url = "https://file.googleapis.com/v1/projects/{}/locations/{}/backups?backupId={}".format(PROJECT_ID, BACKUP_REGION, get_backup_id())
          headers = {
            'Content-Type': 'application/json'
          }
          post_data = {
            "description": "my new backup",
            "source_instance": "projects/{}/locations/{}/instances/{}".format(PROJECT_ID, SOURCE_INSTANCE_ZONE, SOURCE_INSTANCE_NAME),
            "source_file_share": "{}".format(SOURCE_FILE_SHARE_NAME)
          }
          print("Making a request to " + trigger_run_url)
          r = authed_session.post(url=trigger_run_url, headers=headers, data=json.dumps(post_data))
          data = r.json()
          print(data)
          if r.status_code == requests.codes.ok:
            print(str(r.status_code) + ": The backup is uploading in the background.")
          else:
            raise RuntimeError(data['error'])
          return "Backup creation has begun!"
      

      Replace the following:

      • project-id with the Google Cloud project ID of the source Filestore instance.
      • filestore-zone with the zone of the source Filestore instance.
      • filestore-name with the name of the source Filestore instance.
      • file-share-name with the name of the file share.
      • backup-region with the region to store the backup.
      1. Click Test function.

        A new tab session opens in Cloud Shell. In it, the following message is returned if successful:

        Function is ready to test.
        
      2. Click Deploy and wait for the deployment to finish.

      3. Switch back to the previous Cloud Shell tab.

      Delete backups

      This code sample deletes backups older than a predefined period.

      Configure this function in the same way as the create backups function, except the following:

      • Function name: deletefsbackups.
      • Entry point: delete_backup.
      PROJECT_ID = 'project-id'
      BACKUP_REGION = 'region'
      BACKUP_RETENTION_TIME_HRS = hours
      
      import google.auth
      import google.auth.transport.requests
      from google.auth.transport.requests import AuthorizedSession
      import time
      import requests
      import json
      
      credentials, project = google.auth.default()
      request = google.auth.transport.requests.Request()
      credentials.refresh(request)
      authed_session = AuthorizedSession(credentials)
      
      retention_seconds = BACKUP_RETENTION_TIME_HRS * 60 * 60
      
      def delete_backup(request):
          now = time.time()
          list = []
          trigger_run_url = "https://file.googleapis.com/v1beta1/projects/{}/locations/{}/backups".format(PROJECT_ID, BACKUP_REGION)
          r = authed_session.get(trigger_run_url)
          data = r.json()
          if not data:
              print("No backups to delete.")
              return "No backups to delete."
          else:
              list.extend(data['backups'])
              while 'nextPageToken' in data.keys():
                  nextPageToken = data['nextPageToken']
                  trigger_run_url_next = "https://file.googleapis.com/v1beta1/projects/{}/locations/{}/backups?pageToken={}".format(PROJECT_ID, BACKUP_REGION, nextPageToken)
                  r = authed_session.get(trigger_run_url_next)
                  data = r.json()
                  list.extend(data['backups'])
          for i in list:
              backup_time = i['createTime']
              backup_time = backup_time[:-4]
              backup_time = float(time.mktime(time.strptime(backup_time, "%Y-%m-%dT%H:%M:%S.%f")))
              if now - backup_time > retention_seconds:
                  print("Deleting " + i['name'] + " in the background.")
                  r = authed_session.delete("https://file.googleapis.com/v1beta1/{}".format(i['name']))
                  data = r.json()
                  print(data)
                  if r.status_code == requests.codes.ok:
                    print(str(r.status_code) + ": Deleting " + i['name'] + " in the background.")
                  else:
                    raise RuntimeError(data['error'])
          return "Backup deletion has begun!"
      

      Replace the following:

      • project-id with the Google Cloud project ID of the backup.
      • region with the region the backups reside in.
      • hours with the number of hours to retain backups. For example, if you want to retain backups for 10 days, input 240.

Assign IAM roles to the client service accounts

  1. Add the Cloud Scheduler service agent to the IAM policy of the Cloud Scheduler client service account with the role of roles/cloudscheduler.serviceAgent. This allows the service agent to impersonate the client service account in order to invoke the function that creates a backup. Run the iam service-accounts add-iam-policy-binding command:

    gcloud iam service-accounts add-iam-policy-binding $SCHEDULER_CLIENT_SA \
        --member=serviceAccount:$SCHEDULER_SA \
        --role=roles/cloudscheduler.serviceAgent
    
  2. Give the client service account of Cloud Run functions the roles/file.editor role so that it can make calls to the Filestore endpoint. Run the projects add-iam-policy-binding command:

    gcloud projects add-iam-policy-binding $PROJECT_ID \
        --member=serviceAccount:$GCF_CLIENT_SA \
        --role=roles/file.editor
    
  3. Grant the client service account of Cloud Scheduler the role of roles/cloudfunctions.invoker for the function you want to use. Run the following functions add-iam-policy-binding command:

    Create backups

    gcloud functions add-iam-policy-binding fsbackup \
        --member serviceAccount:$SCHEDULER_CLIENT_SA \
        --role roles/cloudfunctions.invoker
    

    Now, only the client service account of Cloud Scheduler can invoke fsbackup.

    Delete backups

    gcloud functions add-iam-policy-binding deletefsbackups \
        --member serviceAccount:$SCHEDULER_CLIENT_SA \
        --role roles/cloudfunctions.invoker
    

    Now, only the client service account of Cloud Scheduler can invoke deletefsbackups.

Create a Cloud Scheduler job that triggers the fsbackup function on a specified schedule

  1. In our example for this tutorial, if you wanted to schedule a backup every weekday at 10 pm, you would use the scheduler jobs create http command:

    gcloud scheduler jobs create http fsbackupschedule \
        --schedule "0 22 * * 1-5" \
        --http-method=GET \
        --uri=https://us-central1-$PROJECT_ID.cloudfunctions.net/fsbackup \
        --oidc-service-account-email=$SCHEDULER_CLIENT_SA    \
        --oidc-token-audience=https://us-central1-$PROJECT_ID.cloudfunctions.net/fsbackup
    

    The --schedule flag is where you specify the frequency in which the job runs using unix-cron formatting. For details, see Configuring cron job schedules.

  2. Start the Cloud Scheduler job created in the previous step. In our example, use the scheduler jobs runs command to run it immediately:

    gcloud scheduler jobs run fsbackupschedule
    

    The fsbackupschedule job invokes the fsbackups function immediately once you execute the command and then invokes it again every weekday at 10 pm until the job is paused.

  3. Check the logs for the fsbackups function to see if the function executes properly and returns a status 200.

  4. Check if the backup is created using the backups list command:

    gcloud filestore backups list
    

    The command returns something similar to:

    NAME                      LOCATION     SRC_INSTANCE                        SRC_FILE_SHARE  STATE
    mybackup-20201123-184500  us-central1  us-central1-c/instances/nfs-server  vol1            READY
    

Low quota alerts for backups

If your implementation of scheduling backups puts you at risk of running out of backups quota, we recommend that you set up low backups quota alerts. This way, you are notified when backups quota runs low.

Clean up

After you finish the tutorial, you can clean up the resources that you created so that they stop using quota and incurring charges. The following sections describe how to delete or turn off these resources.

Delete the project

The easiest way to eliminate billing is to delete the project that you created for the tutorial.

To delete the project:

  1. In the Google Cloud console, go to the Manage resources page.

    Go to Manage resources

  2. In the project list, select the project that you want to delete, and then click Delete.
  3. In the dialog, type the project ID, and then click Shut down to delete the project.

What's next