Configure disaster recovery for a cluster

This page provides instructions for how to configure disaster recovery for cluster workloads in Google Distributed Cloud (GDC) air-gapped.

As a Platform Administrator (PA), you must create a bucket, backup repository, and a backup plan for a specified cluster.

Once these resources are created, you must inform an Infrastructure Operator (IO) to complete the restore.

Before you begin

To configure disaster recovery for a cluster, you must have the following:

  • Access to the Kubernetes cluster you want to create the restore for. For more information, see Kubernetes cluster overview.
  • The necessary identity and access roles:
    • DR Backup Admin MP: performs disaster recovery backups. Ask your Organization IAM Admin to grant you the DR Backup Admin MP (dr-backup-admin-mp) cluster role.
    • DR System Admin MP: manage objects in dr-system namespace for setting up management cluster backups. Ask your Organization IAM Admin to grant you the DR System Admin MP (dr-system-admin-mp) role.

Configure disaster recovery for a cluster

For each YAML file you configure in these steps, apply it using kubectl apply -f.

To configure disaster recovery for a cluster, follow these steps:

  1. Create a bucket in the cluster:

    apiVersion: object.gdc.goog/v1
    kind: Bucket
    metadata:
    name: BUCKET_NAME
    namespace: dr-system
    spec:
    description: BUCKET_DESCRIPTION
    storageClass: Standard
    

    Replace the following:

    • BUCKET_NAME: the name of the bucket where backups are stored. This name must be a globally unique name, such as backup-bucket-1.
    • BUCKET_DESCRIPTION: a brief description of the bucket's purpose.
  2. Create a BackupRepository resource which points to the bucket created in the previous step. The repository uses a secret to authenticate with the bucket. The repository is configured to allow both reading and writing of backups. Create backup repository in the cluster:

    apiVersion: backup.gdc.goog/v1
    kind: BackupRepository
    metadata:
    name: BACKUP_REPO_NAME
    spec:
    secretReference:
        namespace: NAMESPACE
        name: SECRET_NAME
    endpoint: ENDPOINT
    type: "S3"
    s3Options:
        bucket: BUCKET_FQDN
        region: BUCKET_REGION
        forcePathStyle: true
    importPolicy: "ReadWrite"
    force: true
    

    Replace the following:

    • BACKUP_REPO_NAME: your chosen name of the backup repository resource in Kubernetes.
    • SECRET_NAME: the name of the Kubernetes secret that holds the credentials, the access key and secret key, for accessing the bucket. For more information on getting the secret attached to a bucket, see Obtain user access.
    • ENDPOINT: the endpoint URL for accessing the bucket.
    • BUCKET_FQDN: the fully qualified domain name (FQDN) of the bucket. This field includes the bucket name and the endpoint.
    • BUCKET_REGION: the region where the bucket is located, such as us-east-1 or europe-west2.
  3. Create a BackupPlan resource to schedule regular backups:

    apiVersion: backup.gdc.goog/v1
    kind: BackupPlan
    metadata:
    name: BACKUP_PLAN_NAME
    namespace: dr-system
    spec:
    clusterName: CLUSTER_NAME
    backupSchedule:
        cronSchedule: "*/10 * * * *"
        paused: false
    backupConfig:
        backupScope:
        selectedNamespaces:
            namespaces:
            - NAMESPACE
        backupRepository: BACKUP_REPO_NAME
        includeVolumeData: true
        volumeStrategy: LocalSnapshotOnly
    retentionPolicy:
        backupDeleteLockDays: 10
        backupRetainDays: 10
    

    Replace the following:

    • BACKUP_PLAN_NAME: your chosen name for the backup plan resource.
    • CLUSTER_NAME: The name of the Kubernetes cluster being backed up.

    This backup plan follows these rules:

    • Backs up the selected namespace every 10 minutes.
    • The backup includes volume data and uses a local snapshot strategy.
    • A retention policy is set to keep backups for 10 days.

Perform the restoration

You must escalate and instruct an Infrastructure Operator (IO) to perform the restore on your behalf. Provide the necessary information such as the name of the BackupRepository and BackupPlan resource. For more information on personas in GDC, see Personas.