About Persistent Disk Asynchronous Replication


Persistent Disk Asynchronous Replication (PD Async Replication) provides low recovery point objective (RPO) and low recovery time objective (RTO) block storage replication for cross-region active-passive disaster recovery (DR).

PD Async Replication is a storage option that provides asynchronous replication of data between two regions. In the unlikely event of a regional outage, PD Async Replication enables you to failover your data to a secondary region and restart your workload in that region.

You can use PD Async Replication to manage replication for Compute Engine workloads at the infrastructure-level, instead of the workload-level.

Overview

Persistent Disk Asynchronous Replication replicates data from a disk that is attached to a running workload, the primary disk, to a separate disk located in another region. The disk receiving replicated data is referred to as the secondary disk.

The region that the primary disk is located in is referred to as the primary region and the region that the secondary disk is located in is referred to as the secondary region. The primary and secondary regions are referred to as a region pair.

Any disk that meets the disk requirements can be used as a primary disk. After you have a primary disk, you can create a secondary disk that references the primary disk and start replication from the primary disk to the secondary disk.

If you stop replication from the primary disk at any point, and want to restart replication at a later point, then you must create a new secondary disk to restart replication.

Consistency groups

Consistency groups enable you to perform disaster recovery (DR) and DR testing across multiple disks. A consistency group is a resource policy that does the following:

  • Aligns replication across primary disks and ensures that all disks contain replication data from a common point in time, which is used for DR.
  • Aligns disk clones from secondary disks and ensures that all disk clones contain data from a common point in time, which is used for DR drills.

If you want to align the replication period across multiple disks, add primary disks to a consistency group. If you want to clone multiple disks and ensure those clones have data from a common point in time, add secondary disks to a consistency group. A consistency group can be used for replication or cloning, but not both simultaneously.

If you want to add primary disks to a consistency group, you must add disks to the consistency group before you start replication. You can add secondary disks to a consistency group at any time.

Failover and failback

In the event of an outage in the primary region, it is your responsibility to identify the outage and failover restart your workload using the secondary disks, in the secondary region. PD Async Replication doesn't offer outage monitoring. You can identify an outage using RPO metrics, health checks, application-specific metrics, and by contacting Cloud Customer Care.

The failover process involves the following tasks:

  1. Stop replication.
  2. Attach the secondary disks to VMs in the secondary region.

After you failover disks, it is your responsibility to validate and restart your application workload in the secondary region and reconfigure the network addresses that are used to access your application to point to the secondary region.

Following a failover from the primary region to the secondary region, the secondary region becomes the acting primary region. After the outage or disaster gets resolved, you can initiate failback to start replication from the original secondary region (the acting primary region) to the original primary region. You can optionally repeat the process to move the workload back to the original primary region.

The failback process involves the following tasks:

  1. Configure replication between the new primary region and the original primary region.

    • The original secondary disk is now the new primary disk, and you configure it to replicate to a new secondary disk in the original primary region.
    • You can create a new consistency group resource policy in the new primary region so that the new primary disks (the original secondary disks) can replicate consistently to a new set of secondary disks in the original primary region.
  2. (Optional) After the initial replication has occurred, you can repeat the failover process to return the workload to the original primary region.

Disk encryption

Primary and secondary disks don't support customer-supplied encryption keys (CSEK). Use Google-managed encryption keys or customer-managed encryption keys (CMEK) instead. If you use CMEK on the primary disk, you must also use CMEK on the secondary disk. You can use different CMEKs on both disks.

Secondary disk customization

When you create a secondary disk, it inherits the properties of the primary disk, such as the description, disk type, and labels. If the primary disk is a boot disk, the secondary disk inherits the boot configuration of the primary disk. The boot configuration includes information about the operating system (OS) architecture, OS licenses, and its guest OS features.

You can change certain properties of the secondary disk so that they differ from the primary disk. For example, the primary and secondary disk must have the same size and encryption key, but you might assign additional labels to the secondary disk.

For boot disks, you can enable additional security or networking options on the secondary disk by specifying additional guest OS features. However, you can't remove any of the primary disk's guest OS features. Compute Engine merges the new features you specify with the existing guest OS features of the primary disk.

Example

Suppose you have a boot disk called disk-1, with the following guest OS features: [GVNIC, UEFI_COMPATIBLE].

If you create a secondary disk from disk-1, you can only specify additional features. You can't remove the UEFI_COMPATIBLE and GVNIC features. So, if you specify MULTI_IP_SUBNET when you create the secondary disk, the new feature is merged with those of the primary disk, so the resulting guest OS features for the secondary disk are GVNIC,UEFI_COMPATIBLE, and MULTI_IP_SUBNET.

To learn how to customize a secondary disk, see Create a custom secondary disk.

PD Async Replication and regional persistent disks

You can use PD Async Replication with regional persistent disks to achieve high availability (HA) and disaster recovery (DR).

Regional persistent disks can be used as the primary or secondary disk in a PD Async Replication disk pair. A disk pair is a primary disk that replicates to a secondary disk.

When a regional disk is used as the primary disk, replication isn't disrupted if one of the primary disk's zones experiences an outage. The regional primary disk continues to replicate from the healthy zone to the secondary disk.

When a regional disk is used as a secondary disk, replication is paused if one of the secondary disk's zones experiences an outage. Replication doesn't continue to the secondary disk's healthy zone in this case. However, using regional disks as secondary disks can prepare your workload for cross-zone HA in the event of a failover when the secondary disk becomes the new primary disk.

Limitations

  • PD Async Replication is only supported for balanced and performance (SSD) Persistent Disk.
  • Read-only disks and multi-writer disks are not supported.
  • Each disk can have a maximum size of 32 TiB.
  • Each project can have at most 1000 disk pairs in each region pair.

    For example, a given project, project-1 can have up to 1000 disk pairs in the Iowa-Oregon region pair. project-1 can also have up to 1000 disk pairs in the Belgium-Frankfurt region pair.

Supported regions

PD Async Replication is available in all regions in the following continents:

  • Asia, except Indonesia
  • Europe
  • North America
  • Oceania

You can replicate a primary disk in a given region to a secondary disk in any available region within the same continent. This means that you can create a region pair from any two regions within the same continent.

For example, suppose you have a primary disk in Frankfurt (europe-west3). You can replicate that disk to a secondary disk anywhere in Europe, but you can't replicate it to a region in North America.

For a full list of all regions in Compute Engine, see Available zones and regions.

Performance

The recovery point objective (RPO), or the time delay for when data is available at the secondary site, depends on disk change rates. PD Async Replication typically replicates data with a target RPO of one minute, for up to 2 GB of compressed changed blocks per minute with disk blocks replicated with 4 KB block granularity. If a given block is changed multiple times between replication events, only the most recent change is replicated to the secondary disk. At higher disk change rates, RPO may be greater than one minute and typically increases as disk change rates grow. RPO is not configurable.

RPO might exceed one minute in the following scenarios:

  • When disk replication starts. During the initial replication, PD Async Replication replicates all the used blocks on the primary disk to the secondary disk. The initial replication is complete when the disk/async_replication/time_since_last_replication metric is available in Cloud Monitoring.
  • If the disk change rate is greater than 2 GB of compressed changed blocks per minute. After a spike in disk changes, the RPO for later replication cycles might exceed one minute while replication catches up.
  • If you detach a disk from a VM or restart a VM while the disk is replicating. Disks that are undergoing replication that are detached from a VM might see RPO increase up to five minutes for a short period of time.

To learn how to view the RPO for your disks, see Persistent Disk Asynchronous Replication performance metrics.

The recovery time objective (RTO) during failover depends on the time it takes to complete the various tasks involved in failing over your workload to a new region. Tasks such as stopping replication and attaching disks to VMs in the secondary region should take only a few minutes to complete. You can speed up RTO by ensuring that you have VMs running in the secondary region so that if failover occurs, you don't have to wait for VMs to start up.

What's next