Storage & Data Transfer

Cloud storage data protection that fits your business

October 30, 2019

David Seidman

Group Product Manager

As you’re moving apps and workloads to the cloud, you’ll have new options for data storage. It’s just as important to keep your data safe in the cloud as it was on-premises, though. We’ve recently released several enhancements to Persistent Disk that offer more options for keeping your disks available and protected. There are many dimensions to consider when thinking about where and how your data is protected. For example, how locally available do you need your data to be? And what data residency or privacy rules should you be aware of?

There’s not necessarily a one-size-fits-all answer, but there are some common scenarios, use cases, and tradeoffs to consider when you’re managing block storage data protection in the cloud. Tradeoffs can include considerations around TCO, workload performance impact, and data locality. For example, one typical use case is protecting applications that require regularly scheduled maintenance downtime. These apps need short-term protection with very fast rollback in case a maintenance event fails. For these workloads, the data does not need to be backed up offsite, and can often be in the same location as the source data to optimize for performance.

In addition, you may need backups to be stored only in specific regions to meet regulations or compliance requirements. But you still need to optimize for robust disaster recovery plans, including using multiple regions for backup or failover.

Another tradeoff we see is between synchronicity and physical separation and latency. There are plenty of mission-critical enterprise applications, such as databases, that may require zero recovery-point objective (RPO) synchronous data replication with physical separation. To meet this requirement, you may be willing to tolerate higher write latencies to achieve zero RPO.

Google Cloud meets these needs with a variety of Persistent Disk features. Persistent Disk is our high-performance block storage option that you can use with either Compute Engine or Google Kubernetes Engine (GKE). Note that disks and snapshots are always encrypted, and data is replicated multiple times to provide extraordinarily high durability. Here, we’ll dive into three generally available features that help you meet backup and recovery needs in the way that works best for your business data.

Snapshot locality for Persistent Disk gives you more control
There are a number of scenarios that require precise location control of snapshots. Persistent Disk now offers granular control so you can select the snapshot location. This feature became available earlier this year, and we’ve heard great feedback from customers, including global creative commerce platform Etsy. “It was incredibly helpful for us to have regional snapshotting available because of restore times,” says Keith Wells, senior operations engineer at Etsy. “If a whole region went down, we could bring everything up again in a short window. This aligns well with our disaster recovery strategy. Because of the success we've had with this functionality, we have other internal product teams adopting our approach for their use cases.”

By default, a snapshot is stored in the multi-region that is geographically closest to the location of the Persistent Disk. This provides geo-redundancy to maximize resilience. For workloads with more specific needs, it may not be the right choice. If you have data residency requirements to keep your snapshot in a specific geography, you can choose a specific Google Cloud region.

Another common storage requirement is streamlining for disaster recovery and cross-region failover. You can use snapshots to run a primary site in one region and have options for secondary failover sites in different regions. Storing snapshots in the secondary region(s) ensures that data can be restored in the shortest possible amount of time, should it be needed. You can minimize snapshot restore times when the snapshot and target disk are in the same region, keeping RTO to a minimum. Learn how to select the storage location for a snapshot in this documentation.

Scheduled snapshot for Persistent Disk makes snapshotting easier
Earlier this year, we launched scheduled snapshots for Persistent Disk with general availability, and we’ve heard some great success stories among beta and early access customers. The scheduled snapshot feature lets you initiate automated snapshots and manage snapshot retention. Previously, scheduling snapshots required custom automation to fit exact schedules, like hourly, daily, etc. This tool makes it fast and simple to configure snapshots on the schedule you need.

Scheduled snapshot retention policies also help minimize snapshot storage costs by ensuring that snapshots are automatically deleted when they are no longer needed. You can apply one snapshot resource policy to multiple disks, making it simple to set up backup and disaster recovery solutions for Compute Engine workloads.

To see this in action, check out the GCP Developer Console or the scheduled snapshot documentation.

Regional Persistent Disk automatically replicates between zones
The general availability of Regional Persistent Disks provides block-level synchronous replication between two zones in the same region. This approach maximizes application availability without sacrificing consistency, which can add performance and peace of mind.

Regional Persistent Disk is designed for workloads that have no tolerance for data loss and need high availability in the event of a single zone outage. Since its launch, Regional Persistent Disk has been deployed for workloads including VoIP servers, SaaS collaboration tools, design automation services, and protecting data in SQL Server, SAP Hana, PostgreSQL, and MySQL.

Regional Persistent Disk automatically handles transient storage unavailability in a zone, and provides an API to facilitate cross-zone failover (learn more about this in the documentation). For example, a stateful workload might be running in a VM with a Regional Persistent Disk in zone A, and if zone A suddenly fails, the workload can get restarted with the same disk in zone B with minimal disruption.

At the storage level, a replicated disk can typically be attached to a new VM within several seconds, supporting very low recovery-time objectives (RTO). For workloads adopting Regional Persistent Disk, you can choose whether to keep a hot-standby VM instance (with higher cost and lower RTO), or to only replicate data and bring up compute on demand (with lower cost and higher RTO). For additional planning for RTO and failure state management, see our guide about high-availability options.

We’re always working to meet your unique data protection needs. To give some of these features a try, log into the Google Cloud console or check out our free trial. Or visit the Cloud Storage page to learn about all of Google’s cloud storage options.

Posted in