Get cost-effective protection for SAP HANA with Backup and DR Service
Ghanshyam Patel
Cloud Architect
Jeff O'Connor
Solutions Engineer
Like many businesses, your SAP HANA database is the heart of your SAP business applications, a repository of mission-critical data that drives your operations. But what happens when disaster strikes?
Protecting a SAP HANA system involves choices. Common methods include HANA System Replication (HSR) for high availability and Backint for backups. But while having a disaster recovery (DR) strategy is crucial, it doesn't need to be overly complex or expensive. While HSR offers rapid recovery, it requires a significant investment. For many SAP deployments, a cold DR strategy strikes the perfect balance between cost-effectiveness and recovery time objectives (RTOs).
What is cold DR? Think of it as your backup plan's backup plan. It minimizes costs by maintaining a non-running environment that's only activated when disaster strikes. This traditionally means longer RTOs compared to hot or warm DR, but significantly lower costs, and while often deemed sufficient, any improvement on RTO and lower cost is what businesses are often in search of.
Backint, when paired with storage (e.g. Persistent Disk and Cloud Storage) enables data transfer to a secondary location, and can be an effective cold DR solution. However, using Backint for DR can mean longer restore times and high storage costs, especially for large databases. Google Cloud is delivering a solution addressing both the cost-effectiveness of cold DR and the rapid recovery of a full DR solution: Backup and DR Service with Persistent Disk (PD) snapshot integration. This innovative approach leverages the power of incremental forever backups and HANA Savepoints to protect your SAP HANA environment.
Rethinking SAP disaster recovery in Google Cloud
Backup and DR is an enterprise backup and recovery solution that integrates directly with cloud-based workloads that run in Google Compute Engine. Backup and DR provides backup and recovery capabilities for virtual machines (VMs), file systems, multiple SAP databases (HANA, ASE, MaxDB, IQ) as well as Oracle, Microsoft SQL Server, and Db2. You can elect to create backup plans to configure the time of backup, how long to retain backups, where to store the backups (regional/multi-regional) and in what tier of storage, along with specifying database log backup intervals to help ensure a low recovery point objective (RPO).
A recent Backup and DR feature offers Persistent Disk (PD) snapshot integration for SAP HANA databases. This is a significant advancement because these PD snapshots are integrated with SAP HANA Savepoints to help ensure database consistency. When the database is scheduled to be backed up, the Backup and DR agent running in the SAP HANA node instructs the database to trigger a Savepoint image, where all changed data is written to storage in the form of pages. Another benefit of this integration is that the data copy process occurs on the storage side. You no longer copy the backup data through the same network interfaces that the database or operating system are using. This results in production workloads retaining the compute and networking resources, even during an active backup
Once completed, Backup and DR services trigger the PD snapshots from the Google Cloud storage APIs, so that the image is captured on disk, and logs can also be truncated if desired. All of these snapshots are “incremental forever” and database-consistent backups. Alternatively, you can use logs to recover to a point in time (from the HANA PD snapshot image).
Integration with SAP HANA Savepoints is critical to this process. Savepoints are SAP HANA API calls whose primary use is to help speed up recovery restart times, to provide a low RTO. They achieve this because when the system is starting up, logs don’t need to be processed from the beginning, but only from the last Savepoint position. Savepoints are coordinated across all processes (called SAP HANA services) and instances of the database to ensure transaction consistency.
The HANA Savepoint Backup sequence using PD snapshots can be summarized as:
-
Tell agent to initiate HANA Savepoint
-
Initiate PD snapshot, wait for ‘Uploading’ state (seconds)
-
Tell agent to close HANA Savepoint
-
Wait for PD snapshot ‘Ready’ state (minutes)
-
Expire any logs on disk that have passed expiration time
-
Catalog backup for reporting, auditing
In addition, you can configure log backups to occur regularly, independent of Savepoint snapshots. These logs are stored on a separate disk and also backed up via PD snapshots, allowing for point-in-time recovery.
Operating system backups
What about the operating system backups? Good news: Backup and DR lets you take PD snapshots for the bootable OS and selectively any other disk attached directly to your Compute Engine VMs. These backup images can be also stored in the same regional or multi-regional location for cold DR purposes.
You can then restore HANA databases to a local VM or your disaster recovery (DR) region. This flexibility allows you to use your DR region for a variety of purposes, such as development and testing, or maintaining a true cold DR region for cost efficiency.
Backup and DR helps simplify DR setup by allowing you to pre-configure networks, firewall rules, and other dependencies. It can then quickly provision a backup appliance in your DR region and restore your entire environment, including VMs, databases, and logs.
This approach gives you the freedom to choose the best DR strategy for your needs: hot, warm, or cold, each with its own cost, RPO, and RTO implications.
One of the key advantages of using Backup and DR with PD snapshots is the significant cost savings it offers compared to traditional DR methods. By eliminating the need for full backups and leveraging incremental forever snapshots, customers can reduce their storage costs by up to 50%, in our testing. Additionally, we found that using a cold DR region with Backup and DR can reduce storage consumption by 30% or more compared to using a traditional backup to file methodology.
Why this matters
Using Google Cloud’s Backup and DR to protect your SAP HANA environment brings a lot of benefits:
-
Better backup performance (throughput) - storage layer handles data transfer rather than an agent on the HANA server
-
Reduced TCO through elimination of regular full backups
-
Reduced I/O on the SAP HANA server by avoiding database reads and the writes during the backup window that can be very long by comparison to a regular Backint full backup event.
-
Operational simplicity with an onboarding wizard, and no need to manage additional storage provisioning on the source host
-
Faster recovery times (local or DR) as PD Snapshots recover natively to the VM storage subsystem (not copied over customer networks). Recovery to a point-in-time is possible with logs from the HANA PD Snapshot. You can even take more frequent Savepoints by scheduling these every few hours, to further reduce the log recovery time for restores
-
Data resiliency - HANA PD Snapshots are stored in regional or multi-regional locations
-
Low Cost DR - Since Backup images for VMs and Databases are already replicated to your DR region (via regional or multi-regional PD snapshots), recovery is just a matter of bringing up your VM, then choosing your recovery point-in-time for the SAP HANA Database and waiting for a short period of time
When to choose Persistent Disk Asynchronous Replication
While Backup and DR offers a comprehensive solution for many, some customers may have specific needs or preferences that require a different approach. For example, if your SAP application lacks built-in replication, or you need to replicate your data at the disk level, Persistent Disk Asynchronous Replication is a valuable alternative. This approach allows you to spin up new VMs in your DR region using replicated disks, speeding up the recovery process.
PD Async’s infrastructure-level replication is application agnostic, making it ideal for applications without built-in replication. It's also cost-effective, as you only pay for the storage used by the replicated data. Plus, it offers flexibility, allowing you to customize the replication frequency to balance cost and RPOs.
If you are interested in setting up PD Async, and would like to configure this within Terraform, please take a look at one of our colleagues who created this Terraform example for how to test in a failover and failback scenario for a number of Compute Engine VMs.
Take control of your SAP disaster recovery
By leveraging Google Cloud’s Backup and DR and PD Async, you can build a robust and cost-effective cold DR solution for your SAP deployments on Google Cloud that minimizes costs without compromising on data protection, providing peace of mind in the face of unexpected disruptions.
Learn more about running SAP on Google Cloud and discover how Google Cloud Consulting can help you learn, build, operate and succeed in your cloud journey.