By Ashok Ramu, Cloud Engineering, Actifio, and Chandra Reddy, Product Marketing, Actifio
This article describes how to use Actifio Sky software to manage critical applications that run on Google Cloud in enterprise applications such as SAP HANA or in databases such as Microsoft SQL Server, PostgreSQL, and MySQL.
Used with Compute Engine and Persistent Disk, Actifio Sky software provides:
- Scalable and efficient incremental forever data protection.
- Instant recovery for all data including 50+ TB databases or file systems.
- Long-term data retention in Cloud Storage Nearline object storage, for compliance and other reasons.
- Instant rewind and recovery capability from backups stored in Cloud Storage for Persistent Disk and Cloud Storage Nearline or Cloud Storage Coldline object storage.
Actifio Sky is infrastructure-agnostic and can protect applications on-premises and in the cloud.
Backup and replication in multiple regions
This section shows how Actifio Sky simplifies backing up applications running in one Google Cloud region and replicating to another Google Cloud region for disaster recovery.
Provisioning and policies to enforce governance
Actifio Sky can be provisioned as a virtual machine (VM) in Compute Engine. Actifio Sky stores backup data in a highly compressible format on persistent disk (block storage HDD or SSD), as shown in the following diagram.
Actifio Sky running in Compute Engine using Persistent Disk.
Protection policies defined by a backup administrator dictate the application data lifecycle inside Actifio. Policies can specify where data resides and the frequency of replication. For instance, a policy could leverage Actifio Sky's StreamSnap replication capability to move data from one Google Cloud region to another, while a second policy using Actifio's Incremental OnVault capability can have Actifio Sky replicate data to Nearline/Coldline for long-term retention. These policies can be combined to implement an enterprise's corporate data governance policy.
Actifio Sky in Compute Engine replicating in another Google Cloud region.
Actifio Sky replicating to Cloud Storage Nearline storage in another Google Cloud region.
Start small and grow: on-demand scale-up or scale-out
Actifio Sky running on Compute Engine leverages Google's infrastructure
to scale up with your data growth. Enterprises can start with a small CPU and
disk footprint and then grow CPU, memory, disk, and network resources as
data-protection needs increase. For example, you can start with Actifio Sky
n1-standard-2 and then switch to
n1-standard-4 (8 cores and 15 GB
memory), and then switch to
n1-standard-8 (8 cores & 30 GB memory) and so on.
You can attach a larger CPU and bigger network to the same virtual machine
without having to rebuild it from scratch.
You stop the Actifio Sky VM, attach a bigger CPU, and then restart the VM. This operation has no impact on the data being managed. You can also provision multiple Actifio Sky VMs and manage all of them with a single pane of glass using Actifio Global Manager.
On-demand scale-up or scale-out or both.
Starting small and then growing to larger Compute Engine instances helps you pay as you grow, instead of pre-allocating a large amount of compute up-front just to satisfy capacity needs 18 months from today.
Actifio Sky uses the same "grow-as-needed" approach for Persistent Disk storage as well. As you need more storage, you can add more Persistent Disk storage to existing Actifio Sky VMs without disturbing your existing backup images. This keeps storage costs as low as possible.
Using Nearline or Coldline storage, there is almost no limit to the amount of storage that an Actifio Sky VM can use.
Fast backup and instant recovery from images
Actifio Sky can deliver instant recovery from backup images stored in Persistent Disk storage as well as Nearline and Coldline storage from any point-in-time version backup that is days or decades old.
This section walks through the high-level concepts of incremental forever backup, application consistency, point-in-time synthetic virtual full backups, and scalable instant recovery.
Automating service level agreements
You can use Actifio Sky to configure SLAs, or policies, such as which VMs to protect, how often to protect them (for example, once an hour or once a day), how long to retain daily, weekly, monthly, and yearly images, and when and where to replicate protected VMs.
Actifio Sky automatically protects the Compute Engine VMs and applies the configured SLAs. No complex scripts and manual processes are needed. Once SLAs are enabled, Actifio takes care of backup and replication. It even allows you to schedule automated recoveries using its workflow engine.
Efficient change-block tracking
The most efficient way to protect information is to copy only the blocks that have changed. Ideally, the changed blocks should be instantly available, without having to rely on CPU- and I/O-intensive file-system operations or computationally complex hash calculations.
Actifio provides a lightweight connector that incorporates change-block tracking (CBT). The connector automatically provides a list of changed blocks when backups are initiated, minimizing the amount of CPU, storage, and memory required.
The Actifio connector is deployed in Compute Engine VMs running applications and databases.
Actifio CBT keeps track of changed blocks in an efficient bitmap.
Actifio maintains the CBT metadata in-memory and does not do any copy-on-writes, split writes, or compute-intensive hash calculations, so it has no impact on I/O, CPU, or memory performance of the protected Compute Engine VM.
Incremental forever backup
Actifio performs only one full backup. Because of Actifio's unique CBT feature, all subsequent backups rely on incremental changes only. Unlike traditional products that mandate full backups, Actifio never requires a full backup after the initial one, and this applies to all data types.
For example, if a 1 TB Compute Engine data set has a daily change rate of 5%, the Actifio connector can track and only back up the 50 GB of changed blocks, instead of the entire 1 TB. This reduces the storage I/O, CPU, and memory on the VM that the data set by up to 20x.
Actifio connector uses the CBT bitmap to back up just the changed blocks.
Point-in-time synthetic virtual full backups
Actifio Sky synthesizes a point-in-time virtual full backup from the incremental backups to ensure the fastest recoveries.
Point-in-time synthetic virtual full created from incremental forever backups
In the preceding example, after the incremental backup of 50 GB of data, you could still recover the 1 TB database instantly.
Storing the backups
Actifio Sky stores backup data in Persistent Disk in a compressed format to reduce storage utilization. Actifio calls this logical storage area the snapshot pool. You can configure multiple snapshot pools using different Persistent Disk options, such as HDD for lower performance tier app requirements and SSD for higher performance tier apps.
Configure multiple snapshot pools with HDD and SSD persistent disks.
The amount of storage required in the Actifio snapshot pool in Persistent Disk depends on the amount of data protected, change rate, data retention, data compressibility, as discussed later in this paper.
The Actifio connector running in the Guest Virtual Machine OS invokes the appropriate APIs for MS SQL Server, SAP Hana, PostgreSQL, MySQL, Oracle and other databases to ensure that all backups are application consistent. Every point-in-time backup is application-consistent.
Scalable instant recovery
Traditional backup products restore by re-hydrating data from backup images back to the recovery server, making recovery time objective (RTO) directly proportional to the size of the data being recovered.
Very large RTO in traditional "restore" approach.
Recoveries with Actifio Sky are near-instant. Actifio Sky behaves like a storage controller and presents a virtual disk representing a point-in-time copy from its snapshot pool to the target. These images are presented instantly and do not require any data movement, enabling near-instant recoveries for any size data set.
Instant recovery of multi-TB data sets.
Actifio's connector further simplifies recovery by automatically bringing recovered databases online. When a recovery is initiated, Actifio Sky automatically presents the restored SAP HANA environment or database instance, such as MS SQL Server, PostgreSQL, MySQL, or Oracle, to the recovery server. Actifio Sky further accelerates recovery by integrating with the application or database binary files and automatically bringing the recovered instances online. Post-recovery, all reads on the recovered volumes are delivered from the point-in-time backup image stored in the Actifio snapshot pool.
Read operations are served from backup images in the Actifio snapshot pool.
When changes are made to the recovered server, the original backup image is not modified—it's immutable. All changes are stored in the Actifio Sky snapshot pool without impacting any of the immutable backup images.
Writes are sent to Actifio Snapshot pool where write blocks are stored.
The Actifio backup images are stored in the snapshot pool in native application format instead of a deduplicated format. You have the peace of mind of knowing that the performance after recovery will match the underlying storage (HDD or SSD). This is different from systems that store data in a proprietary deduplicated format, where I/O performance after recovery suffers significantly.
Replication, vault, and disaster recovery
When it comes to protecting information, speed matters. Actifio provides unique technology that accelerates both backup and recovery.
Disaster recovery (DR) is a related and equally important concept to backup. Just as enterprises must be sure to have local copies of information, you also need to replicate backup images to another geographic region to protect yourselves from a regional disaster. You might also do this for compliance and audit requirements in some cases.
Actifio Sky provides two options for replication:
- Replication to Actifio Sky in a DR Google Cloud region.
- Replication directly to Nearline/Coldline object storage.
You can use these two technologies together and choose a combination based on your business requirements.
Replication to Actifio Sky in another Google Cloud region
Actifio Sky can replicate data to another instance in a different Google Cloud region with a Recovery Point Objective (RPO) between 1 hr to 24 hrs. The inclusion of database transaction logs further reduces RPOs to as little as 15 minutes. This high-performance replication, called StreamSnap, can saturate 10 Gbps networks, and here is how it works.
As a backup writes its changed blocks to the Actifio snapshot pool, Actifio Sky replicates the newly written data to another Actifio Sky instance in a Compute Engine VM in another region.
Actifio Sky replication to another Actifio Sky in a second region.
The target Actifio Sky instance stores the backup images in its own snapshot pool using HDD or SSD Persistent Disk. All data is compressed, and data in transit is encrypted. For additional security, data at rest in the snapshot pool is always encrypted by Persistent Disk.
You can retain daily, weekly, monthly, and yearly backup images in an Actifio Sky snapshot pool as per the SLAs you define. Images that are older than the specified retention period are removed to reclaim disk space.
Replication to Cloud Storage
After each incremental backup writes changed blocks to the Actifio snapshot pool, Actifio Sky compresses and replicates the changed blocks to Cloud Storage such as Nearline or Coldline using Google's native API.
Incremental forever replication to Nearline or Coldline object storage.
Data can be replicated at any frequency you want; such as every 1 or 4 or 24 hours. Data is written to Nearline or Coldline storage in an incremental forever manner and compressed for additional efficiency. These technologies significantly reduce the bandwidth required and the amount of object storage utilized. After replication completes, a virtual full backup image is created based on the most recent data received. Images from any point in time can be recovered from object storage instantly with a full read/write enabled copy.
Data sent to Nearline or Coldline storage is self-describing; that is, Actifio Sky sends data and associated metadata in the same transaction. This capability provides an alternate or new Actifio Sky appliance the ability to read and import these images.
Even though backups are performed in an incremental forever manner, they can be
accessed as if they are a full image. For example, a weekend incremental backup
will be marked as
weekly virtual full, a month-end backup will be marked as
monthly virtual full, a year-end backup will be marked as
full and appropriate daily, weekly, monthly, and yearly retention will be
applied to those images.
Synthesis of weekly, monthly, and yearly virtual fulls from daily incrementals.
On-demand disaster recovery
Actifio Sky can import high-level metadata within minutes from the object storage where backup images were stored. As a result, recovery volumes can be mounted instantly.
"Instant Access" from backups in object storage.
Note that the virtual disk mounted on the recovery server is rewritable. All read operations fetch data, through Actifio Sky, from object storage, so performance depends on the latency and throughput from Nearline or Coldline. All write operations end up in the Actifio Sky snapshot pool, ensuring that the original backup image in object storage remains immutable.
Another approach to recovery from object storage is to "restore" data to a recovery server, which increases the RTO. This strategy is better suited for applications that need high I/O performance post-recovery.
"Instant Access" from backups stored in Persistent Disk HDD/SSD storage.
This table summarizes recommendations for replicating to Actifio Sky and Nearline or Coldline storage.
|Requirement||Replicate to Actifio Sky + Persistent Disk||Replicate to Nearline or Coldline object storage|
|Instant recovery with high I/O performance needs post-recovery||Yes. Low Latency. High Costs.
|Yes. High Latency. Low Costs.|
|Instant recovery with low I/O performance needs post-recovery||Low Latency. High Costs.||High latency. Low Costs.
|Short-term retention for a few days or weeks||Medium Costs.||Low Costs.|
|Long-term data retention for many months, years, or decades||Yes. Very High Costs.||Yes. Low Costs.
|Data needs to be accessed rarely, such as once a year||Yes. High Costs.||Yes. Low Costs.
Cost estimates for 4 use cases
The critical components of data protection include flexibility, fast backup, and recovery and efficient disaster recovery. Important data is perpetually growing, and often budgets are limited. For many customers, the most important question is the cost of the solution.
The benefit of cloud solutions is that the costs are highly flexible. This flexibility also presents a challenge when you are estimating and planning costs. The amount of cloud infrastructure consumed for data protection depends on many factors. Some of the most important ones are listed later. Real-world costs are directly proportional to these four important factors:
- Amount of data protected.
- The change rate of data per backup period.
- The time period in days, weeks, months, or years for which backups need to be retained.
- Data compression ratio.
To illustrate the consumption and costs of Compute Engine VMs, Persistent Disk storage, and object storage, this section considers the costs of four different use cases. Each use case examines three different data points for each variable mentioned, keeping the other variables constant.
Use Case 1: Data protection size: 10 TB, 30 TB, 50 TB
Percentage change rate of source production data is shown in the following table.
|Daily Change Rate||Weekly Change Rate||Monthly Change Rate||Yearly Change Rate|
Data retention assumptions are shown in the following table.
Compression ratio is assumed to be 3:1 for full and incremental forever backup data stored in Persistent Disk and object storage.
The following graph shows cloud infrastructure costs for 3 different data protection sizes: 10 TB, 30 TB, and 50 TB.
All costs are Monthly Recurring Costs (MRC) at list pricing. Based on usage and region, actual costs could be lower.
Note that as the volume of protected data increases, the Compute Engine VM costs to run Actifio Sky stays flat. A single Compute Engine instance with just 4 cores and 15 GB memory scales from 10 TB to 55 TB of data protected. As data grows, there are two strategies for expansion:
- Additional Actifio Sky appliances can be provisioned. Actifio Global Manager provides a single pane of glass to monitor multiple Actifio Sky appliances.
- The Compute Engine instance running Actifio Sky can also be upgraded to one with more compute and memory. This requires a reboot of Actifio Sky and keeps all the existing backup images intact.
Use Case 2: Daily change rate: 3%, 5%, 10%
In this use case, source data change rate will vary while the other parameters will remain unchanged.
Source data protected is assumed to be 30 TB, and the compression ratio is 3:1. Data retention assumptions are shown in the following table.
The following table shows 3 different change rate scenarios:
|% Change Rate at Source Data||% Daily Change Rate||% Weekly Change Rate||% Monthly Change Rate||% Yearly Change Rate|
This chart shows the MRC for the scenarios previously listed:
Use Case 3: Varying data retention needs
This use case assumes a fixed 30 TB of data, a data change rate of 5%, and a compression ratio of 3:1.
Industry statistics suggest that 95% of restores are from data that is less than 14 days old, and so data retention at the primary site is kept constant at 14 days.
Actifio can deliver instant mounts off any point in time from all storage targets including object-based platforms. Restores from the DR site using object storage from any point in time after 14 days will still be fast. Enterprises have varying retention needs at the DR site for different application tiers. We consider 3 scenarios:
|Retention in Cloud DR region||Daily
This chart shows various cloud infrastructure costs for the above 3 retention scenarios and all costs are based on MRC:
Note that changing data retention needs from 3 months to 12 months increased the object storage costs by just $400 per month. Similarly, increasing data retention from 0 years to 7 years increased the cost by just $616 per month. This minimal increase is possible because of 1) incremental forever backup, 2) compression of data stored in object storage, 3) low cost of Nearline object storage at $10 per TB per month.
Use Case 4: Data compression ratio: 2:1, 3:1, 4:1
This use case considers source data protected to be 30 TB. The % change rate of source production data varies as shown below:
|Daily Change Rate||Weekly Change Rate||Monthly Change Rate||Yearly Change Rate|
Data retention assumptions are shown in the following table:
This chart shows various cloud infrastructure costs for 3 different compression ratios and all costs are MRC.
As expected, if data is compressible, the amount of Persistent Disk, Cloud Storage Nearline storage, and API calls to insert data to Nearline storage decreases, reducing the overall cost.
Every environment will have differing amounts of source data to be protected, change rates, and data retention needs. Compression is also unique to a data set.
When considering total cost of ownership in the cloud, it is best to start with the following questions:
- How much data needs to be protected?
- What % of restore requests happen beyond 1 or 2 weeks? If the number is low, just use 1 to 2 weeks of data retention in Persistent Disk storage to keep your Persistent Disk storage infrastructure costs low.
- What % of data is important enough that it needs to be replicated to Cloud Storage Nearline object storage and retained for more than 2 weeks of duration? A lower number will reduce the amount of object storage and the number of API calls, and hence MRC.
- For data that has to be stored in Cloud Storage Nearline object storage, how long do you need to retain data? Many organizations need to store data for many months or years. But many users just apply the same retention policies for all their data. Instead, the best practice is to tier applications into various data retention categories and apply those retention SLAs accordingly. This approach will minimize Cloud Storage Nearline storage costs.
- Do you have data that is not compressible? Some organizations do, such as images, videos, or encrypted data. Users should expect more Persistent Disk and Cloud Storage Nearline object storage to be used for these data types. A thoughtful retention policy is even more important for these environments.