Hybrid Protection for Enterprise Data with Cohesity and Google Cloud Storage

By Sai Mukundan, Product Management at Cohesity

This article describes how Cohesity works with Google Cloud Storage.

Cohesity is a hyperconverged secondary storage system for consolidating backup, test/dev, file services, and analytic datasets onto a scalable data platform.

Cohesity's product integration with Cloud Storage provides you with an enterprise hybrid cloud data-protection strategy, and with end-to-end data protection that spans on-premises and the cloud.

The following image shows how Cohesity CloudArchive works with Google Cloud Storage:

Diagram of Cohesity DataPlatform and DataProtect with Google Cloud Platform.

Cohesity DataPlatform and DataProtect offer you robust on-premises solutions for enterprise data protection. Cohesity’s CloudArchive enhances the offering by adding seamless connectivity to Cloud Storage as an extension of the data center infrastructure. You see the following benefits:

  • Connect to Cloud Storage without the need for cloud gateways and disparate point solutions — Cohesity consolidates multiple solutions across backup, replication, disaster recovery (DR), and archival, and connects directly from on-premises to Google’s cloud services.

  • Achieve additional data protection by maintaining copies of your data in the cloud, which also offers cost-effective long-term data retention. This helps you to implement your compliance controls and improve your DR strategies with remote copies of your data in Cloud Storage.

  • Transition from capacity planning and large capex investments to pay-as-you-go operational budgeting — Cohesity reduces the total cost of ownership for data protection, because you no longer have to purchase as many physical disks to maintain capacity. Cloud Storage automatically scales when you need more space and you pay only for the space you use.

Cohesity protects physical and virtual workloads on-premises and also has seamless integration with Cloud Storage:

Diagram of physical and virtual workloads integrated with Cloud Storage and protected with Cohesity.

Cohesity enables you to manage increasingly complex storage environments through a hyperconverged secondary storage infrastructure called Cohesity DataPlatform. Initially, you start with a single 2U device housing 96 terabytes of raw storage capacity, which allows you to converge scale-out backup with global deduplication, snapshots, indexing, near instant recovery, replication, and archival to cloud. Cohesity also provides Cohesity DataProtect, a backup and recovery solution with the following features:

  1. Policy-based management for end-to-end data protection: Data protection is managed through a set of policies that specify application SLA requirements including RPO, retention policies, off-site replication, and cloud archival.
  2. Fast recovery points and near-instantaneous recovery times: Cohesity supports highly scalable snapshots and clones. Data protection is further enhanced through an indexing engine that indexes everything that you back up. You can use a simple text-based search to easily mine your on-premises and cloud-based backup data, and quickly restore only the data you need.
  3. Storage efficiency with scale-out, globally deduplicated storage: Cohesity provides a scale-out, globally deduplicated storage platform. With the ability to add or remove individual nodes at any time, the Cohesity cluster automatically scales up or down by rebalancing the data and its associated metadata to ensure redundancy.
  4. Seamless integration with Cloud Storage: Cohesity CloudArchive supports integration with Cloud Storage, allowing you choose from Multi-Regional, Regional, Nearline, or Coldline buckets to manage costs based on how frequently you intend to access your archives. You can leverage Cloud Storage as an extension of your on-premises infrastructure, using CloudArchive to copy older local snapshots to Cloud Storage for long-term retention.
  5. Virtual machines, physical servers and database backups: Cohesity’s native data protection software enables you to easily protect data from both virtual and physical environments, which can dramatically reduce cost and complexity. This means you can use Cohesity to schedule, report on, replicate, and archive datasets across an incredible range of heterogeneous applications.

The rest of this solution explains the process of sending data from on-premises Cohesity to Cloud Storage.

Understanding the backup process

Here's the workflow for Cohesity CloudArchive:

Diagram that illustrates the workflow of Cohesity CloudArchive with backup and archive schedules.

The preceding illustration shows the workflow wherein the VMs or physical servers are first backed up on-premises to Cohesity and then the data is pushed to Cloud Storage based on the archive schedule. The archived objects are divided into smaller segments which are compressed and encrypted before they are transferred to the cloud. Each segment goes through a lookup in a fingerprint database using its metadata hash. If the fingerprint is found, then only the metadata is archived. If there is no such fingerprint, then the data is also archived. This helps optimize the amount of data that needs to be transferred to the cloud.

You can purchase Cohesity CloudArchive in conjunction with DataPlatform and DataProtect. You pay for Cloud Storage usage independently, directly to Google.

Configuring Cohesity to connect to Cloud Storage

The following sections walk you through registering Cloud Storage with Cohesity DataPlatform, creating a policy protection, and then recovering data from Cloud Storage.

Step 1: Register Cloud Storage with Cohesity DataPlatform

In order to use Cloud Storage, you must first register it as an external target with your on-premises Cohesity cluster. Registering an external target allows you to archive data outside of your on-premises Cohesity environment.

  1. Access the UI for Cohesity DataPlatform by pointing your browser to the IP address of any one of the nodes in your on-premises Cohesity cluster.
  2. Click External Targets under the Platform tab to access the external target registration page.

    Registering Cloud Storage as an external target.

  3. Register Google Nearline as a target by filling in the following fields: Bucket Name, Project ID, Client Email Address, and Client Private Key.

  4. Enable Encryption to send and store data in an encrypted format.

  5. Enable Compression to send and store the data in a compressed format.

  6. Click Register to create the new external target.

Step 2: Create Protection Policy to establish archival to Cloud Storage

Next you create a Protection Policy to define how your virtual/physical servers, databases, and unstructured data will be protected; how frequently they will be backed up; and how long the backups will be retained. The Protection Policy allows you to incorporate the Cloud Storage External Target you created as an archive target with your chosen retention period, which is typically several years.

  1. To create a new Protection Policy, on the Protection tab, click Policy Manager. The following example creates a policy to archive data to Google Nearline once a month and retain the data in the cloud for 730 days (2 years).

    Protection drop-down list with Policy Manager selected.

Step 3: Recover data from Cloud Storage

Cohesity DataPlatform includes an indexing engine that enables rapid search and recovery of your files and virtual machines from backups stored both on-premises and in Cloud Storage. As virtual machines and physical servers are backed up, Cohesity’s indexing engine opens the underlying files and indexes the metadata; enabling extremely fast, wild-card search results that are used for near instantaneous granular restores. This can greatly improve the RTO and RPO compared to traditional data-protection architectures.

  1. To search and recover a file, on the Protection tab, click Recovery to open the search screen.

    Protection drop-down list with Recovery selected.

  2. Entering the search term *SQL* reveals a list of virtual machines and jobs that match. In this figure below, SQL is selected and added to the cart by clicking Add To Cart.

    Search results of virtual machines and jobs that match the query

  3. Click the Continue button, and in this image below you’ll see a relevant snapshot from Cloud Storage, indicated by the cloud icon on the right.

    Snapshot of Cloud Storage.

  4. Click Save to select the specific snapshot to recover from. You can recover to the original VM location or to a new location. Additional naming, networking, and power-on options are also available.

    Recovered options for recovered VMs.

In conclusion, organizations can use the integrated data protection capabilities of Cohesity DataPlatform for physical and virtual environments to consolidate disparate hardware and software elements. Using Cohesity DataPlatform with Cloud Storage can provide a robust data protection strategy that delivers improvement in backup/recovery speed and better economics than incumbent solutions.

What’s next

See additional documentation on the Cohesity website:

Try out other Google Cloud Platform features for yourself. Have a look at our tutorials.

Was this page helpful? Let us know how we did:

Send feedback about...