Jump to Content
Infrastructure Modernization

How to set up Google Cloud VMware Engine regional disaster recovery with VMware Site Recovery Manager

March 12, 2024
Jake Wells

Cloud Consultant

Raj Jethnani

Solutions Engineer

Try Gemini 1.5 models

Google's most advanced multimodal models in Vertex AI

Try it

As a VMware admin, you understand the importance of business continuity and minimizing downtime. Google Cloud VMware Engine (GCVE) customers have several options of tools to choose from (such as GCVE Protected, Zerto, Veeam replication, VMware Site Recovery Manager (SRM) and other third-party tools) based on their RTO and RPO needs. Of these, VMware SRM is a popular way to enable disaster recovery (DR) in GCVE multi-region deployments. In this blog post, we present a guide to setting up SRM within GCVE, enabling failover and failback of your VMs between Google Cloud regions for DR purposes.

Architecture

Before we dive into implementation, let’s take a moment to review the architecture of GCVE and SRM. The following diagram shows an overview of GCVE Private Clouds configured with an SRM Deployment:

https://storage.googleapis.com/gweb-cloudblog-publish/images/1_-_Design.max-1100x1100.png

This architecture represents deployment of a primary Private Cloud (PC) in one region with the DR PC deployed to a separate region with connectivity between the PCs using a Standard VMware Engine Network. The SRM and vSphere Replication appliances are deployed to a service subnet within the PC for faster networking speeds as opposed to deploying the appliances within a network segment of the NSX-T Tier1 router. Cloud DNS is used for resolution of the SRM and vSphere replication appliances being deployed within the PCs.

Prerequisites

Before deploying the solution there are a few prerequisite steps we need to ensure are completed:

Set up disaster recovery for SRM

Now let's jump in and deploy SRM to enable regional site recovery for GCVE.

1. Assign address ranges for subnet Service-1(or preferred service subnet) image2

https://storage.googleapis.com/gweb-cloudblog-publish/images/2_-_Subnets.max-1700x1700.png

2. Create a port group in vCenter for the subnet; this requires a solution user account. (You’ll need the VLAN ID of Service-1 subnet for the creation of the port group.) image3

https://storage.googleapis.com/gweb-cloudblog-publish/images/3_-_Port_Group.max-1100x1100.png
3. Repeat Service Subnet creation in the DR PC per the instructions above.
 
4. Create private DNS zones within Cloud DNS for record lookups of the SRM and vSphere replication appliances. The DNS zones should be attached to the customer controlled VPC (SRM requires both forward and reverse lookups between vCenter, SRM, and vSphere replication appliances at both sites).
https://storage.googleapis.com/gweb-cloudblog-publish/images/4_-_PrivateZone.max-1400x1400.png

5. Create DNS records for the appliances within each zone.

https://storage.googleapis.com/gweb-cloudblog-publish/images/5_-_Fwd_Lookup.max-1300x1300.png

Reverse-lookup record:

https://storage.googleapis.com/gweb-cloudblog-publish/images/6_-_Rvrse_Lookup.max-1300x1300.png

6. Update the binding for the private zones to the intranet VPC of the VMware Engine network. This facilitates DNS resolution for the appliances in the PCs. Once the bindings have been created, you can leverage the DNS server address within each private cloud as the DNS address for configuration within the SRM and vSphere replication appliances.

https://storage.googleapis.com/gweb-cloudblog-publish/images/7_-_PC.max-1800x1800.png

7. Log in to vCenter with a user that has similar permissions as the CloudOwner Role and deploy the vSphere Replication Appliance OVA. Be sure to leverage the Distributed port group that you created using service-1 subnet as the network. Repeat this at the DR site.

8. After successful deployment of the vSphere Replication appliances, log in and perform the configuration. If you used the shortname for the hostname on the appliance during the deployment, make sure to update the network settings in the configuration with the FQDN of the appliance. You will also need to use a solution-user account for connecting to vCenter in the configuration. Perform the configuration at both sites.

9. After configuration, log in to vCenter and navigate to Menu -> Site Recovery Manager and validate that vSphere Replication status is OK at both sites.

10. Log in to vCenter with a user that has similar permissions as the CloudOwner Role and deploy the SRM Appliance OVA. Be sure to leverage the Distributed port group that you created using service-1 subnet as the network. Repeat this at the DR site.

11. After successful deployment of the SRM appliances, log in and perform the configuration. If you used the shortname for the hostname on the appliance during the deployment make sure to update the network settings in the configuration with the FQDN of the appliance. You will also need to use a solution-user account for connecting to vCenter in the configuration. Perform the configuration at both sites.

12. After configuration log in to vCenter and navigate to Menu -> Site Recovery Manager and validate that Site Recovery Manager status is OK at both sites.

Now that the vSphere Replication and SRM appliances have been deployed and their status has been verified at both sites, you should be able to go into SRM to perform the site pairing between the two regions and configure the replication of VMs.

By following this guide, you should now be able to leverage the combined power of VMware SRM and Google Cloud VMware Engine to build a robust and reliable disaster recovery solution for your Google Cloud vSphere environments. Don't wait for a disaster to strike. Mitigate risk and protect your critical workloads by implementing disaster recovery on Google Cloud VMware Engine with SRM today.

Posted in