Go green: Sustainable disaster recovery using Google Cloud
Grace Mollison
Head Cloud Solutions Architects EMEA
At Google, we’re dedicated to building technology that helps people do more for the planet, and to fostering sustainability at scale. We continue to be the world’s largest corporate purchaser of renewable energy, and in September made a commitment to operate on 24/7 carbon-free energy in all our data centers and campuses worldwide by 2030.
As we’ve shared previously, our commitment to a sustainable future for the earth takes many forms. This includes empowering our partners and customers to establish a disaster recovery (DR) strategy with zero net operational carbon emissions, regardless of where their production workload is.
In this post, we’ll explore carbon considerations for your disaster recovery strategy, how you can take advantage of Google Cloud to reduce net carbon emissions, and three basic scenarios that can help optimize the design of your DR failover site.
Balancing your DR plan with carbon emissions considerations: It’s easier than you think
A DR strategy entails the policies, tools, and procedures that enable your organization to support business-critical functions following a major disaster, and recover from an unexpected regional failure. Sustainable DR, then, means running your failover site (a standby computer server or system) with the lowest possible carbon footprint.
From a sustainability perspective, we frequently hear that organizations have trouble balancing a robust DR approach with carbon emissions considerations. In order to be prepared for a crisis, they purchase extra power and cooling, backup servers, and staff an entire facility—all of which sit idle during normal operations.
In contrast, Google Cloud customers can lower their carbon footprint by running their applications and workloads on a cloud provider that has procured enough renewable energy to offset the operational emissions of its usage. In terms of traditional DR planning, Google Cloud customers don’t have to worry about capacity (securing enough resources to scale as needed) or the facilities and energy expenditure associated with running equipment that may only be needed in the event of a disaster.
When it comes to implementing a DR strategy using Google Cloud, there are three basic scenarios. To help guide your DR strategy, here's a look at what those scenarios are, plus resources and important questions to ask along the way.
1. Production on-premises, with Google Cloud as the DR site
If you operate your own data centers or use a non-hyperscale data center, like many operated by hosting providers, some of the energy efficiency advantages that can be achieved at scale might not be available to you. For example, an average data center uses almost as much non-computing or "overhead" energy (such as cooling and power conversion) as they do to power their servers.
Creating a failover site on-premises means not only are you running data centers that are not optimized for energy efficiency, but you are operating idle servers in a backup location that is consuming electricity with associated carbon emissions that are likely not offset. When designing your DR strategy, you can avoid increasing your carbon footprint by using Google Cloud as the target for your failover site.
You could create your DR site on Google Cloud by replicating your on-prem environment. Replicating environments means that your DR failover site can directly take advantage of Google Cloud's carbon-neutral data centers, which offsets the energy consumption and costs of running a DR site on-prem. However, the reality is that if you are just replicating your on-prem environment, there is an opportunity for you to optimize how your DR site will consume electricity. Google Cloud will offset all of the emissions of a DR site running on our infrastructure, but to truly take advantage of operating at the lowest possible carbon footprint, you should optimize the way you configure your DR failover environment on Google Cloud.
To do that, there are three patterns—cold, warm, and hot—that can be implemented when your application runs on-prem and your DR solution is on Google Cloud. Get an in-depth look at those patterns here.
The graph below illustrates how the pattern chosen relates to your "personal" energy use. In this context, we define "personal" energy costs as energy wasted on idle resources.
Optimizing your personal energy use consists of more than offsetting where you run your DR site. It involves thinking about your DR strategy carefully beyond taking the simplest “let's just replicate everything” approach. Some of the important questions you need to ask include:
Are there some parts of your application that can withstand a longer recovery time objective (RTO) than others?
Can you make use of Google Cloud storage as part of your DR configuration?
Can you get closer to a cold DR pattern, and thus optimize your personal energy consumption?
The elephant in the room, though, is “What if I absolutely need to have resources when I need them? How do I know the resources will be there when I need them? How will this work if I optimize the design of my DR failover site on Google Cloud such that I have minimal resources running until I need them?”
In this situation, you should look into the ability to reserve Compute Engine zonal resources. This ensures resources are available for your DR workloads when you need them. Using reservations for virtual machines also means you can take advantage of discounting options (which we discuss later in this post).
In summary, using Google Cloud as the target for your failover site can help immediately lower your net carbon emissions, and it's also important to optimize your DR configuration by asking the right questions and implementing the right pattern. Lastly, if your particular use case permits, consider migrating your on-prem workloads to Google Cloud altogether. This will enable your organization to really move the needle in terms of reducing its carbon footprint as much as possible.
2. Production on Google Cloud, with Google Cloud as the DR site
Running your applications and DR failover site on Google Cloud means there are zero net operational emissions to operate both your production application and the DR configuration.
From here, you want to focus on optimizing the design of your DR failover site on Google Cloud. The most optimal pattern depends on your use case.
For example, a full high availability (HA) configuration, or hot pattern, means you are using all your resources. There are no standby resources idling, and you are using what you need, when you need it, all the time. Alternatively, your RTO may not require a full HA configuration, but you can adopt a warm or cold pattern when you need to scale or spin up resources as needed in the event of a disaster or major event.
Adopting a warm or cold pattern means all or some of the resources needed for DR are not in use until you need them. This may lead to the exact same questions we mentioned in scenario #1: What if I absolutely need to have resources when I need them in case of a disaster or major event? How do I know the resources will be there when I need them? How will this work?
A simple solution is, like in the previous scenario, to reserve Compute Engine zonal resources for your workloads when you need them. And since you’re running your production on Google Cloud, you can work with your Google Cloud sales representative to forecast your usage and take advantage of committed use discounts. These are where you purchase compute resources (vCPUs, memory, GPUs, and local SSDs) at a discounted price in return for committing to paying for those resources for one or three years. Committed use discounts are ideal for workloads with predictable resource needs.
Taking advantage of committed use discounts enables Google Cloud to use your forecasting to help ensure our data centers are optimized for what you need, when you need it—rather than Google Cloud over-provisioning and essentially running servers that are not optimally used. Sustainability is a balancing act between the power that is being consumed, what sort of power is in use, and the usage of the resources that are being powered by the data centers.
3. Production on another cloud, with Google Cloud as the DR site
As with running production on-prem, your overall carbon footprint is a combination of what you use outside of Google Cloud and what you’re running on Google Cloud (which is carbon neutral). If you’re running production on another cloud, you should investigate the sustainability characteristics of its infrastructure relative to your own sustainability goals. There are multiple ways to achieve carbon neutrality, and many providers are on different journeys towards their own sustainability goals. For the past three years, Google focused on matching its electricity consumption with renewable energy, and in September 2020 set a target to source carbon-free energy 24/7 for every data center. We believe these commitments will help our cloud customers meet their own sustainability targets.
Regardless of which scenario applies to your organization, using Google Cloud for DR is an easy way to lower your energy consumption. When Google Cloud says we partner with our customers, we really mean it. We meet our customers where they are, and we are grateful for our customers who work with us by forecasting their resource consumption so we know where to focus our data center expansion. Our data centers are designed to achieve net-zero emissions and are optimized for maximum utilization. The resulting benefits get passed to our customers, who in turn can lower their carbon footprint. When it comes to sustainability, we get more done when we work together.
Keep reading: Get more insights that can guide your journey toward 24x7 carbon-free energy. Download the free whitepaper, “Moving toward 24x7 Carbon-Free Energy at Google Data Centers: Progress and Insights.”