Jump to Content
Sustainability

Managing carbon emissions across clouds: a practical example

June 9, 2023
Cameron Casher

Senior Software Engineer & Clean Tech Strategist, Thoughtworks

Mark Richter

Cloud Specialist, Thoughtworks

As climate change continues to be a growing global concern, it is essential for organizations to take steps towards sustainability. One significant area where businesses can make a positive impact is by reducing their cloud carbon emissions. At Thoughtworks, we are able to estimate our carbon footprint across clouds using the open source Cloud Carbon Footprint (CCF) software, and utilize Google Cloud services to create a sustainable infrastructure to architect our solution efficiently. The CCF complements the preferred tool for more accurate and granular emissions in Google Cloud — Google Cloud Carbon Footprint.

What are we doing?

Providing real time energy and emissions metrics internally to help create a culture of sustainability and expedite a remediation plan.

The basis of our approach revolves around the Green Software Fundamentals phases identified by the Green Software Foundation (GSF) of Learn, Measure and Reduce. These phases include understanding the emissions drivers and their relationships to each other; measuring and accessing the right data; and implementing the necessary optimization strategies to reduce an organization’s carbon footprint.

To make sense of emissions metrics and why certain cloud services may be more or less carbon-intensive, it is important to have a fundamental understanding of green software principles, emissions drivers and the broad domain of sustainability in tech. Principles ranging from carbon efficiency to energy proportionality are essential for our platform engineers to understand when looking to optimize their Google Cloud infrastructure.

We believe it is essential for organizations to have access to real-time energy and emissions metrics. The key here is to ensure that data on cloud cost, usage, and carbon footprint is available both historically and retrieved daily, so that our platform teams have both context and short feedback loops for the decisions they make.

Being able to view carbon emissions data in custom charts and dashboards not only allows for casual monitoring, but can also provide clarity for a number of different personas ranging from practitioners who have direct control over their cloud infrastructure, to executive level decision makers who can use the visualizations to tell a compelling story of opportunities to cut cloud costs and emissions.

After obtaining access to real-time data and enabling casual monitoring via dashboards, the natural next step is to more deeply analyze spikes and trends and ultimately, identify opportunities to implement strategies to mitigate emissions and lower costs. We have found that tagging or labeling cloud resources enables practitioners to assess the cost and emissions impact at a more granular resource level with metadata, and make the connection between carbon and meaningful groupings to identify exact pain points or opportunity areas.

Through sufficient analysis, our Thoughtworks teams are able to consider ways to reduce their cloud carbon footprint, and identify remediation strategies that would be most fitting for their objectives, requirements, and infrastructure needs.

https://storage.googleapis.com/gweb-cloudblog-publish/images/Image_1_HBE3o7n.max-1000x1000.jpg

How are we doing it?

Using Google's Carbon reporting data and Google Cloud services to ingest data and open-source software to estimate and and visualize meaningful insights.

CCF is an open source tool that helps you measure and analyze your carbon emissions and energy usage associated with Google Cloud usage, and other cloud service providers. It does this by tracking your energy use and emissions from your Google Cloud resources, such as Compute Engine instances, Cloud Storage buckets, and Cloud SQL databases.

CCF integrates with Google Cloud by configuring the billing project to export billing and usage data to BigQuery, and then subsequently running queries on that data to filter for specific time frames and other relevant information. You can reference the CCF documentation for a more detailed explanation. 

We use several Google Cloud services to pipeline and visualize our carbon emissions data. These include:

  • App Engine is a fully managed platform that makes it easy to deploy and scale web applications. We use App Engine to run our CCF application.

  • Cloud Scheduler is a cron-like service that allows you to schedule jobs to run on a recurring basis. We use Cloud Scheduler to schedule a job to run every day at midnight. This job triggers a Pub/Sub message.

  • Pub/Sub is a messaging service that allows you to send and receive messages between applications. We use Pub/Sub to send the message from Cloud Scheduler to a Cloud Function.

  • Cloud Functions is a serverless platform that makes it easy to run code without having to provision or manage servers, and whose scale-to-zero feature mitigates cost. We use Cloud Functions to automate the daily requests that are used to collect our carbon emissions data for Google Cloud and Amazon Web Services.

  • BigQuery is a serverless, highly scalable data warehouse that makes it easy to store and analyze large amounts of data. We use BigQuery to store our carbon emissions data. 

  • Looker Studio is a business intelligence platform that makes it easy to create interactive dashboards and reports, and where we visualize our carbon emissions data.

https://storage.googleapis.com/gweb-cloudblog-publish/images/Image_2_9mif26G.max-700x700.jpg

Cloud Scheduler triggers a Pub/Sub message daily. That message is consumed by a Cloud Function that hits the CCF API with a query for one day’s worth of data. This function pushes this data upstream, appending it to a BigQuery table. That table is then consumed by Looker Studio and joined with billing export data for many different visualizations.

What are the results of what we’re doing? 

Grabbing cloud carbon estimations through CCF open source software and using Google Cloud services

After enabling a centralized team at Thoughtworks to consume CCF data using cloud billing data, dashboards were made broadly available for internal teams so they could view and monitor their historic and daily cloud emissions metrics. This newfound transparency and access to real-time data enabled teams to identify spikes and trends, and to automate recurring emissions reports to notify teams of any remedial actions they needed to take.

Let’s look at an example. In the following chart we can see three usage spikes and four emissions spikes. We can conclude at least two things from this. First, cost of usage is not a perfect proxy for emissions. There’s a big spike in emissions in early March with no corresponding rise in cost. Second, we know we should really drill in and understand what happened in February and March. In fact we did this, which is how we were able to resolve the spike.

https://storage.googleapis.com/gweb-cloudblog-publish/images/image_3_CffVZHG.max-1400x1400.jpg

Let’s use a chart that shows us emissions by cloud product over time. Narrowing the scope to Mar 4 through Mar 12 gives us this:

https://storage.googleapis.com/gweb-cloudblog-publish/images/image_4_6AKRdGV.max-1400x1400.jpg

This is the time span covering the tallest point of our largest spike in emissions. We can see the product contributing most of our emissions bulge is Cloud Composer. Looking deeper we learn that three Google Cloud projects are contributing 90% of this. We reached out to the teams responsible for these projects. Tuning commenced.

Now, looking at the situation four weeks later we find a much narrower gap between Composer and Compute Engine. Estimated emissions from the use of Composer have been reduced substantially.

https://storage.googleapis.com/gweb-cloudblog-publish/images/image_5_pGa9WcG.max-1400x1400.jpg

Picture a more sustainable organization

We believe that Google Cloud can be a powerful resource for organizations looking to reduce their carbon footprint. When choosing the best measurement tool, we believe that the open-source CCF is great for comparing results across clouds, while allowing access to real time data by leveraging average values from open data sources, and the Google Cloud Carbon Footprint tool has the best accuracy for Google’s carbon estimations with access to true machine-level power consumption data. We view the tools as fundamentally different but complementary, and we encourage you to learn more about Cloud Carbon Footprint and to explore the other sustainability solutions that Google Cloud offers.

Posted in