How Google Does It: How we secure our own cloud
Seth Vargo
Distinguished Software Engineer
Seth Rosenblatt
Security Editor, Google Cloud
Hear monthly from our Cloud CISO in your inbox
Get the latest on security from Cloud CISO Phil Venables.
SubscribeEver wondered how Google does security? As part of our new "How Google Does It" series, we'll share insights, observations, and top tips about how Google approaches some of today's most pressing security topics, challenges, and concerns — straight from Google experts. In this edition, Seth Vargo, distinguished software engineer responsible for Google's use of the public cloud, shares a peek under the hood at how Google uses and secures its own cloud environments.
It probably comes as no surprise that we use our own public cloud at Google — a lot. With many millions of Google Cloud projects, Google is one of the largest users of Google Cloud.
Our goal is to enable our developers with best-in-class technology, regardless whether it runs in our data centers or on a public cloud. That means we really have it all — from purpose-built infrastructure and private cloud, to full gamut of highly-scalable modern cloud technologies such as containers and serverless cloud computing.
We want to move quickly to deliver value, but we also have a responsibility to do it safely and securely to protect our customers. So, how do we do it? Let's take a look at some of the security practices and controls that help us secure our use of the public cloud.
Controlling access
Access control plays a central role in helping us secure our cloud environments and resources. We have an integration layer that sits on top of Google Cloud, providing governance, abstraction layers, and API compatibility. We use the built-in and write our own custom organization policies for our internal systems using the Organization Policy Service, which enable us to make and enforce decisions programmatically about what is permitted and disallowed, helping us to govern access and enforce compliance at the scale needed to support today's cloud users.
Since we can use our own custom Organization Policies, along with our identity and access management (IAM) system, we have a lot of flexibility. We have policies about how identities make their way into cloud systems, both for people and workloads.
IAM policies provide both granular and coarse-grained access control over resources and services. In particular, IAM Deny policies have made it easier for us to set guardrails and define broad access rules at different levels in our technology stack without requiring reviews or changes to existing Allow rules.
Understanding security threats and limiting our attack surface
When using public cloud, we always start with threat modeling to review use cases, identify specific threats, and then curate controls in environments to match that threat model. We also benefit from Google Threat Intelligence, composed of Mandiant, Threat Analysis Group (TAG), and VirusTotal, which defends Google, our users, and our customers.
As both the provider and a customer of Google Cloud, we are in the unique position to test the breadth and depth of our own products and services in real-time. Given the sheer number of our use cases and users, we often hit barriers and limitations that have yet to impact other organizations.
Our threat models vary by workload type. These range from experimental to production workloads that interact with sensitive data, such as health information, personally identifiable information (PII), and more.
We also lean heavily into our resource hierarchy, enforcing different organization policies at different levels of the hierarchy, depending on the threat model. All these segmentations enable us to appropriately restrict access to data and minimize risks while still enabling the freedom to experiment.
As projects move closer to real-world production, we increase governance and limit the technologies available. For example, developers can experiment with an extensive range of solutions in our lower-level environments, but are much more governed and restricted in production environments. This helps balance rapid iteration and experimentation against maintaining a homogenous set of production infrastructure solutions.
Driving continuous innovation
Because Google is one of the biggest users of Google Cloud services, we are constantly testing our products. This additional testing doesn’t stay siloed, of course: We use it to improve our software and services, and ultimately our customers reap the benefits from our use.
We strive to go beyond secure by default to build security so that it's extremely difficult for our engineers to take an action that could be considered insecure or non-compliant. One of the ways we're working toward achieving this is by building more agile security that can help us move fast and deliver features faster.
For example, we leverage Infrastructure as Code (IaC) tooling as much as possible, from configuring organization policies to deploying and managing infrastructure.
As both the provider and a customer of Google Cloud, we are in the unique position to test the breadth and depth of our own products and services in real-time. Given the sheer number of our use cases and users, we often hit barriers and limitations that have yet to impact other organizations.
What we learn from our experiences as users of our products has driven us to create solutions to challenges ahead of the curve, enabling us to build a secure future that works for Google scale — and everyone else as well.
This article includes insights from the Cloud Security Podcast episode, "Google on Google Cloud: How Google Secures Its Own Cloud”.