How Google Does It: Security programs at global scale

Royal Hansen
VP, Engineering for Privacy, Safety, and Security
Hear monthly from our Cloud CISO in your inbox
Get the latest on security from Cloud CISO Phil Venables.
SubscribeEver wondered how Google does security? As part of our “How Google Does It” series, we share insights, observations, and top tips about how Google approaches some of today's most pressing security topics, challenges, and concerns — straight from Google experts. In this edition, Royal Hansen, vice-president of engineering, shares insights into Google’s internal security culture and how Google uses Secure by Design principles to grow security at enterprise scale.
When it comes to securing Google’s vast and complex infrastructure, there’s no way to scale a people-based operation. Given the size and scope of our systems, we simply don’t have the resources — let alone the organizational capacity or skill sets — to keep pace.
Attempting to grow security operations teams at the same rate as your assets and threats will eventually lead to a dead end. In some ways, our massive scale has encouraged us to avoid falling into the trap of throwing people at our problems, pushing us to invest in learning how to operate and make meaningful security improvements at scale.
So, what does this actually look like in practice? Here are the three core principles that influence how we approach scaling security at Google.
Embrace secure by design
One of the main pillars of our approach is making our technology secure by design. We solve many of our problems through design, baking security directly into our technical infrastructure and platform from inception to make technology as safe as possible before it reaches people. Notably, this has been true for us since the very early days of Google.
Emphasizing secure by design has helped us incorporate security mitigations earlier in the lifecycle of a technology, rather than bolting it on afterwards. In turn, this has helped reduce vulnerabilities and produce products and services that can automatically defend users against threats.
In particular, we find applying this approach to the developer ecosystem — tools, systems, and processes developers use to develop products — is one of the most effective ways to achieve high levels of safety and security.
To scale security successfully, it’s imperative to prioritize designing systems, tools, and processes that provide as little opportunity as possible for developers to make mistakes that could lead to future incidents and errors.
One of the first things I noticed after joining Google in 2018 was that security mechanisms were designed in a way that every developer had to use them, built directly into our libraries, web and application frameworks, developer tooling, and production platforms.
At Google, we view security as an emergent property of software development. Every stage of the software development lifecycle, from design to implementation to deployment, has the potential to introduce risk.
Developers are users, too, and the reality is that many defects often get unintentionally introduced in the underlying developer ecosystem of a product. To scale security successfully, it’s imperative to prioritize designing systems, tools, and processes that provide as little opportunity as possible for developers to make mistakes that could lead to future incidents and errors.
Eliminate toil
Toil doesn’t always get a name in a lot of enterprises. At Google, it’s one of our top vocabulary terms, guiding the way we design, develop, and deploy security solutions. Much of our security work draws on key practices from Site Reliability Engineering (SRE), such as managing toil.
In the context of today’s modern environments, the concept of toil is particularly relevant, with security operations teams faced with an increasing number of assets, more complexity, and a nearly endless deluge of alerts. We’re always looking for points where we can offload the cognitive demands of security and introduce AI and automation to reduce the manual processes that tend to overburden teams.
With the recent developments in generative AI, we see many opportunities to transform the way we “do” security, making it easier for all practitioners — even those who aren’t security specialists — to understand and manage security.
One of the ways we do this is to ground our design in terms of security invariants — properties we want to guarantee will hold for a system, even in the event of an attack or a user mistake. Our scale prevents us from being able to examine and understand every piece of every system, so this pattern of thinking allows us to proactively focus our efforts on enforcing foundational rules in our overall architecture that can strengthen our security.
Doing so leads us to provide consistent control points that everyone can rely on, which improves security, benefits operations, and enhances the overall reliability of your systems.
We also adopt a security as code (SaC) approach, which uses automation and scripting to create, manage, and enforce security controls and policies. SaC helps to eliminate many of the risks, inconsistencies, and inefficiencies that often occur in manual activities, advancing a more consistent, repeatable, and scalable approach to security.
With the recent developments in generative AI, we see many opportunities to transform the way we “do” security, making it easier for all practitioners — even those who aren’t security specialists — to understand and manage security. Already, we are investing in gen AI capabilities that have helped us accelerate many general security tasks, including incident summaries, basic analysis across multiple tools, and code analysis.
We believe that security is everybody’s responsibility, a commitment that is upheld and driven by security and our engineering and executive leadership.
Take security culture seriously
All teams, even well-funded teams suffer from some amount of team degradation and burnout, especially as they scale. To help us grow, we have done our best to make security a first-class engineering discipline — not a dark corner of the organization that only certain people can enter.
We believe that security is everybody’s responsibility, a commitment that is upheld and driven by security and our engineering and executive leadership. This mindset has given rise to many incredible teams that people want to join and be a part of, such as the Google Threat Intelligence Group (GTIG), which has helped us to attract talent from across the organization and the industry.
Another important aspect of our security culture is that we maintain the same bottom-up engineering approach that helped put Google on the map. Good ideas have always had currency at Google, and they find their way to the top, even if they aren’t a management mandate.
This core tenet remains true in the way our security teams work and approach solving problems. Ultimately, getting the right people into these roles with the right skills and allowing them to freely share their ideas has enabled us to flourish.
This article includes insights from the episode,“How Google Does Security Programs at Scale: CISO Insights,” of the Cloud Security Podcast.