Jump to Content
Infrastructure

3 key lessons from 25 years of warehouse scale computing

March 12, 2025
https://storage.googleapis.com/gweb-cloudblog-publish/images/Screenshot_2025-03-12_at_5.46.38PM.max-1700x1700.png
Parthasarathy Ranganathan

VP, Engineering Fellow

Urs Hölzle

SVP, Cloud Infrastructure

These lessons remain crucial for technology leaders as AI and cloud computing demands continue to grow.

Try Google Cloud

Start building on Google Cloud with $300 in free credits and 20+ always free products.

Free trial
Editor's note: In 1998, Google already faced significant scaling challenges. Successful web search would require enormous amounts of computing power and storage, far beyond what even the most powerful single computers of that era could provide. Google’s solution was to invent massive data centers housing thousands of interconnected servers that effectively function as one supercomputer.
 
Today, we know this approach as "warehouse-scale computing" (WSC). Now, 25 years later, WSCs form the backbone of all hyperscale and cloud computing, powering everything from Gmail and YouTube to the AI models that are driving innovation across industries around the world today.
 
To commemorate this milestone, Google leaders Parthasarathy Ranganathan and Urs Hölzle have published a retrospective on Google's WSC journey. Their in-depth article chronicles the significant technological challenges Google faced over two and a half decades of dramatically increasing scale and identifies ten enduring lessons for the future. Here are three key lessons they share that can support technology and business leaders in 2025.

3 Key lessons

Landings over launches

One of the most powerful WSC lessons has nothing to do with servers or software, but how to measure success. Too often, organizations focus on launches - splashy product releases or announcements. But, what matters most are landings - the concrete, measurable impact on users and customers.

It's far more important to focus on what it means to land a new product or technology. While picking landing metrics may not be easy, forcing that decision to be made early is essential to success. The landing is the 'why' of the project.

The relentless growth of WSC's complexity and scale has highlighted the critical importance of choosing the right goals and measuring them well. Successive iterations of WSC design have required leaders to integrate the needs for performance, cost effectiveness, reliability, manageability, and security. With many priorities, a team can easily be distracted by their launches while deprioritizing their key results. Orienting teams around clearly defined outcomes, or landings, enables teams to continue adapting to a changing technological environment.

The power of roofshots

Ask a technologist how to achieve 10X gains, and they may describe an ambitious moonshot - a radical reimagining that unlocks a quantum leap in performance. Google has a celebrated history of ambitious moonshot projects, from self-driving cars (Waymo) to internet-beaming balloons (Loon), delivery drones (Wing), and more. These efforts reflect our commitment to innovation and pushing technological boundaries. But while moonshots have their place, there’s another powerful approach that cannot be neglected: roofshots.

Over the past 25 years, we have achieved several orders of magnitudes improvement, but many of them have been the result of roofshots, the relentless, sustained pursuit of smaller (1.3-2X) opportunities. A sequence of roofshots can produce both quick returns and sustained transformative results.

It's a philosophy of persistent, incremental progress - climbing towards towering goals, one rung at a time. Roofshots powered Google's breakthroughs in energy efficiency, server utilization, cost reduction, and more. Compounded over time, these advances completely redefined the capabilities of WSCs. As the late Luiz Barroso said, “We choose to go to the roof not because it is glamorous, but because it is right there!”

The primacy of security

Perhaps the most urgent lesson from Google's WSC journey is the absolute criticality of security. As WSC has become a foundational element of global technology infrastructure, it has come under unrelenting attack from ever-more sophisticated adversaries.

To defend against nation-state actors, much deeper defenses are required. Servers must have a secure, separate silicon root of trust to validate and protect firmware and operating systems; all data must be encrypted; systems assume zero trust; important actions or access requires multi-party authorization; all production code must be reviewed, and have verifiable provenance; defenses (including physical security) must be regularly tested by highly skilled Red Teams; etc. In the modern world, ignore security at your own peril. Only the paranoid survive.

As WSCs become ever-more enmeshed in societal infrastructure, the transparency of open- source hardware and software will be an essential contributor to security. By enabling broad community inspection and collaboration, open-source architectures help surface vulnerabilities and propagate best practices. This is a major reason that Google supports open-source silicon root of trust designs like Titan and organizations like the Open Compute Project.

Looking ahead, security will only grow in importance. Geopolitical frictions are driving new requirements around data sovereignty and residency. Trust in the integrity of foundational systems will be paramount. And as physical and digital worlds intertwine, securing everything from factory robots to autonomous vehicles will be critical.

Looking forward

As AI/ML becomes an increasingly important part of life, the scale of needed computing is likely to continue growing. The challenges and opportunities for innovation in WSC are similarly huge. Advancing AI may soon be able to help design WSCs themselves, opening whole new technical frontiers.

But amidst all this technological progress, Google's 25 years of experience in WSC illuminates three timeless lessons: (1) success demands clear goals pursued to their conclusion (landings not launches); (2) relentless pursuit of operational excellence and incremental improvement can deliver big results; and (3) without deep focus on security, nothing else will work.

After 25 years, warehouse-scale computing has traveled a remarkable path from concept to societal bedrock. The best of WSC is yet to come. If the next 25 years are anything like the last, we are in for an exhilarating ride.

Posted in