Jump to Content
Google Cloud

A look inside Google’s Data Center Networks

June 17, 2015
Amin Vahdat

VP/GM, ML, Systems, and Cloud AI

Google has long been a pioneer in distributed computing and data processing, from Google File System to MapReduce to Bigtable and to Borg. From the beginning, we’ve known that great computing infrastructure like this requires great datacenter networking technology. But when Google was getting started, no one made a datacenter network that could meet our distributed computing requirements.

So, for the past decade, we have been building our own network hardware and software to connect all of the servers in our datacenters together, powering our distributed computing and storage systems. Now, we have opened up this powerful and transformative infrastructure for use by external developers through Google Cloud Platform.

At the 2015 Open Network Summit we revealed for the first time the details of five generations of our in-house network technology. From Firehose, our first in-house datacenter network, ten years ago to our latest-generation Jupiter network, we’ve increased the capacity of a single datacenter network more than 100x. Our current generation — Jupiter fabrics — can deliver more than 1 Petabit/sec of total bisection bandwidth. To put this in perspective, such capacity would be enough for 100,000 servers to exchange information at 10Gb/s each, enough to read the entire scanned contents of the Library of Congress in less than 1/10th of a second.

https://storage.googleapis.com/gweb-cloudblog-publish/images/Jupiter_rackam6r.max-400x400.JPEG

We used three key principles in designing our datacenter networks:

  • We arrange our network around a Clos topology, a network configuration where a collection of smaller (cheaper) switches are arranged to provide the properties of a much larger logical switch.

  • We use a centralized software control stack to manage thousands of switches within the data center, making them effectively act as one large fabric.

  • We build our own software and hardware using silicon from vendors, relying less on standard Internet protocols and more on custom protocols tailored to the data center.

Taken together, our network control stack has more in common with Google’s distributed computing architectures than traditional router-centric Internet protocols. Some might even say that we’ve been deploying and enjoying the benefits of Software Defined Networking (SDN) at Google for a decade. A few years ago, we revealed how SDN has been powering Google’s datacenter WAN, B4, one of the world’s biggest WANs. Last year, we showed the details of GCP’s SDN network virtualization stack, Andromeda. In fact, the architectural ideas for both of these systems come from our early work in datacenter networking.

Building great data center networks is not just about building great hardware and software. It’s about partnering with the world’s best network engineering and operations team from day one. Our approach to networking fundamentally changes the organization of the network’s data, control, and management planes. Such a fundamental shift does not come without some bumps, but our operations team has more than met the challenge. We’ve deployed and redeployed multiple generations of our network across our planetary-scale infrastructure to keep up with the bandwidth needs of our distributed systems.

Putting all of this together, our datacenter networks deliver unprecedented speed at the scale of entire buildings. They are built for modularity, constantly upgraded to meet the insatiable bandwidth demands of the latest generation of our servers. They are managed for availability, meeting the uptime requirements of some of the most demanding Internet services and customers. Most importantly, our datacenter networks are shared infrastructure. This means that the same networks that power all of Google’s internal infrastructure and services also power Google Cloud Platform. We are most excited about opening this capability up to developers across the world so that the next great Internet service or platform can leverage world-class network infrastructure without having to invent it.

Posted in