Systems

The past, present and future of custom compute at Google

March 22, 2021

Amin Vahdat

VP/GM, ML, Systems, and Cloud AI

Try Google Cloud

Start building on Google Cloud with $300 in free credits and 20+ always free products.

Since our inception more than two decades ago, Google has designed and built some of the world’s largest and most efficient computing systems to meet the needs of our customers and users. Custom chips are one way to boost performance and efficiency now that Moore’s Law no longer provides rapid improvements for everyone. Today, we are doubling down on this approach.

To put our future vision for computing in context, let’s briefly take a look back at history. In 2015, we introduced the Tensor Processing Unit (TPU) to customers. Without TPUs, offering many of our services such as real-time voice search, photo object recognition, and interactive language translation simply would not be possible. In 2018, we launched Video Coding Units (VCUs) to enable video distribution to a range of formats and client requirements, supporting the rapid demand for real-time video communication scalably and effectively. In 2019, we unveiled OpenTitan, the first open-source silicon root-of-trust project. We’ve also developed custom hardware solutions from SSDs, to hard drives, network switches, and network interface cards—often in deep collaboration with external partners.

The future of cloud infrastructure is bright, and it’s changing fast. As we continue to work to meet computing demands from around the world, today we are thrilled to welcome Uri Frank as our VP of Engineering for server chip design. Uri brings nearly 25 years of custom CPU design and delivery experience, and will help us build a world-class team in Israel. We’ve long looked to Israel for novel technologies including Waze, Call Screen, flood forecasting, high-impact features in Search, and Velostrata’s cloud migration tools, and we look forward to growing our presence in this global innovation hub.

Compute at Google is at an important inflection point. To date, the motherboard has been our integration point, where we compose CPUs, networking, storage devices, custom accelerators, memory, all from different vendors, into an optimized system. But that’s no longer sufficient: to gain higher performance and to use less power, our workloads demand even deeper integration into the underlying hardware.

Instead of integrating components on a motherboard where they are separated by inches of wires, we are turning to “Systems on Chip” (SoC) designs where multiple functions sit on the same chip, or on multiple chips inside one package. In other words, the SoC is the new motherboard.

On an SoC, the latency and bandwidth between different components can be orders of magnitude better, with greatly reduced power and cost compared to composing individual ASICs on a motherboard. Just like on a motherboard, individual functional units (such as CPUs, TPUs, video transcoding, encryption, compression, remote communication, secure data summarization, and more) come from different sources. We buy where it makes sense, build it ourselves where we have to, and aim to build ecosystems that benefit the entire industry.

Together with our global ecosystem of partners, we look forward to continuing to innovate at the leading edge of compute infrastructure, delivering the next generation of capabilities that are not available elsewhere, and creating fertile ground for the next wave of yet-to-be-imagined applications and services.

AI & Machine Learning

Google supercharges machine learning tasks with TPU custom chip

By Norm Jouppi • 2-minute read

https://storage.googleapis.com/gweb-cloudblog-publish/images/01_-_AI__Machine_Learning_H1ZyZG8.max-900x900.jpg

Posted in

Systems

Balance of power: A full-stack approach to power and thermal fluctuations in ML infrastructure

By Houle Gan • 6-minute read

Sustainability

Designing sustainable AI: A deep dive into TPU efficiency and lifecycle emissions

By David Patterson • 4-minute read

https://storage.googleapis.com/gweb-cloudblog-publish/images/25_years.max-700x700.jpg

Networking

Speed, scale and reliability: 25 years of Google data-center networking evolution

By Amin Vahdat • 7-minute read

https://storage.googleapis.com/gweb-cloudblog-publish/images/OCP24_blog_hero_1.max-700x700.jpg

Systems

Sustainable silicon to intelligent clouds: collaborating for the future of computing

By Parthasarathy Ranganathan • 7-minute read

The past, present and future of custom compute at Google

Amin Vahdat

Try Google Cloud

Google supercharges machine learning tasks with TPU custom chip

Related articles

Balance of power: A full-stack approach to power and thermal fluctuations in ML infrastructure

Designing sustainable AI: A deep dive into TPU efficiency and lifecycle emissions

Speed, scale and reliability: 25 years of Google data-center networking evolution

Sustainable silicon to intelligent clouds: collaborating for the future of computing