Not sure if TPUs are the right fit? Learn about when to use GPUs or CPUs on Compute Engine instances to run your machine learning workloads.
Overview
A GPU is a specialized processor originally designed for manipulating computer graphics. Their parallel structure makes them ideal for algorithms that process large blocks of data commonly found in AI workloads. Learn more.
A TPU is an application-specific integrated circuit (ASIC) designed by Google for neural networks. TPUs possess specialized features, such as the matrix multiply unit (MXU) and proprietary interconnect topology that make them ideal for accelerating AI training and inference.
Cloud TPU versions
Cloud TPU version | Description | Availability |
---|---|---|
Cloud TPU v5e | The most efficient, versatile, and scalable Cloud TPU. | Cloud TPU v5e is generally available in North America (US West/East regions) |
Cloud TPU v4 | The most powerful Cloud TPU for training AI models. | Cloud TPUv4 is available in the us-central2 region |
Cloud TPU v5e is generally available in North America (us-east5 and us-west4).
Cloud TPU v5e
The most efficient, versatile, and scalable Cloud TPU.
Cloud TPU v5e is generally available in North America (US West/East regions)
Cloud TPU v4
The most powerful Cloud TPU for training AI models.
Cloud TPUv4 is available in the us-central2 region
Cloud TPU v5e is generally available in North America (us-east5 and us-west4).
How It Works
Get an inside look at the magic of Google Cloud TPUs, including a rare inside view of the data centers where it all happens. Customers use Cloud TPUs to run some of the world's largest AI workloads and that power comes from much more than just a chip. In this video, take a look at the components of the TPU system, including data center networking, optical circuit switches, water cooling systems, biometric security verification and more.
Common Uses
Cloud TPU Multislice training is a full-stack technology that enables fast, easy, and reliable large-scale AI model training across tens of thousands of TPU chips.
Cloud TPU Multislice training is a full-stack technology that enables fast, easy, and reliable large-scale AI model training across tens of thousands of TPU chips.
Cloud TPU v5e enables high-performance and cost-effective inference for a wide range of AI workloads, including the latest LLMs and Gen AI models. TPU v5e delivers up to 2.5x more throughput performance per dollar and up to 1.7x speedup over Cloud TPU v4. Each TPU v5e chip provides up to 393 trillion int8 operations per second, allowing complex models to make fast predictions. A TPU v5e pod delivers up to 100 quadrillion int8 operations per second, or 100 petaOps of compute power.