Cloud Tensor Processing Units (TPUs)
Accelerate AI development with Google Cloud TPUs
Not sure if TPUs are the right fit? Learn about when to use GPUs or CPUs on Compute Engine instances to run your machine learning workloads.
Overview
What is a Tensor Processing Unit (TPU)?
What are the advantages of Cloud TPUs?
When to use Cloud TPUs?
How are Cloud TPUs different from GPUs?
A GPU is a specialized processor originally designed for manipulating computer graphics. Their parallel structure makes them ideal for algorithms that process large blocks of data commonly found in AI workloads. Learn more.
A TPU is an application-specific integrated circuit (ASIC) designed by Google for neural networks. TPUs possess specialized features, such as the matrix multiply unit (MXU) and proprietary interconnect topology that make them ideal for accelerating AI training and inference.
Cloud TPU versions
Cloud TPU v5e
The most efficient, versatile, and scalable Cloud TPU.
Cloud TPU v5e will be available in North America (US West/Central/East regions), EMEA (Netherlands) and APAC (Singapore)
Cloud TPU v4
The most powerful Cloud TPU for training AI models.
Cloud TPUv4 is available in the us-central2 region
Cloud TPU version | Description | Availability |
---|---|---|
Cloud TPU v5e |
The most efficient, versatile, and scalable Cloud TPU. |
Cloud TPU v5e will be available in North America (US West/Central/East regions), EMEA (Netherlands) and APAC (Singapore) |
Cloud TPU v4 |
The most powerful Cloud TPU for training AI models. |
Cloud TPUv4 is available in the us-central2 region |
How It Works
Get an inside look at the magic of Google Cloud TPUs, including a rare inside view of the data centers where it all happens. Customers use Cloud TPUs to run some of the world's largest AI workloads and that power comes from much more than just a chip. In this video, take a look at the components of the TPU system, including data center networking, optical circuit switches, water cooling systems, biometric security verification and more.
Common Uses
Run large-scale AI training workloads
Cost-efficient scaling with Cloud TPU Multislice
TPU v5e provides up to 2x higher training
performance per dollar for LLMs and Gen AI
models compared to TPU v4. Multislice
technology allows users to easily scale AI
models beyond a single TPU pod, training on
tens of thousands of Cloud TPU chips, both for
TPU v5e and TPU v4. With Multislice,
developers can leverage the same XLA
programming model to scale workloads over
inter-chip interconnect (ICI) within a single
pod, or across pods over datacenter network
(DCN).
Fine-tune foundational AI models
Serve large-scale AI inference workloads
Maximize performance/$ with AI infrastructure that scales
Cloud TPU v5e enables high-performance and
cost-effective inference for a wide range of
AI workloads, including the latest LLMs and
Gen AI models. TPU v5e delivers up to 2.5x
more throughput performance per dollar and up
to 1.7x speedup over Cloud TPU v4. Each TPU
v5e chip provides up to 393 trillion int8
operations per second, allowing complex models
to make fast predictions. A TPU v5e pod
delivers up to 100 quadrillion int8 operations
per second, or 100 petaOps of compute power.
Cloud TPU in GKE
Effortless scaling with GKE
Combine the power of Cloud TPUs with the
flexibility and scalability of
GKE
to build and deploy machine learning models faster
and more easily than ever before. With Cloud TPUs
available in GKE, you can now have a single
consistent operations environment for all your
workloads, standardizing automated MLOps
pipelines.
Cloud TPU in Vertex AI
Vertex AI Training & Predictions with Cloud TPUs
For customers looking for a simplest way to
develop AI models, you can deploy Cloud TPU v5e
with
Vertex AI,
an end-to-end platform for building AI models on
fully-managed infrastructure that’s purpose-built
for low-latency serving and high-performance
training.
Pricing
Cloud TPU pricing
All Cloud TPU pricing is per chip-hour
Starting at
$1.200
per chip-hour
Starting at
$0.8400
per chip-hour
Starting at
$0.5400
per chip-hour
Cloud TPU v4
Starting at
$3.2200
per chip-hour
Starting at
$2.0286
per chip-hour
Starting at
$1.4490
per chip-hour
Cloud TPU pricing | All Cloud TPU pricing is per chip-hour | ||
---|---|---|---|
Cloud TPU Version | Evaluation Price (USD) | 1-year commitment (USD) | 3-year commitment (USD) |
Cloud TPU v5e |
Starting at $1.200 per chip-hour |
Starting at $0.8400 per chip-hour |
Starting at $0.5400 per chip-hour |
Cloud TPU v4 |
Starting at $3.2200 per chip-hour |
Starting at $2.0286 per chip-hour |
Starting at $1.4490 per chip-hour |