TPU VM images
When you create TPU resources, you pass the --version
or --runtime-version
parameter which specifies a TPU VM image. TPU VM images contain the Ubuntu operating
system, Docker, and other software required to run your code on TPUs.
This document provides guidance on selecting the appropriate TPU VM image when
you create Cloud TPUs.
PyTorch and JAX
Use the following common TPU VM base images for PyTorch and JAX, then install the framework you want to use.
- tpu-ubuntu2204-base (default)
- v2-alpha-tpuv6e (TPU v6e)
- v2-alpha-tpuv5 (TPU v5p)
- v2-alpha-tpuv5-lite (TPU v5e)
Refer to the quickstart documents for PyTorch/XLA and JAX for installation instructions.
TensorFlow
There are TPU VM images specific to each version of TensorFlow. The following TensorFlow versions are supported on Cloud TPUs:
- 2.18.0
- 2.17.1
- 2.17.0
- 2.16.2
- 2.16.1
- 2.15.1
- 2.15.0
- 2.14.1
- 2.14.0
- 2.13.1
- 2.13.0
- 2.12.1
- 2.12.0
- 2.11.1
- 2.11.0
- 2.10.1
- 2.10.0
- 2.9.3
- 2.9.1
- 2.8.4
- 2.8.3
- 2.8.0
- 2.7.4
- 2.7.3
The name of the TPU VM image is composed of tpu-vm-tf
, the version of
TensorFlow, -pod
if you are using a multihost TPU slice, and -pjrt
if you are using the PJRT API. Not all TPU
versions support PJRT, see the following sections for more information on how to
specify a TPU VM image.
For more information on TensorFlow patch versions, see Supported TensorFlow patch versions.
For TensorFlow versions 2.15.0 and newer there are TPU VM image variants based on the device API (PJRT or stream executor) you are using.
Training on v6e, v5p, and v5e
TPU v6e, v5e, and v5p support TensorFlow 2.15.0 and newer. You specify the TPU VM
image using the form: tpu-vm-tf-x.y.z-{pod}-pjrt
where x
is the major
TensorFlow version, y
is the minor version, and z
is the
TensorFlow patch version. Add pod
after the TensorFlow version
if you are using a multi-host TPU. For example, if you are using TensorFlow
2.16.0 on a multi-host TPU, use the tpu-vm-tf-2.16.0-pod-pjrt
TPU VM image. For
other versions of TensorFlow, replace 2.16.0
with the major and patch
versions of TensorFlow you are using. If you are using a single host TPU,
omit pod
.
Serving on v6e and v5e
There are serving Docker images that contain all needed software requirements for serving with TensorFlow, PyTorch, and JAX. For more information, see Cloud TPU v5e inference introduction.
TPU v4
If you are using TPU v4 and TensorFlow 2.15.0 or newer, follow the instructions for training on v6e, v5p, and v5e. If you are using TensorFlow 2.10.0 or earlier, use a v4-specific TPU VM image:
TensorFlow version | TPU VM image version |
---|---|
2.10.0 | tpu-vm-tf-2.10.0-v4 tpu-vm-tf-2.10.0-pod-v4 |
2.9.3 | tpu-vm-tf-2.9.3-v4 tpu-vm-tf-2.9.3-pod-v4 |
2.9.2 | tpu-vm-tf-2.9.2-v4 tpu-vm-tf-2.9.2-pod-v4 |
2.9.1 | tpu-vm-tf-2.9.1-v4 tpu-vm-tf-2.9.1-pod-v4 |
TPU v2 and v3
If you are using TPU v2 or v3, use the TPU VM image that matches the version of
TensorFlow you are using. For example if you are using TensorFlow
2.14.1, use the tpu-vm-tf-2.14.1
TPU image. For other versions
of TensorFlow, replace 2.14.1
with the TensorFlow version you
are using. If you are using a multi-host TPU append pod to the end of the TPU
image, for example tpu-vm-tf-2.14.1-pod
.
Beginning with TensorFlow 2.15.0, you must also specify a device API as
part of the image name. For example, if you are using TensorFlow 2.16.1
with the PJRT API, use the TPU image tpu-vm-tf-2.16.1-pjrt
. If you are using
the stream executor API with the same version of TensorFlow, use the
tpu-vm-tf-2.16.1-se
TPU image. TensorFlow versions older than 2.15.0
only support stream executor.
TensorFlow PJRT support
Beginning with TensorFlow 2.15.0, you can use the PJRT interface for TensorFlow on TPU. PJRT features automatic device memory defragmentation and simplifies the integration of hardware with frameworks. For more information about PJRT, see PJRT: Simplifying ML Hardware and Framework Integration.
Accelerator | Feature | PJRT support | Stream executor support |
---|---|---|---|
TPU v2 - v4 | Dense compute (no TPU embedding API) | Yes | Yes |
TPU v2 - v4 | Dense compute API + TPU embedding API | No | Yes |
TPU v2 - v4 | tf.summary/tf.print with soft device placement | No | Yes |
TPU v5e | Dense compute (no TPU embedding API) | Yes | No |
TPU v5e | TPU embedding API | N/A | No |
TPU v5p | Dense compute (no TPU embedding API) | Yes | No |
TPU v5p | TPU embedding API | Yes | No |
Libtpu versions
TPU VMs TensorFlow images contain a specific TensorFlow version and the corresponding libtpu library. If you are creating your own VM image, use the following TensorFlow TPU software versions and corresponding libtpu versions:
TensorFlow version | libtpu.so version |
---|---|
2.18.0 | 1.12.0 |
2.17.1 | 1.11.1 |
2.17.0 | 1.11.0 |
2.16.2 | 1.10.1 |
2.16.1 | 1.10.1 |
2.15.1 | 1.9.0 |
2.15.0 | 1.9.0 |
2.14.1 | 1.8.1 |
2.14.0 | 1.8.0 |
2.13.1 | 1.7.1 |
2.13.0 | 1.7.0 |
2.12.1 | 1.6.1 |
2.12.0 | 1.6.0 |
2.11.1 | 1.5.1 |
2.11.0 | 1.5.0 |
2.10.1 | 1.4.1 |
2.10.0 | 1.4.0 |
2.9.3 | 1.3.2 |
2.9.1 | 1.3.0 |
2.8.3 | 1.2.3 |
2.8.0 | 1.2.0 |
2.7.3 | 1.1.2 |
What's next
- Learn more about TPU architecture in the System Architecture page.
- See When to use TPUs to learn about the types of models that are well suited to Cloud TPU.