Supported TPU configurations
To determine the most effective TPU configuration for training your model, see the TPU hardware versions section in the System Architecture document. See the supported reference models page for a list of reference models supported by Cloud TPU.
This page provides lists of the TPU types and the TPU runtime versions. When programming with TensorFlow frameworks, the runtime you use depends on whether you are using the TPU VM or TPU Node architecture." See the System Architecture page for the differences between the two VM architectures.
The following tables shows the supported TPU hardware types.
TPU v4 configurations
TPU v4 configurations consist of two groups, those with topologies smaller than 64 chips (small topologies), and those with topologies greater than 64 chips (large topologies). The supported configurations for each of these topologies are described in the following sections.
Small v4 topologies
Cloud TPU supports the following TPU v4 slices smaller than 64 chips, a 4x4x4 cube. You can create these small v4 topologies using either their TensorCore-based name (for example, v4-32), or their topology (for example, 2x2x4):
Name (based on TensorCore count) | Number of chips | Topology |
v4-8 | 4 | 2x2x1 |
v4-16 | 8 | 2x2x2 |
v4-32 | 16 | 2x2x4 |
v4-64 | 32 | 2x4x4 |
Large v4 topologies
TPU v4 slices are available in increments of 64 chips, with shapes that are multiples of 4 on all three dimensions. The dimensions must also be in increasing order. Several examples are shown in the following table. Some of these topologies are "custom" topologies that can only be launched using the topology API because they have the same number of chips as a more commonly used named topology.
Name (based on TensorCore count) | Number of chips | Topology |
v4-128 | 64 | 4x4x4 |
v4-256 | 128 | 4x4x8 |
v4-512 | 256 | 4x8x8 |
See Topology API | 256 | 4x4x16 |
v4-1024 | 512 | 8x8x8 |
v4-1536 | 768 | 8x8x12 |
v4-2048 | 1024 | 8x8x16 |
See Topology API | 1024 | 4x16x16 |
v4-4096 | 2048 | 8x16x16 |
… | … | … |
Topology API
In order to create Cloud TPU Pod slices with custom topology
the gcloud
TPU API can be used as follows:
$ gcloud alpha compute tpus tpu-vm create tpu-name \
--zone=us-central2-b \
--subnetwork=tpusubnet \
--type=v4 \
--topology=4x4x16 \
--version=runtime-version
TPU v2 and v3 configurations
The following table lists the supported TPU v2 and v3 types
TPU version | Support ends |
---|---|
v2-8 | (End date not yet set) |
v2-32 | (End date not yet set) |
v2-128 | (End date not yet set) |
v2-256 | (End date not yet set) |
v2-512 | (End date not yet set) |
v3-8 | (End date not yet set) |
v3-32 | (End date not yet set) |
v3-128 | (End date not yet set) |
v3-256 | (End date not yet set) |
v3-512 | (End date not yet set) |
v3-1024 | (End date not yet set) |
v3-2048 | (End date not yet set) |
TPU software versions
The version of TPU software you should use depends upon the TPU architecture,TPU VM or TPU Node and the ML framework you are using (TensorFlow, PyTorch, or JAX).
TPU VM
When you create a TPU VM, the latest version of TensorFlow is preinstalled on the TPU VM.
TensorFlow
Use the TPU software version that matches the version of TensorFlow with which
your model was written. For example, if you are using TensorFlow 2.11.0,
use the tpu-vm-tf-2.11.0
TPU software version. If you are using a TPU Pod,
use tpu-vm-tf-2.11.0-pod
. The current supported TensorFlow
TPU VM software versions for TPUs are:
- tpu-vm-tf-2.11.0
- tpu-vm-tf-2.10.1
- tpu-vm-tf-2.10.0
- tpu-vm-tf-2.9.3
- tpu-vm-tf-2.9.1
- tpu-vm-tf-2.8.4
- tpu-vm-tf-2.8.3
- tpu-vm-tf-2.8.0
- tpu-vm-tf-2.7.4
- tpu-vm-tf-2.7.3
To specify a TPU VM Pod, add -pod
to the TPU software version you want to use,
for example, tpu-vm-tf-2.11.0-pod
.
TPU VM with TPU v4
If you are training a model on TPU VM v4 with TensorFlow, use one of the v4
versions shown below.
2.10.0 - tpu-vm-tf-2.10.0-v4, tpu-vm-tf-2.10.0-pod-v4 2.9.3 - tpu-vm-tf-2.9.3-v4, tpu-vm-tf-2.9.3-pod-v4 2.9.2 - tpu-vm-tf-2.9.2-v4, tpu-vm-tf-2.9.2-pod-v4 2.9.1 - tpu-vm-tf-2.9.1-v4, tpu-vm-tf-2.9.1-pod-v4 2.10.0 - tpu-vm-tf-2.10.0-v4, tpu-vm-tf-2.10.0-pod-v4
Beginning with TensorFlow 2.10.1, there is only a single image and you
do not need to specify -v4
with the version number.
For more information on TensorFlow patch versions, see Supported TensorFlow patch versions.
TPU VMs are created with TensorFlow and the corresponding Libtpu library preinstalled. If you are creating your own VM image, specify the following TensorFlow TPU software versions and corresponding Libtpu versions:
TensorFlow version | libtpu.so version |
---|---|
2.11.0 | 1.5.0 |
2.10.1 | 1.4.1 |
2.10.0 | 1.4.0 |
2.9.3 | 1.3.2 |
2.9.1 | 1.3.0 |
2.8.3 | 1.2.3 |
2.8.* | 1.2.0 |
2.7.3 | 1.1.2 |
PyTorch
Use the TPU software version that matches the version of PyTorch with which
your model was written. For example, if you are using PyTorch 1.13,
use the tpu-vm-pt-1.13
TPU software version for v2 and v3 or
tpu-vm-v4-pt-1.13
TPU software version for v4. The same TPU software version
is used for TPU Pods (e.g.,v2-32, v3-128, v4-32). The current supported TPU
software versions are:
TPU v2/v3:
- tpu-vm-pt-1.13 (pytorch-1.13)
- tpu-vm-pt-1.12 (pytorch-1.12)
- tpu-vm-pt-1.11 (pytorch-1.11)
- tpu-vm-pt-1.10 (pytorch-1.10)
- v2-alpha (pytorch-1.8.1)
TPU v4:
- tpu-vm-v4-pt-1.13 (pytorch-1.13)
When you create a TPU VM, the latest version of PyTorch is preinstalled on the TPU VM. The correct version of libtpu.so is automatically installed when you install PyTorch.
To change the current PyTorch software version, see Changing PyTorch version.
JAX
You must manually install JAX on your TPU VM, because there is no JAX-specific
TPU software version. For v2 and v3 configurations use the tpu-vm-base
TPU
software version. For v4 configurations use tpu-vm-v4-base
. The correct
version of libtpu.so is automatically installed when you install JAX.
TPU Node
TensorFlow
Use the TPU software version that matches the version of TensorFlow with which
your model was written. For example, if you are using
TF 2.11.0, use the 2.11.0
TPU software version. The TensorFlow specific TPU software versions are:
- 2.11.0
- 2.10.1
- 2.10.0
- 2.9.3
- 2.9.1
- 2.8.4
- 2.8.2
- 2.7.3
For more information on TensorFlow patch versions, see Supported TensorFlow patch versions.
When you create a TPU Node, the latest version of TensorFlow is preinstalled on the TPU Node.
PyTorch
Use the TPU software version that matches the version of PyTorch with which your
model was written. For example, if you are using PyTorch 1.9, use the
pytorch-1.9
software version.
The PyTorch specific TPU software versions are:
- pytorch-1.13
- pytorch-1.12
- pytorch-1.11
- pytorch-1.10
- pytorch-1.9
- pytorch-1.8
- pytorch-1.7
pytorch-1.6
pytorch-nightly
When you create a TPU Node, the latest version of PyTorch is preinstalled on the TPU Node.
JAX
You must manually install JAX on your TPU VM, so there is no pre-installed JAX-specific TPU software version. You can use any of the software versions listed for TensorFlow.