Cloud TPU v2, Cloud TPU v2 Pod (beta), and Cloud TPU v3, and v3 Pod (beta) run the same TensorFlow software for training and evaluating models. A model that runs on one TPU version can run with no code changes on another. The main differences between Cloud TPU versions are performance and memory capacity.
A single Cloud TPU device, v2 or v3, is comprised of 8 cores: 2 cores per chip and 4 chips per device. A number following a version number specifies the number of cores, for example, v2-8 for a single Cloud TPU device or v2-256 for a Cloud TPU Pod with 256 cores. A Cloud TPU v2 Pod (beta) consists of 64 TPU devices containing 256 TPU chips (512 cores). A Cloud TPU v3 Pod (beta) consists of 1024 TPU chips (2048 cores).
See specifying the accelerator type for instructions on how to specify the Cloud TPU version you want to use.
Cloud TPU v3 hardware improves over Cloud TPU v2 by increasing the FLOPS per core and the HBM memory capacity. The first-order effects that govern a model’s ability to achieve full speedup are memory and step times:
- Compute-bound models using v3 hardware should see significant performance benefits over those using v2. Models that are significantly memory-bound might see little benefit.
- Models that were nearly input ("infeed") bound on Cloud TPU v2 because training steps were waiting for input, may become infeed bound with Cloud TPU v3. The pipeline performance guide can help you resolve infeed issues.
Other performance effects of Cloud TPU v3:
For many models, having additional HBM availability can improve performance and can also reduce the need to recompute intermediate values in cases where data does not fit into memory (re-materialization).
Cloud TPU v3 can run new models with batch sizes that did not fit on Cloud TPU v2, for example deeper ResNets and larger images with RetinaNet.
Specifying the accelerator type
You can specify the accelerator type in the following ways:
$ gcloud compute tpus create [TPU name] \ --zone us-central1-b \ --range '10.240.0.0' \ --accelerator-type 'v2-8' \ --network my-tf-network \ --version '1.13'
TPU nameis a name for identifying the TPU that you're creating.
--zoneis the compute/zone of the Compute Engine. Make sure the requested accelerator type is supported in your region.
--rangespecifies the address of the created Cloud TPU resource and can be any value in
--accelerator-typeis the type of accelerator and number of cores you want to use, for example, v2-32 (32 cores).
--networkspecifies the name of the network that your Compute Engine VM instance uses. You must be able to connect to instances on this network over SSH. For most situations, you can use the default network that your Google Cloud Platform project created automatically. However, an error results if the default network is a legacy network.
--versionspecifies the TensorFlow version to use with the TPU.
- From the left navigation menu, select Compute Engine > TPUs.
- On the TPUs screen click Create TPU node. This brings up a configuration page for your TPU.
- Under TPU type select one of the supported TPU versions.
- Click the Create button
- Learn more about single Cloud TPU device and Cloud TPU v2 Pod (beta) and v3 Pod (beta) architecture in the architecture document.
- Explore the TPU setup documentation for help choosing between a single Cloud TPU device and a Cloud TPU Pod.
- See When to use TPUs to learn about the types of models that are well suited to Cloud TPU.
- Check Regions and Zones to determine availability of Cloud TPU versions and Pod slices.
- If you plan to run on Kubernetes or ML Engine, see Deciding on a TPU service.