Choosing a TPU Version

General comparison

Cloud TPU v2, Cloud TPU v2 Pod (beta), and Cloud TPU v3, and v3 Pod (beta) run the same TensorFlow software for training and evaluating models. A model that runs on one TPU version can run with no code changes on another. The main differences between Cloud TPU versions are performance and memory capacity.

A single Cloud TPU device, v2 or v3, is comprised of 8 cores: 2 cores per chip and 4 chips per device. A number following a version number specifies the number of cores, for example, v2-8 for a single Cloud TPU device or v2-256 for a Cloud TPU Pod with 256 cores. A Cloud TPU v2 Pod (beta) consists of 64 TPU devices containing 256 TPU chips (512 cores). A Cloud TPU v3 Pod (beta) consists of 1024 TPU chips (2048 cores).

See specifying the accelerator type for instructions on how to specify the Cloud TPU version you want to use.

Performance comparison

Cloud TPU v3 hardware improves over Cloud TPU v2 by increasing the FLOPS per core and the HBM memory capacity. The first-order effects that govern a model’s ability to achieve full speedup are memory and step times:

  • Compute-bound models using v3 hardware should see significant performance benefits over those using v2. Models that are significantly memory-bound might see little benefit.
  • Models that were nearly input ("infeed") bound on Cloud TPU v2 because training steps were waiting for input, may become infeed bound with Cloud TPU v3. The pipeline performance guide can help you resolve infeed issues.

Other performance effects of Cloud TPU v3:

  • For many models, having additional HBM availability can improve performance and can also reduce the need to recompute intermediate values in cases where data does not fit into memory (re-materialization).

  • Cloud TPU v3 can run new models with batch sizes that did not fit on Cloud TPU v2, for example deeper ResNets and larger images with RetinaNet.

Specifying the accelerator type

You can specify the accelerator type in the following ways:

ctpu utility

  1. When running the Cloud TPU Provisioning Utility (ctpu).
    • Specify the TPU size with the ctpu up command. For example:
      $ ctpu up --tpu-size=[TPU_VERSION]
      where:

gcloud command

  1. When creating a new Cloud TPU resource using the gcloud compute tpus create command, specify the accelerator type you want to use from the supported TPU versions. For example:

    $ gcloud compute tpus create [TPU name] \
     --zone us-central1-b \
     --range '10.240.0.0' \
     --accelerator-type 'v2-8' \
     --network my-tf-network \
     --version '1.13'
    

    where:

    • TPU name is a name for identifying the TPU that you're creating.
    • --zone is the compute/zone of the Compute Engine. Make sure the requested accelerator type is supported in your region.
    • --range specifies the address of the created Cloud TPU resource and can be any value in 10.240.*.*.
    • --accelerator-type is the type of accelerator and number of cores you want to use, for example, v2-32 (32 cores).
    • --network specifies the name of the network that your Compute Engine VM instance uses. You must be able to connect to instances on this network over SSH. For most situations, you can use the default network that your Google Cloud Platform project created automatically. However, an error results if the default network is a legacy network.
    • --version specifies the TensorFlow version to use with the TPU.

Cloud Console

  1. From the left navigation menu, select Compute Engine > TPUs.
  2. On the TPUs screen click Create TPU node. This brings up a configuration page for your TPU.
  3. Under TPU type select one of the supported TPU versions.
  4. Click the Create button

What's next

Was this page helpful? Let us know how we did:

Send feedback about...