Run a calculation on a Cloud TPU VM using TensorFlow
This quickstart shows you how to create a Cloud TPU, install TensorFlow and run a simple calculation on a Cloud TPU. For a more in depth tutorial showing you how to train a model on a Cloud TPU see one of the Cloud TPU Tutorials.
Before you begin
Before you follow this quickstart, you must create a Google Cloud Platform
account, install the Google Cloud CLI. and configure the gcloud
command.
For more information, see Set up an account and a Cloud TPU project.
Create a Cloud TPU VM or Node with gcloud
Launch a Compute Engine Cloud TPU using the gcloud
command. The command you use depends on whether you are using a TPU VM or a TPU
node. For more information on the two VM architecture, see
System Architecture. For more
information on the gcloud
command, see the
gcloud
reference.
TPU VM
$ gcloud compute tpus tpu-vm create tpu-name \
--zone=europe-west4-a \
--accelerator-type=v3-8 \
--version=tpu-vm-tf-2.16.1-pjrt
Command flag descriptions
zone
- The zone where you plan to create your Cloud TPU.
accelerator-type
- The accelerator type specifies the version and size of the Cloud TPU you want to create. For more information about supported accelerator types for each TPU version, see TPU versions.
version
- The Cloud TPU software version.
TPU Node
$ gcloud compute tpus execution-groups create \
--name=tpu-name \
--zone=europe-west4-a \
--disk-size=300 \
--machine-type=n1-standard-16 \
--tf-version=2.12.0 \
Command flag descriptions
project
- Your Google Cloud project ID
name
- The name of the Cloud TPU to create.
zone
- The zone where you plan to create your Cloud TPU.
disk-size
- The size of the hard disk in GB of the VM created by the
gcloud
command. machine-type
- The machine type of the Compute Engine VM to create.
tf-version
- The version of TensorFlow
gcloud
installs on the VM. See accelerator-type
- The accelerator type specifies the version and size of the Cloud TPU you want to create. For more information about supported accelerator types for each TPU version, see TPU versions.
Connect to your Cloud TPU VM
When using TPU VMs, you must explicitly connect to your TPU VM using SSH. When using TPU Nodes, you should be automatically SSHed into your Compute EngineVM. If you are not automatically connected, use the following command.
TPU VM
$ gcloud compute tpus tpu-vm ssh tpu-name \
--zone europe-west4-a
TPU Node
$ gcloud compute ssh tpu-name \
--zone=europe-west4-a
Run a simple example using TensorFlow
TPU VM
Once you are connected to the TPU VM, set the following environment variable.
(vm)$ export TPU_NAME=local
When creating your TPU, if you set the --version
parameter to a version ending with
-pjrt
, set the following environment variables to enable the PJRT runtime:
(vm)$ export NEXT_PLUGGABLE_DEVICE_USE_C_API=true (vm)$ export TF_PLUGGABLE_DEVICE_LIBRARY_PATH=/lib/libtpu.so
Create a file named tpu-test.py
in the current directory and copy and paste
the following script into it.
import tensorflow as tf print("Tensorflow version " + tf.__version__) @tf.function def add_fn(x,y): z = x + y return z cluster_resolver = tf.distribute.cluster_resolver.TPUClusterResolver() tf.config.experimental_connect_to_cluster(cluster_resolver) tf.tpu.experimental.initialize_tpu_system(cluster_resolver) strategy = tf.distribute.TPUStrategy(cluster_resolver) x = tf.constant(1.) y = tf.constant(1.) z = strategy.run(add_fn, args=(x,y)) print(z)
TPU Node
Create a file named tpu-test.py
in the current directory and copy and paste
the following script into it.
import tensorflow as tf
print("Tensorflow version " + tf.__version__)
tpu = tf.distribute.cluster_resolver.TPUClusterResolver(tpu='your-tpu-name') # TPU detection
print('Running on TPU ', tpu.cluster_spec().as_dict()['worker'])
tf.config.experimental_connect_to_cluster(tpu)
tf.tpu.experimental.initialize_tpu_system(tpu)
strategy = tf.distribute.experimental.TPUStrategy(tpu)
@tf.function
def add_fn(x,y):
z = x + y
return z
x = tf.constant(1.)
y = tf.constant(1.)
z = strategy.run(add_fn, args=(x,y))
print(z)
Run this script with the following command:
(vm)$ python3 tpu-test.py
This script performs a simple computation on a each TensorCore of a TPU. The output will look similar to the following:
PerReplica:{ 0: tf.Tensor(2.0, shape=(), dtype=float32), 1: tf.Tensor(2.0, shape=(), dtype=float32), 2: tf.Tensor(2.0, shape=(), dtype=float32), 3: tf.Tensor(2.0, shape=(), dtype=float32), 4: tf.Tensor(2.0, shape=(), dtype=float32), 5: tf.Tensor(2.0, shape=(), dtype=float32), 6: tf.Tensor(2.0, shape=(), dtype=float32), 7: tf.Tensor(2.0, shape=(), dtype=float32) }
Clean up
To avoid incurring charges to your Google Cloud account for the resources used on this page, follow these steps.
Disconnect from the Compute Engine instance, if you have not already done so:
(vm)$ exit
Your prompt should now be
username@projectname
, showing you are in the Cloud Shell.Delete your Cloud TPU.
TPU VM
$ gcloud compute tpus tpu-vm delete tpu-name \ --zone=europe-west4-a
TPU Node
$ gcloud compute tpus execution-groups delete tpu-name \ --zone=europe-west4-a
Verify the resources have been deleted by running
gcloud compute tpus tpu-vm list
. The deletion might take several minutes.TPU VM
$ gcloud compute tpus tpu-vm list --zone=europe-west4-a
TPU Node
$ gcloud compute tpus execution-groups list --zone=europe-west4-a
What's next
For more information about Cloud TPU, see: