Cloud TPU VM TensorFlow Quickstart

Overview: This quickstart provides a brief introduction to working with Cloud TPU. In this quickstart, you use Cloud TPU to train ResNet.

Before you begin

Set up a GCP Project

Sign in to your Google Account. If you don't already have one, sign up for a new account. In the Google Cloud Console, select or create a Cloud project from the project selector page. Make sure billing is enabled for your project. Set your project ID using gcloud in the Cloud Shell. The project ID is the name of your project shown in the Cloud console.

$ gcloud config set project project-id

Enable the Cloud TPU API

Enable the Cloud TPU API using the following gcloud command in Cloud Shell. (You may also enable it from the Google Cloud Console.

$ gcloud services enable tpu.googleapis.com

Configure the gcloud command

Run the following commands to configure gcloud to use your GCP project and install components needed for the TPU VM preview.

$ gcloud config set account your-email-account
$ gcloud config set project project-id

Create a Cloud TPU VM with gcloud

$ gcloud alpha compute tpus tpu-vm create tpu-name \
  --zone=europe-west4-a \
  --accelerator-type=v3-8 \
  --version=v2-alpha

Required fields

zone
The zone where you plan to create your Cloud TPU.
accelerator-type
The type of the Cloud TPU to create.
version
The Cloud TPU runtime version.

Connect to your Cloud TPU VM

$ gcloud alpha compute tpus tpu-vm ssh tpu-name --zone europe-west4-a --project project-id

Required fields

tpu_name
The name of the TPU VM to which you are connecting.
zone
The zone where you are creating your Cloud TPU.
project-id
Your GCP project ID.

Install TensorFlow on your Cloud TPU VM

(vm)$ git clone https://github.com/tensorflow/models.git
(vm)$ pip3 install -r models/official/requirements.txt

System check

Run a simple test that uses the libtpu library directly to load a buffer to the TPU and perform a simple computation. This command is only supported on single-device TPUs.

$ /usr/share/tpu/libtpu_client

You should see the following output:

------ Going to Query Version ------
TPU Driver Version: 0.0.0-dev
------ Going to Open a TPU Driver ------
------ Going to Query for System Information ------
------ Going to Compile a TPU program ------
------ Going to Load a TPU program ------
------ Going to Allocate a TPU Buffer ------
------ Going to Allocate a TPU Buffer ------
------ Going to Allocate a TPU Buffer ------
------ Going to Transfer To Device ------
------ Going to Transfer To Device ------
------ Going to Execute a TPU program ------
------ Going to Transfer From Device ------
------ Going to Unload a TPU program ------
------ Going to Deallocate a TPU Buffer ------
------ Going to Deallocate a TPU Buffer ------
------ Going to Deallocate a TPU Buffer ------
sum:
3 3 3 3 3 3 3 3 3 3 [...]

Run a simple example using tensorflow

This example performs a simple computation on a TPU.

$ python3 /usr/share/tpu/tensorflow/simple_example.py

Train Resnet on a single TPU

$ export PYTHONPATH=/usr/share/tpu/tensorflow/resnet50_keras
$ python3 /usr/share/tpu/tensorflow/resnet50_keras/resnet50_single_tpu.py

Clean up

To avoid incurring charges to your Google Cloud account for the resources used in this quickstart, follow these steps.

  1. Disconnect from the Compute Engine instance, if you have not already done so:

    (vm)$ exit
    

    Your prompt should now be username@projectname, showing you are in the Cloud Shell.

  2. Delete your Cloud TPU.

    $ gcloud alpha compute tpus tpu-vm delete tpu_name \
      --zone=europe-west4-a
    
  3. Verify the resources have been deleted by running gcloud alpha compute tpus tpu-vm list. The deletion might take several minutes. A response like the one below indicates your instances have been successfully deleted.

    $ gcloud alpha compute tpus tpu-vm list --zone=europe-west4-a
    
    NAME             STATUS
    

What's next

Read more about Cloud TPU VMs: