The open source
ctpu tool is used to create a flock of compute resources,
which consist of a Compute Engine VM and one or more
Cloud TPU devices. The tool is pre-installed in your Cloud
You can find
documentation and code for
ctpu tool uses the following syntax:
ctpu <subcommand> <flags> <subcommand> <subcommand args>
Following are the subcommands for
- Set or display authorization(s) for Cloud TPUs.
ctpu auth <flags> <subcommand> <subcommand args>
ctpu auth list --project="my-project" --zone=us-central1-a ctpu auth list --project my-project --zone us-central1-a
ctpu authcommand supports the following subcommands:
- add-bigtable - ensure Cloud TPU is authorized for Cloud Bigtable
- add-gcs - ensure Cloud TPU is authorized for Cloud Storage
- list - display Cloud TPU service account authorizations
- commands - list all command names
- flags - describe all known top-level flags
- help - describe subcommands and their syntax
- Optional Flags
The following are optional commands for
name | project | zone
- Delete your Compute Engine VM and Cloud TPU.
ctpu rm <flags>
ctpu rm --zone=us-central1-b
- List all
ctpusubcommands and top level flags.
ctpu help ctpu help <subcommand>
ctpu help // list all ctpu subcommands and top level flags ctpu help auth // list all flags that can be used with `ctpu auth` ctpu help up // list all flags that can be used with `ctpu up`
- List all Compute Engine VMs and Cloud TPU in the specified zone.
ctpu ls <flags>
ctpu ls --zone=us-central1-b
Stop the Compute Engine VM, and delete your Cloud TPU. Stop charging for Cloud TPU usage until you run
To ensure that the Cloud TPU is stopped, you must specify the Cloud TPU name and the zone on the command line.
ctpu pause <name, zone>
ctpu pause --name=my-tpu --zone=us-central1-a // pause the named TPU in the specified zone
- Print onscreen the current configuration of the Cloud TPU name, project name, and zone.
- Display a URL where you can see quotas.
ctpu quota Output: Quotas cannot currently be displayed within
ctpu. To view your quota, open <url> Request additional quota from <url>
Restarts a Cloud TPU that is still in the RUNNING state (shown in
ctpu status), but has stopped running because of a hardware problem. Use
gcloud compute tpu startor the START button on the Compute Engine > TPUs page in the Cloud Console if the TPU is in the STOPPED state.
restartdoes not restart a preempted Cloud TPU. You need to run
ctpu upif your Cloud TPU has been preempted.
ctpu restart <flags>
ctpu restart --zone=us-central1-a
Query the GCP APIs (default zone only) to determine the current status of your Cloud TPU and Compute Engine VM.
ctpu st --zone=us-central1-a Status message: Your cluster is running! Compute Engine VM: RUNNING Cloud TPU: RUNNING
- List all zones where TPU types are available.
- ctpu tpu-locations
Cloud TPU Locations: asia-east1-c europe-west4-a us-central1-a us-central1-b us-central1-c
- List all available TPU sizes in specified zone. Some sizes are available only in certain zones. (default = default zone)
ctpu tpu-sizes --zone=us-central1-a
Bring up a
ctpuresource set. The first time you run
ctpu upon a project, it takes longer than it will in future runs because it is performing tasks such as SSH key propagation and API turn-up.
- Enables the Compute Engine and Cloud TPU services.
- Creates a Compute Engine VM with the latest stable TensorFlow version pre-installed.
- Assigns a default zone, such as
us-central1-bbased on your location.
- Passes the name of the Cloud TPU to the Compute Engine
VM as an environment variable (
- Ensures your Cloud TPU has access to resources it needs from your Google Cloud project, by granting specific IAM roles to your Cloud TPU service account.
- Performs a number of other checks.
- Logs you into your new Compute Engine VM. Your shell prompt changes
You can run
ctpu upas often as you like. For example, if you lose the SSH connection to the Compute Engine VM, run
ctpu upto restore the connection. You must specify a zone if your Compute Engine is not in the default zone. For example:
$ ctpu up --zone=us-central1-a
ctpu up <flags>
ctpu up --tpu-size=v2-8 --disk-size-gb=320 --preemptible
Configure the root volume size of your Compute Engine VM. Value must be an integer. (default = 250)
Do not make changes; print only what would have happened.
Enable ssh agent forwarding when sshing into the Compute Engine VM. SSH Agent Forwarding enables access to shared repositories (such as GitHub) without having to place private keys on the Compute Engine VM. (default = true)
Automatically forward useful ports from the Compute Engine VM to your local machine. Ports forwarded are: 6006 (tensorboard), 8888 (jupyter notebooks), 8470 (TPU port), 8466 (TPU profiler port). (default = true)
Override the automatically chosen Compute Engine Image. Use this flag when you are using your own custom images instead of the ones provided with the installed TensorFlow.
Specify the network in which the Cloud TPU and associated VM should be created. Refer to Virtual Private Cloud (VPC) Network Overview for information on networks. (default = default network)
Print the full content of http request-response pairs. To enable the printout, set this flag to true. Use this flag when you need log output to file a bug report against
ctpu. Refer to
ctpuREADME for details.
Configure the size of your Compute Engine VM. A full list of machine types is available on the Cloud Machine Types page. (default = n1-standard-2)
Override the name to use for VMs and Cloud TPU. (default = your username)
Create a preemptible Cloud TPU node. A preemptible Cloud TPU costs less per hour than a non-preemptible one. Cloud TPU service can exit a preemptible device at any time. (default = non-preemptible)
Create a preemptible Compute Engine VM. A preemptible VM costs less per hour than a non-preemptible VM. The Compute Engine service can exit the VM instance at any time. (default = non-preemptible)
Always print the welcome message.
Override the GCP project name to use when allocating VMs and TPUs. Specify a value from cloud config or Compute Engine metadata, usually your project name. If a good value cannot be found, you must to provide a value on the command line.
Set the version of TensorFlow to use when creating the Compute Engine VM and the Cloud TPU. (default = latest stable release)
Allocate a Cloud TPU only; use this only if you already have a VM available.
Configure the size and hardware version of a Cloud TPU.
Use Deep Learning VM Images (refer docs: https://cloud.google.com/deep-learning-vm/) instead of TPU machine images. (default = TPU machine images)
Allocate a VM only; use this when you are not ready to set up and pay for a TPU.
Override the Compute Engine zone to use when allocating VMs and Cloud TPU. On the command line, run
ctpu help upto view the list.
- Prints out the version of
ctpu version Output: ctpu version: 1.9