This document describes how to set up and attach a persistent disk to a TPU VM.
A TPU VM includes a 100GB boot disk. For some datasets, you might need more local storage for training or preprocessing. To train these models, you can add a persistent disk to expand your local disk capacity.
You need to have a GCP account and project set up before using the following procedures. If you do not already have a Cloud TPU project set up, follow the procedure in Set up an account and a Cloud TPU project before continuing.
The high-level steps to set up a persistent disk with a TPU VM are:
- Create a persistent disk
- Launch a TPU VM with a persistent disk
- SSH into the TPU VM
- List the attached disks
- Format the attached persistent disk
- Create a directory to mount the persistent disk
- Mount the persistent disk
- Set permissions for the persistent disk
- Clean up TPU VM and persistent disk resources
Setting up a TPU VM and a persistent disk
In a Cloud Shell, create a persistent disk:
$ gcloud compute disks create disk-name \ --size disk-size \ --zone zone \ --type pd-balanced
Command flag descriptions
- A name of your choosing for the persistent disk.
- The size of the persistent disk in GB.
- The zone in which to create the persistent disk. This needs to be the same zone used to create the TPU.
- The disk type to add. Supported types are: 'pd-standard', 'pd-ssd' or 'pd-balanced'.
Launch a TPU VM with the persistent disk attached:
$ gcloud alpha compute tpus tpu-vm create tpu-name \ --project project-id \ --zone=zone \ --accelerator-type=v3-8 \ --version=v2-alpha \ --data-disk source=projects/project-id/zones/zone/disks/disk-name,mode=read-write
Command flag descriptions
- The name you have chosen for the TPU resources.
- Your project ID.
- The zone where you plan to create your Cloud TPU.
- The type of the Cloud TPU to create.
- The Cloud TPU runtime version.
- The name and read/write mode of the persistent disk to attach to the TPU VM.
SSH into the TPU VM
$ gcloud alpha compute tpus tpu-vm ssh tpu-name --zone zone
From the TPU VM, list the disks attached to the TPU VM:
(vm)$ sudo lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT loop0 7:0 0 55.5M 1 loop /snap/core18/1997 loop1 7:1 0 67.6M 1 loop /snap/lxd/20326 loop2 7:2 0 32.3M 1 loop /snap/snapd/11588 loop3 7:3 0 32.1M 1 loop /snap/snapd/11841 loop4 7:4 0 55.4M 1 loop /snap/core18/2066 sda 8:0 0 300G 0 disk ├─sda1 8:1 0 299.9G 0 part / ├─sda14 8:14 0 4M 0 part └─sda15 8:15 0 106M 0 part /boot/efi sdb 8:16 0 10G 0 disk # persistent disk
sda is the boot disk for the VM. The name of the attached persistent disk will depend upon how many persistent disks are attached to the VM.
Format the attached persistent disk:
(vm)$ sudo mkfs.ext4 -m 0 -E lazy_itable_init=0,lazy_journal_init=0,discard /dev/sdb
Create a directory to mount the persistent disk:
(vm)$ sudo mkdir -p /mnt/disks/persist
Mount the persistent disk:
(vm)$ sudo mount -o discard,defaults /dev/sdb /mnt/disks/persist
Set permissions for the persistent disk:
(vm)$ sudo chmod a+w /mnt/disks/persist
If you want to delete the persistent disk when you delete the TPU VM, you need to set the auto-delete state of the persistent disk using the following command:
$ gcloud alpha compute instances set-disk-auto-delete vm-instance \ --zone=zone \ --auto-delete \ --disk=disk-name
Command flag descriptions
- After you SSH into the TPU VM, your shell prompt changes to include your user ID followed by a generated VM instance name (for example. pjohnston@t1v-n-...$). Replace vm-instance with the generated VM instance name,
- The zone in which the persistent disk is located.
- Automatically delete the persistent disk when the TPU resources are deleted.
- A name of your persistent disk.
If you do not want to have the persistent disk deleted automatically, skip this command. At any time, you can use the command shown in Cleanup to manually remove the persistent disk.
If your VM shuts down for any reason, the persistent disk might be disconnected. See Configure automatic mounting on system restart to cause your persistent disk to automatically mount on VM restart. Refer to the persistent disk document for details on managing persistent disks.
Disconnect from the Compute Engine instance, if you have not already done so:
Your prompt should now be
username@projectname, showing you are in the Cloud Shell.
Delete your Cloud TPU and Compute Engine resources.
$ gcloud alpha compute tpus tpu-vm delete tpu-name \ --zone=zone
Verify the resources have been deleted by running
gcloud list. The deletion might take several minutes. The output from
gcloud listshould not display any of the TPU VM resources created by this procedure.
$ gcloud alpha compute tpus tpu-vm list --zone=zone
$ gcloud compute tpus execution-groups list --zone zone
Verify that the persistent disk was automatically deleted when the TPU VM was deleted by listing all disks in the zone where you created the persistent disk:
$ gcloud compute disks list --filter="zone:( us-central1-b )"
If the persistent disk was not deleted when the TPU VM was deleted, use the following commands to delete it:
$ gcloud compute disks delete disk-name \ --zone zone