Add a persistent disk to a TPU VM
A TPU VM includes a 100GB boot disk. For some scenarios, you might need more local storage for training or preprocessing. You can add a persistent disk to expand your local disk capacity.
A persistent disk attached to a single-device TPU (v2-8, v3-8, v4-8, etc.) can be
read-only. When you attach a persistent disk to a
TPU VM that is part of a TPU Pod, the disk is attached to each TPU VM in that
Pod. To prevent two or more TPU VMs from a Pod from writing to a persistent disk
at once, all persistent disks attached to a TPU VM in a Pod must be configured
read-only disks are useful for storing a dataset for processing
on a TPU Pod.
After creating and attaching a persistent disk to your TPU VM, you must mount the persistent disk, specifying where in the file system the persistent disk can be accessed. For more information, see Mounting a disk.
You need to have a Google Cloud account and project set up before using the following procedures. If you do not already have a Cloud TPU project set up, follow the procedure in Set up an account and a Cloud TPU project before continuing.
The high-level steps to set up a persistent disk:
- Create a persistent disk
- Attach a persistent disk to a TPU VM
- Mount the persistent disk
- Clean up TPU VM and persistent disk resources
Setting up a TPU VM and a persistent disk
You can attach a persistent disk to a TPU VM when you create the TPU VM. You can also attach a persistent disk to an existing TPU VM.
Create a persistent disk
Use the following command to create a persistent disk:
$ gcloud compute disks create disk-name \ --size disk-size \ --zone zone \ --type pd-balanced
Command flag descriptions
- A name of your choosing for the persistent disk.
- The size of the persistent disk in GB.
- The zone in which to create the persistent disk. This needs to be the same zone used to create the TPU.
- The disk type to add. Supported types are: 'pd-standard', 'pd-ssd' or 'pd-balanced'.
Attach a persistent disk
You can attach a persistent disk to your TPU VM when you create the TPU VM or you can add one after the TPU VM is created.
Attach a persistent disk when you create a TPU VM
--data-disk flag to attach a persistent disk when you create a TPU VM.
If you are creating a TPU Pod, you must specify
mode=read-only. If you are
creating a single TPU device, you can specify
The following command creates a single TPU and sets the persistent disk mode to
$ gcloud compute tpus tpu-vm create tpu-name \ --project project-id \ --zone=zone \ --accelerator-type=v3-8 \ --version=Cloud TPU software version \ --data-disk source=projects/project-id/zones/zone/disks/disk-name,mode=read-write
Command flag descriptions
- The name you have chosen for the TPU resources.
- Your project ID.
- The zone to create your Cloud TPU in.
- The type of the Cloud TPU to create.
- The Cloud TPU software version for your framework.
- The name and read/write mode of the persistent disk to attach to the TPU VM.
Attach a persistent disk to an existing TPU VM
gcloud alpha compute tpus tpu-vm attach-disk command to attach a
persistent disk to an existing TPU VM. See the
documentation for more details and examples.
$ gcloud alpha compute tpus tpu-vm attach-disk tpu-name \ --zone=zone \ --disk=disk-name \ --mode=disk-mode
Command flag descriptions
- The name of the TPU resources.
- The zone where the Cloud TPU is located.
- The name of the persistent disk to attach to the TPU VM.
- The mode of the disk. Mode must be one of: `read-only` or `read-write`.
If you want to delete the persistent disk when you delete the TPU VM, you need to set the auto-delete state of the persistent disk using the following command:
$ gcloud compute instances set-disk-auto-delete vm-instance \ --zone=zone \ --auto-delete \ --disk=disk-name
Command flag descriptions
- After you SSH into the TPU VM, your shell prompt changes to include your user ID followed by a generated VM instance name (for example. pjohnston@t1v-n-...$). Replace vm-instance with the generated VM instance name,
- The zone in which the persistent disk is located.
- Automatically delete the persistent disk when the TPU resources are deleted.
- A name of your persistent disk.
If your VM shuts down for any reason, the persistent disk might be disconnected. See Configure automatic mounting on system restart to cause your persistent disk to automatically mount on VM restart. For more information, see Modify a persistent disk.
Mount a persistent disk
In order to access a persistent disk from a TPU VM, you must mount the disk. This specifies a location in the TPU VM file system where the persistent disk can be accessed.
Connect to your TPU VM using SSH:
$ gcloud compute tpus tpu-vm ssh tpu-name --zone zone
When working with a TPU Pod, there is a one TPU VM for each TPU in the Pod. The preceding command will work for both TPU devices and TPU Pods. If you are using TPU Pods this command will connect you to the first TPU in the Pod (also called worker 0).
From the TPU VM, list the disks attached to the TPU VM:
(vm)$ sudo lsblk
The output from the
lsblkcommand should look like the following:
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT loop0 7:0 0 55.5M 1 loop /snap/core18/1997 loop1 7:1 0 67.6M 1 loop /snap/lxd/20326 loop2 7:2 0 32.3M 1 loop /snap/snapd/11588 loop3 7:3 0 32.1M 1 loop /snap/snapd/11841 loop4 7:4 0 55.4M 1 loop /snap/core18/2066 sda 8:0 0 300G 0 disk ├─sda1 8:1 0 299.9G 0 part / ├─sda14 8:14 0 4M 0 part └─sda15 8:15 0 106M 0 part /boot/efi sdb 8:16 0 10G 0 disk <== persistent disk
In this example
sdais the boot disk and
sdbis the persistent disk. The name of the attached persistent disk will depend upon how many persistent disks are attached to the VM.
When using a TPU Pod, you will need to mount the persistent disk on all TPU VMs in your Pod. The name of the persistent disk should be the same for all TPU VMs, but it is not guaranteed. For example if you detach and then re-attach the persistent disk, the device name will be incremented, changing from
sdc, and so on.
If the disk has not been formatted, format the attached persistent disk now:
(vm)$ sudo mkfs.ext4 -m 0 -E lazy_itable_init=0,lazy_journal_init=0,discard /dev/sdb
Create a directory to mount the persistent disk:
If you are using a TPU device, run the following command to create a directory to mount the persistent disk:
(vm)$ sudo mkdir -p /mnt/disks/persist
If you are using a TPU Pod, run the following command outside of your TPU VM. This will create the directory on all TPU VMs in the Pod.
gcloud compute tpus tpu-vm ssh $TPU_NAME --worker=all --command="sudo mkdir -p /mnt/disks/persist"
Mount the persistent disk:
If you are using a TPU device, run the following command to mount the persistent disk on your TPU VM.
(vm)$ sudo mount -o discard,defaults /dev/sdb /mnt/disks/persist
If you are using a TPU Pod, run the following command outside of your TPU VM. It will mount the persistent disk on all TPU VMs in your Pod.
(vm)$ gcloud compute tpus tpu-vm ssh $TPU_NAME --worker=all --command="sudo mount -o discard,defaults /dev/sdb /mnt/disks/persist"
Delete your TPU resources when you are done with them.
Disconnect from the Compute Engine instance, if you have not already done so:
Your prompt should now be
username@projectname, showing you are in the Cloud Shell.
Delete your Cloud TPU and Compute Engine resources.
$ gcloud compute tpus tpu-vm delete tpu-name \ --zone=zone
Verify the resources have been deleted by running
gcloud list. The deletion might take several minutes. The output from
gcloud listshould not display any of the TPU VM resources created by this procedure.
$ gcloud compute tpus tpu-vm list --zone=zone
$ gcloud compute tpus execution-groups list --zone zone
Verify that the persistent disk was automatically deleted when the TPU VM was deleted by listing all disks in the zone where you created the persistent disk:
$ gcloud compute disks list --filter="zone:( us-central1-b )"
If the persistent disk was not deleted when the TPU VM was deleted, use the following commands to delete it:
$ gcloud compute disks delete disk-name \ --zone zone