Add a Persistent Disk to a TPU VM
A TPU VM includes a 100GB boot disk. For some scenarios, your TPU VM might need additional storage for training or preprocessing. You can add a Persistent Disk to expand your local disk capacity.
Overview
A Persistent Disk attached to a single-device TPU (v2-8, v3-8, v4-8, etc.) can be
configured as read-write
or read-only
. When you attach a Persistent Disk to a
TPU VM that is part of a TPU Pod, the disk is attached to each TPU VM in that
Pod. To prevent two or more TPU VMs from a Pod from writing to a Persistent Disk
at once, all Persistent Disks attached to a TPU VM in a Pod must be configured
as read-only
. read-only
disks are useful for storing a dataset for processing
on a TPU Pod.
After creating and attaching a Persistent Disk to your TPU VM, you must mount the Persistent Disk, specifying where in the file system the Persistent Disk can be accessed. For more information, see Mounting a disk.
Prerequisites
You need to have a Google Cloud account and project set up before using the following procedures. If you don't already have a Cloud TPU project set up, follow the procedure in Set up the Cloud TPU environment before continuing.
High-level steps
The high-level steps to set up a Persistent Disk:
- Create a Persistent Disk
- Attach a Persistent Disk to a TPU VM
- Mount the Persistent Disk
- Clean up TPU VM and Persistent Disk resources
Setting up a TPU VM and a Persistent Disk
You can attach a Persistent Disk to a TPU VM when you create the TPU VM. You can also attach a Persistent Disk to an existing TPU VM.
Create a Persistent Disk
Use the following command to create a Persistent Disk:
$ gcloud compute disks create disk-name \
--size disk-size \
--zone zone \
--type pd-balanced
Command flag descriptions
disk-name
- A name of your choosing for the Persistent Disk.
disk-size
- The size of the Persistent Disk in GB.
zone
- The zone in which to create the Persistent Disk. This needs to be the same zone used to create the TPU.
type
- The
disk type to add. Supported types are:
pd-standard
,pd-ssd
orpd-balanced
.
Attach a Persistent Disk
You can attach a Persistent Disk to your TPU VM when you create the TPU VM or you can add one after the TPU VM is created.
Attach a Persistent Disk when you create a TPU VM
Use the --data-disk
flag to attach a Persistent Disk when you create a TPU VM.
If you are creating a TPU Pod, you must specify mode=read-only
. If you are
creating a single TPU device, you can specify mode=read-only
or mode=read-write
.
The following command creates a single TPU and sets the Persistent Disk mode to
read-write
:
$ gcloud compute tpus tpu-vm create tpu-name \
--project project-id \
--zone=zone \
--accelerator-type=v3-8 \
--version=tpu-vm-image \
--data-disk source=projects/project-id/zones/zone/disks/disk-name,mode=read-write
Command flag descriptions
tpu-name
- The name you have chosen for the TPU resources.
project
- Your project ID.
zone
- The zone to create your Cloud TPU in.
accelerator-type
- The accelerator type specifies the version and size of the Cloud TPU you want to create. For more information about supported accelerator types for each TPU version, see TPU versions.
version
- The TPU VM image for your framework.
data-disk
- The name and read/write mode of the Persistent Disk to attach to the TPU VM.
Attach a Persistent Disk to an existing TPU VM
Use the gcloud alpha compute tpus tpu-vm attach-disk
command to attach a
Persistent Disk to an existing TPU VM. See the gcloud
documentation for more details and examples.
$ gcloud alpha compute tpus tpu-vm attach-disk tpu-name \
--zone=zone \
--disk=disk-name \
--mode=disk-mode
Command flag descriptions
tpu-name
- The name of the TPU resources.
zone
- The zone where the Cloud TPU is located.
disk-name
- The name of the Persistent Disk to attach to the TPU VM.
mode
- The mode of the disk. Mode must be one of:
read-only
orread-write
.
If you want to delete the Persistent Disk when you delete the TPU VM, you need to set the auto-delete state of the Persistent Disk using the following command:
$ gcloud compute instances set-disk-auto-delete vm-instance \
--zone=zone \
--auto-delete \
--disk=disk-name
Command flag descriptions
vm-instance
- After you SSH into the TPU VM, your shell prompt changes to include your user ID followed by a generated VM instance name (for example. pjohnston@t1v-n-...$). Replace vm-instance with the generated VM instance name,
zone
- The zone in which the Persistent Disk is located.
auto-delete
- Automatically delete the Persistent Disk when the TPU resources are deleted.
disk-name
- A name of your Persistent Disk.
If your VM shuts down for any reason, the Persistent Disk might be disconnected. See Configure automatic mounting on system restart to cause your Persistent Disk to automatically mount on VM restart.
For more information about automatically deleting a Persistent Disk, see Modify a Persistent Disk.
Mount a Persistent Disk
In order to access a Persistent Disk from a TPU VM, you must mount the disk. This specifies a location in the TPU VM file system where the Persistent Disk can be accessed.
Connect to your TPU VM using SSH:
$ gcloud compute tpus tpu-vm ssh tpu-name --zone zone
When working with a TPU Pod, there is a one TPU VM for each TPU in the Pod. The preceding command will work for both TPU devices and TPU Pods. If you are using TPU Pods this command will connect you to the first TPU in the Pod (also called worker 0).
From the TPU VM, list the disks attached to the TPU VM:
(vm)$ sudo lsblk
The output from the
lsblk
command should look like the following:NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT loop0 7:0 0 55.5M 1 loop /snap/core18/1997 loop1 7:1 0 67.6M 1 loop /snap/lxd/20326 loop2 7:2 0 32.3M 1 loop /snap/snapd/11588 loop3 7:3 0 32.1M 1 loop /snap/snapd/11841 loop4 7:4 0 55.4M 1 loop /snap/core18/2066 sda 8:0 0 300G 0 disk ├─sda1 8:1 0 299.9G 0 part / ├─sda14 8:14 0 4M 0 part └─sda15 8:15 0 106M 0 part /boot/efi sdb 8:16 0 10G 0 disk <== Persistent Disk
In this example
sda
is the boot disk andsdb
is the name of the newly attached Persistent Disk. The name of the attached Persistent Disk will depend upon how many persistent disks are attached to the VM.When using a TPU Pod, you will need to mount the Persistent Disk on all TPU VMs in your Pod. The name of the Persistent Disk should be the same for all TPU VMs, but it is not guaranteed. For example if you detach and then re-attach the Persistent Disk, the device name will be incremented, changing from
sdb
tosdc
.If the disk has not been formatted, format the attached Persistent Disk now:
(vm)$ sudo mkfs.ext4 -m 0 -E lazy_itable_init=0,lazy_journal_init=0,discard /dev/sdb
Create a directory to mount the Persistent Disk:
If you are using a TPU device, run the following command to create a directory to mount the Persistent Disk:
(vm)$ sudo mkdir -p /mnt/disks/persist
If you are using a TPU Pod, run the following command outside of your TPU VM. This will create the directory on all TPU VMs in the Pod.
(vm)$ gcloud compute tpus tpu-vm ssh $TPU_NAME --worker=all --command="sudo mkdir -p /mnt/disks/persist"
Mount the Persistent Disk:
If you are using a TPU device, run the following command to mount the Persistent Disk on your TPU VM.
(vm)$ sudo mount -o discard,defaults /dev/sdb /mnt/disks/persist
If you are using a TPU Pod, run the following command outside of your TPU VM. It will mount the Persistent Disk on all TPU VMs in your Pod.
(vm)$ gcloud compute tpus tpu-vm ssh $TPU_NAME --worker=all --command="sudo mount -o discard,defaults /dev/sdb /mnt/disks/persist"
Unmount a persistent disk
To unmount (detach) a persistent disk run the following command:
$ gcloud alpha compute tpus tpu-vm detach-disk tpu-name \
--zone=zone \
--disk=disk-name
Command flag descriptions
tpu-name
- The name of the TPU resources.
zone
- The zone where the Cloud TPU is located.
disk-name
- The name of the Persistent Disk to detach from the TPU VM.
Clean up
Delete your TPU resources when you are done with them.
Disconnect from the Compute Engine instance, if you have not already done so:
(vm)$ exit
Your prompt should now be
username@projectname
, showing you are in the Cloud Shell.Delete your Cloud TPU and Compute Engine resources.
$ gcloud compute tpus tpu-vm delete tpu-name \ --zone=zone
Verify the resources have been deleted by running
gcloud list
. The deletion might take several minutes. The output fromgcloud list
shouldn't display any of the TPU VM resources created by this procedure.$ gcloud compute tpus tpu-vm list --zone=zone
Verify that the Persistent Disk was automatically deleted when the TPU VM was deleted by listing all disks in the zone where you created the Persistent Disk:
$ gcloud compute disks list --filter="zone:( us-central1-b )"
If the Persistent Disk was not deleted when the TPU VM was deleted, use the following commands to delete it:
$ gcloud compute disks delete disk-name \ --zone zone