You can create a group of VMs that have attached GPUs by using the bulk instance API. The bulk instance API speeds up the VM creation process by providing upfront validation where the request fails fast if it is not feasible. Also if you use the region flag, the bulk instance API automatically chooses the zone that has the capacity to fulfill the request. For more information about the bulk instance API, see Using the bulk instance API.
Before you begin
- If you want to use the command-line examples in this guide, do the following:
- Install or update to the latest version of the Google Cloud CLI.
- Set a default region and zone.
- If you want to use the API examples in this guide, set up API access.
- Read about GPU pricing on Compute Engine to understand the cost to use GPUs on your VMs.
- Read about restrictions for VMs with GPUs.
- Check your GPU quota.
- Choose an operating system image:
- If you are using GPUs for machine learning, you can use a Deep Learning VM image for your VM. The Deep Learning VM images have GPU drivers pre-installed and include packages, such as TensorFlow and PyTorch. You can also use the Deep Learning VM images for general GPU workloads. For information about the images available and the packages installed on the images, see Choosing an image.
- You can also use any public image or custom image, but some images might require a unique driver or install process that is not covered in this document. You must identify which drivers are appropriate for your images. For steps to install drivers, see installing GPU drivers.
Check GPU quota
To protect Compute Engine systems and users, new projects have a global GPU quota, which limits the total number of GPUs you can create in any supported zone.
Use the regions describe
command
to ensure that you have sufficient GPU quota in the region where you
want to create VMs with GPUs.
gcloud compute regions describe REGION
Replace REGION
with the
region that you want to check for GPU quota.
If you need additional GPU quota, request a quota increase. When you request a GPU quota, you must request a quota for the GPU types that you want to create in each region and an additional global quota for the total number of GPUs of all types in all zones.
If your project has an established billing history, it will receive quota automatically after you submit the request.
Creating groups of VMs with attached GPUs
When creating VMs with attached GPUs using the bulk instance API, you can choose
to create VMs in a region (such as us-central1
) or in a specific zone such as
(us-central1-a
).
If you choose to specify a region, Compute Engine places the VMs in any zone within the region that supports GPUs.
Create groups of VMs with attached A100 GPUs
You create a group of VMs with attached A100 GPUs by using either the Google Cloud CLI, or the Cloud Build API.
gcloud
To create a group of VMs, use the gcloud compute instances bulk create
command. For more
information about the parameters and how to use this command, see
Creating VMs with the bulk instance API.
Example
The following example creates two VMs with attached GPUs using the following specifications:
- VM names:
my-test-vm-1
,my-test-vm-2
- VMs created in any zone in
us-central1
that supports GPUs - Each VM has two A100 GPUs attached, specified by using
the appropriate
A2 machine type:
a2-highgpu-2
- Each VM has GPU drivers installed
- Each VM uses the Deep Learning VM image
pytorch-latest-gpu-v20211028-debian-10
gcloud compute instances bulk create \ --name-pattern="my-test-vm-#" \ --region=us-central1 \ --count=2 \ --machine-type=a2-highgpu-2g \ --boot-disk-size=200 \ --metadata="install-nvidia-driver=True" \ --scopes="https://www.googleapis.com/auth/cloud-platform" \ --image=pytorch-latest-gpu-v20211028-debian-10 \ --image-project=deeplearning-platform-release \ --on-host-maintenance=TERMINATE --restart-on-failure
If successful, the output is similar to the following:
NAME ZONE my-test-vm-1 us-central1-b my-test-vm-2 us-central1-b Bulk create request finished with status message: [VM instances created: 2, failed: 0.]
API
Use the instances.bulkInsert
method with the
required parameters to create multiple VMs in a zone. For more
information about the parameters and how to use this command, see
Creating VMs with the bulk instance API.
Example
The following example creates two VMs with attached GPUs using the following specifications:
- VM names:
my-test-vm-1
,my-test-vm-2
- VMs created in any zone in
us-central1
that supports GPUs - Each VM has two A100 GPUs attached, specified by using
the appropriate
A2 machine type:
a2-highgpu-2
- Each VM has GPU drivers installed
- Each VM uses the Deep Learning VM image
pytorch-latest-gpu-v20211028-debian-10
Replace PROJECT_ID
with your project ID.
POST https://compute.googleapis.com/compute/v1/projects/PROJECT_ID/regions/us-central1/instances/bulkInsert { "namePattern":"my-test-vm-#", "count":"2", "instanceProperties": { "machineType":"a2-highgpu-2g", "disks":[ { "type":"PERSISTENT", "initializeParams":{ "diskSizeGb":"200", "sourceImage":"projects/deeplearning-platform-release/global/images/pytorch-latest-gpu-v20211028-debian-10" }, "boot":true } ], "name": "default", "networkInterfaces": [ { "network": "projects/PROJECT_ID/global/networks/default" } ], "scheduling":{ "onHostMaintenance":"TERMINATE", "automaticRestart":true }, "metadata":{ "items":[ { "key":"install-nvidia-driver", "value":"True" } ] } } }
Create groups of VMs with other GPU types attached
You create a group of VMs with attached GPUs by using either the Google Cloud CLI, or the Cloud Build API.
This section describes how to create multiple VMs using the following GPU types:
NVIDIA GPUs:
- NVIDIA T4:
nvidia-tesla-t4
- NVIDIA P4:
nvidia-tesla-p4
- NVIDIA P100:
nvidia-tesla-p100
- NVIDIA V100:
nvidia-tesla-v100
- NVIDIA K80:
nvidia-tesla-k80
NVIDIA RTX (formerly known as NVIDIA GRID) virtual workstation GPUs:
- NVIDIA T4 Virtual Workstation:
nvidia-tesla-t4-vws
- NVIDIA P4 Virtual Workstation:
nvidia-tesla-p4-vws
NVIDIA P100 Virtual Workstation:
nvidia-tesla-p100-vws
For these virtual workstations, an NVIDIA RTX Virtual Workstation license is automatically added to your VM.
gcloud
To create a group of VMs, use the gcloud compute instances bulk create
command.
For more information about the parameters and how to use this command, see
Creating VMs with the bulk instance API.
Example
The following example creates two VMs with attached GPUs using the following specifications:
- VM names:
my-test-vm-1
,my-test-vm-2
- VMs created in any zone in
us-central1
that supports GPUs - Each VM has two T4 GPUs attached, specified by using the accelerator type and accelerator count flags
- Each VM has GPU drivers installed
- Each VM uses the Deep Learning VM image
pytorch-latest-gpu-v20211028-debian-10
gcloud compute instances bulk create \ --name-pattern="my-test-vm-#" \ --count=2 \ --region=us-central1 \ --machine-type=n1-standard-2 \ --accelerator type=nvidia-tesla-t4,count=2 \ --boot-disk-size=200 \ --metadata="install-nvidia-driver=True" \ --scopes="https://www.googleapis.com/auth/cloud-platform" \ --image=pytorch-latest-gpu-v20211028-debian-10 \ --image-project=deeplearning-platform-release \ --on-host-maintenance=TERMINATE --restart-on-failure
If successful, the output is similar to the following:
NAME ZONE my-test-vm-1 us-central1-b my-test-vm-2 us-central1-b Bulk create request finished with status message: [VM instances created: 2, failed: 0.]
API
Use the instances.bulkInsert
method with the
required parameters to create multiple VMs in a zone.
For more information about the parameters and how to use this command, see
Creating VMs with the bulk instance API.
Example
The following example creates two VMs with attached GPUs using the following specifications:
- VM names:
my-test-vm-1
,my-test-vm-2
- VMs created in any zone in
us-central1
that supports GPUs - Each VM has two T4 GPUs attached, specified by using the accelerator type and accelerator count flags
- Each VM has GPU drivers installed
- Each VM uses the Deep Learning VM image
pytorch-latest-gpu-v20211028-debian-10
Replace PROJECT_ID
with your project ID.
POST https://compute.googleapis.com/compute/v1/projects/PROJECT_ID/region/us-central1/instances/bulkInsert { "namePattern":"my-test-vm-#", "count":"2", "instanceProperties": { "machineType":"n1-standard-2", "disks":[ { "type":"PERSISTENT", "initializeParams":{ "diskSizeGb":"200", "sourceImage":"projects/deeplearning-platform-release/global/images/pytorch-latest-gpu-v20211028-debian-10" }, "boot":true } ], "name": "default", "networkInterfaces": [ { "network": "projects/PROJECT_ID/global/networks/default" } ], "guestAccelerators": [ { "acceleratorCount": 2, "acceleratorType": "nvidia-tesla-t4" } ], "scheduling":{ "onHostMaintenance":"TERMINATE", "automaticRestart":true }, "metadata":{ "items":[ { "key":"install-nvidia-driver", "value":"True" } ] } } }
What's next?
- To monitor GPU performance, see Monitor GPU performance.
- To optimize GPU performance, see Optimize GPU performance.
- To handle GPU host maintenance, see Handle GPU host maintenance events.