Overview of creating an instance with attached GPUs

Linux Windows

Compute Engine provides graphics processing units (GPUs) that you can add to your virtual machines (VMs). You can use these GPUs to accelerate specific workloads on your VMs such as machine learning and data processing.

You can also use some GPU machine types on AI Hypercomputer. AI Hypercomputer is a supercomputing system that is optimized to support your artificial intelligence (AI) and machine learning (ML) workloads. This option is recommended for creating a densely allocated, performance-optimized infrastructure that has integrations for Google Kubernetes Engine (GKE) and Slurm schedulers.

This document provides an overview of the steps required for creating a VM with attached GPUs.

For more information about GPUs on Compute Engine, see About GPUs.

Select the GPU model

For a list of GPU models that are available, see GPU platforms. Also make a note of the machine type that is supported for the selected GPU model.

For each model, it might also be helpful to review the following:

Supported regions and zones.
GPU pricing to understand the cost to use each GPU model on your VMs. For VMs that use accelerator-optimized machines, also review VM instance pricing.

Limitations

In addition to the restrictions for all VMs with GPUs, each machine series with attached GPUs has the following limitations:

A4 VMs

You can only request capacity by using the supported consumption options for an A4 machine type.
You don't receive sustained use discounts and flexible committed use discounts for instances that use an A4 machine type.
You can only use an A4 machine type in certain regions and zones.
You can't use Persistent Disk (regional or zonal) on an instance that uses an A4 machine type.
The A4 machine type is only available on the Emerald Rapids CPU platform.
You can't change the machine type of an existing instance to an A4 machine type. You can only create new A4 instances. After creating an instance using an A4 machine type, you can't change the machine type.
A4 machine types don't support sole-tenancy.
You can't run Windows operating systems on an A4 machine type.

A3 Ultra VMs

You can only request capacity by using the supported consumption options for an A3 Ultra machine type.
You don't receive sustained use discounts and flexible committed use discounts for instances that use an A3 Ultra machine type.
You can only use an A3 Ultra machine type in certain regions and zones.
You can't use Persistent Disk (regional or zonal) on an instance that uses an A3 Ultra machine type.
The A3 Ultra machine type is only available on the Emerald Rapids CPU platform.
You can't change the machine type of an existing instance to an A3 Ultra machine type. You can only create new A3-ultra instances. After creating an instance using an A3 Ultra machine type, you can't change the machine type.
A3 Ultra machine types don't support sole-tenancy.
You can't run Windows operating systems on an A3 Ultra machine type.

A3 Mega VMs

You can only request capacity by using the supported consumption options for an A3 Mega machine type.
You don't receive sustained use discounts and flexible committed use discounts for instances that use an A3 Mega machine type.
You can only use an A3 Mega machine type in certain regions and zones.
You can't use regional Persistent Disk on an instance that uses an A3 Mega machine type.
The A3 Mega machine type is only available on the Sapphire Rapids CPU platform.
You can't change the machine type of an existing instance to an A3 Mega machine type. You can only create new A3-mega instances. After creating an instance using an A3 Mega machine type, you can't change the machine type.
A3 Mega machine types don't support sole-tenancy.
You can't run Windows operating systems on an A3 Mega machine type.

A3 High VMs

You can only request capacity by using the supported consumption options for an A3 High machine type.
You don't receive sustained use discounts and flexible committed use discounts for instances that use an A3 High machine type.
You can only use an A3 High machine type in certain regions and zones.
You can't use regional Persistent Disk on an instance that uses an A3 High machine type.
The A3 High machine type is only available on the Sapphire Rapids CPU platform.
You can't change the machine type of an existing instance to an A3 High machine type. You can only create new A3-high instances. After creating an instance using an A3 High machine type, you can't change the machine type.
A3 High machine types don't support sole-tenancy.
You can't run Windows operating systems on an A3 High machine type.
For a3-highgpu-1g, a3-highgpu-2g, anda3-highgpu-4g machine types, you must create instances using Spot VMs or a feature that uses the Dynamic Workload Scheduler (DWS), such as resize requests in a MIG. For detailed instructions on either of these options, review the following:
- To create Spot VMs, set the provisioning model to SPOT when you Create an accelerator-optimized VM.
- To create a resize request in a MIG, which uses DWS, see Create a MIG with GPU VMs.

A3 Edge VMs

You can only request capacity by using the supported consumption options for an A3 Edge machine type.
You don't receive sustained use discounts and flexible committed use discounts for instances that use an A3 Edge machine type.
You can only use an A3 Edge machine type in certain regions and zones.
You can't use regional Persistent Disk on an instance that uses an A3 Edge machine type.
The A3 Edge machine type is only available on the Sapphire Rapids CPU platform.
You can't change the machine type of an existing instance to an A3 Edge machine type. You can only create new A3-edge instances. After creating an instance using an A3 Edge machine type, you can't change the machine type.
A3 Edge machine types don't support sole-tenancy.
You can't run Windows operating systems on an A3 Edge machine type.

A2 Standard VMs

You can only request capacity by using the supported consumption options for an A2 Standard machine type.
You don't receive sustained use discounts and flexible committed use discounts for instances that use an A2 Standard machine type.
You can only use an A2 Standard machine type in certain regions and zones.
The A2 Standard machine type is only available on the Cascade Lake platform.
If your instance uses an A2 Standard machine type, you can only switch from one A2 Standard machine type type to another A2 Standard machine type. You can't change to any other machine type. For more information, see Modify accelerator-optimized instances.
You can't use the Windows operating system with a2-megagpu-16g A2 Standard machine types. When using Windows operating systems, choose a different A2 Standard machine type.
You can't do a quick format of the attached Local SSDs on Windows instances that use A2 Standard machine types. To format these Local SSDs, you must do a full format by using the diskpart utility and specifying format fs=ntfs label=tmpfs.
A2 Standard machine types don't support sole-tenancy.

A2 Ultra VMs

You can only request capacity by using the supported consumption options for an A2 Ultra machine type.
You don't receive sustained use discounts and flexible committed use discounts for instances that use an A2 Ultra machine type.
You can only use an A2 Ultra machine type in certain regions and zones.
The A2 Ultra machine type is only available on the Cascade Lake platform.
If your instance uses an A2 Ultra machine type, you can't change the machine type. If you need to use a different A2 Ultra machine type, or any other machine type, you must create a new instance.
You can't change any other machine type to an A2 Ultra machine type. If you need a instance that uses an A2 Ultra machine type, you must create a new instance.
You can't do a quick format of the attached Local SSDs on Windows instances that use A2 Ultra machine types. To format these Local SSDs, you must do a full format by using the diskpart utility and specifying format fs=ntfs label=tmpfs.

G2 VMs

You can only request capacity by using the supported consumption options for a G2 machine type.
You don't receive sustained use discounts and flexible committed use discounts for instances that use a G2 machine type.
You can only use a G2 machine type in certain regions and zones.
The G2 machine type is only available on the Cascade Lake platform.
Standard Persistent Disk (pd-standard) isn't supported on instances that use the G2 machine type. For supported disk types, see Supported disk types for G2.
You can't create Multi-Instance GPUs on an instance that uses a G2 machine type.
If you need to change the machine type of a G2 instance, review Modify accelerator-optmized instances.
You can't use Deep Learning VM Images as boot disks for instances that use the G2 machine type.
The current default driver for Container-Optimized OS doesn't support L4 GPUs running on G2 machine types. Also, Container-Optimized OS only supports a select set of drivers. If you want to use Container-Optimized OS on G2 machine types, review the following notes:
- Use a Container-Optimized OS version that supports the minimum recommended NVIDIA driver version 525.60.13 or later. For more information, review the Container-Optimized OS release notes.
- When you install the driver, specify the latest available version that works for the L4 GPUs. For example, sudo cos-extensions install gpu -- -version=525.60.13.
You must use the Google Cloud CLI or REST to create G2 instances for the following scenarios:
- You want to specify custom memory values.
- You want to customize the number of visible CPU cores.

N1+GPU VMs

To learn about the limitations for N1 instances with GPUs, see features for the N1 machine series and GPUs for the N1 machine series.

Choose an operating system

If you are using GPUs for machine learning, use one of the following operating systems:

Images optimized for AI workloads. You can use Ubuntu and Rocky images, which are available in accelerator-optimized versions with drivers and CUDA toolkit pre-installed. See OS images in the AI Hypercomputer documentation.
Deep Learning VM Images. Each Deep Learning VM has a GPU driver installation tool and includes packages such as TensorFlow and PyTorch. You can also use a Deep Learning VM for general GPU workloads. To learn more about available images and packages installed on these images, see Choosing an image.

Alternatively, you can use any public image or custom image. Note that some images might require a unique driver or have an install process that is out of the scope of the Compute Engine documentation. To help identify which drivers are appropriate for your OS image, see installing GPU drivers.

Check GPU quota

To protect Compute Engine systems and users, new projects have a global GPU quota, which limits the total number of GPUs you can create in any supported zone. To review GPU quota, see GPU quota.

If you need additional GPU quota, request a quota increase. When you request GPU quota, you must request quota for the GPU types that you want to create in each region and an additional global quota for the total number of GPUs of all types in all zones.

If your project has an established billing history, it will receive quota automatically after you submit the request.

GPU VMs and preemptible allocation quotas

Instances that use the standard provisioning model typically can't use preemptible allocation quotas. Preemptible quotas are for temporary workloads and are usually more available. If your project doesn't have preemptible quota, and you have never requested it, then all instances in your project consume standard allocation quotas.

If you request preemptible allocation quota, then instances that use the standard provisioning model must meet all of the following criteria to consume preemptible allocation quota:

The instances have GPUs attached.
The instances are configured to be automatically deleted after a predefined run time through the maxRunDuration or terminationTime field. For more information, see the following:
- Limit the run time of an instance
- Limit the run time of instances in a MIG
The instance isn't allowed to consume reservations. For more information, see Prevent compute instances from consuming reservations.

When you consume preemptible allocation for time-bound GPU workloads, you can benefit from both uninterrupted run time and the high obtainability of preemptible allocation quota. For more information, see Preemptible quotas.

Create a VM that has attached GPUs

To create a VM that has attached GPUs, complete the following steps:

Create the VM. The method used to create a VM depends on the GPU model selected.
- To create a VM that has attached NVIDIA B200 or H200 GPUs, see Create an A3 Ultra or A4 VM.
- To create a VM that has attached NVIDIA H100, A100, or L4 GPUs, see Create an A3, A2, or G2 VM.
- To create a VM that has attached NVIDIA T4, P4, P100, or V100 GPUs, see Create an N1 VM that has attached GPUs.
For the VM to use the GPU, you need to install the GPU driver on your VM. If you enabled an NVIDIA RTX virtual workstation (formerly known as NVIDIA GRID), install a driver for virtual workstation.

What's next?

Learn more about GPU platforms.
Learn more about the features and limitations of using GPUs.

Learn how to view the actual and forecasted usage of your GPUs.