You can attach graphics processing units (GPUs) to your virtual machine (VM) instance to accelerate specific workloads on Compute Engine.
This document describes the features and limitations of GPUs running on Compute Engine.
GPUs and machine types
GPUs on preemptible instances
You can add GPUs to your preemptible VM instances at lower spot prices for the GPUs. GPUs attached to preemptible instances work like normal GPUs but persist only for the life of the instance. Preemptible instances with GPUs follow the same preemption process as all preemptible instances.
Consider requesting dedicated
Preemptible GPU quota to use for GPUs on
preemptible instances. For more information, see
Quotas for preemptible VM instances.
During maintenance events, preemptible instances with GPUs are preempted by default and cannot be automatically restarted. If you want to recreate your instances after they have been preempted, use a managed instance group. Managed instance groups recreate your instances if the vCPU, memory, and GPU resources are available.
If you want a warning before your instance is preempted, or want to configure your instance to automatically restart after a maintenance event, use a standard instance with a GPU. For standard instances with GPUs, Google provides one hour advance notice before preemption.
Compute Engine does not charge you for GPUs if their instances are preempted in the first minute after they start running.
For steps to automatically restart a standard instance, see Updating options for an instance.
To learn how to create preemptible instances with GPUs attached, read Create a VM with attached GPUs.
GPUs and host maintenance
VMs with attached GPUs cannot live migrate and must stop for host maintenance events. These maintenance events typically occur once every two weeks. Maintenance events can also occur more frequently when necessary. For information on handling maintenance events, see Handling GPU host maintenance events.
GPUs and block storage
Most VMs with an attached GPU receive sustained use discounts similar to vCPUs. When you select a GPU for a virtual workstation, an NVIDIA RTX Virtual Workstation license is added to your VM.
For hourly and monthly pricing for GPUs, see GPU pricing page.
Reserving GPUs with committed use discounts
To reserve GPU resources in a specific zone, see Reserving zonal resources. Reservations are required for committed use discounted pricing for GPUs.
For VMs with attached GPUs, the following restrictions apply:
If you want to use NVIDIA K80 GPUs with your VMs, the VMs cannot use the Intel Skylake or later CPU platforms.
GPUs are currently only supported with general-purpose N1 or accelerator-optimized A2 machine types.
To protect Compute Engine systems and users, new projects have a global GPU quota, which limits the total number of GPUs you can create in any supported zone. When you request a GPU quota, you must request a quota for the GPU models that you want to create in each region, and an additional global quota for the total number of GPUs of all types in all zones.
VMs with one or more GPUs have a maximum number of vCPUs for each GPU that you add to the instance. For example, each NVIDIA K80 GPU lets you have up to eight vCPUs and up to 52 GB of memory in your instance machine type. To see the available vCPU and memory ranges for different GPU configurations, see the GPUs list.
GPUs require device drivers in order to function properly. NVIDIA GPUs running on Compute Engine must use a minimum driver version. For more information about driver versions, see Required NVIDIA driver versions.
VMs with a specific attached GPU model are covered by the Compute Engine SLA only if that attached GPU model is generally available and is supported in more than one zone in the same region. The Compute Engine SLA does not cover GPU models in the following zones:
- NVIDIA A100:
- NVIDIA T4:
- NVIDIA V100:
- NVIDIA P100:
- NVIDIA K80:
- NVIDIA A100:
Compute Engine supports the running of 1 concurrent user per GPU.