About GPUs


You can attach graphics processing units (GPUs) to your virtual machine (VM) instance to accelerate specific workloads on Compute Engine.

This document describes the features and limitations of GPUs running on Compute Engine.

GPUs and machine series

GPUs are supported for N1 general-purpose, and the accelerator-optimized (A3, A2, and G2) machine series. For VMs that use N1 machine types, you attach the GPU to the VM during, or after VM creation. For VMs that use A3, A2 or G2 machine types, the GPUs are automatically attached when you create the VM. GPUs can't be used with other machine series.

Accelerator-optimized machine series

Each accelerator-optimized machine type has a specific model of NVIDIA GPUs attached.

  • For A3 accelerator-optimized machine types, NVIDIA H100 80GB GPUs are attached.
  • For A2 accelerator-optimized machine types, NVIDIA A100 GPUs are attached. These are available in both A100 40GB and A100 80GB options.
  • For G2 accelerator-optimized machine types, NVIDIA L4 GPUs are attached.

For more information, see Accelerator-optimized machine series.

N1 general-purpose machine series

For all other GPU types, you can use most N1 machine types except the N1 shared-core.

For this machine series, you can use either predefined or custom machine types.

GPUs on preemptible instances

You can add GPUs to your preemptible VM instances at lower spot prices for the GPUs. GPUs attached to preemptible instances work like normal GPUs but persist only for the life of the instance. Preemptible instances with GPUs follow the same preemption process as all preemptible instances.

Consider requesting dedicated Preemptible GPU quota to use for GPUs on preemptible instances. For more information, see Quotas for preemptible VM instances.

During maintenance events, preemptible instances with GPUs are preempted by default and cannot be automatically restarted. If you want to recreate your instances after they have been preempted, use a managed instance group. Managed instance groups recreate your instances if the vCPU, memory, and GPU resources are available.

If you want a warning before your instance is preempted, or want to configure your instance to automatically restart after a maintenance event, use a standard instance with a GPU. For standard instances with GPUs, Google provides one hour advance notice before preemption.

Compute Engine does not charge you for GPUs if their instances are preempted in the first minute after they start running.

For steps to automatically restart a standard instance, see Updating options for an instance.

To learn how to create preemptible instances with GPUs attached, read Create a VM with attached GPUs.

GPUs and Confidential VM

You can't attach GPUs to Confidential VM instances. For more information about Confidential VM, see Confidential Computing concepts.

GPUs and host maintenance

VMs with attached GPUs cannot live migrate and must stop for host maintenance events. These maintenance events typically occur once every two weeks. Maintenance events can also occur more frequently when necessary. For information on handling maintenance events, see Handling GPU host maintenance events.

GPUs and block storage

You can add Local SSDs to VMs that have GPUs attached. For a list of Local SSD support by GPU types and regions, see Local SSD availability by GPU regions and zones.

GPU pricing

Most VMs with an attached GPU receive sustained use discounts similar to vCPUs. When you select a GPU for a virtual workstation, an NVIDIA RTX Virtual Workstation license is added to your VM.

For hourly and monthly pricing for GPUs, see GPU pricing page.

Reserving GPUs with committed use discounts

To reserve GPU resources in a specific zone, see Reservations of Compute Engine zonal resources.

To receive committed use discounts for GPUs in a specific zone, you must purchase resource-based commitments for the GPUs and also attach reservations that specify matching GPUs to your commitments. For more information, see Attach reservations to resource-based commitments.

GPU restrictions and limitations

For VMs with attached GPUs, the following restrictions and limitations apply:

  • If you want to use NVIDIA K80 GPUs with your VMs, the VMs cannot use the Intel Skylake or later CPU platforms.

  • GPUs are currently only supported with general-purpose N1 or accelerator-optimized - A3, A2, and G2 - machine types.

  • To protect Compute Engine systems and users, new projects have a global GPU quota, which limits the total number of GPUs you can create in any supported zone. When you request a GPU quota, you must request a quota for the GPU models that you want to create in each region, and an additional global quota for the total number of GPUs of all types in all zones.

  • VMs with one or more GPUs have a maximum number of vCPUs for each GPU that you add to the instance. For example, each NVIDIA K80 GPU lets you have up to eight vCPUs and up to 52 GB of memory in your instance machine type. To see the available vCPU and memory ranges for different GPU configurations, see the GPUs list.

  • GPUs require device drivers in order to function properly. NVIDIA GPUs running on Compute Engine must use a minimum driver version. For more information about driver versions, see Required NVIDIA driver versions.

  • VMs with a specific attached GPU model are covered by the Compute Engine SLA only if that attached GPU model is generally available and is supported in more than one zone in the same region. The Compute Engine SLA does not cover GPU models in the following zones:

    • NVIDIA H100 80GB:
      • us-east5-a
    • NVIDIA L4:
      • europe-west3-b
      • europe-west6-b
    • NVIDIA A100 80GB:
      • asia-southeast1-c
      • us-east4-c
      • us-east5-b
    • NVIDIA A100 40GB:
      • us-east1-b
      • us-west1-b
      • us-west3-b
      • us-west4-b
    • NVIDIA T4:
      • europe-west3-b
      • southamerica-east1-c
      • us-west3-b
    • NVIDIA V100:
      • asia-east1-c
      • us-east1-c
    • NVIDIA P100:
      • australia-southeast1-c
      • europe-west4-a
    • NVIDIA K80:
      • us-west1-b
  • Compute Engine supports the running of 1 concurrent user per GPU.

What's next?