GPUs on Compute Engine

Google Compute Engine provides graphics processing units (GPUs) that you can add to your virtual machine instances. You can use these GPUs to accelerate specific workloads on your instances such as machine learning and data processing.

If you have graphics-intensive workloads, such as 3D visualization, 3D rendering, or virtual applications, you can create virtual workstations that use NVIDIA® GRID® technology. For information on GPUs for graphics-intensive applications, see GPUs for graphics workloads.

For steps to add GPUs to your instances, read adding GPUs to Instances.

Introduction

Compute Engine provides NVIDIA® Tesla® GPUs for your instances in passthrough mode so that your virtual machine instances have direct control over the GPUs and their associated memory.

For compute workloads, GPU models are available in the following stages:

  • NVIDIA® Tesla® P4: Beta
  • NVIDIA® Tesla® V100: Generally Available
  • NVIDIA® Tesla® P100: Generally Available
  • NVIDIA® Tesla® K80: Generally Available

For graphics workloads, GPU models are available in the following stages: + NVIDIA® Tesla® P4 Virtual Workstations: Beta + NVIDIA® Tesla® P100 Virtual Workstations: Beta

For information on GPUs for virtual workstations, see GPUs for graphics workloads.

You can attach GPUs only to instances with a predefined machine type or custom machine type that you are able to create in a zone. GPUs are not supported on shared-core machine types or memory-optimized machine types.

Instances with lower numbers of GPUs are limited to a maximum number of vCPUs. In general, a higher number of GPUs allows you to create instances with a higher number of vCPUs and system memory.

GPUs for compute workloads

GPU model GPUs GPU memory Available vCPUs Available memory Available zones
NVIDIA® Tesla® V100 1 GPU 16 GB HBM2 1 - 12 vCPUs 1 - 78 GB
  • us-west1-a
  • us-west1-b
  • us-central1-a
  • us-central1-f
  • europe-west4-a
  • europe-west4-c
  • asia-east1-c
2 GPUs 32 GB HBM2 1 - 24 vCPUs 1 - 156 GB
4 GPUs 64 GB HBM2 1 - 48 vCPUs 1 - 312 GB
8 GPUs 128 GB HBM2 1 - 96 vCPUs 1 - 624 GB
NVIDIA® Tesla® P100 1 GPU 16 GB HBM2 1 - 16 vCPUs 1 - 104 GB
  • us-west1-a
  • us-west1-b
  • us-central1-c
  • us-central1-f
  • us-east1-b
  • us-east1-c
  • europe-west1-b
  • europe-west1-d
  • europe-west4-a
  • asia-east1-a
  • asia-east1-c
2 GPUs 32 GB HBM2 1 - 32 vCPUs 1 - 208 GB
4 GPUs 64 GB HBM2

1 - 64 vCPUs
(us-east1-c, europe-west1-d, europe-west1-b)

1 - 96 vCPUs
(all other zones)

1 - 208 GB
(us-east1-c, europe-west1-d, europe-west1-b)

1 - 624 GB
(all other zones)

NVIDIA® Tesla® P4 1 GPU 8 GB GDDR5 1 - 24 vCPUs 1 - 156 GB
  • us-west2-c
  • us-west2-b
  • us-central1-a
  • us-central1-c
  • us-east4-a
  • us-east4-b
  • us-east4-c
  • northamerica-northeast1-a
  • northamerica-northeast1-b
  • northamerica-northeast1-c
  • europe-west4-b
  • europe-west4-c
2 GPUs 16 GB GDDR5 1 - 48 vCPUs 1 - 312 GB
4 GPUs 32 GB GDDR5 1 - 96 vCPUs 1 - 624 GB
NVIDIA® Tesla® K80 1 GPU 12 GB GDDR5 1 - 8 vCPUs 1 - 52 GB
  • us-west1-b
  • us-central1-a
  • us-central1-c
  • us-east1-c
  • us-east1-d
  • europe-west1-b
  • europe-west1-d
  • asia-east1-a
  • asia-east1-b
2 GPUs 24 GB GDDR5 1 - 16 vCPUs 1 - 104 GB
4 GPUs 48 GB GDDR5 1 - 32 vCPUs 1 - 208 GB
8 GPUs 96 GB GDDR5 1 - 64 vCPUs

1 - 416 GB
(asia-east1-a and us-east1-d)

1 - 208 GB
(all other zones)

GPU devices receive sustained use discounts similar to vCPUs. Read the Compute Engine pricing page to see hourly and monthly pricing for GPU devices.

For multi-GPU workloads, the V100 GPUs are offered with high-speed NVLink™ connections for communication between GPUs.

To see information about how your GPUs connect to each other and to your CPUs, run the following command on your instance:

nvidia-smi topo -m

For information on NVLink and its advantages, see the NVIDIA Developer Blog.

NVIDIA® GRID® GPUs for graphics workloads

If you have graphics-intensive workloads, such as 3D visualization, you can create virtual workstations that use NVIDIA GRID® platform.

For background information about GRID, see the GRID overview.

When you select a GPU for a virtual workstation, an NVIDIA GRID license is added to your instance.

After you create your virtual workstation, you can connect to it using a remote desktop protocol such as Teradici® PCoIP or VMWare® Horizon View.

GPU model GPUs GPU memory Available vCPUs Available memory Available zones
NVIDIA® Tesla® P4 Virtual Workstation 1 GPU 8 GB GDDR5 1 - 16 vCPUs 1 - 192 GB
  • us-west2-c
  • us-west2-b
  • us-central1-a
  • us-central1-c
  • us-east4-a
  • us-east4-b
  • us-east4-c
  • northamerica-northeast1-a
  • northamerica-northeast1-b
  • northamerica-northeast1-c
  • europe-west4-b
  • europe-west4-c
2 GPUs 16 GB GDDR5 1 - 48 vCPUs 1 - 312 GB
4 GPUs 32 GB GDDR5 1 - 96 vCPUs 1 - 624 GB
NVIDIA® Tesla® P100 Virtual Workstation 1 GPU 16 GB HBM2 1 - 16 vCPUs 1 - 104 GB
  • us-west1-b
  • us-central1-c
  • us-central1-f
  • us-east1-b
  • us-east1-c
  • europe-west1-b
  • europe-west1-d
  • asia-east1-a
  • asia-east1-c
  • europe-west4-a
2 GPUs 32 GB HBM2 1 - 32 vCPUs 1 - 208 GB
4 GPUs 64 GB HBM2

1 - 64 vCPUs
(us-east1-c, europe-west1-d, europe-west1-b)

1 - 96 vCPUs
(all other zones)

1 - 208 GB
(us-east1-c, europe-west1-d, europe-west1-b)

1 - 624 GB
(all other zones)

GPUs on preemptible instances

You can add GPUs to your preemptible VM instances at lower preemptible prices for the GPUs. GPUs attached to preemptible instances work like normal GPUs but persist only for the life of the instance. Preemptible instances with GPUs follow the same preemption process as all preemptible instances.

When you add a GPU to a preemptible instance, you use your regular GPU quota. If you need a separate quota for preemptible GPUs, request a separate Preemptible GPU quota.

During maintenance events, preemptible instances with GPUs are preempted by default and cannot be automatically restarted. If you want to recreate your instances after they have been preempted, use a managed instance group. Managed instance groups recreate your instances if the vCPU, memory, and GPU resources are available.

If you want a warning before your instance is preempted, or want to configure your instance to automatically restart after a maintenance event, use a non-preemptible instance with a GPU. For non-preemptible instances with GPUs, Google provides one hour advance notice before preemption.

For steps to automatically restart a non-preemptible instance, see Updating options for an instance.

To learn how to create preemptible instances with GPUs attached, read Creating an instance with a GPU.

Restrictions

Instances with GPUs have specific restrictions that make them behave differently than other instance types.

  • If you want to use Tesla K80 GPUs with your instances, the instances cannot use the Intel Skylake or later CPU platforms.

  • GPU instances must terminate for host maintenance events, but can automatically restart. These maintenance events typically occur once per week, but can occur more frequently when necessary. You must configure your workloads to handle these maintenance events cleanly. Specifically, long-running workloads like machine learning and high-performance computing (HPC) must handle the interruption of host maintenance events. Learn how to handle host maintenance events on instances with GPUs.

  • To protect Compute Engine systems and users, new projects have a global GPU quota, which limits the total number of GPUs you can create in any supported zone. When you request a GPU quota, you must request a quota for the GPU models that you want to create in each region, and an additional global quota for the total number of GPUs of all types in all zones.

  • Instances with one or more GPUs have a maximum number of vCPUs for each GPU that you add to the instance. For example, each NVIDIA® Tesla® K80 GPU allows you to have up to eight vCPUs and up to 52 GB of system memory in your instance machine type. To see the available vCPU and memory ranges for different GPU configurations, see the GPUs list.

  • You cannot attach GPUs to instances with shared-core machine types.

  • GPUs require device drivers in order to function properly. NVIDIA GPUs running on Google Compute Engine must use the following driver versions:

    • Linux instances:
      • R384 branch: NVIDIA 384.111 driver or greater
      • R390 branch: Not yet available
    • Windows Server instances:
      • R384 branch: NVIDIA 386.07 driver or greater
      • R390 branch: Not yet available
  • Instances with a specific attached GPU model are covered by the Google Compute Engine SLA only if that attached GPU model is available in more than one zone in the same region where the instance is located. The Google Compute Engine SLA does not cover specific GPU models in the following zones:

    • NVIDIA® Tesla® P100:
      • us-west1-b
      • europe-west4-a
    • NVIDIA® Tesla® K80:
      • us-west1-b
      • us-central1-c
  • Instances with NVIDIA® Tesla® P100 GPUs in europe-west1-d cannot use Local SSD devices.

What's next?

Was this page helpful? Let us know how we did:

Send feedback about...

Compute Engine Documentation