GPU platforms


Compute Engine provides graphics processing units (GPUs) that you can add to your virtual machine (VM) instances. You can use these GPUs to accelerate specific workloads on your VMs such as machine learning and data processing.

Compute Engine provides NVIDIA GPUs for your VMs in passthrough mode so that your VMs have direct control over the GPUs and their associated memory.

If you have graphics-intensive workloads, such as 3D visualization, 3D rendering, or virtual applications, you can use NVIDIA RTX virtual workstations (formerly known as NVIDIA GRID).

This document provides an overview of the different GPU models that are available on Compute Engine.

To view available regions and zones for GPUs on Compute Engine, see GPUs regions and zone availability.

NVIDIA GPUs for compute workloads

For compute workloads, GPU models are available in the following stages:

  • NVIDIA H100 80GB: nvidia-h100-80gb: Generally Available
  • NVIDIA L4: nvidia-l4: Generally Available
  • NVIDIA A100
    • NVIDIA A100 40GB: nvidia-tesla-a100: Generally Available
    • NVIDIA A100 80GB: nvidia-a100-80gb: Generally Available
  • NVIDIA T4: nvidia-tesla-t4: Generally Available
  • NVIDIA V100: nvidia-tesla-v100: Generally Available
  • NVIDIA P100: nvidia-tesla-p100: Generally Available
  • NVIDIA P4: nvidia-tesla-p4: Generally Available
  • NVIDIA K80: nvidia-tesla-k80: Generally Available. See NVIDIA K80 end of support.

NVIDIA H100 GPUs

To run NVIDIA H100 80GB GPUs, you must use an A3 accelerator-optimized machine type.

GPU model Machine type GPUs GPU memory* Available vCPUs Available memory Local SSD supported
NVIDIA H100 a3-highgpu-8g 8 GPUs 640 GB HBM3 208 vCPUs 1872 GB Bundled (6000 GB)

*GPU memory is the memory that is available on a GPU device that can be used for temporary storage of data. It is separate from the VM's memory and is specifically designed to handle the higher bandwidth demands of your graphics-intensive workloads.

NVIDIA L4 GPUs

To run NVIDIA L4 GPUs, you must use a G2 accelerator-optimized machine type.

Each G2 machine type has a fixed number of NVIDIA L4 GPUs and vCPUs attached. Each G2 machine type also has a default memory and a custom memory range. The custom memory range defines the amount of memory that you can allocate to your VM for each machine type. You can specify your custom memory during VM creation.

GPU model Machine type GPUs GPU memory* vCPUs Default memory Custom memory range Max local SSD supported
NVIDIA L4 g2-standard-4 1 GPU 24 GB GDDR6 4 vCPUs 16 GB 16 - 32 GB 375 GB
g2-standard-8 1 GPU 24 GB GDDR6 8 vCPUs 32 GB 32 - 54 GB 375 GB
g2-standard-12 1 GPU 24 GB GDDR6 12 vCPUs 48 GB 48 - 54 GB 375 GB
g2-standard-16 1 GPU 24 GB GDDR6 16 vCPUs 64 GB 54 - 64 GB 375 GB
g2-standard-24 2 GPUs 48 GB GDDR6 24 vCPUs 96 GB 96 - 108 GB 750 GB
g2-standard-32 1 GPU 24 GB GDDR6 32 vCPUs 128 GB 96 - 128 GB 375 GB
g2-standard-48 4 GPUs 96 GB GDDR6 48 vCPUs 192 GB 192 - 216 GB 1500 GB
g2-standard-96 8 GPUs 192 GB GDDR6 96 vCPUs 384 GB 384 - 432 GB 3000 GB

*GPU memory is the memory that is available on a GPU device that can be used for temporary storage of data. It is separate from the VM's memory and is specifically designed to handle the higher bandwidth demands of your graphics-intensive workloads.

NVIDIA A100 GPUs

To run NVIDIA A100 GPUs, you must use the A2 accelerator-optimized machine type.

Each A2 machine type has a fixed GPU count, vCPU count, and memory size.

A100 40GB

GPU model Machine type GPUs GPU memory* Available vCPUs Available memory Local SSD supported
NVIDIA A100 40GB a2-highgpu-1g 1 GPU 40 GB HBM2 12 vCPUs 85 GB Yes
a2-highgpu-2g 2 GPUs 80 GB HBM2 24 vCPUs 170 GB Yes
a2-highgpu-4g 4 GPUs 160 GB HBM2 48 vCPUs 340 GB Yes
a2-highgpu-8g 8 GPUs 320 GB HBM2 96 vCPUs 680 GB Yes
a2-megagpu-16g 16 GPUs 640 GB HBM2 96 vCPUs 1360 GB Yes

A100 80GB

GPU model Machine type GPUs GPU memory* Available vCPUs Available memory Local SSD supported
NVIDIA A100 80GB a2-ultragpu-1g 1 GPU 80 GB HBM2e 12 vCPUs 170 GB Bundled (375 GB)
a2-ultragpu-2g 2 GPUs 160 GB HBM2e 24 vCPUs 340 GB Bundled (750 GB)
a2-ultragpu-4g 4 GPUs 320 GB HBM2e 48 vCPUs 680 GB Bundled (1.5 TB)
a2-ultragpu-8g 8 GPUs 640 GB HBM2e 96 vCPUs 1360 GB Bundled (3 TB)

*GPU memory is the memory that is available on a GPU device that can be used for temporary storage of data. It is separate from the VM's memory and is specifically designed to handle the higher bandwidth demands of your graphics-intensive workloads.

NVIDIA T4 GPUs

VMs with lower numbers of GPUs are limited to a maximum number of vCPUs. In general, a higher number of GPUs lets you create instances with a higher number of vCPUs and memory.

GPU model Machine type GPUs GPU memory* Available vCPUs Available memory Local SSD supported
NVIDIA T4 N1 machine series except N1 shared-core 1 GPU 16 GB GDDR6 1 - 48 vCPUs 1 - 312 GB Yes
2 GPUs 32 GB GDDR6 1 - 48 vCPUs 1 - 312 GB Yes
4 GPUs 64 GB GDDR6 1 - 96 vCPUs 1 - 624 GB Yes

*GPU memory is the memory that is available on a GPU device that can be used for temporary storage of data. It is separate from the VM's memory and is specifically designed to handle the higher bandwidth demands of your graphics-intensive workloads.

NVIDIA P4 GPUs

For P4 GPUs, local SSD is only supported in select regions, see Local SSD availability by GPU regions and zone.

GPU model Machine type GPUs GPU memory* Available vCPUs Available memory Local SSD supported
NVIDIA P4 N1 machine series except N1 shared-core 1 GPU 8 GB GDDR5 1 - 24 vCPUs 1 - 156 GB Yes
2 GPUs 16 GB GDDR5 1 - 48 vCPUs 1 - 312 GB Yes
4 GPUs 32 GB GDDR5 1 - 96 vCPUs 1 - 624 GB Yes

*GPU memory is the memory that is available on a GPU device that can be used for temporary storage of data. It is separate from the VM's memory and is specifically designed to handle the higher bandwidth demands of your graphics-intensive workloads.

NVIDIA V100 GPUs

For V100 GPUs, local SSD is only supported in select regions, see Local SSD availability by GPU regions and zone.

GPU model Machine type GPUs GPU memory* Available vCPUs Available memory Local SSD supported
NVIDIA V100 N1 machine series except N1 shared-core 1 GPU 16 GB HBM2 1 - 12 vCPUs 1 - 78 GB Yes
2 GPUs 32 GB HBM2 1 - 24 vCPUs 1 - 156 GB Yes
4 GPUs 64 GB HBM2 1 - 48 vCPUs 1 - 312 GB Yes
8 GPUs 128 GB HBM2 1 - 96 vCPUs 1 - 624 GB Yes

*GPU memory is the memory that is available on a GPU device that can be used for temporary storage of data. It is separate from the VM's memory and is specifically designed to handle the higher bandwidth demands of your graphics-intensive workloads.

NVIDIA P100 GPUs

For some P100 GPUs, the maximum CPU and memory that is available for some configurations is dependent on the zone in which the GPU resource is running.

GPU model Machine type GPUs GPU memory* Available vCPUs Available memory Local SSD supported
NVIDIA P100 N1 machine series except N1 shared-core 1 GPU 16 GB HBM2 1 - 16 vCPUs 1 - 104 GB Yes
2 GPUs 32 GB HBM2 1 - 32 vCPUs 1 - 208 GB Yes
4 GPUs 64 GB HBM2

1 - 64 vCPUs
(us-east1-c, europe-west1-d, europe-west1-b)

1 - 96 vCPUs
(all P100 zones)

1 - 208 GB
(us-east1-c, europe-west1-d, europe-west1-b)

1 - 624 GB
(all P100 zones)

Yes

*GPU memory is the memory that is available on a GPU device that can be used for temporary storage of data. It is separate from the VM's memory and is specifically designed to handle the higher bandwidth demands of your graphics-intensive workloads.

NVIDIA K80 GPUs

NVIDIA K80 boards contain two GPUs each. The pricing for K80 GPUs is by individual GPU, not by the board.

GPU model Machine type GPUs GPU memory* Available vCPUs Available memory Local SSD supported
NVIDIA K80 N1 machine series except N1 shared-core 1 GPU 12 GB GDDR5 1 - 8 vCPUs 1 - 52 GB Yes
2 GPUs 24 GB GDDR5 1 - 16 vCPUs 1 - 104 GB Yes
4 GPUs 48 GB GDDR5 1 - 32 vCPUs 1 - 208 GB Yes
8 GPUs 96 GB GDDR5 1 - 64 vCPUs

1 - 416 GB
(asia-east1-a and us-east1-d)

1 - 208 GB
(all K80 zones)

Yes

*GPU memory is the memory that is available on a GPU device that can be used for temporary storage of data. It is separate from the VM's memory and is specifically designed to handle the higher bandwidth demands of your graphics-intensive workloads.

NVIDIA RTX Virtual Workstations (vWS) for graphics workloads

If you have graphics-intensive workloads, such as 3D visualization, you can create virtual workstations that use NVIDIA RTX Virtual Workstations (vWS) (formerly known as NVIDIA GRID). When you create a virtual workstation, an NVIDIA RTX Virtual Workstation (vWS) license is automatically added to your VM.

For information about pricing for virtual workstations, see GPU pricing page.

For graphics workloads, NVIDIA RTX virtual workstation (vWS) models are available in the following stages:

  • NVIDIA L4 Virtual Workstations: nvidia-l4-vws: Generally Available
  • NVIDIA T4 Virtual Workstations: nvidia-tesla-t4-vws: Generally Available
  • NVIDIA P100 Virtual Workstations: nvidia-tesla-p100-vws: Generally Available
  • NVIDIA P4 Virtual Workstations: nvidia-tesla-p4-vws: Generally Available

NVIDIA L4 vWS GPUS

GPU model Machine type GPUs GPU memory vCPUs Default memory Custom memory range Max local SSD supported
NVIDIA L4 Virtual Workstation g2-standard-4 1 GPU 24 GB GDDR6 4 vCPUs 16 GB 16 - 32 GB 375 GB
g2-standard-8 1 GPU 24 GB GDDR6 8 vCPUs 32 GB 32 - 54 GB 375 GB
g2-standard-12 1 GPU 24 GB GDDR6 12 vCPUs 48 GB 48 - 54 GB 375 GB
g2-standard-16 1 GPU 24 GB GDDR6 16 vCPUs 64 GB 54 - 64 GB 375 GB
g2-standard-24 2 GPUs 48 GB GDDR6 24 vCPUs 96 GB 96 - 108 GB 750 GB
g2-standard-32 1 GPU 24 GB GDDR6 32 vCPUs 128 GB 96 - 128 GB 375 GB
g2-standard-48 4 GPUs 96 GB GDDR6 48 vCPUs 192 GB 192 - 216 GB 1500 GB
g2-standard-96 8 GPUs 192 GB GDDR6 96 vCPUs 384 GB 384 - 432 GB 3000 GB

*GPU memory is the memory that is available on a GPU device that can be used for temporary storage of data. It is separate from the VM's memory and is specifically designed to handle the higher bandwidth demands of your graphics-intensive workloads.

NVIDIA T4 vWS GPUs

GPU model Machine type GPUs GPU memory* Available vCPUs Available memory Local SSD supported
NVIDIA T4 Virtual Workstation N1 machine series except N1 shared-core 1 GPU 16 GB GDDR6 1 - 48 vCPUs 1 - 312 GB Yes
2 GPUs 32 GB GDDR6 1 - 48 vCPUs 1 - 312 GB Yes
4 GPUs 64 GB GDDR6 1 - 96 vCPUs 1 - 624 GB Yes

*GPU memory is the memory that is available on a GPU device that can be used for temporary storage of data. It is separate from the VM's memory and is specifically designed to handle the higher bandwidth demands of your graphics-intensive workloads.

NVIDIA P4 vWS GPUs

For P4 GPUs, local SSD is only supported in select regions, see Local SSD availability by GPU regions and zone.

GPU model Machine type GPUs GPU memory* Available vCPUs Available memory Local SSD supported
NVIDIA P4 Virtual Workstation N1 machine series except N1 shared-core 1 GPU 8 GB GDDR5 1 - 16 vCPUs 1 - 156 GB Yes
2 GPUs 16 GB GDDR5 1 - 48 vCPUs 1 - 312 GB Yes
4 GPUs 32 GB GDDR5 1 - 96 vCPUs 1 - 624 GB Yes

*GPU memory is the memory that is available on a GPU device that can be used for temporary storage of data. It is separate from the VM's memory and is specifically designed to handle the higher bandwidth demands of your graphics-intensive workloads.

NVIDIA P100 vWS GPUs

GPU model Machine type GPUs GPU memory* Available vCPUs Available memory Local SSD supported
NVIDIA P100 Virtual Workstation N1 machine series except N1 shared-core 1 GPU 16 GB HBM2 1 - 16 vCPUs 1 - 104 GB Yes
2 GPUs 32 GB HBM2 1 - 32 vCPUs 1 - 208 GB Yes
4 GPUs 64 GB HBM2

1 - 64 vCPUs
(us-east1-c, europe-west1-d, europe-west1-b)

1 - 96 vCPUs
(all P100 zones)

1 - 208 GB
(us-east1-c, europe-west1-d, europe-west1-b)

1 - 624 GB
(all P100 zones)

Yes

*GPU memory is the memory that is available on a GPU device that can be used for temporary storage of data. It is separate from the VM's memory and is specifically designed to handle the higher bandwidth demands of your graphics-intensive workloads.

General comparison chart

The following table describes the GPU memory size, feature availability, and ideal workload types of different GPU models that are available on Compute Engine.

GPU model Memory Interconnect NVIDIA RTX Virtual Workstation (vWS) support Best used for
H100 80GB 80 GB HBM3 @ 3.35 TBps NVLink Full Mesh @ 900 GBps Large models with massive data tables for ML Training, Inference, HPC, BERT, DLRM
A100 80GB 80 GB HBM2e @ 1.9 TBps NVLink Full Mesh @ 600 GBps Large models with massive data tables for ML Training, Inference, HPC, BERT, DLRM
A100 40GB 40 GB HBM2 @ 1.6 TBps NVLink Full Mesh @ 600 GBps ML Training, Inference, HPC
L4 24 GB GDDR6 @ 300 GBps N/A ML Inference, Training, Remote Visualization Workstations, Video Transcoding, HPC
T4 16 GB GDDR6 @ 320 GBps N/A ML Inference, Training, Remote Visualization Workstations, Video Transcoding
V100 16 GB HBM2 @ 900 GBps NVLink Ring @ 300 GBps ML Training, Inference, HPC
P4 8 GB GDDR5 @ 192 GBps N/A Remote Visualization Workstations, ML Inference, and Video Transcoding
P100 16 GB HBM2 @ 732 GBps N/A ML Training, Inference, HPC, Remote Visualization Workstations
K80EOL 12 GB GDDR5 @ 240 GBps N/A ML Inference, Training, HPC

To compare GPU pricing for the different GPU models and regions that are available on Compute Engine, see GPU pricing.

Performance comparison chart

The following table describes the performance specifications of different GPU models that are available on Compute Engine.

Compute performance

GPU model FP64 FP32 FP16 INT8
H100 80GB 34 TFLOPS 67 TFLOPS
A100 80GB 9.7 TFLOPS 19.5 TFLOPS
A100 40GB 9.7 TFLOPS 19.5 TFLOPS
L4 0.5 TFLOPS* 30.3 TFLOPS
T4 0.25 TFLOPS* 8.1 TFLOPS
V100 7.8 TFLOPS 15.7 TFLOPS
P4 0.2 TFLOPS* 5.5 TFLOPS 22 TOPS
P100 4.7 TFLOPS 9.3 TFLOPS 18.7 TFLOPS
K80EOL 1.46 TFLOPS 4.37 TFLOPS

*To allow FP64 code to work correctly, a small number of FP64 hardware units are included in the T4, L4, and P4 GPU architecture.

TeraOperations per Second.

Tensor core performance

GPU model FP64 TF32 Mixed-precision FP16/FP32 INT8 INT4 FP8
H100 80GB 67 TFLOPS 989 TFLOPS 1,979 TFLOPS*, † 3,958 TOPS 3,958 TFLOPS
A100 80GB 19.5 TFLOPS 156 TFLOPS 312 TFLOPS* 624 TOPS 1248 TOPS
A100 40GB 19.5 TFLOPS 156 TFLOPS 312 TFLOPS* 624 TOPS 1248 TOPS
L4 120 TFLOPS 242 TFLOPS*, † 485 TOPS 485 TFLOPS
T4 65 TFLOPS 130 TOPS 260 TOPS
V100 125 TFLOPS
P4
P100
K80EOL

*For mixed precision training, NVIDIA H100, A100, and L4 GPUs also support the bfloat16 data type.

For H100 and L4 GPUs, structural sparsity is supported which you can use to double the performance value. The values shown are with sparsity. Specifications are one-half lower without sparsity.

What's next?