Compute Engine provides graphics processing units (GPUs) that you can add to your virtual machine (VM) instances. You can use these GPUs to accelerate specific workloads on your VMs such as machine learning and data processing.
Compute Engine provides NVIDIA GPUs for your VMs in passthrough mode so that your VMs have direct control over the GPUs and their associated memory.
For more information about GPUs on Compute Engine, see About GPUs.
If you have graphics-intensive workloads, such as 3D visualization, 3D rendering, or virtual applications, you can use NVIDIA RTX virtual workstations (formerly known as NVIDIA GRID).
This document provides an overview of the different GPU models that are available on Compute Engine.
To view available regions and zones for GPUs on Compute Engine, see GPUs regions and zone availability.
NVIDIA GPUs for compute workloads
For compute workloads, GPU models are available in the following stages:
- NVIDIA H100
- NVIDIA H100 80GB Mega:
nvidia-h100-mega-80gb
: Generally Available - NVIDIA H100 80GB:
nvidia-h100-80gb
: Generally Available
- NVIDIA H100 80GB Mega:
- NVIDIA L4:
nvidia-l4
: Generally Available - NVIDIA A100
- NVIDIA A100 40GB:
nvidia-tesla-a100
: Generally Available - NVIDIA A100 80GB:
nvidia-a100-80gb
: Generally Available
- NVIDIA A100 40GB:
- NVIDIA T4:
nvidia-tesla-t4
: Generally Available - NVIDIA V100:
nvidia-tesla-v100
: Generally Available - NVIDIA P100:
nvidia-tesla-p100
: Generally Available - NVIDIA P4:
nvidia-tesla-p4
: Generally Available
NVIDIA H100 GPUs
To run NVIDIA H100 GPUs, you must use an A3 accelerator-optimized machine type.
GPU model | Machine type | GPU count | GPU memory* (GB HBM3) |
vCPU count | VM memory (GB) | Attached Local SSD (GiB) | Maximum network bandwidth (Gbps) | |
---|---|---|---|---|---|---|---|---|
VM | GPU cluster | |||||||
NVIDIA H100 80GB Mega | a3-megagpu-8g |
8 | 640 | 208 | 1,872 | 6,000 | 200 | 1,600 |
NVIDIA H100 80GB | a3-highgpu-8g |
8 | 640 | 208 | 1,872 | 6,000 | 200 | 800 |
*GPU memory is the memory that is available on a GPU device that can be used for temporary storage of data. It is separate from the VM's memory and is specifically designed to handle the higher bandwidth demands of your graphics-intensive workloads.
NVIDIA L4 GPUs
To run NVIDIA L4 GPUs, you must use a G2 accelerator-optimized machine type.
Each G2 machine type has a fixed number of NVIDIA L4 GPUs and vCPUs attached. Each G2 machine type also has a default memory and a custom memory range. The custom memory range defines the amount of memory that you can allocate to your VM for each machine type. You can specify your custom memory during VM creation.
GPU model | Machine type | GPUs | GPU memory* | vCPUs | Default VM memory | Custom VM memory range | Max Local SSD supported |
---|---|---|---|---|---|---|---|
NVIDIA L4 | g2-standard-4 |
1 GPU | 24 GB GDDR6 | 4 vCPUs | 16 GB | 16 - 32 GB | 375 GiB |
g2-standard-8 |
1 GPU | 24 GB GDDR6 | 8 vCPUs | 32 GB | 32 - 54 GB | 375 GiB | |
g2-standard-12 |
1 GPU | 24 GB GDDR6 | 12 vCPUs | 48 GB | 48 - 54 GB | 375 GiB | |
g2-standard-16 |
1 GPU | 24 GB GDDR6 | 16 vCPUs | 64 GB | 54 - 64 GB | 375 GiB | |
g2-standard-24 |
2 GPUs | 48 GB GDDR6 | 24 vCPUs | 96 GB | 96 - 108 GB | 750 GiB | |
g2-standard-32 |
1 GPU | 24 GB GDDR6 | 32 vCPUs | 128 GB | 96 - 128 GB | 375 GiB | |
g2-standard-48 |
4 GPUs | 96 GB GDDR6 | 48 vCPUs | 192 GB | 192 - 216 GB | 1,500 GiB | |
g2-standard-96 |
8 GPUs | 192 GB GDDR6 | 96 vCPUs | 384 GB | 384 - 432 GB | 3,000 GiB |
*GPU memory is the memory that is available on a GPU device that can be used for temporary storage of data. It is separate from the VM's memory and is specifically designed to handle the higher bandwidth demands of your graphics-intensive workloads.
NVIDIA A100 GPUs
To run NVIDIA A100 GPUs, you must use the A2 accelerator-optimized machine type.
Each A2 machine type has a fixed GPU count, vCPU count, and memory size.
A100 40GB
GPU model | Machine type | GPUs | GPU memory* | Available vCPUs | VM memory | Local SSD supported |
---|---|---|---|---|---|---|
NVIDIA A100 40GB | a2-highgpu-1g |
1 GPU | 40 GB HBM2 | 12 vCPUs | 85 GB | Yes |
a2-highgpu-2g |
2 GPUs | 80 GB HBM2 | 24 vCPUs | 170 GB | Yes | |
a2-highgpu-4g |
4 GPUs | 160 GB HBM2 | 48 vCPUs | 340 GB | Yes | |
a2-highgpu-8g |
8 GPUs | 320 GB HBM2 | 96 vCPUs | 680 GB | Yes | |
a2-megagpu-16g |
16 GPUs | 640 GB HBM2 | 96 vCPUs | 1,360 GB | Yes |
A100 80GB
GPU model | Machine type | GPUs | GPU memory* | Available vCPUs | VM memory | Attached Local SSD |
---|---|---|---|---|---|---|
NVIDIA A100 80GB | a2-ultragpu-1g |
1 GPU | 80 GB HBM2e | 12 vCPUs | 170 GB | 375 GiB |
a2-ultragpu-2g |
2 GPUs | 160 GB HBM2e | 24 vCPUs | 340 GB | 750 GiB | |
a2-ultragpu-4g |
4 GPUs | 320 GB HBM2e | 48 vCPUs | 680 GB | 1,500 GiB | |
a2-ultragpu-8g |
8 GPUs | 640 GB HBM2e | 96 vCPUs | 1,360 GB | 3,000 GiB |
*GPU memory is the memory that is available on a GPU device that can be used for temporary storage of data. It is separate from the VM's memory and is specifically designed to handle the higher bandwidth demands of your graphics-intensive workloads.
NVIDIA T4 GPUs
VMs with lower numbers of GPUs are limited to a maximum number of vCPUs. In general, a higher number of GPUs lets you create instances with a higher number of vCPUs and memory.
GPU model | Machine type | GPUs | GPU memory* | Available vCPUs | VM memory | Local SSD supported |
---|---|---|---|---|---|---|
NVIDIA T4 | N1 machine series except N1 shared-core | 1 GPU | 16 GB GDDR6 | 1 - 48 vCPUs | 1 - 312 GB | Yes |
2 GPUs | 32 GB GDDR6 | 1 - 48 vCPUs | 1 - 312 GB | Yes | ||
4 GPUs | 64 GB GDDR6 | 1 - 96 vCPUs | 1 - 624 GB | Yes |
*GPU memory is the memory that is available on a GPU device that can be used for temporary storage of data. It is separate from the VM's memory and is specifically designed to handle the higher bandwidth demands of your graphics-intensive workloads.
NVIDIA P4 GPUs
For P4 GPUs, local SSD is only supported in select regions, see Local SSD availability by GPU regions and zone.
GPU model | Machine type | GPUs | GPU memory* | Available vCPUs | VM memory | Local SSD supported |
---|---|---|---|---|---|---|
NVIDIA P4 | N1 machine series except N1 shared-core | 1 GPU | 8 GB GDDR5 | 1 - 24 vCPUs | 1 - 156 GB | Yes |
2 GPUs | 16 GB GDDR5 | 1 - 48 vCPUs | 1 - 312 GB | Yes | ||
4 GPUs | 32 GB GDDR5 | 1 - 96 vCPUs | 1 - 624 GB | Yes |
*GPU memory is the memory that is available on a GPU device that can be used for temporary storage of data. It is separate from the VM's memory and is specifically designed to handle the higher bandwidth demands of your graphics-intensive workloads.
NVIDIA V100 GPUs
For V100 GPUs, local SSD is only supported in select regions, see Local SSD availability by GPU regions and zone.
GPU model | Machine type | GPUs | GPU memory* | Available vCPUs | VM memory | Local SSD supported |
---|---|---|---|---|---|---|
NVIDIA V100 | N1 machine series except N1 shared-core | 1 GPU | 16 GB HBM2 | 1 - 12 vCPUs | 1 - 78 GB | Yes |
2 GPUs | 32 GB HBM2 | 1 - 24 vCPUs | 1 - 156 GB | Yes | ||
4 GPUs | 64 GB HBM2 | 1 - 48 vCPUs | 1 - 312 GB | Yes | ||
8 GPUs | 128 GB HBM2 | 1 - 96 vCPUs | 1 - 624 GB | Yes |
*GPU memory is the memory that is available on a GPU device that can be used for temporary storage of data. It is separate from the VM's memory and is specifically designed to handle the higher bandwidth demands of your graphics-intensive workloads.
NVIDIA P100 GPUs
For some P100 GPUs, the maximum CPU and memory that is available for some configurations is dependent on the zone in which the GPU resource is running.
GPU model | Machine type | GPUs | GPU memory* | Available vCPUs | VM memory | Local SSD supported |
---|---|---|---|---|---|---|
NVIDIA P100 | N1 machine series except N1 shared-core | 1 GPU | 16 GB HBM2 | 1 - 16 vCPUs | 1 - 104 GB | Yes |
2 GPUs | 32 GB HBM2 | 1 - 32 vCPUs | 1 - 208 GB | Yes | ||
4 GPUs | 64 GB HBM2 | 1 - 64 vCPUs 1 - 96 vCPUs |
1 - 208 GB 1 - 624 GB |
Yes |
*GPU memory is the memory that is available on a GPU device that can be used for temporary storage of data. It is separate from the VM's memory and is specifically designed to handle the higher bandwidth demands of your graphics-intensive workloads.
NVIDIA RTX Virtual Workstations (vWS) for graphics workloads
If you have graphics-intensive workloads, such as 3D visualization, you can create virtual workstations that use NVIDIA RTX Virtual Workstations (vWS) (formerly known as NVIDIA GRID). When you create a virtual workstation, an NVIDIA RTX Virtual Workstation (vWS) license is automatically added to your VM.
For information about pricing for virtual workstations, see GPU pricing page.
For graphics workloads, NVIDIA RTX virtual workstation (vWS) models are available in the following stages:
- NVIDIA L4 Virtual Workstations:
nvidia-l4-vws
: Generally Available - NVIDIA T4 Virtual Workstations:
nvidia-tesla-t4-vws
: Generally Available - NVIDIA P100 Virtual Workstations:
nvidia-tesla-p100-vws
: Generally Available - NVIDIA P4 Virtual Workstations:
nvidia-tesla-p4-vws
: Generally Available
NVIDIA L4 vWS GPUS
GPU model | Machine type | GPUs | GPU memory | vCPUs | Default VM memory | Custom VM memory range | Max local SSD supported |
---|---|---|---|---|---|---|---|
NVIDIA L4 Virtual Workstation | g2-standard-4 |
1 GPU | 24 GB GDDR6 | 4 vCPUs | 16 GB | 16 - 32 GB | 375 GiB |
g2-standard-8 |
1 GPU | 24 GB GDDR6 | 8 vCPUs | 32 GB | 32 - 54 GB | 375 GiB | |
g2-standard-12 |
1 GPU | 24 GB GDDR6 | 12 vCPUs | 48 GB | 48 - 54 GB | 375 GiB | |
g2-standard-16 |
1 GPU | 24 GB GDDR6 | 16 vCPUs | 64 GB | 54 - 64 GB | 375 GiB | |
g2-standard-24 |
2 GPUs | 48 GB GDDR6 | 24 vCPUs | 96 GB | 96 - 108 GB | 750 GiB | |
g2-standard-32 |
1 GPU | 24 GB GDDR6 | 32 vCPUs | 128 GB | 96 - 128 GB | 375 GiB | |
g2-standard-48 |
4 GPUs | 96 GB GDDR6 | 48 vCPUs | 192 GB | 192 - 216 GB | 1,500 GiB | |
g2-standard-96 |
8 GPUs | 192 GB GDDR6 | 96 vCPUs | 384 GB | 384 - 432 GB | 3,000 GiB |
*GPU memory is the memory that is available on a GPU device that can be used for temporary storage of data. It is separate from the VM's memory and is specifically designed to handle the higher bandwidth demands of your graphics-intensive workloads.
NVIDIA T4 vWS GPUs
GPU model | Machine type | GPUs | GPU memory* | Available vCPUs | VM memory | Local SSD supported |
---|---|---|---|---|---|---|
NVIDIA T4 Virtual Workstation | N1 machine series except N1 shared-core | 1 GPU | 16 GB GDDR6 | 1 - 48 vCPUs | 1 - 312 GB | Yes |
2 GPUs | 32 GB GDDR6 | 1 - 48 vCPUs | 1 - 312 GB | Yes | ||
4 GPUs | 64 GB GDDR6 | 1 - 96 vCPUs | 1 - 624 GB | Yes |
*GPU memory is the memory that is available on a GPU device that can be used for temporary storage of data. It is separate from the VM's memory and is specifically designed to handle the higher bandwidth demands of your graphics-intensive workloads.
NVIDIA P4 vWS GPUs
For P4 GPUs, local SSD is only supported in select regions, see Local SSD availability by GPU regions and zone.
GPU model | Machine type | GPUs | GPU memory* | Available vCPUs | VM memory | Local SSD supported |
---|---|---|---|---|---|---|
NVIDIA P4 Virtual Workstation | N1 machine series except N1 shared-core | 1 GPU | 8 GB GDDR5 | 1 - 16 vCPUs | 1 - 156 GB | Yes |
2 GPUs | 16 GB GDDR5 | 1 - 48 vCPUs | 1 - 312 GB | Yes | ||
4 GPUs | 32 GB GDDR5 | 1 - 96 vCPUs | 1 - 624 GB | Yes |
*GPU memory is the memory that is available on a GPU device that can be used for temporary storage of data. It is separate from the VM's memory and is specifically designed to handle the higher bandwidth demands of your graphics-intensive workloads.
NVIDIA P100 vWS GPUs
GPU model | Machine type | GPUs | GPU memory* | Available vCPUs | VM memory | Local SSD supported |
---|---|---|---|---|---|---|
NVIDIA P100 Virtual Workstation | N1 machine series except N1 shared-core | 1 GPU | 16 GB HBM2 | 1 - 16 vCPUs | 1 - 104 GB | Yes |
2 GPUs | 32 GB HBM2 | 1 - 32 vCPUs | 1 - 208 GB | Yes | ||
4 GPUs | 64 GB HBM2 | 1 - 64 vCPUs 1 - 96 vCPUs |
1 - 208 GB 1 - 624 GB |
Yes |
*GPU memory is the memory that is available on a GPU device that can be used for temporary storage of data. It is separate from the VM's memory and is specifically designed to handle the higher bandwidth demands of your graphics-intensive workloads.
General comparison chart
The following table describes the GPU memory size, feature availability, and ideal workload types of different GPU models that are available on Compute Engine.
GPU model | Memory | Interconnect | NVIDIA RTX Virtual Workstation (vWS) support | Best used for |
---|---|---|---|---|
H100 80GB | 80 GB HBM3 @ 3.35 TBps | NVLink Full Mesh @ 900 GBps | Large models with massive data tables for ML Training, Inference, HPC, BERT, DLRM | |
A100 80GB | 80 GB HBM2e @ 1.9 TBps | NVLink Full Mesh @ 600 GBps | Large models with massive data tables for ML Training, Inference, HPC, BERT, DLRM | |
A100 40GB | 40 GB HBM2 @ 1.6 TBps | NVLink Full Mesh @ 600 GBps | ML Training, Inference, HPC | |
L4 | 24 GB GDDR6 @ 300 GBps | N/A | ML Inference, Training, Remote Visualization Workstations, Video Transcoding, HPC | |
T4 | 16 GB GDDR6 @ 320 GBps | N/A | ML Inference, Training, Remote Visualization Workstations, Video Transcoding | |
V100 | 16 GB HBM2 @ 900 GBps | NVLink Ring @ 300 GBps | ML Training, Inference, HPC | |
P4 | 8 GB GDDR5 @ 192 GBps | N/A | Remote Visualization Workstations, ML Inference, and Video Transcoding | |
P100 | 16 GB HBM2 @ 732 GBps | N/A | ML Training, Inference, HPC, Remote Visualization Workstations |
To compare GPU pricing for the different GPU models and regions that are available on Compute Engine, see GPU pricing.
Performance comparison chart
The following table describes the performance specifications of different GPU models that are available on Compute Engine.
Compute performance
GPU model | FP64 | FP32 | FP16 | INT8 |
---|---|---|---|---|
H100 80GB | 34 TFLOPS | 67 TFLOPS | ||
A100 80GB | 9.7 TFLOPS | 19.5 TFLOPS | ||
A100 40GB | 9.7 TFLOPS | 19.5 TFLOPS | ||
L4 | 0.5 TFLOPS* | 30.3 TFLOPS | ||
T4 | 0.25 TFLOPS* | 8.1 TFLOPS | ||
V100 | 7.8 TFLOPS | 15.7 TFLOPS | ||
P4 | 0.2 TFLOPS* | 5.5 TFLOPS | 22 TOPS† | |
P100 | 4.7 TFLOPS | 9.3 TFLOPS | 18.7 TFLOPS |
*To allow FP64 code to work correctly, a small number of FP64 hardware units are included in the T4, L4, and P4 GPU architecture.
†TeraOperations per Second.
Tensor core performance
GPU model | FP64 | TF32 | Mixed-precision FP16/FP32 | INT8 | INT4 | FP8 |
---|---|---|---|---|---|---|
H100 80GB | 67 TFLOPS | 989 TFLOPS† | 1,979 TFLOPS*, † | 3,958 TOPS† | 3,958 TFLOPS† | |
A100 80GB | 19.5 TFLOPS | 156 TFLOPS | 312 TFLOPS* | 624 TOPS | 1248 TOPS | |
A100 40GB | 19.5 TFLOPS | 156 TFLOPS | 312 TFLOPS* | 624 TOPS | 1248 TOPS | |
L4 | 120 TFLOPS† | 242 TFLOPS*, † | 485 TOPS† | 485 TFLOPS† | T4 | 65 TFLOPS | 130 TOPS | 260 TOPS |
V100 | 125 TFLOPS | |||||
P4 | ||||||
P100 |
*For mixed precision training, NVIDIA H100, A100, and L4 GPUs
also support the bfloat16
data type.
†For H100 and L4 GPUs, structural sparsity is supported which you can use to double the performance value. The values shown are with sparsity. Specifications are one-half lower without sparsity.
What's next?
- For more information about GPUs on Compute Engine, see About GPUs.
- Review the GPU regions and zones availability.
- Learn about GPU pricing.