To use GPUs on Google Cloud, you can either deploy an accelerator-optimized VM that has attached GPUs, or attach GPUs to an N1 general-purpose VM. The following GPU machine types are supported for running your artificial intelligence (AI), machine learning (ML) and high performance computing (HPC) workloads on the AI Hypercomputer platform.
A3 series
The A3 machine series is available in the following configurations. For more information about this machine series, see A3 accelerator-optimized machine series.
A3 Ultra
These machine types have NVIDIA H200 GPUs (nvidia-h200-141gb
)
attached and are ideal for foundation model training and serving.
Machine type | GPU count | GPU memory* (GB HBM3e) |
vCPU count† | VM memory (GB) | Attached Local SSD (GiB) | Physical NIC count | Maximum network bandwidth (Gbps)‡ | Network protocol |
---|---|---|---|---|---|---|---|---|
a3-ultragpu-8g |
8 | 1128 | 224 | 2,952 | 12,000 | 10 | 3,200 | RDMA over Converged Ethernet (RoCE) |
*GPU memory is the memory on a GPU device that can be used for
temporary storage of data. It is separate from the VM's memory and is
specifically designed to handle the higher bandwidth demands of your
graphics-intensive workloads.
†A vCPU is implemented as a single hardware hyper-thread on one of
the available CPU platforms.
‡Maximum egress bandwidth cannot exceed the number given. Actual
egress bandwidth depends on the destination IP address and other factors.
See Network bandwidth.
A3 Mega
These machine types have NVIDIA H100 80GB GPUs (nvidia-h100-mega-80gb
)
and are ideal for large model training and multi-host inference.
Machine type | GPU count | GPU memory* (GB HBM3) |
vCPU count† | VM memory (GB) | Attached Local SSD (GiB) | Physical NIC count | Maximum network bandwidth (Gbps)‡ | Network protocol |
---|---|---|---|---|---|---|---|---|
a3-megagpu-8g |
8 | 640 | 208 | 1,872 | 6,000 | 9 | 1,800 | GPUDirect-TCPXO |
*GPU memory is the memory on a GPU device that can be used for
temporary storage of data. It is separate from the VM's memory and is
specifically designed to handle the higher bandwidth demands of your
graphics-intensive workloads.
†A vCPU is implemented as a single hardware hyper-thread on one of
the available CPU platforms.
‡Maximum egress bandwidth cannot exceed the number given. Actual
egress bandwidth depends on the destination IP address and other factors.
See Network bandwidth.
A3 High
These machine types have NVIDIA H100 80GB GPUs (nvidia-h100-80gb
) and
are well-suited for both large model inference and model fine tuning.
Machine type | GPU count | GPU memory* (GB HBM3) |
vCPU count† | VM memory (GB) | Attached Local SSD (GiB) | Physical NIC count | Maximum network bandwidth (Gbps)‡ | Network protocol |
---|---|---|---|---|---|---|---|---|
a3-highgpu-1g |
1 | 80 | 26 | 234 | 750 | 1 | 25 | GPUDirect-TCPX |
a3-highgpu-2g |
2 | 160 | 52 | 468 | 1,500 | 1 | 50 | GPUDirect-TCPX |
a3-highgpu-4g |
4 | 320 | 104 | 936 | 3,000 | 1 | 100 | GPUDirect-TCPX |
a3-highgpu-8g |
8 | 640 | 208 | 1,872 | 6,000 | 5 | 1,000 | GPUDirect-TCPX |
*GPU memory is the memory on a GPU device that can be used for
temporary storage of data. It is separate from the VM's memory and is
specifically designed to handle the higher bandwidth demands of your
graphics-intensive workloads.
†A vCPU is implemented as a single hardware hyper-thread on one of
the available CPU platforms.
‡Maximum egress bandwidth cannot exceed the number given. Actual
egress bandwidth depends on the destination IP address and other factors.
See Network bandwidth.
A3 Edge
These machine types have NVIDIA H100 80GB GPUs (nvidia-h100-80gb
),
are designed specifically for serving and are available in
a limited set of regions.
Machine type | GPU count | GPU memory* (GB HBM3) |
vCPU count† | VM memory (GB) | Attached Local SSD (GiB) | Physical NIC count | Maximum network bandwidth (Gbps)‡ | Network protocol |
---|---|---|---|---|---|---|---|---|
a3-edgegpu-8g |
8 | 640 | 208 | 1,872 | 6,000 | 5 |
|
GPUDirect-TCPX |
*GPU memory is the memory on a GPU device that can be used for
temporary storage of data. It is separate from the VM's memory and is
specifically designed to handle the higher bandwidth demands of your
graphics-intensive workloads.
†A vCPU is implemented as a single hardware hyper-thread on one of
the available CPU platforms.
‡Maximum egress bandwidth cannot exceed the number given. Actual
egress bandwidth depends on the destination IP address and other factors.
See Network bandwidth.
A2 series
The A2 machine series is available in the following configurations. For more information about this machine series, see A2 machine series.
A2 Ultra
These machine types have NVIDIA A100 80GB GPUs (nvidia-a100-80gb
) attached
and are ideal for model fine tuning, large model and cost optimized inference.
Machine type | GPU count | GPU memory* (GB HBM3) |
vCPU count† | VM memory (GB) | Attached Local SSD (GiB) | Maximum network bandwidth (Gbps)‡ |
---|---|---|---|---|---|---|
a2-ultragpu-1g |
1 | 80 | 12 | 170 | 375 | 24 |
a2-ultragpu-2g |
2 | 160 | 24 | 340 | 750 | 32 |
a2-ultragpu-4g |
4 | 320 | 48 | 680 | 1,500 | 50 |
a2-ultragpu-8g |
8 | 640 | 96 | 1,360 | 3,000 | 100 |
*GPU memory is the memory on a GPU device that can be used for
temporary storage of data. It is separate from the VM's memory and is
specifically designed to handle the higher bandwidth demands of your
graphics-intensive workloads.
†A vCPU is implemented as a single hardware hyper-thread on one of
the available CPU platforms.
‡Maximum egress bandwidth cannot exceed the number given. Actual
egress bandwidth depends on the destination IP address and other factors.
See Network bandwidth.
A2 High
These machine types have NVIDIA A100 40GB GPUs (nvidia-a100-40gb
) attached and
are ideal for model fine tuning, large model and cost optimized inference.
Machine type | GPU count | GPU memory* (GB HBM3) |
vCPU count† | VM memory (GB) | Attached Local SSD (GiB) | Maximum network bandwidth (Gbps)‡ |
---|---|---|---|---|---|---|
a2-highgpu-1g |
1 | 40 | 12 | 85 | Yes | 24 |
a2-highgpu-2g |
2 | 80 | 24 | 170 | Yes | 32 |
a2-highgpu-4g |
4 | 160 | 48 | 340 | Yes | 50 |
a2-highgpu-8g |
8 | 320 | 96 | 680 | Yes | 100 |
a2-megagpu-16g |
16 | 640 | 96 | 1,360 | Yes | 100 |
*GPU memory is the memory on a GPU device that can be used for
temporary storage of data. It is separate from the VM's memory and is
specifically designed to handle the higher bandwidth demands of your
graphics-intensive workloads.
†A vCPU is implemented as a single hardware hyper-thread on one of
the available CPU platforms.
‡Maximum egress bandwidth cannot exceed the number given. Actual
egress bandwidth depends on the destination IP address and other factors.
See Network bandwidth.
G2 series
These machine types have NVIDIA L4 GPUs (nvidia-l4
or nvidia-l4-vws
)
attached and are ideal for
cost-optimized inference, graphics-intensive and high performance computing workloads.
Machine type | GPU count | GPU memory* (GB GDDR6) | vCPU count† | Default VM memory (GB) | Custom VM memory range (GB) | Max Local SSD supported (GiB) | Maximum network bandwidth (Gbps)‡ |
---|---|---|---|---|---|---|---|
g2-standard-4 |
1 | 24 | 4 | 16 | 16 to 32 | 375 | 10 |
g2-standard-8 |
1 | 24 | 8 | 32 | 32 to 54 | 375 | 16 |
g2-standard-12 |
1 | 24 | 12 | 48 | 48 to 54 | 375 | 16 |
g2-standard-16 |
1 | 24 | 16 | 64 | 54 to 64 | 375 | 32 |
g2-standard-24 |
2 | 48 | 24 | 96 | 96 to 108 | 750 | 32 |
g2-standard-32 |
1 | 24 | 32 | 128 | 96 to 128 | 375 | 32 |
g2-standard-48 |
4 | 96 | 48 | 192 | 192 to 216 | 1,500 | 50 |
g2-standard-96 |
8 | 192 | 96 | 384 | 384 to 432 | 3,000 | 100 |
*GPU memory is the memory on a GPU device that can be used for
temporary storage of data. It is separate from the VM's memory and is
specifically designed to handle the higher bandwidth demands of your
graphics-intensive workloads.
†A vCPU is implemented as a single hardware hyper-thread on one of
the available CPU platforms.
‡Maximum egress bandwidth cannot exceed the number given. Actual
egress bandwidth depends on the destination IP address and other factors.
See Network bandwidth.
N1 + GPUs series
You can also attach NVIDIA T4, P4, V100 and P100 models to an N1 machine type with the exception of the N1 shared-core machine type. These machine types can be used for small scale inference, graphics-intensive and high performance computing workloads.
For more information about these N1+GPUs machines, see N1 + GPU machine series.
What's next?
- For more information about GPUs on Compute Engine, see About GPUs.
- Review the GPU regions and zones availability.
- Learn about GPU pricing.