GPU machine types

To use GPUs on Google Cloud, you can either deploy an accelerator-optimized VM that has attached GPUs, or attach GPUs to an N1 general-purpose VM. The following GPU machine types are supported for running your artificial intelligence (AI), machine learning (ML) and high performance computing (HPC) workloads on the AI Hypercomputer platform.

A3 series

The A3 machine series is available in the following configurations. For more information about this machine series, see A3 accelerator-optimized machine series.

A3 Ultra

These machine types have NVIDIA H200 GPUs (nvidia-h200-141gb) attached and are ideal for foundation model training and serving.

Tip: When provisioning a3-ultragpu-8g machine types, you must use Hypercompute Cluster to request capacity and create VMs or clusters. To get started see Overview of creating VMs and clusters in the AI Hypercomputer documentation.

Machine type	GPU count	GPU memory^* (GB HBM3e)	vCPU count^†	VM memory (GB)	Attached Local SSD (GiB)	Physical NIC count	Maximum network bandwidth (Gbps)^‡
`a3-ultragpu-8g`	8	1128	224	2,952	12,000	10	3,600

^*GPU memory is the memory on a GPU device that can be used for temporary storage of data. It is separate from the VM's memory and is specifically designed to handle the higher bandwidth demands of your graphics-intensive workloads.
^†A vCPU is implemented as a single hardware hyper-thread on one of the available CPU platforms.
^‡Maximum egress bandwidth cannot exceed the number given. Actual egress bandwidth depends on the destination IP address and other factors. See Network bandwidth.

A3 Mega

These machine types have NVIDIA H100 80GB GPUs (nvidia-h100-mega-80gb) and are ideal for large model training and multi-host inference.

Tip: When provisioning a3-megagpu-8g machine types, we recommend using a cluster of these VMs and deploying with a scheduler such as Google Kubernetes Engine (GKE) or Slurm. For detailed instructions on either of these options, review the following:

To create Google Kubernetes Engine cluster, see Deploy an A3 Mega cluster with GKE.
To create a Slurm cluster, see Deploy an A3 Mega Slurm cluster.

Machine type	GPU count	GPU memory^* (GB HBM3)	vCPU count^†	VM memory (GB)	Attached Local SSD (GiB)	Physical NIC count	Maximum network bandwidth (Gbps)^‡
`a3-megagpu-8g`	8	640	208	1,872	6,000	9	1,800

A3 High

These machine types have NVIDIA H100 80GB GPUs (nvidia-h100-80gb) and are well-suited for both large model inference and model fine tuning.

Tip: When provisioning a3-highgpu-1g, a3-highgpu-2g, or a3-highgpu-4g machine types, you must either use Spot VMs or a feature that uses the Dynamic Workload Scheduler (DWS) such as resize requests in a MIG. For detailed instructions on either of these options, review the following:

To create Spot VMs, see Create an accelerator-optimized VM and remember to set the provisiong model to SPOT
To create a resize request in a MIG, which uses Dynamic Workload Scheduler, see Create a MIG with GPU VMs.

Machine type	GPU count	GPU memory^* (GB HBM3)	vCPU count^†	VM memory (GB)	Attached Local SSD (GiB)	Physical NIC count	Maximum network bandwidth (Gbps)^‡
`a3-highgpu-1g`	1	80	26	234	750	1	25
`a3-highgpu-2g`	2	160	52	468	1,500	1	50
`a3-highgpu-4g`	4	320	104	936	3,000	1	100
`a3-highgpu-8g`	8	640	208	1,872	6,000	5	1,000

A3 Edge

These machine types have NVIDIA H100 80GB GPUs (nvidia-h100-80gb), are designed specifically for serving and are available in a limited set of regions.

Machine type	GPU count	GPU memory^* (GB HBM3)	vCPU count^†	VM memory (GB)	Attached Local SSD (GiB)	Physical NIC count	Maximum network bandwidth (Gbps)^‡
`a3-edgegpu-8g`	8	640	208	1,872	6,000	5	800: for asia-south1 and northamerica-northeast2 400: for all other A3 Edge regions

A2 series

The A2 machine series is available in the following configurations. For more information about this machine series, see A2 machine series.

A2 Ultra

These machine types have NVIDIA A100 80GB GPUs (nvidia-a100-80gb) attached and are ideal for model fine tuning, large model and cost optimized inference.

Machine type	GPU count	GPU memory^* (GB HBM3)	vCPU count^†	VM memory (GB)	Attached Local SSD (GiB)	Maximum network bandwidth (Gbps)^‡
`a2-ultragpu-1g`	1	80	12	170	375	24
`a2-ultragpu-2g`	2	160	24	340	750	32
`a2-ultragpu-4g`	4	320	48	680	1,500	50
`a2-ultragpu-8g`	8	640	96	1,360	3,000	100

A2 High

These machine types have NVIDIA A100 40GB GPUs (nvidia-a100-40gb) attached and are ideal for model fine tuning, large model and cost optimized inference.

Machine type	GPU count	GPU memory^* (GB HBM3)	vCPU count^†	VM memory (GB)	Attached Local SSD (GiB)	Maximum network bandwidth (Gbps)^‡
`a2-highgpu-1g`	1	40	12	85	Yes	24
`a2-highgpu-2g`	2	80	24	170	Yes	32
`a2-highgpu-4g`	4	160	48	340	Yes	50
`a2-highgpu-8g`	8	320	96	680	Yes	100
`a2-megagpu-16g`	16	640	96	1,360	Yes	100

G2 series

These machine types have NVIDIA L4 GPUs (nvidia-l4 or nvidia-l4-vws) attached and are ideal for cost-optimized inference, graphics-intensive and high performance computing workloads.

Machine type	GPU count	GPU memory^* (GB GDDR6)	vCPU count^†	Default VM memory (GB)	Custom VM memory range (GB)	Max Local SSD supported (GiB)	Maximum network bandwidth (Gbps)^‡
`g2-standard-4`	1	24	4	16	16 to 32	375	10
`g2-standard-8`	1	24	8	32	32 to 54	375	16
`g2-standard-12`	1	24	12	48	48 to 54	375	16
`g2-standard-16`	1	24	16	64	54 to 64	375	32
`g2-standard-24`	2	48	24	96	96 to 108	750	32
`g2-standard-32`	1	24	32	128	96 to 128	375	32
`g2-standard-48`	4	96	48	192	192 to 216	1,500	50
`g2-standard-96`	8	192	96	384	384 to 432	3,000	100

N1 + GPUs series

You can also attach NVIDIA T4, P4, V100 and P100 models to an N1 machine type with the exception of the N1 shared-core machine type. These machine types can be used for small scale inference, graphics-intensive and high performance computing workloads.

For more information about these N1+GPUs machines, see N1 + GPU machine series.

What's next?

For more information about GPUs on Compute Engine, see About GPUs.
Review the GPU regions and zones availability.
Learn about GPU pricing.