Compute Engine provides graphics processing units (GPUs) that you can add to your virtual machine (VM) instances. You can use these GPUs to accelerate specific workloads on your VMs such as machine learning and data processing.
You can only use two machine families when running GPUs on Compute Engine:
- The accelerator-optimized machine family. All accelerator-optimized machine types have attached GPUs.
- The N1 general-purpose machine family. You can use most N1 machine types except the N1 shared-core machine type. If you are not using an N1 general-purpose machine, you can switch to an N1 general-purpose machine and then add the GPUs.
Before you begin
- To review additional prerequisite steps such as selecting an OS image and checking GPU quota, review the overview document.
-
If you haven't already, set up authentication.
Authentication verifies your identity for access to Google Cloud services and APIs. To run
code or samples from a local development environment, you can authenticate to
Compute Engine by selecting one of the following options:
Select the tab for how you plan to use the samples on this page:
Console
When you use the Google Cloud console to access Google Cloud services and APIs, you don't need to set up authentication.
REST
To use the REST API samples on this page in a local development environment, you use the credentials you provide to the gcloud CLI.
Install the Google Cloud CLI. After installation, initialize the Google Cloud CLI by running the following command:
gcloud init
If you're using an external identity provider (IdP), you must first sign in to the gcloud CLI with your federated identity.
For more information, see Authenticate for using REST in the Google Cloud authentication documentation.
Accelerator-optimized VMs
Each accelerator-optimized machine type has a specific model of NVIDIA GPUs attached to support the recommended workload type.
AI and ML workloads | Graphics and visualization |
---|---|
Accelerator-optimized A series machine types are designed for high
performance computing (HPC), artificial intelligence (AI), and machine
learning (ML) workloads.
For these machine types, the GPU model is automatically attached to the instance. |
Accelerator-optimized G series machine types are designed for workloads
such as NVIDIA Omniverse simulation workloads, graphics-intensive applications,
video transcoding, and virtual desktops. These machine types support
NVIDIA RTX Virtual Workstations (vWS).
For these machine types, the GPU model is automatically attached to the instance. |
|
You can modify each accelerator-optimized instance as follows:
For A4X, A4, A3, and A2 Ultra instances, you can't modify the machine type. If you are using any of these machine types for your instance and you need to change the machine type, create a new instance.
For A2 Standard instances, you can modify the GPU count by switching from one A2 Standard machine type to another A2 Standard machine type.
For G4 instances, you can modify the GPU count by switching from one G4 machine type to another G4 machine type.
For G2 instances, you can do the following:
- You can modify the GPU count by switching from one G2 machine type to another G2 machine type.
- You can switch from a G2 machine type to a machine type from a different machine family such as general-purpose or compute-optimized. See Edit the machine type of a VM.
You can't remove GPUs from any of the accelerator-optimized machine type.
Modify the GPU count
You can modify the GPU count of an A2 standard, G4, or G2 accelerator-optimized instance by using either the Google Cloud console, or REST.
Console
You can modify the number of GPUs for your instance by stopping the instance and editing the instance configuration.
Verify that all of your critical applications are stopped on the instance.
In the Google Cloud console, go to the VM instances page to see your list of instances.
Click the name of the instance that you want to modify the number of GPUs for. The Details page opens.
Complete the following steps from the Details page.
If the instance is running, click
Stop to stop the instance. If there is no Stop option, click More actions > Stop.Click
Edit.In the Machine configuration section, select the GPUs machine family, and then do the following:
In the Number of GPUs list, increase or decrease the GPU count.
To apply your changes, click Save.
To restart the instance, click Start/Resume.
REST
You can modify the number of GPUs on your instance by stopping the instance and changing the machine type. Each accelerator-optimized machine type has a specific number of GPUs attached. If you change the machine type, this adjusts the number of GPUs that are attached to the instance.
Verify that all of your critical applications are stopped on the instance, and then create a POST command to stop the instance so it can move to a host system where GPUs are available.
POST https://compute.googleapis.com/compute/v1/projects/PROJECT_ID/zones/ZONE/instances/VM_NAME/stop
After the instance stops, create a POST request to modify the machine type.
POST https://compute.googleapis.com/compute/v1/projects/PROJECT_ID/zones/ZONE/instances/VM_NAME/setMachineType { machineType: "zones/ZONE/machineTypes/MACHINE_TYPE" }
Start the instance.
POST https://compute.googleapis.com/compute/v1/projects/PROJECT_ID/zones/ZONE/instances/VM_NAME/start
Replace the following:
PROJECT_ID
: your project ID.VM_NAME
: the name of the instance that you want to add GPUs to.ZONE
: the zone where the instance is located. This zone must support GPUs.MACHINE_TYPE
: the machine type that you want to use. It must be one of the following:- If your instance uses an A2 standard machine, select another A2 Standard machine type.
- If your instance uses a G4 machine, select another G4 machine type.
- If your instance uses a G2 machine type, select another
G2 machine type.
G2 machine types also support custom memory. Memory must be a
multiple of 1024 MB and within the supported memory range. For
example, the machine type name for an instance with 4 vCPUs and 19 GB
of memory would be
g2-custom-4-19456
.
Limitations
A2 instances
- You can only request capacity by using the supported consumption options for an A2 Standard machine type.
- You don't receive sustained use discounts and flexible committed use discounts for instances that use an A2 Standard machine type.
- You can only use an A2 Standard machine type in certain regions and zones.
- The A2 Standard machine type is only available on the Cascade Lake platform.
- If your instance uses an A2 Standard machine type, you can only switch from one A2 Standard machine type type to another A2 Standard machine type. You can't change to any other machine type. For more information, see Modify accelerator-optimized instances.
- You can't use the Windows operating system with the
a2-megagpu-16g
machine type. When using a Windows operating system, choose a different A2 Standard machine type. - You can't do a quick format of the attached Local SSDs on Windows instances that use A2 Standard machine types.
To format these Local SSDs, you must do a full format by using the diskpart
utility and specifying
format fs=ntfs label=tmpfs
. - A2 Standard machine types don't support sole-tenancy.
G2 instances
- You can only request capacity by using the supported consumption options for a G2 machine type.
- You don't receive sustained use discounts and flexible committed use discounts for instances that use a G2 machine type.
- You can only use a G2 machine type in certain regions and zones.
- The G2 machine type is only available on the Cascade Lake platform.
- Standard Persistent Disk (
pd-standard
) isn't supported on instances that use the G2 machine type. For supported disk types, see Supported disk types for G2. - You can't create Multi-Instance GPUs on an instance that uses a G2 machine type.
- If you need to change the machine type of a G2 instance, review Modify accelerator-optmized instances.
- You can't use Deep Learning VM Images as boot disks for instances that use the G2 machine type.
- The current default driver for Container-Optimized OS doesn't support L4 GPUs running on
G2 machine types. Also, Container-Optimized OS only supports a select set of drivers.
If you want to use Container-Optimized OS on G2 machine types, review the following notes:
- Use a Container-Optimized OS version that supports the minimum recommended
NVIDIA driver version
525.60.13
or later. For more information, review the Container-Optimized OS release notes. - When you install the driver,
specify the latest available version that works for the L4 GPUs.
For example,
sudo cos-extensions install gpu -- -version=525.60.13
.
- Use a Container-Optimized OS version that supports the minimum recommended
NVIDIA driver version
- You must use the Google Cloud CLI or REST to
create G2 instances
for the following scenarios:
- You want to specify custom memory values.
- You want to customize the number of visible CPU cores.
G4 instances
- You can only request capacity by using the supported consumption options for a G4 machine type.
- You don't receive sustained use discounts and flexible committed use discounts for instances that use a G4 machine type.
- You can only use a G4 machine type in certain regions and zones.
- You can't use Persistent Disk (regional or zonal) on an instance that uses a G4 machine type.
- The G4 machine type is only available on the AMD EPYC Turin 5th Generation platform.
- You can't create Confidential VM instances that use a G4 machine type.
- You can't create G4 instances on sole-tenant nodes.
- You can't use Windows operating systems on
g4-standard-384
instances.
N1-general purpose instances
This section covers how to add, modify, or remove GPUs from a N1-general purpose machine.
In summary, the process to add, modify, or remove GPUs from an existing instance is as follows:
- Check that your instance has a boot disk size of at least 40 GB.
- Stop the instance.
Add, modify, or remove the GPUs.
If your N1 instance doesn't have any GPUs attached, you need to complete the following steps:
- Prepare your instance for the modification.
- Modify the host maintenance setting for the instance. Instances with GPUs cannot live migrate because they are assigned to specific hardware devices. For more information, see GPU restrictions.
- Change the machine type. GPUs are only supported on select N1 machine types.
- Install a GPU driver on your instance, so that your system can use the GPU device.
Prepare your instance
When a GPU is added to an instance, the order of the network interface can change.
Most public images on Compute Engine don't have persistent network interface names and adjust to the new order.
However, if you are using either SLES or a custom image, you must update the system setting to prevent the network interface from persisting. To prevent the network interface from persisting, run the following command on your instance:
rm /etc/udev/rules.d/70-persistent-net.rules
Add GPUs or modify GPU type on existing instances
This section covers how to add GPUs, or modify the GPU type on an existing N1 general-purpose instance. This procedure supports the following GPU types:
NVIDIA GPUs:
- NVIDIA T4:
nvidia-tesla-t4
- NVIDIA P4:
nvidia-tesla-p4
- NVIDIA P100:
nvidia-tesla-p100
- NVIDIA V100:
nvidia-tesla-v100
NVIDIA RTX Virtual Workstation (vWS) (formerly known as NVIDIA GRID):
- NVIDIA T4 Virtual Workstation:
nvidia-tesla-t4-vws
- NVIDIA P4 Virtual Workstation:
nvidia-tesla-p4-vws
NVIDIA P100 Virtual Workstation:
nvidia-tesla-p100-vws
For these virtual workstations, an NVIDIA RTX Virtual Workstation (vWS) license is automatically added to your instance.
Console
To add GPUs or modify the GPU type, complete the following steps.
Verify that all of your critical applications are stopped on the instance.
In the Google Cloud console, go to the VM instances page to see your list of instances.
Click the name of the instance that you want to update. The Details page opens.
Complete the following steps from the Details page.
If the instance is running, click
Stop. If there is no Stop option, click More actions > Stop.Click
Edit.In the Machine configuration section, select the GPUs machine family, and then do the following:
In the GPU type list, select or switch to any of the GPU types supported on N1 VMs.
In the Number of GPUs list, select the number of GPUs.
If your GPU model supports NVIDIA RTX Virtual Workstations (vWS) for graphics workloads, and you plan on running graphics-intensive workloads on this instance, select Enable Virtual Workstation (NVIDIA GRID).
If your instance didn't have GPUs attached before, complete the following:
If the instance has a shared-core machine type, you must change the machine type. In the Machine type list, select one of the preset N1 machine types. Alternatively, you can also specify custom machine type settings.
In the Management section, complete the following:
In the On host maintenance list, select Terminate VM instance. instances with attached GPUs can't live migrate. See Handle GPU host events.
In the Automatic restart list, select On.
To apply your changes, click Save.
To restart the VM, click Start/Resume.
REST
You can add or modify GPUs on your instance by stopping the instance and changing your instance's configuration through the API.
Verify that all of your critical applications are stopped on the instance and then create a POST command to stop the instance so it can move to a host system where GPUs are available.
POST https://compute.googleapis.com/compute/v1/projects/PROJECT_ID/zones/ZONE/instances/VM_NAME/stop
If your instance does not have any GPUs attached, complete the following steps:
Identify the GPU type that you want to add to your instance. You can submit a
GET
request to list the GPU types that are available to your project in a specific zone.GET https://compute.googleapis.com/compute/v1/projects/PROJECT_ID/zones/ZONE/acceleratorTypes
If the instance has a shared-core machine type, you must change the machine type to have one or more vCPUs. You cannot add accelerators to instances with shared-core machine types.
Create a POST command to set the scheduling options for the instance.
POST https://compute.googleapis.com/compute/v1/projects/PROJECT_ID/zones/ZONE/instances/VM_NAME/setScheduling { "onHostMaintenance": "TERMINATE", "automaticRestart": true }
Create a POST request to add or modify the GPUs that are attached to your instance.
POST https://compute.googleapis.com/compute/v1/projects/PROJECT_ID/zones/ZONE/instances/VM_NAME/setMachineResources { "guestAccelerators": [ { "acceleratorCount": ACCELERATOR_COUNT, "acceleratorType": "https://www.googleapis.com/compute/v1/projects/PROJECT_ID/zones/ZONE/acceleratorTypes/ACCELERATOR_TYPE" } ] }
Start the instance.
POST https://compute.googleapis.com/compute/v1/projects/PROJECT_ID/zones/ZONE/instances/VM_NAME/start
Replace the following:
PROJECT_ID
: your project ID.VM_NAME
: the name of the instance that you want to add GPUs to.ZONE
: the zone where the instance is located.ACCELERATOR_COUNT
: the number of GPUs that you want attached to your instance. For a list of GPU limits based on the machine type of your instance, see GPUs on Compute Engine.ACCELERATOR_TYPE
: the GPU model that you want to attach or switch to. If you plan on running graphics-intensive workloads on this instance, use one of the virtual workstation models.Choose one of the following values:
NVIDIA GPUs:
- NVIDIA T4:
nvidia-tesla-t4
- NVIDIA P4:
nvidia-tesla-p4
- NVIDIA P100:
nvidia-tesla-p100
- NVIDIA V100:
nvidia-tesla-v100
- NVIDIA T4:
NVIDIA RTX Virtual Workstation (vWS) (formerly known as NVIDIA GRID):
- NVIDIA T4 Virtual Workstation:
nvidia-tesla-t4-vws
- NVIDIA P4 Virtual Workstation:
nvidia-tesla-p4-vws
- NVIDIA P100 Virtual Workstation:
nvidia-tesla-p100-vws
For these virtual workstations, an NVIDIA RTX Virtual Workstation (vWS) license is automatically added to your instance.
- NVIDIA T4 Virtual Workstation:
Install drivers
To install the drivers, choose one of the following options:
- If you plan to run graphics-intensive workloads, such as those for gaming and visualization, install drivers for the NVIDIA RTX Virtual Workstation.
- For most workloads, install the GPU drivers.
Remove GPUs
This section covers how to remove the following GPU types from an existing N1 general-purpose instance.
NVIDIA GPUs:
- NVIDIA T4:
nvidia-tesla-t4
- NVIDIA P4:
nvidia-tesla-p4
- NVIDIA P100:
nvidia-tesla-p100
- NVIDIA V100:
nvidia-tesla-v100
NVIDIA RTX Virtual Workstation (vWS) (formerly known as NVIDIA GRID):
- NVIDIA T4 Virtual Workstation:
nvidia-tesla-t4-vws
- NVIDIA P4 Virtual Workstation:
nvidia-tesla-p4-vws
NVIDIA P100 Virtual Workstation:
nvidia-tesla-p100-vws
For these virtual workstations, an NVIDIA RTX Virtual Workstation (vWS) license is automatically added to your instance.
You can use the Google Cloud console to remove GPUs from an existing instance. To remove GPUs, complete the following steps:
Verify that all of your critical applications are stopped on the instance.
In the Google Cloud console, go to the VM instances page to see your list of instances.
Click the name of the instance that you want to remove GPUs from. The Details page opens.
Complete the following steps from the Details page.
If the instance is running, click
Stop to stop the instance. If there is no Stop option, click More actions > Stop.On the toolbar, click
Edit.In the Machine configuration section, select the General purpose machine family, and then do the following:
To view attached GPUs, expand Advanced configurations.
In the GPUs section, remove GPUs using one of the following options:
To remove some GPUs, in the Number of GPUs list, select a new number.
To remove all GPUs, click
Delete GPU.
Optional: Modify the instance host maintenance policy setting. instances with GPUs must have the host maintenance policy set to Terminate VM instance. But if you removed all GPUs, you have the option to live migrate this instance during host maintenance. For more information, see Set VM host maintenance policy.
To apply your changes, click Save.
To restart the instance, click Start/Resume.
What's next?
- Learn more about GPU platforms.
- Add Local SSDs to your instances. Local SSD devices pair well with GPUs when your apps require high-performance storage.
- Create groups of GPU instances using instance templates.
- To monitor GPU performance, see Monitoring GPU performance.
- To improve network performance, see Use higher network bandwidth.
- To handle GPU host maintenance, see Handling GPU host events.
- Try the Running TensorFlow Inference Workloads at Scale with TensorRT5 and NVIDIA T4 GPU tutorial.