Adding GPUs to Instances

Google Compute Engine provides graphics processing units (GPUs) that you can add to your virtual machine instances. You can use these GPUs to accelerate specific workloads on your instances such as machine learning and data processing.

For more information about what you can do with GPUs and what types of GPU hardware are available, read GPUs on Compute Engine.

Before you begin

Creating an instance with a GPU

Before you create an instance with a GPU, select which boot disk image you want to use for the instance, and ensure that the appropriate GPU driver is installed. You can use any public image or custom image that you need, but some images might require a unique driver or install process that is not covered in this guide. You must identify what drivers are appropriate for your images. Drivers are not installed by default on public images. Read installing GPU drivers for details.

When you create an instance with one or more GPUs, you must set the instance to terminate on host maintenance. Instances with GPUs cannot live migrate because they are assigned to specific hardware devices. See GPU restrictions for details.

Create an instance with one or more GPUs using the Google Cloud Platform Console, the gcloud command-line tool, or the API.

Console

  1. Go to the VM instances page.

    Go to the VM instances page

  2. Click the Create instance button.
  3. Select a zone where GPUs are available. See the list of available zones with GPUs.
  4. In the Machine type section, select the machine type that you want to use for this instance. Alternatively, you can specify custom machine type settings later.
  5. In the Machine type section, click Customize to see advanced machine type options and available GPUs.
  6. Click the GPUs drop down menu to see the list of available GPUs.
  7. Specify the GPU type and the number of GPUs that you need.
  8. If necessary, adjust the machine type to accommodate your desired GPU settings. If you leave these settings as they are, the instance uses the predefined machine type that you specified before opening the machine type customization screen.
  9. In the Boot disk section, click Change to begin configuring your boot disk.
  10. In the OS images tab, choose an image.
  11. Click Select to confirm your boot disk options.
  12. Optionally, you can include a startup script to install the GPU driver while the instance starts up. In the Automation section, include the contents of your startup script under Startup script. See installing GPU drivers for example scripts.
  13. At the bottom of the page, click Create to create the instance.

gcloud

Use the regions describe command to ensure that you have sufficient GPU quota in the region where you want to create instances with GPUs.

gcloud compute regions describe [REGION]

where [REGION] is the region where you want to check for GPU quota.

Start an instance with the latest image from an image family:

gcloud compute instances create [INSTANCE_NAME] \
    --machine-type [MACHINE_TYPE] --zone [ZONE] \
    --accelerator type=[ACCELERATOR_TYPE],count=[ACCELERATOR_COUNT] \
    --image-family [IMAGE_FAMILY] --image-project [IMAGE_PROJECT] \
    --maintenance-policy TERMINATE --restart-on-failure \
    --metadata startup-script='[STARTUP_SCRIPT]'

where:

  • [INSTANCE_NAME] is the name for the new instance.
  • [MACHINE_TYPE] is the machine type that you selected for the instance. See GPUs on Compute Engine to see what machine types are available based on your desired GPU count.
  • [ZONE] is the zone for this instance.
  • [IMAGE_FAMILY] is one of the available image families.
  • [ACCELERATOR_COUNT] is the number of GPUs that you want to add to your instance. See GPUs on Compute Engine for a list of GPU limits based on the machine type of your instance.
  • [ACCELERATOR_TYPE] is the GPU model that you want to use. See GPUs on Compute Engine for a list of available GPU models.
  • [IMAGE_PROJECT] is the image project that the image family belongs to.
  • [STARTUP_SCRIPT] is an optional startup script that you can use to install the GPU driver while the instance is starting up. See installing GPU drivers for examples.

For example, you can use the following gcloud command to start an Ubuntu 1604 instance with one NVIDIA® Tesla® K80 GPU and 2 vCPUs in the us-east1-d zone. The startup-script metadata instructs the instance to install the CUDA Toolkit with its recommended driver version.

gcloud compute instances create gpu-instance-1 \
    --machine-type n1-standard-2 --zone us-east1-d \
    --accelerator type=nvidia-tesla-k80,count=1 \
    --image-family ubuntu-1604-lts --image-project ubuntu-os-cloud \
    --maintenance-policy TERMINATE --restart-on-failure \
    --metadata startup-script='#!/bin/bash
    echo "Checking for CUDA and installing."
    # Check for CUDA and try to install.
    if ! dpkg-query -W cuda-8-0; then
      curl -O http://developer.download.nvidia.com/compute/cuda/repos/ubuntu1604/x86_64/cuda-repo-ubuntu1604_8.0.61-1_amd64.deb
      dpkg -i ./cuda-repo-ubuntu1604_8.0.61-1_amd64.deb
      apt-get update
      apt-get install cuda-8-0 -y
    fi'

This example command starts the instance, but CUDA and the driver will take several minutes to finish installing.

API

Identify the GPU type that you want to add to your instance. Submit a GET request to list the GPU types that are available to your project in a specific zone.

GET https://www.googleapis.com/compute/v1/projects/[PROJECT_ID]/zones/[ZONE]/acceleratorTypes

where:

  • [PROJECT_ID] is your project ID.
  • [ZONE] is the zone where you want to list the available GPU types.

In the API, create a POST request to create a new instance. Include the acceleratorType parameter to specify which GPU type you want to use, and include the acceleratorCount parameter to specify how many GPUs you want to add. Also set the onHostMaintenance parameter to TERMINATE.

POST https://www.googleapis.com/compute/v1/projects/[PROJECT_ID]/zones/[ZONE]/instances?key={YOUR_API_KEY}
{
  "machineType": "https://www.googleapis.com/compute/v1/projects/[PROJECT_ID]/zones/[ZONE]/machineTypes/n1-highmem-2",
  "disks":
  [
    {
      "type": "PERSISTENT",
      "initializeParams":
      {
        "diskSizeGb": "[DISK_SIZE]",
        "sourceImage": "https://www.googleapis.com/compute/v1/projects/[IMAGE_PROJECT]/global/images/family/[IMAGE_FAMILY]"
      },
      "boot": true
    }
  ],
  "name": "[INSTANCE_NAME]",
  "networkInterfaces":
  [
    {
      "network": "https://www.googleapis.com/compute/v1/projects/[PROJECT_ID]/global/networks/[NETWORK]"
    }
  ],
  "guestAccelerators":
  [
    {
      "acceleratorCount": [ACCELERATOR_COUNT],
      "acceleratorType": "https://www.googleapis.com/compute/v1/projects/[PROJECT_ID]/zones/[ZONE]/acceleratorTypes/[ACCELERATOR_TYPE]"
    }
  ],
  "scheduling":
  {
    "onHostMaintenance": "terminate",
    "automaticRestart": true
  },
  "metadata":
  {
    "items":
    [
      {
        "key": "startup-script",
        "value": "[STARTUP_SCRIPT]"
      }
    ]
  }
}

where:

  • [INSTANCE_NAME] is the name of the instance.
  • [PROJECT_ID] is your project ID.
  • [ZONE] is the zone for this instance.
  • [MACHINE_TYPE] is the machine type that you selected for the instance. See GPUs on Compute Engine to see what machine types are available based on your desired GPU count.
  • [IMAGE_PROJECT] is the image project that the image belongs to.
  • [IMAGE_FAMILY] is a boot disk image for your instance. Specify an image family from the list of available public images.
  • [DISK_SIZE] is the size of your boot disk in GB.
  • [NETWORK] is the VPC network that you want to use for this instance. Specify default to use your default network.
  • [ACCELERATOR_COUNT] is the number of GPUs that you want to add to your instance. See GPUs on Compute Engine for a list of GPU limits based on the machine type of your instance.
  • [ACCELERATOR_TYPE] is the GPU model that you want to use. See GPUs on Compute Engine for a list of available GPU models.
  • [STARTUP_SCRIPT] is an optional startup script that you can use to install the GPU driver while the instance is starting up. See installing GPU drivers for examples.

If you used a startup script to automatically install the GPU device driver, verify that the GPU driver installed correctly.

If you did not use a startup script to install the GPU driver during instance creation, manually install the GPU driver on your instance so that your system can use the device.

Adding or removing GPUs on existing instances

You can add or detach GPUs on your existing instances, but you must first stop the instance and change its host maintenance setting so that it terminates rather than live-migrating. Instances with GPUs cannot live migrate because they are assigned to specific hardware devices. See GPU restrictions for details.

Also be aware that you must install GPU drivers on this instance after you add a GPU. The boot disk image that you used to create this instance determines what drivers you need. You must identify what drivers are appropriate for the operating system on your instance's persistent boot disk images. Read installing GPU drivers for details.

You can add or remove GPUs from an instance using the Google Cloud Platform Console or the API.

Console

You can add or remove GPUs from your instance by stopping the instance and editing your instance's configuration.

  1. Verify that all of your critical applications are stopped on the instance. You must stop the instance before you can add a GPU.

  2. Go to the VM instances page to see your list of instances.

    Go to the VM instances page

  3. On the list of instances, click the name of the instance where you want to add GPUs. The instance details page opens.

  4. At the top of the instance details page, click Stop to stop the instance.

  5. After the instance stops running, click Edit to change the instance properties.

  6. If the instance has a shared-core machine type, you must change the machine type to have one or more vCPUs. You cannot add accelerators to instances with shared-core machine types.

  7. In the Machine type settings, click GPUs to expand the GPU selection list.

  8. Select the number of GPUs and the GPU model that you want to add to your instance. Alternatively, you can set the number of GPUs to None to remove existing GPUs from the instance.

  9. If you added GPUs to an instance, set the host maintenance setting to Terminate. If you removed GPUs from the instance, you can optionally set the host maintenance setting back to Migrate VM instance.

  10. At the bottom of the instance details page, click Save to apply your changes.

  11. After the instance settings are saved, click Start at the top of the instance details page to start the instance again.

API

You can add or remove GPUs from your instance by stopping the instance and changing your instance's configuration through the API.

  1. Verify that all of your critical applications are stopped on the instance and then create a POST command to stop the instance so it can move to a host system where GPUs are available.

    POST https://www.googleapis.com/compute/v1/projects/compute/zones/[ZONE]/instances/[INSTANCE_NAME]/stop
    

    where:

    • [INSTANCE_NAME] is the name of the instance where you want to add GPUs.
    • [ZONE] is the zone for where the instance is located.
  2. Identify the GPU type that you want to add to your instance. Submit a GET request to list the GPU types that are available to your project in a specific zone.

    GET https://www.googleapis.com/compute/v1/projects/[PROJECT_ID]/zones/[ZONE]/acceleratorTypes
    

    where:

    • [PROJECT_ID] is your project ID.
    • [ZONE] is the zone where you want to list the available GPU types.
  3. If the instance has a shared-core machine type, you must change the machine type to have one or more vCPUs. You cannot add accelerators to instances with shared-core machine types.

  4. After the instance stops, create a POST request to add or remove one or more GPUs to your instance.

    POST https://www.googleapis.com/compute/v1/projects/[PROJECT_ID]/zones/[ZONE]/instances/[INSTANCE_NAME]/setMachineResources
    
    {
     "guestAccelerators": [
      {
        "acceleratorCount": [ACCELERATOR_COUNT],
        "acceleratorType": "https://www.googleapis.com/compute/v1/projects/[PROJECT_ID]/zones/[ZONE]/acceleratorTypes/[ACCELERATOR_TYPE]"
      }
     ]
    }
    

    where:

    • [INSTANCE_NAME] is the name of the instance.
    • [PROJECT_ID] is your project ID.
    • [ZONE] is the zone for this instance.
    • [ACCELERATOR_COUNT] is the number of GPUs that you want on your instance. To remove GPUs from your instance, set this value to 0. See GPUs on Compute Engine for a list of GPU limits based on the machine type of your instance.
    • [ACCELERATOR_TYPE] is the GPU model that you want to use. See GPUs on Compute Engine for a list of available GPU models.
  5. Create a POST command to set the scheduling options for the instance. If you are adding GPUs to an instance, you must specify "onHostMaintenance": "TERMINATE". Optionally, if you are removing GPUs from an instance you can specify "onHostMaintenance": "MIGRATE".

    POST https://www.googleapis.com/compute/v1/projects/compute/zones/[ZONE]/instances/[INSTANCE_NAME]/setScheduling
    
    {
     "onHostMaintenance": "[MAINTENANCE_TYPE]",
     "automaticRestart": true
    }
    

    where:

    • [INSTANCE_NAME] is the name of the instance where you want to add GPUs.
    • [ZONE] is the zone for where the instance is located.
    • [MAINTENANCE_TYPE] is the action you want your instance to take when host maintenance is necessary. Specify TERMINATE if you are adding GPUs to your instance. Alternatively, you can specify "onHostMaintenance": "MIGRATE" if you have removed all of the GPUs from your instance and want the instance to resume migration on host maintenance events.
  6. Start the instance.

    POST https://www.googleapis.com/compute/v1/projects/compute/zones/[ZONE]/instances/[INSTANCE_NAME]/start
    

    where:

    • [INSTANCE_NAME] is the name of the instance where you want to add GPUs.
    • [ZONE] is the zone for where the instance is located.

Next install the GPU driver on your instance so that your system can use the device.

Creating groups of GPU instances using instance templates

You can use instance templates to create managed instance groups with GPUs added to each instance. Managed instance groups use the template to create multiple identical instances. You can scale the number of instances in the group to match your workload. You can create the instance templates with GPUs only through the gcloud beta command-line tool.

Follow the guide to create an instance template and include the --accelerators and --maintenance-policy TERMINATE flags. Optionally, you can include the --metadata startup-script flag and specify a startup script to install the GPU driver while the instance starts up. See installing GPU drivers for example scripts that work on GPU instances.

As an example, you could create an instance template with 2 vCPUs, a 250GB boot disk with Ubuntu 1604, an NVIDIA® Tesla® K80 GPU, and a startup script. The startup script instructs the instance to install the CUDA Toolkit with its recommended driver version.

gcloud beta compute instance-templates create gpu-template \
    --machine-type n1-standard-2 \
    --boot-disk-size 250GB \
    --accelerator type=nvidia-tesla-k80,count=1 \
    --image-family ubuntu-1604-lts --image-project ubuntu-os-cloud \
    --maintenance-policy TERMINATE --restart-on-failure \
    --metadata startup-script='#!/bin/bash
    echo "Checking for CUDA and installing."
    # Check for CUDA and try to install.
    if ! dpkg-query -W cuda-8-0; then
      curl -O http://developer.download.nvidia.com/compute/cuda/repos/ubuntu1604/x86_64/cuda-repo-ubuntu1604_8.0.61-1_amd64.deb
      dpkg -i ./cuda-repo-ubuntu1604_8.0.61-1_amd64.deb
      apt-get update
      apt-get install cuda-8-0 -y
    fi'

After you create the template, use the template to create an instance group. Every time you add an instance to the group, it starts that instance using the settings in the instance template.

To learn more about managing and scaling groups of instances, read Creating Groups of Managed Instances.

GPUs on preemptible instances

You can start a preemptible VM instance with GPUs and Compute Engine will charge you preemptible prices for the GPUs. GPUs attached to preemptible instances work like normal GPUs but will only persist for the life of the instance. You can request a separate Preemptible GPU quota for preemptible GPUs but you can also choose to use your regular GPU quota when creating preemptible GPUs.

Non-preemptible GPU instances are terminated for host maintenance events, but can be configured to automatically restart. Google provides provide a one hour advance notice before preemption. Preemptible instances with GPUs attached do not receive this one hour notice and are preempted by default during maintenance events. These instances cannot be set to automatically restart.

For more details on GPUs, review the GPUs documentation.

Installing GPU drivers

After you create an instance with one or more GPUs, your system requires device drivers so that your applications can access the device. This guide demonstrates basic procedures to install NVIDIA proprietary drivers on instances with public images.

You can install GPU drivers through one of the following options:

Installing GPU drivers using scripts

NVIDIA GPUs running on Google Compute Engine must use one of the following NVIDIA driver versions:

  • 375.51
  • 384.66 or greater

For most driver installs, you can obtain these drivers by installing the NVIDIA CUDA Toolkit.

On some images, you can use scripts to simplify the driver install process. You can either specify these scripts as startup scripts on your instances or copy these scripts to your instances and run them through the terminal as a user with sudo privileges.

You must prepare the script so that it works with the boot disk image that you selected. If you imported a custom boot disk image for your instances, you might need to customize the startup script to work correctly with that custom image.

For Windows Server instances and SLES 12 instances where you cannot automate the driver installation process, install the driver manually.

For public images, you can install CUDA with the associated drivers for NVIDIA® Tesla® K80 GPUs using the following sample scripts:

CentOS

This script checks for an existing CUDA install and then installs the full CUDA 8 package and its associated proprietary driver.

CentOS 7 - CUDA 8:

#!/bin/bash
echo "Checking for CUDA and installing."
# Check for CUDA and try to install.
if ! rpm -q cuda-8-0; then
  curl -O http://developer.download.nvidia.com/compute/cuda/repos/rhel7/x86_64/cuda-repo-rhel7-8.0.61-1.x86_64.rpm
  rpm -i --force ./cuda-repo-rhel7-8.0.61-1.x86_64.rpm
  yum clean all
  yum install epel-release -y
  yum update -y
  yum install cuda-8-0 -y
fi
# Verify that CUDA installed; retry if not.
if ! rpm -q cuda-8-0; then
  yum install cuda-8-0 -y
fi
# Enable persistence mode
nvidia-smi -pm 1

On instances with CentOS 7 images, you might need to reboot the instance after the script finishes installing the drivers and the CUDA packages. Reboot the instance if the script is finished and the nvidia-smi command returns the following error:

NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA
driver. Make sure that the latest NVIDIA driver is installed and
running.

CentOS 6 - CUDA 8:

#!/bin/bash
echo "Checking for CUDA and installing."
# Check for CUDA and try to install.
if ! rpm -q cuda-8-0; then
  curl -O http://developer.download.nvidia.com/compute/cuda/repos/rhel6/x86_64/cuda-repo-rhel6-8.0.61-1.x86_64.rpm
  rpm -i --force ./cuda-repo-rhel6-8.0.61-1.x86_64.rpm
  yum clean all
  yum install epel-release -y
  yum update -y
  yum install cuda-8-0 -y
fi
# Verify that CUDA installed; retry if not.
if ! rpm -q cuda-8-0; then
  yum install cuda-8-0 -y
fi
# Enable persistence mode
nvidia-smi -pm 1

RHEL

This script checks for an existing CUDA install and then installs the full CUDA 8 package and its associated proprietary driver.

RHEL 7 - CUDA 8:

#!/bin/bash
echo "Checking for CUDA and installing."
# Check for CUDA and try to install.
if ! rpm -q cuda-8-0; then
  curl -O http://developer.download.nvidia.com/compute/cuda/repos/rhel7/x86_64/cuda-repo-rhel7-8.0.61-1.x86_64.rpm
  rpm -i --force ./cuda-repo-rhel7-8.0.61-1.x86_64.rpm
  curl -O https://dl.fedoraproject.org/pub/epel/epel-release-latest-7.noarch.rpm
  rpm -i --force ./epel-release-latest-7.noarch.rpm
  yum clean all
  yum update -y
  yum install cuda-8-0 -y
fi
# Verify that CUDA installed; retry if not.
if ! rpm -q cuda-8-0; then
  yum install cuda-8-0 -y
fi
# Enable persistence mode
nvidia-smi -pm 1

On instances with RHEL 7 images, you might need to reboot the instance after the script finishes installing the drivers and the CUDA packages. Reboot the instance if the script is finished and the nvidia-smi command returns the following error:

NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA
driver. Make sure that the latest NVIDIA driver is installed and
running.

RHEL 6 - CUDA 8:

#!/bin/bash
echo "Checking for CUDA and installing."
# Check for CUDA and try to install.
if ! rpm -q  cuda-8-0; then
  curl -O http://developer.download.nvidia.com/compute/cuda/repos/rhel6/x86_64/cuda-repo-rhel6-8.0.61-1.x86_64.rpm
  rpm -i --force ./cuda-repo-rhel6-8.0.61-1.x86_64.rpm
  curl -O https://dl.fedoraproject.org/pub/epel/epel-release-latest-6.noarch.rpm
  rpm -i --force ./epel-release-latest-6.noarch.rpm
  yum clean all
  yum update -y
  yum install cuda-8-0 -y
fi
# Verify that CUDA installed; retry if not.
if ! rpm -q  cuda-8-0; then
  yum install cuda-8-0 -y
fi
# Enable persistence mode
nvidia-smi -pm 1

SLES

This script checks for an existing CUDA install and then installs the full CUDA 8 package and its associated proprietary driver.

SLES 12 - CUDA 8:

On SLES 12 instances, install the driver manually.

SLES 11 - CUDA 8:

#!/bin/bash
echo "Checking for CUDA and installing."
# Check for CUDA and try to install.
if ! rpm -q cuda-8-0; then
  curl -O http://developer.download.nvidia.com/compute/cuda/repos/sles114/x86_64/cuda-repo-sles114-8.0.61-1.x86_64.rpm
  rpm -i --force ./cuda-repo-sles114-8.0.61-1.x86_64.rpm
  zypper --gpg-auto-import-keys refresh
  zypper install -ny cuda
fi
# Verify that CUDA installed; retry if not.
if ! rpm -q cuda-8-0; then
  zypper install -ny cuda
fi
# Enable persistence mode
nvidia-smi -pm 1

Ubuntu

This script checks for an existing CUDA install and then installs the full CUDA 8 package and its associated proprietary driver.

Ubuntu 16.04 LTS or 17.04 - CUDA 8:

#!/bin/bash
echo "Checking for CUDA and installing."
# Check for CUDA and try to install.
if ! dpkg-query -W cuda-8-0; then
  # The 16.04 installer works with 16.10.
  curl -O http://developer.download.nvidia.com/compute/cuda/repos/ubuntu1604/x86_64/cuda-repo-ubuntu1604_8.0.61-1_amd64.deb
  dpkg -i ./cuda-repo-ubuntu1604_8.0.61-1_amd64.deb
  apt-get update
  apt-get install cuda-8-0 -y
fi
# Enable persistence mode
nvidia-smi -pm 1

Ubuntu 14.04 LTS - CUDA 8:

#!/bin/bash
echo "Checking for CUDA and installing."
# Check for CUDA and try to install.
if ! dpkg-query -W cuda-8-0; then
  curl -O http://developer.download.nvidia.com/compute/cuda/repos/ubuntu1404/x86_64/cuda-repo-ubuntu1404_8.0.61-1_amd64.deb
  dpkg -i ./cuda-repo-ubuntu1404_8.0.61-1_amd64.deb
  apt-get update
  apt-get install cuda-8-0 -y
  apt-get install linux-headers-$(uname -r) -y
fi
# Enable persistence mode
nvidia-smi -pm 1

Windows Server

On Windows Server instances, you must install the driver manually.

After your script finishes running, you can verify that GPU driver installed.

Manually installing GPU drivers

If you cannot use a script to install the driver for your GPUs, you can manually install the driver yourself. You are responsible for selecting the installer and driver version that works best for your applications. Use this install method if you require a specific driver or you need to install the driver on a custom image or a public image that does not work with one of the install scripts.

You can use this process to manually install drivers on instances with most public images. For custom images, you might need to modify the process to function in your unique environment.

CentOS

  1. Connect to the instance where you want to install the driver.

  2. Select a driver repository and add it to your instance. For example, use curl to download the CUDA Toolkit and use the rpm command to add the repository to your system:

    • CentOS 7

      $ curl -O http://developer.download.nvidia.com/compute/cuda/repos/rhel7/x86_64/cuda-repo-rhel7-8.0.61-1.x86_64.rpm
      
      $ sudo rpm -i cuda-repo-rhel7-8.0.61-1.x86_64.rpm
      

    • CentOS 6

      $ curl -O http://developer.download.nvidia.com/compute/cuda/repos/rhel6/x86_64/cuda-repo-rhel6-8.0.61-1.x86_64.rpm
      
      $ sudo rpm -i cuda-repo-rhel6-8.0.61-1.x86_64.rpm
      

  3. Install the epel-release repository. This repository includes the DKMS packages, which are required to install NVIDIA drivers on CentOS.

    $ sudo yum install epel-release
    

  4. Clean the Yum cache:

    $ sudo yum clean all
    

  5. Install CUDA 8, which includes the NVIDIA driver.

    $ sudo yum install cuda-8-0
    

  6. Enable persistence mode.

    $ sudo nvidia-smi -pm 1
    Enabled persistence mode for GPU 00000000:00:04.0.
    Enabled persistence mode for GPU 00000000:00:05.0.
    All done.
    

RHEL

  1. Connect to the instance where you want to install the driver.

  2. Select a driver repository and add it to your instance. For example, use curl to download the CUDA Toolkit and use the rpm command to add the repository to your system:

    • RHEL 7

      $ curl -O http://developer.download.nvidia.com/compute/cuda/repos/rhel7/x86_64/cuda-repo-rhel7-8.0.61-1.x86_64.rpm
      
      $ sudo rpm -i cuda-repo-rhel7-8.0.61-1.x86_64.rpm
      

    • RHEL 6

      $ curl -O http://developer.download.nvidia.com/compute/cuda/repos/rhel6/x86_64/cuda-repo-rhel6-8.0.61-1.x86_64.rpm
      
      $ sudo rpm -i cuda-repo-rhel6-8.0.61-1.x86_64.rpm
      

  3. Install the epel-release repository. This repository includes the DKMS packages, which are required to install NVIDIA drivers. On RHEL, you must download the .rpm for this repository from fedoraproject.org and add it to your system.

    • RHEL 7

      $ curl -O https://dl.fedoraproject.org/pub/epel/epel-release-latest-7.noarch.rpm
      
      $ sudo rpm -i epel-release-latest-7.noarch.rpm
      

    • RHEL 6

      $ curl -O https://dl.fedoraproject.org/pub/epel/epel-release-latest-6.noarch.rpm
      
      $ sudo rpm -i epel-release-latest-6.noarch.rpm
      

  4. Clean the Yum cache:

    $ sudo yum clean all
    

  5. Install CUDA 8, which includes the NVIDIA driver.

    $ sudo yum install cuda-8-0
    

  6. Enable persistence mode.

    $ sudo nvidia-smi -pm 1
    Enabled persistence mode for GPU 00000000:00:04.0.
    Enabled persistence mode for GPU 00000000:00:05.0.
    All done.
    

SLES

  1. Connect to the instance where you want to install the driver.

  2. Select a driver repository and add it to your instance. For example, use curl to download the CUDA Toolkit and use the rpm command to add the repository to your system:

    • SLES 12

      $ curl -O http://developer.download.nvidia.com/compute/cuda/repos/sles12/x86_64/cuda-repo-sles12-8.0.61-1.x86_64.rpm
      
      $ sudo rpm -i cuda-repo-sles12-8.0.61-1.x86_64.rpm
      

    • SLES 11

      $ curl -O http://developer.download.nvidia.com/compute/cuda/repos/sles114/x86_64/cuda-repo-sles114-8.0.61-1.x86_64.rpm
      
      $ sudo rpm -i cuda-repo-sles114-8.0.61-1.x86_64.rpm
      

  3. Refresh Zypper:

    $ sudo zypper refresh
    

  4. Install CUDA, which includes the NVIDIA driver.

    $ zypper install cuda-8-0
    

  5. Enable persistence mode.

    $ sudo nvidia-smi -pm 1
    Enabled persistence mode for GPU 00000000:00:04.0.
    Enabled persistence mode for GPU 00000000:00:05.0.
    All done.
    

Ubuntu

  1. Connect to the instance where you want to install the driver.

  2. Select a driver repository and add it to your instance. For example, use curl to download the CUDA Toolkit and use the dpkg command to add the repository to your system:

    • Ubuntu 16.04 LTS and 17.04

      $ curl -O http://developer.download.nvidia.com/compute/cuda/repos/ubuntu1604/x86_64/cuda-repo-ubuntu1604_8.0.61-1_amd64.deb
      
      $ sudo dpkg -i cuda-repo-ubuntu1604_8.0.61-1_amd64.deb
      

    • Ubuntu 14.04 LTS

      $ curl -O http://developer.download.nvidia.com/compute/cuda/repos/ubuntu1404/x86_64/cuda-repo-ubuntu1404_8.0.61-1_amd64.deb
      
      $ sudo dpkg -i cuda-repo-ubuntu1404_8.0.61-1_amd64.deb
      

  3. Update the package lists:

    $ sudo apt-get update
    

  4. Install CUDA 8, which includes the NVIDIA driver.

    $ sudo apt-get install cuda-8-0
    

  5. On Ubuntu 14.04, you might need to install headers for your current kernel version so that the driver can initialize properly.

    $ sudo apt-get install linux-headers-$(uname -r)
    

  6. Enable persistence mode.

    $ sudo nvidia-smi -pm 1
    Enabled persistence mode for GPU 00000000:00:04.0.
    Enabled persistence mode for GPU 00000000:00:05.0.
    All done.
    

Windows Server

  1. Connect to the instance where you want to install the driver.

  2. Download an .exe installer file to your instance that includes the NVIDIA 375.51 driver, the NVIDIA 384.66 driver, or greater. For most Windows Server instances, you can use one of the following options:

    For example, you can open a PowerShell terminal as an administrator and use the wget command to download the driver installer that you need.

    PS C:> wget https://developer.nvidia.com/compute/cuda/8.0/prod/network_installers/cuda_8.0.44_windows_network-exe -o cuda_8.0.44_windows_network.exe
    

  3. Run the .exe installer. For example, you can open a PowerShell terminal as an administrator and run the following command.

    PS C:> .\cuda_8.0.44_windows_network.exe
    

After your installer finishes running, you can verify that GPU driver installed.

Verifying the GPU driver install

After the driver finishes installing, verify that the driver installed and initialized properly.

Linux

Connect to the Linux instance and use the nvidia-smi command to verify that the driver is running properly.

$ nvidia-smi

Mon Jan 26 10:23:26 2017
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 384.66                 Driver Version: 384.66                    |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  Tesla K80           Off  | 0000:00:04.0     Off |                    0 |
| N/A   43C    P0    72W / 149W |      0MiB / 11439MiB |    100%      Default |
+-------------------------------+----------------------+----------------------+

Windows Server

Connect to the Windows Server instance and use the nvidia-smi.exe tool to verify that the driver is running properly.

PS C:> & 'C:\Program Files\NVIDIA Corporation\NVSMI\nvidia-smi.exe'

Mon Jan 27 13:06:50 2017
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 369.30                 Driver Version: 369.30                    |
|-------------------------------+----------------------+----------------------+
| GPU  Name            TCC/WDDM | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  Tesla K80           TCC  | 0000:00:04.0     Off |                    0 |
| N/A   52C    P8    30W / 149W |      0MiB / 11423MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+

If the driver is not functioning and you used a script to install the driver, check the startup script logs ensure that the script has finished and that it did not fail during the install process.

Optimizing GPU performance

In general, you can optimize the performance of your GPU devices on Linux instances using the following settings:

  • Enable persistence mode. This setting apples to all of the GPUs on your instance.

    $ sudo nvidia-smi -pm 1
    Enabled persistence mode for GPU 00000000:00:04.0.
    Enabled persistence mode for GPU 00000000:00:05.0.
    All done.
    

  • Set GPU and GPU memory clock speeds to fixed maximum rates:

    • On instances with NVIDIA® Tesla® P100 GPUs:

      $ sudo nvidia-smi -ac 715,1328
      Applications clocks set to "(MEM 715, SM 1328)" for GPU 00000000:00:04.0
      Applications clocks set to "(MEM 715, SM 1328)" for GPU 00000000:00:05.0
      All done.
      

    • On instances with NVIDIA® Tesla® K80 GPUs:

      $ sudo nvidia-smi -ac 2505,875
      Applications clocks set to "(MEM 715, SM 1328)" for GPU 00000000:00:04.0
      Applications clocks set to "(MEM 715, SM 1328)" for GPU 00000000:00:05.0
      All done.
      

  • On instances with NVIDIA® Tesla® K80 GPUs, disable autoboost:

    $ sudo nvidia-smi --auto-boost-default=DISABLED
    All done.
    

Handling host maintenance events

GPU instances must terminate for host maintenance events, but can automatically restart. These maintenance events typically occur once per week, but can occur more frequently when necessary.

You can deal with maintenance events using the following processes:

  • Avoid these disruptions by regularly restarting your instances on a schedule that is more convenient for your applications.
  • Identify when your instance is scheduled for host maintenance and prepare your workload to transition through the system restart.

To receive advanced notice of host maintenance events, monitor the /computeMetadata/v1/instance/maintenance-event metadata value. If the request to the metadata server returns NONE, the instance is not scheduled to terminate. For example, run the following command from within an instance:

$ curl http://metadata.google.internal/computeMetadata/v1/instance/maintenance-event -H "Metadata-Flavor: Google"

NONE

If the metadata server returns a timestamp, the timestamp indicates when your instance will be forcefully terminated. Compute Engine gives GPU instances a one hour termination notice, while normal instances receive only a 60 second notice. Configure your application to transition through the maintenance event. For example, you might use one of the following techniques:

  • Configure your application to temporarily move work in progress to a Google Cloud Storage bucket, then retrieve that data after the instance restarts.

  • Write data to a secondary persistent disk. When the instance automatically restarts, the persistent disk can be reattached and your application can resume work.

You can also receive notification of changes in this metadata value without polling. For examples of how to receive advanced notice of host maintenance events without polling, read getting live migration notices.

What's next?

Send feedback about...

Compute Engine Documentation