Installing GPU drivers

After you create an instance with one or more GPUs, your system requires device drivers so that your applications can access the device. This guide shows the ways to install NVIDIA proprietary drivers on instances with public images.

To install GRID drivers for virtual workstations, see Installing GRID drivers for virtual workstations.

Before you begin

Each version of CUDA requires a minimum GPU driver version or a later version. To check the minimum driver required for your version of CUDA, see CUDA Toolkit and Compatible Driver Versions.

NVIDIA GPUs running on Compute Engine must use the following driver versions:

  • Linux instances:

    • NVIDIA 410.79 driver or greater
  • Windows Server instances:

    • NVIDIA 426.00 driver or greater

For most driver installs, you can obtain these drivers by installing the NVIDIA CUDA Toolkit.

Use the following steps to install CUDA and the associated drivers for NVIDIA® GPUs. Review your application needs to determine the driver version that works best. If the software you are using requires a specific version of CUDA, modify the commands to download the version of CUDA that you need.

For information about support for CUDA, and for steps to modify your CUDA installation, see the CUDA Toolkit Documentation.

You can use this process to manually install drivers on instances with most public images. For custom images, you might need to modify the process to function in your unique environment.

To ensure a successful installation, your operating system must have the latest package updates.

CentOS/RHEL

  1. Install latest kernel package. If needed, this command also reboots the system.

    sudo yum clean all
    sudo yum install -y kernel | grep -q 'already installed' || sudo reboot
    
  2. If the system rebooted in the previous step, reconnect to the instance.

  3. Install kernel headers and development packages.

    sudo yum install -y kernel-devel-$(uname -r) kernel-headers-$(uname -r)
    
  4. Select a driver repository for the CUDA Toolkit and add it to your instance.

    • CentOS/RHEL 8

      sudo yum install http://developer.download.nvidia.com/compute/cuda/repos/rhel8/x86_64/cuda-repo-rhel8-10.1.243-1.x86_64.rpm
      
    • CentOS/RHEL 7

      sudo yum install http://developer.download.nvidia.com/compute/cuda/repos/rhel7/x86_64/cuda-repo-rhel7-10.0.130-1.x86_64.rpm
      
    • CentOS/RHEL 6

      sudo yum install http://developer.download.nvidia.com/compute/cuda/repos/rhel6/x86_64/cuda-repo-rhel6-10.0.130-1.x86_64.rpm
      
  5. Install the epel-release repository. This repository includes the DKMS packages, which are required to install NVIDIA drivers on CentOS.

    • CentOS 6/7/8 and RHEL 6/7

      sudo yum install epel-release
      
    • RHEL 8 only

      sudo yum install https://dl.fedoraproject.org/pub/epel/epel-release-latest-8.noarch.rpm
      
  6. Clean the Yum cache:

    sudo yum clean all
    
  7. Install CUDA, this package includes the NVIDIA driver.

    sudo yum install cuda
    

SLES

  1. Connect to the instance where you want to install the driver.

  2. Install latest kernel package. If needed, this command also reboots the system.

    sudo zypper refresh
    sudo zypper up -y kernel-default | grep -q 'already installed' || sudo reboot
    
  3. If the system rebooted in the previous step, reconnect to the instance.

  4. Select a driver repository for the CUDA Toolkit and add it to your instance.

    • SLES 15

      sudo rpm --import https://developer.download.nvidia.com/compute/cuda/repos/sles15/x86_64/7fa2af80.pub
      sudo yum install https://developer.download.nvidia.com/compute/cuda/repos/sles15/x86_64/cuda-repo-sles15-10.0.130-1.x86_64.rpm
      
    • SLES 12 with Service Pack 4

      sudo rpm --import https://developer.download.nvidia.com/compute/cuda/repos/sles124/x86_64/7fa2af80.pub
      sudo yum install https://developer.download.nvidia.com/compute/cuda/repos/sles124/x86_64/cuda-repo-sles124-10.1.243-1.x86_64.rpm
      
  5. Refresh Zypper.

    sudo zypper refresh
    
  6. Install CUDA, which includes the NVIDIA driver.

    sudo zypper install cuda
    

Ubuntu

  1. Connect to the instance where you want to install the driver.

  2. Select a driver repository for the CUDA Toolkit and add it to your instance.

    • Ubuntu 18.04 LTS

      curl -O http://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/cuda-repo-ubuntu1804_10.0.130-1_amd64.deb
      sudo dpkg -i cuda-repo-ubuntu1804_10.0.130-1_amd64.deb
      sudo apt-key adv --fetch-keys http://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/7fa2af80.pub
      
    • Ubuntu 16.04 LTS

      curl -O http://developer.download.nvidia.com/compute/cuda/repos/ubuntu1604/x86_64/cuda-repo-ubuntu1604_10.0.130-1_amd64.deb
      sudo dpkg -i cuda-repo-ubuntu1604_10.0.130-1_amd64.deb
      sudo apt-key adv --fetch-keys http://developer.download.nvidia.com/compute/cuda/repos/ubuntu1604/x86_64/7fa2af80.pub
      
  3. Update the package lists.

    sudo apt-get update
    
  4. Install CUDA, which includes the NVIDIA driver.

    sudo apt-get install cuda
    

Windows Server

  1. Connect to the instance where you want to install the driver.

  2. Download an .exe installer file to your instance that includes the R426 branch: NVIDIA 426.00 driver or greater. For most Windows Server instances, you can use one of the following options:

    For example in Windows Server 2019, you can open a PowerShell terminal as an administrator and use the wget command to download the driver installer that you need.

    PS C:\> wget https://developer.download.nvidia.com/compute/cuda/10.1/Prod/network_installers/cuda_10.1.243_win10_network.exe -O cuda_10.1.243_win10_network.exe
  3. Run the .exe installer. For example, you can open a PowerShell terminal as an administrator and run the following command.

    PS C:\> .\\cuda_10.1.243_win10_network.exe
    

Verifying the GPU driver install

After completing the driver installation steps, verify that the driver installed and initialized properly.

Linux

Connect to the Linux instance and use the nvidia-smi command to verify that the driver is running properly.

nvidia-smi

The output resembles the following:

Wed Jan  2 19:51:51 2019
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 410.79       Driver Version: 410.79       CUDA Version: 10.0     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  Tesla P4            Off  | 00000000:00:04.0 Off |                    0 |
| N/A   42C    P8     7W /  75W |     62MiB /  7611MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+

Windows Server

Connect to the Windows Server instance and use the nvidia-smi.exe tool to verify that the driver is running properly.

& 'C:\Program Files\NVIDIA Corporation\NVSMI\nvidia-smi.exe'

The output resembles the following:

Mon Aug 26 18:09:03 2019
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 426.00      Driver Version: 426.00       CUDA Version: 10.1      |
|-------------------------------+----------------------+----------------------+
| GPU  Name            TCC/WDDM | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  Tesla P4            TCC  | 00000000:00:04.0 Off |                    0 |
| N/A   27C    P8     7W /  75W |      0MiB /  7611MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+

What's next?