Creating groups of GPU instances using instance templates

You can use instance templates to create managed instance groups with GPUs added to each instance. Managed instance groups use the template to create multiple identical instances. You can scale the number of instances in the group to match your workload.

Before you begin

Creating an instance template

For steps to create an instance template, see Creating instance templates.

If you create the instance template using the Console, customize the machine type, and select the type and number of GPUs that you want to add to the instance template.

If you are using the gcloud command-line tool, include the --accelerators and --maintenance-policy TERMINATE flags. Optionally, include the --metadata startup-script flag and specify a startup script to install the GPU driver while the instance starts up. For sample scripts that work on GPU instances, see installing GPU drivers.

The following example creates an instance template with 2 vCPUs, a 250 GB boot disk with Ubuntu 16.04, an NVIDIA® Tesla® K80 GPU, and a startup script. The startup script installs the CUDA Toolkit with its recommended driver version.

gcloud beta compute instance-templates create gpu-template \
    --machine-type n1-standard-2 \
    --boot-disk-size 250GB \
    --accelerator type=nvidia-tesla-k80,count=1 \
    --image-family ubuntu-1604-lts --image-project ubuntu-os-cloud \
    --maintenance-policy TERMINATE --restart-on-failure \
    --metadata startup-script='#!/bin/bash
    echo "Checking for CUDA and installing."
    # Check for CUDA and try to install.
    if ! dpkg-query -W cuda-10-0; then
      curl -O
      dpkg -i ./cuda-repo-ubuntu1604_10.0.130-1_amd64.deb
      apt-get update
      apt-get install cuda-10-0 -y

Creating an instance group

After you create the template, use the template to create an instance group. Every time you add an instance to the group, it starts that instance using the settings in the instance template.

If you are creating a regional managed instance group, be sure to select zones that specifically support the GPU model that you want. For a list of GPU models and available zones, see GPUs on Compute Engine. The following example creates a regional managed instance group across two zones that support the nvidia-tesla-k80 model.

gcloud beta compute instance-groups managed create example-rmig \
    --template gpu-template --base-instance-name example-instances \
    --size 30 --zones us-east1-c,us-east1-d

Note: If you are choosing specific zones, use the gcloud beta component because the zone selection feature is currently in beta.

What's next?

