Launching a PyTorch Deep Learning VM Instance

Before you begin

If you are using GPUs with your Deep Learning VM, check the quotas page to ensure that you have enough GPUs available in your project. If GPUs are not listed on the quotas page or you require additional GPU quota, request a quota increase.

Launching a PyTorch Deep Learning VM Instance from the Cloud Marketplace

Cloud Marketplace lets you quickly deploy functional software packages that run on Compute Engine. The Deep Learning VM with PyTorch can be launched quickly from the Cloud Marketplace interface without having to use the command line.

Using Cloud Marketplace

  1. Visit the Deep Learning Virtual Machine Image Cloud Marketplace page.
  2. Click Launch on Compute Engine.
  3. Enter a Deployment name which will be the root of your VM name. Compute Engine appends -vm to this name when naming your instance.
  4. Set Framework to PyTorch and choose Zone.

To provision an instance with GPUs:

  1. Choose your GPU type. Not all GPU types are available in all zones; confirm that your combination is supported.
  2. Choose the number of GPUs to deploy. Each GPU supports different numbers; confirm that your combination is supported.
  3. An NVIDIA driver is required when using GPUs. You can install the driver yourself, or select the checkbox to have the latest stable driver installed automatically.
  4. Follow the instructions on the page to check your GPU quota, and enter the required phrase to confirm.
  5. In the CPU section, adjust your machine type as needed. For certain workflows, you may want to increase the number or cores (e.g. for CPU-heavy preprocessing) or the amount of memory (e.g. using CPU as a parameter store for distributed training).
  6. Click Deploy.

To provision a CPU-only instance:

  1. In the GPU section, set the number of GPUs to Zero and enter n/a in the Quota confirmation field.
  2. In the CPU section, select your Machine type. Learn more about machine types.
  3. Select your boot disk type and size.
  4. Click Deploy.

Once the VM has been deployed, the page will update with instructions for accessing the instance.

If you've elected to install NVIDIA drivers, allow 3-5 minutes for installation to complete.

Read Connecting to Instances for instructions on connecting to your Deep Learning VM instance through the GCP Console or command line. Your instance name is the Deployment name you specified with -vm appended.

Launching a PyTorch Deep Learning VM Instance From the Command Line

Launching an instance without a GPU

To launch a Deep Learning VM with the latest PyTorch instance with a CPU:

export IMAGE_FAMILY="pytorch-latest-cpu"
export ZONE="us-west1-b"
export INSTANCE_NAME="my-instance"

gcloud compute instances create $INSTANCE_NAME \
  --zone=$ZONE \
  --image-family=$IMAGE_FAMILY \
  --image-project=deeplearning-platform-release

Options:

  • --image-family must be either pytorch-latest-cpu or pytorch-VERSION-cpu (e.g. pytorch-0-4-cpu).

  • --image-project must be deeplearning-platform-release.

Launching an instance with a GPU

Compute Engine offers the option of adding GPUs to your virtual machine instances. GPUs offer faster processing for many complex data and machine learning tasks. Learn more about GPUs.

To launch your VM with an attached GPU:

export IMAGE_FAMILY="pytorch-latest-cu92"
export ZONE="us-west1-b"
export INSTANCE_NAME="my-instance"

gcloud compute instances create $INSTANCE_NAME \
  --zone=$ZONE \
  --image-family=$IMAGE_FAMILY \
  --image-project=deeplearning-platform-release \
  --maintenance-policy=TERMINATE \
  --accelerator='type=nvidia-tesla-v100,count=1' \
  --metadata='install-nvidia-driver=True'

Options:

  • --image-family must be pytorch-latest-cu92 or pytorch-VERSION-cu92 (e.g. pytorch-0-4-cu92).

  • --image-project must be deeplearning-platform-release.

  • --maintenance-policy must be TERMINATE. Read GPU Restrictions to learn more.

  • --accelerator specifies the GPU type to use. Must be specified in the format --accelerator='type=TYPE,count=COUNT'. Supported values of TYPE are:

    • nvidia-tesla-v100 (count=1 or 8)
    • nvidia-tesla-p100 (count=1, 2, or 4)
    • nvidia-tesla-p4 (count=1, 2, or 4)
    • nvidia-tesla-k80 (count=1, 2, 4, or 8)

    Not all GPU types are supported in all regions. See GPUs on Compute Engine for details.

  • --metadata is used to specify that the NVIDIA driver should be installed on your behalf. The value is install-nvidia-driver=True. If specified, Compute Engine loads the latest stable driver on the first boot and performs the necessary steps (including a final reboot to activate the driver). It may take up to 5 minutes before your VM is fully provisioned. In this time, you will be unable to SSH into your machine. When the installation is complete, to guarantee that the driver installation was successful, you can SSH in and run nvidia-smi.

    When you've configured your image, you can save a snapshot of your image so that you can start derivitave instances without having to wait for the driver installation.

Was this page helpful? Let us know how we did:

Send feedback about...

Deep Learning VM