This topic provides instructions for creating a new Deep Learning VM instance with TensorFlow and other tools pre-installed. You have the option of including one or more GPUs in your instance on setup.
Before you begin
If you are using GPUs with your Deep Learning VM, check the quotas page to ensure that you have enough GPUs available in your project: Quotas.
If GPUs are not listed on the quotas page or you require additional GPU quota, you can request a quota increase. See "Requesting additional quota" on the Compute Engine Resource Quotas page.
Creating a TensorFlow Deep Learning VM instance from the Google Cloud Marketplace
Cloud Marketplace lets you quickly deploy functional software packages that run on Compute Engine. The Deep Learning VM with TensorFlow can be created quickly from the Cloud Marketplace within the Cloud Console without having to use the command line.
You can create a TensorFlow instance with or without GPUs.
Without GPUs
To provision a Deep Learning VM instance without a GPU:
- Visit the AI Platform Deep Learning VM Image Cloud Marketplace page.
- Click Launch on Compute Engine.
- Enter a Deployment name, which will be the root of your VM name.
Compute Engine appends
-vm
to this name when naming your instance. - Choose a Zone or accept the default.
- Select your Machine type. Click Customize to make specific adjustments to the number of cores or memory. To learn more about machine types, see Machine Types.
- In the GPUs section, set the Number of GPUs to None.
- In the Framework section, select the version of TensorFlow that you want.
- Select your Boot disk type and Boot disk size in GB.
- Click Deploy.
Once the VM has been deployed, the page will update with instructions for accessing the instance.
With one or more GPUs
Compute Engine offers the option of adding GPUs to your virtual machine instances. GPUs offer faster processing for many complex data and machine learning tasks. To learn more about GPUs, see GPUs on Compute Engine.
To provision a Deep Learning VM instance with one or more GPUs:
- Visit the AI Platform Deep Learning VM Image Cloud Marketplace page.
- Click Launch on Compute Engine.
- Enter a Deployment name, which will be the root of your VM name.
Compute Engine appends
-vm
to this name when naming your instance. - Choose a Zone or accept the default.
- Select your Machine type. Click Customize to make specific adjustments to the number of cores or memory. To learn more about machine types, see Machine Types.
- In the GPUs section, select the Number of GPUs and GPU type. Not all GPU types are available in all zones; see the GPUs on Compute Engine page to confirm that your GPU type is supported.
- In the Framework section, select the version of TensorFlow that you want.
- An NVIDIA driver is required when using GPUs. You can install the driver yourself, or select the checkbox to have the latest stable driver installed automatically.
- Select your Boot disk type and Boot disk size in GB.
- Follow the instructions on the page to check your GPU quota, and enter the required phrase to confirm.
- Click Deploy.
- If a message tells you you've gone over your GPU quota, follow the instructions in the message to increase it.
If you've elected to install NVIDIA drivers, allow 3-5 minutes for installation to complete.
Once the VM has been deployed, the page will update with instructions for accessing the instance.
Creating a TensorFlow Deep Learning VM instance from the command line
To use the gcloud
command-line tool to create a new Deep Learning VM
instance, you must first install and initialize the Cloud SDK:
- Download and install the Cloud SDK using the instructions given on Installing Google Cloud SDK.
- Initialize the SDK using the instructions given on Initializing Cloud SDK.
To use gcloud
in Cloud Shell, first activate Cloud Shell using the
instructions given on Starting Cloud Shell.
You can create a TensorFlow instance with or without GPUs.
Without GPUs
To provision a Deep Learning VM instance without a GPU:
export IMAGE_FAMILY="tf2-ent-latest-cpu"
export ZONE="us-west1-b"
export INSTANCE_NAME="my-instance"
gcloud compute instances create $INSTANCE_NAME \
--zone=$ZONE \
--image-family=$IMAGE_FAMILY \
--image-project=deeplearning-platform-release
Options:
--image-family
must be one of the following:tf2-ent-latest-cpu
to get the latest TensorFlow Enterprise 2 imagetf-ent-latest-cpu
to get the latest TensorFlow Enterprise 1 image- An earlier TensorFlow or TensorFlow Enterprise image family name (see Choosing an image)
--image-project
must bedeeplearning-platform-release
.
With one or more GPUs
Compute Engine offers the option of adding one or more GPUs to your virtual machine instances. GPUs offer faster processing for many complex data and machine learning tasks. To learn more about GPUs, see GPUs on Compute Engine.
To provision a Deep Learning VM instance with one or more GPUs:
export IMAGE_FAMILY="tf2-ent-latest-gpu"
export ZONE="us-west1-b"
export INSTANCE_NAME="my-instance"
gcloud compute instances create $INSTANCE_NAME \
--zone=$ZONE \
--image-family=$IMAGE_FAMILY \
--image-project=deeplearning-platform-release \
--maintenance-policy=TERMINATE \
--accelerator="type=nvidia-tesla-v100,count=1" \
--metadata="install-nvidia-driver=True"
Options:
--image-family
must be one of the following:tf2-ent-latest-gpu
to get the latest TensorFlow Enterprise 2 imagetf-ent-latest-gpu
to get the latest TensorFlow Enterprise 1 image- An earlier TensorFlow or TensorFlow Enterprise image family name (see Choosing an image)
--image-project
must bedeeplearning-platform-release
.--maintenance-policy
must beTERMINATE
. To learn more, see GPU Restrictions.--accelerator
specifies the GPU type to use. It must be specified in the format--accelerator="type=TYPE,count=COUNT"
. Supported values ofTYPE
are:nvidia-tesla-v100
(count=1
or8
)nvidia-tesla-p100
(count=1
,2
, or4
)nvidia-tesla-p4
(count=1
,2
, or4
)nvidia-tesla-k80
(count=1
,2
,4
, or8
)
Not all GPU types are supported in all regions. For details, see GPUs on Compute Engine.
--metadata
is used to specify that the NVIDIA driver should be installed on your behalf. The value isinstall-nvidia-driver=True
. If specified, Compute Engine loads the latest stable driver on the first boot and performs the necessary steps (including a final reboot to activate the driver).
If you've elected to install NVIDIA drivers, allow 3-5 minutes for installation to complete.
It may take up to 5 minutes before your VM is fully provisioned. In this
time, you will be unable to SSH into your machine. When the installation is
complete, to guarantee that the driver installation was successful, you can
SSH in and run nvidia-smi
.
When you've configured your image, you can save a snapshot of your image so that you can start derivitave instances without having to wait for the driver installation.
About TensorFlow Enterprise
TensorFlow Enterprise is a distribution of TensorFlow that has been optimized to run on Google Cloud and includes Long Term Version Support.
Creating a preemptible instance
You can create a preemptible Deep Learning VM instance. A preemptible instance is an instance you can create and run at a much lower price than normal instances. However, Compute Engine might stop (preempt) these instances if it requires access to those resources for other tasks. Preemptible instances always stop after 24 hours. To learn more about preemptible instances, see Preemptible VM Instances.
To create a preemptible Deep Learning VM instance:
Follow the instructions located above to create a new instance using the command line. To the
gcloud compute instances create
command, append the following:--preemptible
What's next
For instructions on connecting to your new Deep Learning VM instance
through the Cloud Console or command line, see Connecting to
Instances. Your instance name
is the Deployment name you specified with -vm
appended.