Managing the lifecycle of a Cloud Datalab instance

This page describes the lifecycle of a Cloud Datalab instance and the options available for managing and conserving compute resources.

Cloud Datalab runs inside of a Google Compute Engine VM with an attached persistent disk that is used to store notebooks. Cloud Datalab VMs are connected to a special network in a project called datalab-network. The default configuration of this network limits incoming connections to SSH connections.

Prerequisites

To use the commands discussed below, you must have done the following:

  1. Installed the Cloud SDK, including the datalab component
  2. Authenticated with the gcloud command-line tool
  3. Configured the gcloud command-line tool to use your selected project and zone

Creating an instance

You create a Cloud Datalab instance using the datalab create command.

datalab create instance_name

There are several command-line options available with this command. For example, if you want to create an instance with more memory than the default, you can pass in the --machine-type flag:

datalab create --machine-type n1-highmem-2 instance_name

To list all available options, run:

datalab create --help

By default, the datalab create command connects to the newly created instance. To create the instance but not connect to it, pass in the --no-connect flag:

datalab create --no-connect instance_name

The datalab create command also creates the following Google Cloud Platform resources (if not already available):

  • The datalab-network network
  • A firewall rule on the datalab-network allowing incoming SSH connections
  • The datalab-notebooks Google Cloud Source Repository
  • The persistent disk for storing Cloud Datalab notebooks

Note that some of the above steps may require owner permission (see Using Cloud Datalab in a team environment).

Connecting to an instance

The datalab tool can create a persistent SSH tunnel to your Cloud Datalab instance that allows you to connect to the instance from your local browser as though Cloud Datalab was running on your local machine.

To create this connection, use the datalab connect command:

datalab connect instance_name

The datalab connect command restarts your instance if it is not running. The command continues to run until you stop it (the connection remains available for as long as the command is running).

By default, the local port used for the connection is 8081. To change to a different port, pass in the --port flag. For example, to use local port 8082, run the following:

datalab connect --port 8082 instance_name

Stopping an instance

Run the following command to stop your Cloud Datalab instance to avoid incurring unnecessary costs when you want to pause using Cloud Datalab.

datalab stop instance_name

When you are ready to start using Cloud Datalab again, run datalab connect command to restart the instance.

Deleting and recreating an instance without deleting the notebooks disk

To change VM properties, such as the machine type or the service account, you can delete, and then recreate the VM without losing your notebooks stored on the persistent disk.

datalab delete --keep-disk instance_name
datalab create instance_name

Deleting an instance and the notebooks disk

By default, the datalab delete command does not delete the persistent disk holding your notebooks. This allows you to easily change the VM without accidentally losing your data (see Deleting and recreating an instance without deleting the notebooks disk).

If you want to delete both the VM and the attached persistent disk, then add the --delete-disk flag to the command:

datalab delete --delete-disk instance_name

Reducing usage of compute resources

Google Compute Engine VMs incur costs. You are charged for the time that a Cloud Datalab instance is running whether or not you are using it. You can reduce Cloud Datalab VM charges by stopping the instance when you are not using it. You will continue to incur charges for the resources attached to the VM (such as the persistent disk and the external IP address), but the VM instance itself will not incur charges while it is stopped.

When you need to use your stopped instance again, run datalab connect instance_name to connect to your instance, and the datalab tool will restart the instance before attempting to connect to it.

To stop incurring all charges associated with a Cloud Datalab instance, you must delete both the VM and the attached persistent disk by running the datalab delete command with the --delete-disk option.

Send feedback about...

Google Cloud Datalab Documentation