Create an instance using a custom container

This page describes how to create a Vertex AI Workbench instance based on a custom container.

Overview

Vertex AI Workbench instances support using a custom container derived from one of the Google-provided base containers. You can modify these base containers to make a custom container image and use these custom containers to create a Vertex AI Workbench instance.

The base containers are configured with a Container-Optimized OS in the host virtual machine (VM). The host image is built from the cos-stable image family.

Limitations

Consider the following limitations when planning your project:

The custom container must be derived from a Google-provided base container. Using a container that isn't derived from a base container increases the risk of compatibility issues and limits our ability to support your usage of Vertex AI Workbench instances.
Use of more than one container with a Vertex AI Workbench instance isn't supported.
Supported metadata for custom containers from user-managed notebooks and managed notebooks can have different behavior when used with Vertex AI Workbench instances.
The VM hosting the custom container is running off of a Container-Optimized OS, which restricts how you can interact with the host machine. For example, Container-Optimized OS doesn't include a package manager. This means that packages acting on the host must be performed on a container with mounts. This affects the post-startup scripts that are migrated from managed notebooks instances and user-managed notebooks instances, where the host machine contains significantly more tooling than Container-Optimized OS.
Vertex AI Workbench instances uses nerdctl (a containerd CLI) for running the custom container. This is required for compatibility with the Image streaming service. Any container parameters that are added using a metadata value need to adhere to what is supported by nerdctl.
Vertex AI Workbench instances are configured to pull either from Artifact Registry or a public container repository. To configure an instance to pull from a private repository, you must manually configure the credentials used by the containerd.

Base containers

Standard base container

The standard base container supports all Vertex AI Workbench features and includes the following:

Pre-installed data science packages.
Cuda libraries similar to Deep Learning Containers.
Google Cloud JupyterLab integrations such as the Dataproc and BigQuery integrations.
Common system packages such as curl or git.
Metadata-based JupyterLab configuration.
Micromamba-based kernel management.

Specifications

The standard base container has the following specifications:

Base image: nvidia/cuda:12.6.1-cudnn-devel-ubuntu24.04
Image size: Approximately 22 GB
URI: us-docker.pkg.dev/deeplearning-platform-release/gcr.io/workbench-container:latest

Slim base container

The slim base container provides a minimal set of configurations that permit a proxy connection to the instance. Standard Vertex AI Workbench features and packages aren't included, except for the following:

JupyterLab
Metadata-based JupyterLab configuration
Micromamba-based kernel management

Additional packages or JupyterLab extensions must be installed and managed independently.

Specifications

The slim base container has the following specifications:

Base image: marketplace.gcr.io/google/ubuntu24.04
Image size: Approximately 2 GB
URI: us-docker.pkg.dev/deeplearning-platform-release/gcr.io/workbench-container-slim:latest

Before you begin

Sign in to your Google Cloud account. If you're new to Google Cloud, create an account to evaluate how our products perform in real-world scenarios. New customers also get $300 in free credits to run, test, and deploy workloads.

In the Google Cloud console, on the project selector page, select or create a Google Cloud project.

Roles required to select or create a project

Select a project: Selecting a project doesn't require a specific IAM role—you can select any project that you've been granted a role on.
Create a project: To create a project, you need the Project Creator (roles/resourcemanager.projectCreator), which contains the resourcemanager.projects.create permission. Learn how to grant roles.

Go to project selector

Verify that billing is enabled for your Google Cloud project.

Enable the Notebooks API.

Roles required to enable APIs

To enable APIs, you need the Service Usage Admin IAM role (roles/serviceusage.serviceUsageAdmin), which contains the serviceusage.services.enable permission. Learn how to grant roles.

Enable the API

In the Google Cloud console, on the project selector page, select or create a Google Cloud project.

Roles required to select or create a project

Select a project: Selecting a project doesn't require a specific IAM role—you can select any project that you've been granted a role on.
Create a project: To create a project, you need the Project Creator (roles/resourcemanager.projectCreator), which contains the resourcemanager.projects.create permission. Learn how to grant roles.

Go to project selector

Verify that billing is enabled for your Google Cloud project.

Enable the Notebooks API.

Roles required to enable APIs

To enable APIs, you need the Service Usage Admin IAM role (roles/serviceusage.serviceUsageAdmin), which contains the serviceusage.services.enable permission. Learn how to grant roles.

Enable the API

Required roles

To get the permissions that you need to create a Vertex AI Workbench instance with a custom container, ask your administrator to grant you the following IAM roles:

Notebooks Runner (roles/notebooks.runner) on the user account
To pull images from the Artifact Registry repository: Artifact Registry Reader (roles/artifactregistry.reader) on the service account

For more information about granting roles, see Manage access to projects, folders, and organizations.

You might also be able to get the required permissions through custom roles or other predefined roles.

Create a custom container

To create a custom container for use with Vertex AI Workbench instances:

Create a derivative container derived from a Google-provided base container image.
Build and push the container to Artifact Registry. You'll use the container's URI when you create your Vertex AI Workbench instance. For example, the URI might look like this: gcr.io/PROJECT_ID/IMAGE_NAME.

Create the instance

You can create a Vertex AI Workbench instance based on a custom container by using the Google Cloud console or the Google Cloud CLI.

Console

To create a Vertex AI Workbench instance based on a custom container, do the following:

In the Google Cloud console, go to the Instances page.

Go to Instances
Click Create new.
In the New instance dialog, click Advanced options.
In the Create instance dialog, in the Environment section, select Use custom container.
For Docker container image, click Select.
In the Select container image dialog, navigate to the container image that you want to use, and then click Select.
Optional. For Post-startup script, enter a path to a post-startup script that you want to use.
Optional. Add metadata for your instance. To learn more, see Custom container metadata.
Optional. In the Networking section, customize your network settings. To learn more, see Network configuration options.
Complete the rest of the instance creation dialog, and then click Create.

Vertex AI Workbench creates an instance and automatically starts it. When the instance is ready to use, Vertex AI Workbench activates an Open JupyterLab link.

gcloud

Before using any of the command data below, make the following replacements:

INSTANCE_NAME: the name of your Vertex AI Workbench instance; must start with a letter followed by up to 62 lowercase letters, numbers, or hyphens (-), and cannot end with a hyphen
PROJECT_ID: your project ID
LOCATION: the zone where you want your instance to be located
CUSTOM_CONTAINER_PATH: the path to the container image repository, for example: gcr.io/PROJECT_ID/IMAGE_NAME
METADATA: custom metadata to apply to this instance; for example, to specify a post-startup-script, you can use the post-startup-script metadata tag, in the format: "--metadata=post-startup-script=gs://BUCKET_NAME/hello.sh"

Execute the following command:

Linux, macOS, or Cloud Shell

gcloud workbench instances create INSTANCE_NAME \
    --project=PROJECT_ID \
    --location=LOCATION \
    --container-repository=CUSTOM_CONTAINER_URL \
    --container-tag=latest \
    --metadata=METADATA

Windows (PowerShell)

gcloud workbench instances create INSTANCE_NAME `
    --project=PROJECT_ID `
    --location=LOCATION `
    --container-repository=CUSTOM_CONTAINER_URL `
    --container-tag=latest `
    --metadata=METADATA

Windows (cmd.exe)

gcloud workbench instances create INSTANCE_NAME ^
    --project=PROJECT_ID ^
    --location=LOCATION ^
    --container-repository=CUSTOM_CONTAINER_URL ^
    --container-tag=latest ^
    --metadata=METADATA

For more information about the command for creating an instance from the command line, see the gcloud CLI documentation.

Vertex AI Workbench creates an instance and automatically starts it. When the instance is ready to use, Vertex AI Workbench activates an Open JupyterLab link in the Google Cloud console.

Network configuration options

In addition to the general network options, a Vertex AI Workbench instance with a custom container must have access to the Artifact Registry service.

If you have turned off public IP access for your VPC, ensure that you have enabled Private Google Access.

Enable Image streaming

The custom container host is provisioned to interact with Image streaming in Google Kubernetes Engine (GKE), which pulls containers faster and reduces initialization time for large containers once they are cached in the GKE remote file system.

To view the requirements for enabling Image streaming, see Requirements. Often, Image streaming can be used with Vertex AI Workbench instances by enabling the Container File System API.

Enable Container File System API

How the host VM runs the custom container

Instead of using Docker to run the custom container, the host VM uses nerdctl under the Kubernetes namespace to load and run the container. This lets Vertex AI Workbench use Image streaming for custom containers.

# Runs the custom container.
sudo /var/lib/google/nerdctl/nerdctl --snapshotter=gcfs -n k8s.io run --name payload-container

Example installation: custom container with a custom default kernel

The following example shows how to create a new kernel with a pip package pre-installed.

Create a new custom container:

FROM us-docker.pkg.dev/deeplearning-platform-release/gcr.io/workbench-container:latest

ENV MAMBA_ROOT_PREFIX=/opt/micromamba

RUN micromamba create -n ENVIRONMENT_NAME -c conda-forge python=PYTHON_VERSION -y

SHELL ["micromamba", "run", "-n", "ENVIRONMENT_NAME", "/bin/bash", "-c"]

RUN micromamba install -c conda-forge pip -y
RUN pip install PACKAGE
RUN pip install ipykernel
RUN python -m ipykernel install --prefix /opt/micromamba/envs/ENVIRONMENT_NAME --name ENVIRONMENT_NAME --display-name KERNEL_NAME
# Creation of a micromamba kernel automatically creates a python3 kernel
# that must be removed if it's in conflict with the new kernel.
RUN rm -rf "/opt/micromamba/envs/ENVIRONMENT_NAME/share/jupyter/kernels/python3"

Add the new container to Artifact Registry:

gcloud auth configure-docker REGION-docker.pkg.dev
docker build -t REGION-docker.pkg.dev/PROJECT_ID/REPOSITORY_NAME/IMAGE_NAME .
docker push REGION-docker.pkg.dev/PROJECT_ID/REPOSITORY_NAME/IMAGE_NAME:latest

Create an instance:

gcloud workbench instances create INSTANCE_NAME  \
    --project=PROJECT_ID \
    --location=ZONE \
    --container-repository=REGION-docker.pkg.dev/PROJECT_ID/REPOSITORY_NAME/IMAGE_NAME \
    --container-tag=latest

Persistent kernels for custom containers

Vertex AI Workbench custom containers only mount a data disk to the /home/USER directory within each container, where jupyter is the default user. This means that any change outside of /home/USER is ephemeral and won't persist after a restart. If you need installed packages to persist for a specific kernel, you can create a kernel in the /home/USER directory.

To create a kernel in the /home/USER directory:

Create a micromamba environment:

micromamba create -p /home/USER/ENVIRONMENT_NAME -c conda-forge python=3.11 -y
micromamba activate /home/USER/ENVIRONMENT_NAME
pip install ipykernel
pip install -r ~/requirement.txt
python -m ipykernel install --prefix "/home/USER/ENVIRONMENT_NAME" --display-name "Example Kernel"

Replace the following:

USER: the user directory name, which is jupyter by default
ENVIRONMENT_NAME: the name of the environment
PYTHON_VERSION: the Python version, for example 3.11

Wait 30 seconds to 1 minute for the kernels to refresh.

Updating the startup of the base container

The base container for a Vertex AI Workbench instance (us-docker.pkg.dev/deeplearning-platform-release/gcr.io/workbench-container:latest) starts JupyterLab by running /run_jupyter.sh.

If you modify the container's startup in a derivative container, you must append /run_jupyter.sh to run the default configuration of JupyterLab.

The following is an example of how the Dockerfile might be modified:

# DockerFile
FROM us-docker.pkg.dev/deeplearning-platform-release/gcr.io/workbench-container:latest

CP startup_file.sh /
# Ensure that you have the correct permissions and startup is executable.
RUN chmod 755 /startup_file.sh && \
    chown jupyter:jupyter /startup_file.sh

# Override the existing CMD directive from the base container.
CMD ["/startup_file.sh"]

# /startup_file.sh

echo "Running startup scripts"
...

/run_jupyter.sh

Updating the JupyterLab Configuration within the base container

If you need to modify the JupyterLab configuration on the base container you must do the following:

Ensure that JupyterLab is configured to port 8080. Our proxy agent is configured to forward any request to port 8080 and if the jupyter server isn't listening to the correct port, the instance encounters provisioning issues.
Modify JupyterLab packages under the jupyterlab micromamba environment. We provide a separate package environment to run JupyterLab and its plugin to ensure that there aren't any dependency conflicts with the kernel environment. If you want to install an additional JupyterLab extension, you must install it within the jupyterlab environment. For example:
```
# DockerFile
FROM us-docker.pkg.dev/deeplearning-platform-release/gcr.io/workbench-container:latest
RUN micromamba activate jupyterlab && \
  jupyter nbextension install nbdime
```

Custom Container Metadata

In addition to the standard list of metadata that can be applied to a Vertex AI Workbench instance, instances with custom containers include the following metadata for managing the instantiation of the payload container:

Feature	Description	Metadata key	Accepted values and defaults
Enables Cloud Storage FUSE on a container image	Mounts `/dev/fuse` onto the container and enables `gcsfuse` for use on the container.	`container-allow-fuse`	`true`: Enables Cloud Storage FUSE. `false` (default): Doesn't enable Cloud Storage FUSE.
Additional container run parameters	Appends additional container parameters to `nerdctl run`, where `nerdctl` is the Containerd CLI.	`container-custom-params`	A string of container run parameters. Example: `--v /mnt/disk1:/mnt/disk1`.
Additional container environment flags	Stores Environment variables into a flag under `/mnt/stateful_partition/workbench/container_env` and appends it to `nerdctl run`.	`container-env-file`	A string of container environment variables. Example: `CONTAINER_NAME=derivative-container`.

Upgrade a Custom Container

When your instance starts for the first time, it pulls the container image from a URI stored in the custom-container-payload metadata. If you use the :latest tag, the container is updated at every restart. The custom-container-payload metadata value can't be modified directly because it's a protected metadata key.

To update your instance's custom container image, you can use the following methods supported by the Google Cloud CLI, Terraform, or the Notebooks API.

gcloud

You can update the custom container image metadata on a Vertex AI Workbench instance by using the following command:

gcloud workbench instances update INSTANCE_NAME \
    --container-repository=CONTAINER_URI \
    --container-tag=CONTAINER_TAG

Terraform

You can change the container_image field in the terraform configuration to update the container payload.

To learn how to apply or remove a Terraform configuration, see Basic Terraform commands.

resource "google_workbench_instance" "default" {
  name     = "workbench-instance-example"
  location = "us-central1-a"

  gce_setup {
    machine_type = "n1-standard-1"
    container_image {
      repository = "us-docker.pkg.dev/deeplearning-platform-release/gcr.io/workbench-container"
      family  = "latest"
    }
  }
}

Notebooks API

Use the instances.patch method with changes to gce_setup.container_image.repository and gce_setup.container_image.tag in the updateMask.

Run the diagnostic tool

The diagnostic tool checks and verifies the status of various Vertex AI Workbench services. To learn more, see Tasks performed by the diagnostic tool.

When you create a Vertex AI Workbench instance using a custom container, the diagnostic tool isn't available as a script in the host environment that users can run. Instead, it is compiled into a binary and loaded onto a Google runtime container that is built to run diagnostic services in a Container-Optimized OS environment. See Container-Optimized OS Overview.

To run the diagnostic tool, complete the following steps:

Use ssh to connect to your Vertex AI Workbench instance.

In the SSH terminal, run the following command:

sudo docker exec diagnostic-service ./diagnostic_tool

To view additional command options, run the following command:

sudo docker exec diagnostic-service ./diagnostic_tool --help

For more information about the diagnostic tool's options, see the monitoring health status documentation.

To run the diagnostic tool by using the REST API, see the REST API documentation.

Access your instance

You can access your instance through a proxy URL.

After your instance has been created and is active, you can get the proxy URL by using the gcloud CLI.

Before using any of the command data below, make the following replacements:

INSTANCE_NAME: the name of your Vertex AI Workbench instance
PROJECT_ID: your project ID
LOCATION: the zone where your instance is located

Execute the following command:

Linux, macOS, or Cloud Shell

gcloud workbench instances describe INSTANCE_NAME \
--project=PROJECT_ID \
--location=LOCATION | grep proxy-url

Windows (PowerShell)

gcloud workbench instances describe INSTANCE_NAME `
--project=PROJECT_ID `
--location=LOCATION | grep proxy-url

Windows (cmd.exe)

gcloud workbench instances describe INSTANCE_NAME ^
--project=PROJECT_ID ^
--location=LOCATION | grep proxy-url

proxy-url: 7109d1b0d5f850f-dot-datalab-vm-staging.googleusercontent.com

The describe command returns your proxy URL. To access your instance, open the proxy URL in a web browser.

For more information about the command for describing an instance from the command line, see the gcloud CLI documentation.