Introduction to managed notebooks

Vertex AI Workbench managed notebooks instances are Google-managed environments with integrations and capabilities that help you set up and work in an end-to-end Jupyter notebook-based production environment.

Managed notebooks instances are prepackaged with JupyterLab and have a preinstalled suite of deep learning packages, including support for the TensorFlow and PyTorch frameworks. Managed notebooks instances support GPU accelerators and the ability to sync with a GitHub repository. Your managed notebooks instances are protected by Google Cloud authentication and authorization.

Google-managed compute infrastructure

A Vertex AI Workbench managed notebooks instance is a Google-managed, Jupyter notebook-based, compute infrastructure.

When you create a managed notebooks instance, it is deployed as a Google-managed virtual machine (VM) instance in a tenant project.

Your managed notebooks instance includes many common data science framework environments, such as TensorFlow and PyTorch. You can also add your own custom container images to your managed notebooks instance. These environments are available as kernels that you can run your notebook file in.

When you run a notebook in one of the kernels, Vertex AI Workbench starts the corresponding container, creates a Jupyter session on it, and uses that Jupyter session to run your notebook on the container.

This Google-managed compute infrastructure includes integrations and capabilities that help you implement data science and machine learning workflows from start to finish. See the following sections for details.

Use custom containers

You can add custom Docker container images to your managed notebooks instance to run your notebook code in an environment customized for your needs.

These custom containers are available to use directly from the JupyterLab user interface, alongside the preinstalled frameworks. For more information, see Add a custom container to a managed notebooks instance.

Notebook-based workflow

Managed notebooks instances let you perform workflow-oriented tasks without leaving the JupyterLab user interface.

Control your hardware and framework from JupyterLab

In a managed notebooks instance, your JupyterLab user interface is where you specify what compute resources your code will run on. For example, you can configure how many vCPUs or GPUs you want, how much RAM you want, and what framework you want to run the code in. You can write your code first, and then choose how to run it without leaving JupyterLab or restarting your instance. For quick tests of your code, you can scale your hardware down and then scale it back up to run your code against more data.

Access to data

You can access your data without leaving the JupyterLab user interface.

In JupyterLab's navigation menu on a managed notebooks instance, you can use the Cloud Storage integration to browse data and other files that you have access to. See Access Cloud Storage buckets and files from within JupyterLab.

You can also use the BigQuery integration to browse tables that you have access to, write queries, preview results, and load data into your notebook. See Query data in BigQuery tables from within JupyterLab.

Execute notebook runs

Use the executor to run a notebook file as a one-time execution or on a schedule. Choose the specific environment and hardware that you want your execution to run on. Your notebook's code will run on Vertex AI custom training, which can make it easier to do distributed training, optimize hyperparameters, or schedule continuous training jobs. See Run notebook files with the executor.

You can use parameters in your execution to make specific changes to each run. For example, you might specify a different dataset to use, change the learning rate on your model, or change the version of the model.

You can also set a notebook to run on a recurring schedule. Even while your instance is shut down, Vertex AI Workbench will run your notebook file and save the results for you to look at and share with others.

Share insights

Executed notebook runs are stored in a Cloud Storage bucket, so you can share your insights with others by granting access to the results. See the previous section on executing notebook runs.

Secure your instance

You can deploy your managed notebooks instance with the default Google-managed network, which uses a default VPC network and subnet. Instead of the default network, you can specify a VPC network to use with your instance. For more information, see Set up a network. You can use VPC Service Controls to provide additional security for your managed notebooks instances.

By default, Google Cloud automatically encrypts data when it is at rest using encryption keys managed by Google. If you have specific compliance or regulatory requirements related to the keys that protect your data, you can use customer-managed encryption keys (CMEK) with your managed notebooks instances. For more information, see Use customer-managed encryption keys.

Automated shutdown for idle instances

To help manage costs, managed notebooks instances shut down after being idle for a specific time period by default. You can change the amount of time or turn this feature off. For more information, see Idle shutdown.

Dataproc integration

You can process data quickly by running a notebook on a Dataproc cluster. After your cluster is set up, you can run a notebook file on it without leaving the JupyterLab user interface. For more information, see Run a managed notebooks instance on a Dataproc cluster.

Limitations

Consider the following limitations of managed notebooks when planning your project:

  • Managed notebooks instances are Google-managed and therefore less customizable than Vertex AI Workbench user-managed notebooks instances. User-managed notebooks instances can be more ideal for users who need a lot of control over their environment. For more information, see Introduction to user-managed notebooks.

  • Third party JupyterLab extensions are not supported.

What's next