Separate operations and development when using user-managed notebooks: Deploy
This document describes how to deploy the notebooks manager for managing Vertex AI Workbench user-managed notebooks. The code for this deployment is available on GitHub.
The document is part of a series that includes the following documents:
- Overview, which describes a solution that you can use for deploying the notebooks manager and extended notebooks UIs.
- Deploy (this document), which guides IT administrators on how to deploy the notebooks manager and extended notebooks UIs.
- Use, which guides data practitioners on how to use the notebooks manager and extended notebooks UIs.
- Troubleshooting, which describes potential issues and suggested resolutions.
Objectives
- Set up an environment that limits Google Cloud console access for data practitioners but lets users interact with Google Cloud services through an extended notebooks UI.
- Set up the OAuth 2.0 flow for the notebooks manager web application.
- Deploy the notebooks manager.
- Create extended notebooks UIs for data practitioners.
- Provide data practitioners with a link to the notebooks manager.
Costs
In this document, you use the following billable components of Google Cloud:
To generate a cost estimate based on your projected usage,
use the pricing calculator.
Before you begin
- Sign in to your Google Cloud account. If you're new to Google Cloud, create an account to evaluate how our products perform in real-world scenarios. New customers also get $300 in free credits to run, test, and deploy workloads.
-
In the Google Cloud console, on the project selector page, select or create a Google Cloud project.
-
Make sure that billing is enabled for your Google Cloud project. Learn how to check if billing is enabled on a project.
-
Enable the Notebooks and Cloud Storage APIs.
-
In the Google Cloud console, on the project selector page, select or create a Google Cloud project.
-
Make sure that billing is enabled for your Google Cloud project. Learn how to check if billing is enabled on a project.
-
Enable the Notebooks and Cloud Storage APIs.
Prepare your environment
Open Cloud Shell.
Clone the repository that contains the code for this tutorial:
git clone "https://github.com/GoogleCloudPlatform/notebooks-extended-uis.git"
For more information, see the GitHub repository.
If you do not have Terraform set up for Google Cloud, follow the instructions in Getting Started with the Google Provider to install Terraform with the Google Cloud provider.
Set up OAuth 2.0 flow for the notebooks manager web application
This procedure is performed manually in the Google Cloud console to mitigate abuse risks.
Go to the Google Cloud console.
To set up a consent screen, follow the instructions in Setting up your OAuth consent screen.
This defines the user experience when users authenticate to the notebooks manager and authorize scope access. We recommend setting the user type as internal if possible.
Create an OAuth 2.0 web client ID. This lets you grant the notebooks manager access to the Vertex AI API on behalf of the user while keeping credentials private. To create an OAuth 2.0 client, follow the steps in Setting up OAuth 2.0 for Web Application.
Copy the client ID from the web application page in the Google Cloud console. The client ID looks similar to the following:
123456789-a1b2c3d4e5.apps.googleusercontent.com
Terraform uses this client ID to update the config.tpl file when it deploys the application.
For more information, including a list of Terraform variables, see the GitHub repository's README file.
Deploy the notebooks manager
The notebooks manager is an HTML page that you can deploy on any Google Cloud service, like Cloud Run, that can serve web pages and that's supported by VPC Service Controls. In this solution, the default hosting option is Cloud Storage.
Cloud Shell, go to the repository directory:
cd notebooks-extended-uis
Open the
terraforms.tfvars
file in a text editor.For information about the required variables in this file, see Inputs in the GitHub repository. The repository provides an example
terraforms.tfvars
file:In the
terraform.tfvars
file, set theclient_id
variable using the value that you copied in the previous section.Set the value of the
console_url
variable to a unique name.This value is used as the name of a Cloud Storage bucket, so it must be globally unique.
Save and close the
terraform.tfvars
file.Deploy the infrastructure:
terraform apply
The Terraform script deploys the notebooks manager by doing the following:
- Creating a bucket in Cloud Storage. The name of the bucket
is derived from the value of the
console_url
Terraform variable. - Using
*.tpl
template files to create static files such asindex.html
,404.html
, andconfig.js
. Copying all static files to the Cloud Storage bucket. The
.tpl
files contain variables, including the following:client_id
: Found in theconfig.tpl
file. This value is required for the OAuth 2.0 flow. When you set the value in thetfvars
file, theconfig.tpl
value is used in theconfig.tpl
file to create aconfig.js
file.relative_path
: Found in theindex.tpl
and404.tpl
files. This value defines where to find the static file; it's based on variables that are defined in themain.tf
file. This value is required in order to load static files locally.
After Terraform completes these steps, the notebooks manager is available at the following URL:
https://storage.googleapis.com/BUCKET_NAME/index.html
BUCKET_NAME
is the name of the Cloud Storage bucket
where you deployed the notebooks manager. It must match the
console_url
value that's in your terraform.tfvars
file.
The Terraform script also lets you use a static bucket with your own domain name, but you must grant access to an additional IP address. For more information, see the README file in the GitHub repository.
Deploy extended notebooks UIs
This solution described in this tutorial assumes that you deploy the extended notebooks UIs on behalf of end users. This section is optional and serves as an example of how to do that. You can integrate this example in your own processes to create instances of user-managed notebooks, whether you do it at the same time as when you deploy the notebooks manager or separately. The example shows how to create a user-managed notebooks instance with the extended notebooks UI by using a custom container image. The image includes features such as the following:
- Cloud Storage and BigQuery extensions provide interactive features that are similar to features in the Google Cloud console.
- Git support lets users store and manage their user-managed notebooks and local files.
- The notebook executor lets you run user-managed notebooks end to end in the background.
The extended notebooks UI uses Google Cloud and BigQuery
extension add-ons for JupyterLab. The add-ons are enabled when the
enable-extended-ui
key is set to True
in the user-managed notebooks instance
metadata. For the architecture described in this document, the key is set in
the Terraform script that deploys the example instance, as shown in the
following listing:
The main.tf
script deploys a user-managed notebooks instance as an example. You can
adapt this part of the script to create your pool of extended notebooks
UIs based on your hardware, software, and accelerator requirements.
This solution does not enforce any IAM permissions. If your company policies prevent users from accessing each other's user-managed notebooks, you should use additional security features such as OS Login, single-user access, or IAM permissions. Setting up access to user-managed notebooks is out of scope for this solution.
Provide access for data practitioners
Before users can use the notebooks manager, you need to do the following:
Publish your application.
This step lets users access your application's endpoint and is required only for non-internal applications. You perform this step from the OAuth consent screen. To learn more, see Setting up your OAuth consent screen.
Provide users with the link to the notebooks manager.
This lets users access the notebooks manager from a client that's authorized by the perimeters of the VPC Service Controls. When you use the default
deployment_context
parameter, the link looks similar to the following:https://storage.googleapis.com/BUCKET_NAME/index.html?projectId=PROJECT_ID
For more information about running the Terraform commands, see the GitHub README file.
Select hosting options for the notebooks manager
The solution described in this document hosts the notebooks manager on Cloud Storage because the application is a static web page. Cloud Storage makes it easy to deploy the solution. The solution is supported by VPC Service Controls, and it provides regional options.
As you can see in the GitHub repository, the static page is part of a Docker folder with an Nginx setup. That folder hierarchy is independent from deploying on Cloud Storage, but it provides flexibility in case you want to expand the capabilities of the notebooks manager and you need to build a container image.
For example, you might want to add a custom backend server and deploy it on another Google Cloud offering that supports containers. Options include Cloud Run, an internal deployment on Google Kubernetes Engine (GKE), or a managed instance group with container images.
If you do not use the default deployment option to Cloud Storage, you
must create the index.html
, 404.html
, and config.js
files from the
.tpl
files. You can create those files either manually by replacing the
templated variables
or through a Terraform script that's similar to the one that's provided in the
GitHub repository.
Set up a bastion host
Access to the notebooks manager and extended notebooks UIs requires a client that's within the same perimeter as those applications. For example, some companies use a bastion host approach with remote desktops.
Grant access to URLs
To make sure that users can go through the OAuth 2.0 authorization flow, interact with the Vertex AI API, and access BigQuery from Vertex AI Workbench, you need to grant access to the following external URLs in the organization's firewall rules.
You might need to grant access to additional URLs that pertain to your identity provider to support the browser sign-in flow.
A user's ability to perform tasks with the Vertex AI API depends on the IAM permissions that you set up in your project and organization. The notebooks manager does not enforce any security because it is only a client-side tool.
External URL | Description |
---|---|
*.accounts.google.com |
Used for OAuth 2.0 flow. |
*.accounts.youtube.com |
Used for OAuth 2.0 flow. |
*.gstatic.com |
Used for OAuth 2.0 flow and a favicon. |
*.googleusercontent.com |
Implicitly allows notebooks.googleusercontent.com . |
*.datalab.cloud.google.com |
Used to get a notebook's proxy URL. |
content-cloudresourcemanager.googleapis.com |
Used to list projects. |
content-notebooks.googleapis.com |
Used when calling actions on notebooks. |
https://apis.google.com/js/googleapis.proxy.js.* |
Used for OAuth 2.0 flow. |
https://apis.google.com/_/scs/apps-static/.* |
Used for OAuth 2.0 flow. |
*.notebooks.googleapis.com |
Used when calling actions on notebooks. |
*.notebooks.cloud.google.com |
Used by the Vertex AI Workbench viewer service. |
cdn.jsdelivr.net/npm |
Enables the BigQuery add-on to work. |
What's next
- To learn more about the architecture and objectives of the solution described in this document, see Separate operations and development when using user-managed notebooks: Overview.
- Learn how to use and troubleshoot the solution described in this document.
- To learn details about the code, see the GitHub repository.
- To add features, report issues, or contribute to the code, use the GitHub repository.
- For more reference architectures, diagrams, and best practices, explore the Cloud Architecture Center.