On-premises hosts can reach a Vertex AI online prediction endpoint either through the public internet or privately through a hybrid networking architecture that uses Private Service Connect (PSC) over Cloud VPN or Cloud Interconnect. Both options offer SSL/TLS encryption. However, the private option offers much better performance and is therefore recommended for critical applications.
In this tutorial, you use High-Availability VPN (HA VPN) to access an online prediction endpoint both publicly, through Cloud NAT; and privately, between two Virtual Private Cloud networks that can serve as a basis for multi-cloud and on-premises private connectivity.
This tutorial is intended for enterprise network administrators, data scientists, and researchers who are familiar with Vertex AI, Virtual Private Cloud (VPC), the Google Cloud console, and the Cloud Shell. Familiarity with Vertex AI Workbench is helpful but not required.
Objectives
- Create two Virtual Private Cloud (VPC) networks, as shown in the preceding
diagram:
- One (
on-prem-vpc
) represents an on-premises network. - The other (
aiml-vpc
) is for building and deploying a Vertex AI online prediction model.
- One (
- Deploy HA VPN gateways, Cloud VPN tunnels, and
Cloud Routers to connect
aiml-vpc
andon-prem-vpc
. - Build and deploy a Vertex AI online prediction model.
- Create a Private Service Connect (PSC) endpoint to forward private online prediction requests to the deployed model.
- Enable the Cloud Router custom advertisement mode in
aiml-vpc
to announce routes for the Private Service Connect endpoint toon-prem-vpc
. - Create two Compute Engine VM instances in
on-prem-vpc
to represent client applications:- One (
nat-client
) sends online prediction requests over the public internet (through Cloud NAT). This access method is indicated by a red arrow and the number 1 in the diagram. - The other (
private-client
) sends prediction requests privately over HA VPN. This access method is indicated by a green arrow and the number 2.
- One (
Costs
In this document, you use the following billable components of Google Cloud:
To generate a cost estimate based on your projected usage,
use the pricing calculator.
When you finish the tasks that are described in this document, you can avoid continued billing by deleting the resources that you created. For more information, see Clean up.
Before you begin
-
In the Google Cloud console, go to the project selector page.
-
Select or create a Google Cloud project.
-
Make sure that billing is enabled for your Google Cloud project.
- Open Cloud Shell to execute the commands listed in this tutorial. Cloud Shell is an interactive shell environment for Google Cloud that lets you manage your projects and resources from your web browser.
- In the Cloud Shell, set the current project to your
Google Cloud project ID and store the same
project ID into the
projectid
shell variable:projectid="PROJECT_ID" gcloud config set project ${projectid}
Replace PROJECT_ID with your project ID. If necessary, you can locate your project ID in the Google Cloud console. For more information, see Find your project ID. -
Grant roles to your user account. Run the following command once for each of the following IAM roles:
roles/appengine.appViewer, roles/artifactregistry.admin, roles/compute.instanceAdmin.v1, roles/compute.networkAdmin, roles/compute.securityAdmin, roles/dns.admin, roles/iap.admin, roles/iap.tunnelResourceAccessor, roles/notebooks.admin, roles/oauthconfig.editor, roles/resourcemanager.projectIamAdmin, roles/servicemanagement.quotaAdmin, roles/iam.serviceAccountAdmin, roles/iam.serviceAccountUser, roles/servicedirectory.editor, roles/storage.admin, roles/aiplatform.user
gcloud projects add-iam-policy-binding PROJECT_ID --member="USER_IDENTIFIER" --role=ROLE
- Replace
PROJECT_ID
with your project ID. -
Replace
USER_IDENTIFIER
with the identifier for your user account. For example,user:myemail@example.com
. - Replace
ROLE
with each individual role.
- Replace
-
Enable the DNS, Artifact Registry, IAM, Compute Engine, Notebooks, and Vertex AI APIs:
gcloud services enable dns.googleapis.com
artifactregistry.googleapis.com iam.googleapis.com compute.googleapis.com notebooks.googleapis.com aiplatform.googleapis.com
Create the VPC networks
In this section, you create two VPC networks: one for creating an online prediction model and deploying it to an endpoint, the other for private access to that endpoint. In each of the two VPC networks, you create a Cloud Router and Cloud NAT gateway. A Cloud NAT gateway provides outgoing connectivity for Compute Engine virtual machine (VM) instances without external IP addresses.
Create the VPC network for the online prediction endpoint (aiml-vpc
)
Create the VPC network:
gcloud compute networks create aiml-vpc \ --project=$projectid \ --subnet-mode=custom
Create a subnet named
workbench-subnet
, with a primary IPv4 range of172.16.10.0/28
:gcloud compute networks subnets create workbench-subnet \ --project=$projectid \ --range=172.16.10.0/28 \ --network=aiml-vpc \ --region=us-central1 \ --enable-private-ip-google-access
Create a regional Cloud Router named
cloud-router-us-central1-aiml-nat
:gcloud compute routers create cloud-router-us-central1-aiml-nat \ --network aiml-vpc \ --region us-central1
Add a Cloud NAT gateway to the Cloud Router:
gcloud compute routers nats create cloud-nat-us-central1 \ --router=cloud-router-us-central1-aiml-nat \ --auto-allocate-nat-external-ips \ --nat-all-subnet-ip-ranges \ --region us-central1
Create the "on-premises" VPC network (on-prem-vpc
)
Create the VPC network:
gcloud compute networks create on-prem-vpc \ --project=$projectid \ --subnet-mode=custom
Create a subnet named
nat-subnet
, with a primary IPv4 range of192.168.10.0/28
:gcloud compute networks subnets create nat-subnet \ --project=$projectid \ --range=192.168.10.0/28 \ --network=on-prem-vpc \ --region=us-central1
Create a subnet named
private-ip-subnet
, with a primary IPv4 range of192.168.20.0/28
:gcloud compute networks subnets create private-ip-subnet \ --project=$projectid \ --range=192.168.20.0/28 \ --network=on-prem-vpc \ --region=us-central1
Create a regional Cloud Router named
cloud-router-us-central1-on-prem-nat
:gcloud compute routers create cloud-router-us-central1-on-prem-nat \ --network on-prem-vpc \ --region us-central1
Add a Cloud NAT gateway to the Cloud Router:
gcloud compute routers nats create cloud-nat-us-central1 \ --router=cloud-router-us-central1-on-prem-nat \ --auto-allocate-nat-external-ips \ --nat-all-subnet-ip-ranges \ --region us-central1
Create the Private Service Connect (PSC) endpoint
In this section, you create the Private Service Connect (PSC)
endpoint that the VM instances in the on-prem-vpc
network use to access the
online prediction endpoint through the Vertex AI API.
The Private Service Connect (PSC) endpoint is an internal IP
address in the on-prem-vpc
network that can be directly accessed by clients
in that network. This endpoint is created by deploying a forwarding rule
that directs network traffic that matches the PSC endpoint's IP address
to a bundle of Google APIs.
The PSC endpoint's
IP address (100.100.10.10
) will be advertised from the
aiml-cr-us-central1
Cloud Router as a custom advertised route to the
on-prem-vpc
network in a later step.
Reserve IP addresses for the PSC endpoint:
gcloud compute addresses create psc-ip \ --global \ --purpose=PRIVATE_SERVICE_CONNECT \ --addresses=100.100.10.10 \ --network=aiml-vpc
Create the PSC endpoint:
gcloud compute forwarding-rules create pscvertex \ --global \ --network=aiml-vpc \ --address=psc-ip \ --target-google-apis-bundle=all-apis
List the configured PSC endpoints and verify that the
pscvertex
endpoint was created:gcloud compute forwarding-rules list \ --filter target="(all-apis OR vpc-sc)" --global
Get the details of the configured PSC endpoint and verify that the IP address is
100.100.10.10
:gcloud compute forwarding-rules describe pscvertex \ --global
Configure hybrid connectivity
In this section, you create two (HA VPN) gateways that are connected to each other. Each gateway contains a Cloud Router and a pair of VPN tunnels.
Create the HA VPN gateway for the
aiml-vpc
VPC network:gcloud compute vpn-gateways create aiml-vpn-gw \ --network=aiml-vpc \ --region=us-central1
Create the HA VPN gateway for the
on-prem-vpc
VPC network:gcloud compute vpn-gateways create on-prem-vpn-gw \ --network=on-prem-vpc \ --region=us-central1
In the Google Cloud console, go to the VPN page.
On the VPN page, click the Cloud VPN Gateways tab.
In the list of VPN gateways, verify that there are two gateways and that each one has two IP addresses.
In the Cloud Shell, create a Cloud Router for the
aiml-vpc
Virtual Private Cloud network:gcloud compute routers create aiml-cr-us-central1 \ --region=us-central1 \ --network=aiml-vpc \ --asn=65001
Create a Cloud Router for the
on-prem-vpc
Virtual Private Cloud network:gcloud compute routers create on-prem-cr-us-central1 \ --region=us-central1 \ --network=on-prem-vpc \ --asn=65002
Create the VPN tunnels for aiml-vpc
Create a VPN tunnel called
aiml-vpc-tunnel0
:gcloud compute vpn-tunnels create aiml-vpc-tunnel0 \ --peer-gcp-gateway on-prem-vpn-gw \ --region us-central1 \ --ike-version 2 \ --shared-secret [ZzTLxKL8fmRykwNDfCvEFIjmlYLhMucH] \ --router aiml-cr-us-central1 \ --vpn-gateway aiml-vpn-gw \ --interface 0
Create a VPN tunnel called
aiml-vpc-tunnel1
:gcloud compute vpn-tunnels create aiml-vpc-tunnel1 \ --peer-gcp-gateway on-prem-vpn-gw \ --region us-central1 \ --ike-version 2 \ --shared-secret [bcyPaboPl8fSkXRmvONGJzWTrc6tRqY5] \ --router aiml-cr-us-central1 \ --vpn-gateway aiml-vpn-gw \ --interface 1
Create the VPN tunnels for on-prem-vpc
Create a VPN tunnel called
on-prem-vpc-tunnel0
:gcloud compute vpn-tunnels create on-prem-tunnel0 \ --peer-gcp-gateway aiml-vpn-gw \ --region us-central1 \ --ike-version 2 \ --shared-secret [ZzTLxKL8fmRykwNDfCvEFIjmlYLhMucH] \ --router on-prem-cr-us-central1 \ --vpn-gateway on-prem-vpn-gw \ --interface 0
Create a VPN tunnel called
on-prem-vpc-tunnel1
:gcloud compute vpn-tunnels create on-prem-tunnel1 \ --peer-gcp-gateway aiml-vpn-gw \ --region us-central1 \ --ike-version 2 \ --shared-secret [bcyPaboPl8fSkXRmvONGJzWTrc6tRqY5] \ --router on-prem-cr-us-central1 \ --vpn-gateway on-prem-vpn-gw \ --interface 1
In the Google Cloud console, go to the VPN page.
On the VPN page, click the Cloud VPN Tunnels tab.
In the list of VPN tunnels, verify that four VPN tunnels have been established.
Establish BGP sessions
Cloud Router uses Border Gateway Protocol (BGP) to exchange routes between
your VPC network (in this case, aiml-vpc
) and your
on-premises network (represented by on-prem-vpc
). On Cloud Router,
you configure an interface and a BGP peer for your on-premises router.
The interface and BGP peer configuration together form a BGP session.
In this section, you create two BGP sessions for aiml-vpc
and
two for on-prem-vpc
.
Establish BGP sessions for aiml-vpc
In the Cloud Shell, create the first BGP interface:
gcloud compute routers add-interface aiml-cr-us-central1 \ --interface-name if-tunnel0-to-onprem \ --ip-address 169.254.1.1 \ --mask-length 30 \ --vpn-tunnel aiml-vpc-tunnel0 \ --region us-central1
Create the first BGP peer:
gcloud compute routers add-bgp-peer aiml-cr-us-central1 \ --peer-name bgp-on-premises-tunnel0 \ --interface if-tunnel1-to-onprem \ --peer-ip-address 169.254.1.2 \ --peer-asn 65002 \ --region us-central1
Create the second BGP interface:
gcloud compute routers add-interface aiml-cr-us-central1 \ --interface-name if-tunnel1-to-onprem \ --ip-address 169.254.2.1 \ --mask-length 30 \ --vpn-tunnel aiml-vpc-tunnel1 \ --region us-central1
Create the second BGP peer:
gcloud compute routers add-bgp-peer aiml-cr-us-central1 \ --peer-name bgp-on-premises-tunnel1 \ --interface if-tunnel2-to-onprem \ --peer-ip-address 169.254.2.2 \ --peer-asn 65002 \ --region us-central1
Establish BGP sessions for on-prem-vpc
Create the first BGP interface:
gcloud compute routers add-interface on-prem-cr-us-central1 \ --interface-name if-tunnel0-to-aiml-vpc \ --ip-address 169.254.1.2 \ --mask-length 30 \ --vpn-tunnel on-prem-tunnel0 \ --region us-central1
Create the first BGP peer:
gcloud compute routers add-bgp-peer on-prem-cr-us-central1 \ --peer-name bgp-aiml-vpc-tunnel0 \ --interface if-tunnel1-to-aiml-vpc \ --peer-ip-address 169.254.1.1 \ --peer-asn 65001 \ --region us-central1
Create the second BGP interface:
gcloud compute routers add-interface on-prem-cr-us-central1 \ --interface-name if-tunnel1-to-aiml-vpc \ --ip-address 169.254.2.2 \ --mask-length 30 \ --vpn-tunnel on-prem-tunnel1 \ --region us-central1
Create the second BGP peer:
gcloud compute routers add-bgp-peer on-prem-cr-us-central1 \ --peer-name bgp-aiml-vpc-tunnel1 \ --interface if-tunnel2-to-aiml-vpc \ --peer-ip-address 169.254.2.1 \ --peer-asn 65001 \ --region us-central1
Validate BGP session creation
In the Google Cloud console, go to the VPN page.
On the VPN page, click the Cloud VPN Tunnels tab.
In the list of VPN tunnels, you should now see that the value in the BGP session status column for each of the four tunnels has changed from Configure BGP session to BGP established. You may need to refresh the Google Cloud console browser tab to see the new values.
Validate that aiml-vpc
has learned subnet routes over HA VPN
In the Google Cloud console, go to the VPC networks page.
In the list of VPC networks, click
aiml-vpc
.Click the Routes tab.
Select us-central1 (Iowa) in the Region list and click View.
In the Destination IP range column, verify that the
aiml-vpc
VPC network has learned routes from theon-prem-vpc
VPC networks'snat-subnet
subnet (192.168.10.0/28
) andprivate-ip-subnet
(192.168.20.0/28
) subnet.
Validate that on-prem-vpc
has learned subnet routes over HA VPN
In the Google Cloud console, go to the VPC networks page.
In the list of VPC networks, click
on-prem-vpc
.Click the Routes tab.
Select us-central1 (Iowa) in the Region list and click View.
In the Destination IP range column, verify that the
on-prem-vpc
VPC network has learned routes from theaiml-vpc
VPC networks'sworkbench-subnet
subnet (172.16.10.0/28
).
Create a custom advertised route for aiml-vpc
The Private Service Connect endpoint IP address is not automatically
advertised by the aiml-cr-us-central1
Cloud Router because the subnet is
not configured in the VPC network.
Therefore, you will need to create a custom advertised route from the
aiml-cr-us-central
Cloud Router for the endpoint IP Address 100.100.10.10
that is advertised to the on-premises environment over BGP to the on-prem-vpc
.
In the Google Cloud console, go to the Cloud Routers page.
In the Cloud Router list, click
aiml-cr-us-central1
.On the Router details page, click
Edit.In the Advertised routes section, for Routes, select Create custom routes.
Click Add a custom route.
For Source, select Custom IP range.
For IP address range, enter
100.100.10.10
.For Description, enter
Private Service Connect Endpoint IP
.Click Done, and then click Save.
Validate that on-prem-vpc
has learned the PSC endpoint IP address over HA VPN
In the Google Cloud console, go to the VPC networks page.
In the list of VPC networks, click
on-prem-vpc
.Click the Routes tab.
Select us-central1 (Iowa) in the Region list and click View.
In the Destination IP range column, verify that the
on-prem-vpc
VPC network has learned the PSC endpoint's IP address (100.100.10.10
).
Create a custom advertised route for on-prem-vpc
The on-prem-vpc
Cloud Router advertises all subnets by default,
but only the private-ip-subnet
subnet is needed.
In the following section, update the route advertisements from the
on-prem-cr-us-central1
Cloud Router.
In the Google Cloud console, go to the Cloud Routers page.
In the Cloud Router list, click
on-prem-cr-us-central1
.On the Router details page, click
Edit.In the Advertised routes section, for Routes, select Create custom routes.
If the Advertise all subnets visible to the Cloud Router checkbox is selected, clear it.
Click Add a custom route.
For Source, select Custom IP range.
For IP address range, enter
192.168.20.0/28
.For Description, enter
Private Service Connect Endpoint IP subnet (private-ip-subnet)
.Click Done, and then click Save.
Validate that aiml-vpc
has learned the private-ip-subnet
route from the on-prem-vpc
In the Google Cloud console, go to the VPC networks page.
In the list of VPC networks, click
aiml-vpc
.Click the Routes tab.
Select us-central1 (Iowa) in the Region list and click View.
In the Destination IP range column, verify that the
aiml-vpc
VPC network has learned theprivate-ip-subnet
route (192.168.20.0/28
).
Create the test VM instances
Create a user-managed service account
If you have applications that need to call Google Cloud APIs, Google recommends that you attach a user-managed service account to the VM on which the application or workload is running. Accordingly, in this section you create a user-managed service account to be applied to the VM instances that you create later in this tutorial.
In the Cloud Shell, create the service account:
gcloud iam service-accounts create gce-vertex-sa \ --description="service account for vertex" \ --display-name="gce-vertex-sa"
Assign the Compute Instance Admin (v1) (
roles/compute.instanceAdmin.v1
) IAM role to the service account:gcloud projects add-iam-policy-binding $projectid \ --member="serviceAccount:gce-vertex-sa@$projectid.iam.gserviceaccount.com" \ --role="roles/compute.instanceAdmin.v1"
Assign the Vertex AI User (
roles/aiplatform.user
) IAM role to the service account:gcloud projects add-iam-policy-binding $projectid \ --member="serviceAccount:gce-vertex-sa@$projectid.iam.gserviceaccount.com" \ --role="roles/aiplatform.user"
Create the test VM instances
In this step you create test VM instances to validate different methods to reach Vertex AI APIs, specifically:
- The
nat-client
instance uses Cloud NAT to resolve Vertex AI to access the Online Prediction endpoint over the public internet. - The
private-client
instance uses the Private Service Connect IP address100.100.10.10
to access the online prediction endpoint over HA VPN.
To allow Identity-Aware Proxy (IAP) to connect to your VM instances, you create a firewall rule that:
- Applies to all VM instances that you want to make accessible through IAP.
- Allows TCP traffic through port 22 from the IP range
35.235.240.0/20
. This range contains all IP addresses that IAP uses for TCP forwarding.
Create the
nat-client
VM instance:gcloud compute instances create nat-client \ --zone=us-central1-a \ --image-family=debian-11 \ --image-project=debian-cloud \ --subnet=nat-subnet \ --service-account=gce-vertex-sa@$projectid.iam.gserviceaccount.com \ --scopes=https://www.googleapis.com/auth/cloud-platform \ --no-address \ --metadata startup-script="#! /bin/bash sudo apt-get update sudo apt-get install tcpdump dnsutils -y"
Create the
private-client
VM instance:gcloud compute instances create private-client \ --zone=us-central1-a \ --image-family=debian-11 \ --image-project=debian-cloud \ --subnet=private-ip-subnet \ --service-account=gce-vertex-sa@$projectid.iam.gserviceaccount.com \ --scopes=https://www.googleapis.com/auth/cloud-platform \ --no-address \ --metadata startup-script="#! /bin/bash sudo apt-get update sudo apt-get install tcpdump dnsutils -y"
Create the IAP firewall rule:
gcloud compute firewall-rules create ssh-iap-on-prem-vpc \ --network on-prem-vpc \ --allow tcp:22 \ --source-ranges=35.235.240.0/20
Create a Vertex AI Workbench instance
Create a user-managed service account for Vertex AI Workbench
When you create a Vertex AI Workbench instance,
Google strongly recommends that you specify a user-managed service account instead
of using the Compute Engine default service account.
If your organization doesn't enforce the
iam.automaticIamGrantsForDefaultServiceAccounts
organization policy constraint, the Compute Engine default service account
(and thus anyone you specify
as an instance user) is granted the Editor role (roles/editor
) on your
Google Cloud project. To turn off this behavior, see
Disable automatic role grants for default service accounts.
In the Cloud Shell, create a service account named
workbench-sa
:gcloud iam service-accounts create workbench-sa \ --display-name="workbench-sa"
Assign the Storage Admin (
roles/storage.admin
) IAM role to the service account:gcloud projects add-iam-policy-binding $projectid \ --member="serviceAccount:workbench-sa@$projectid.iam.gserviceaccount.com" \ --role="roles/storage.admin"
Assign the Vertex AI User (
roles/aiplatform.user
) IAM role to the service account:gcloud projects add-iam-policy-binding $projectid \ --member="serviceAccount:workbench-sa@$projectid.iam.gserviceaccount.com" \ --role="roles/aiplatform.user"
Assign the Artifact Registry Administrator IAM role to the service account:
gcloud projects add-iam-policy-binding $projectid \ --member="serviceAccount:workbench-sa@$projectid.iam.gserviceaccount.com" \ --role="roles/artifactregistry.admin"
Create the Vertex AI Workbench instance
In Cloud Shell, create a Vertex AI Workbench instance, specifying the
workbench-sa
service account:gcloud workbench instances create workbench-tutorial \ --vm-image-project=deeplearning-platform-release \ --vm-image-family=common-cpu-notebooks \ --machine-type=n1-standard-4 \ --location=us-central1-a \ --subnet-region=us-central1 \ --shielded-secure-boot=True \ --subnet=workbench-subnet \ --disable-public-ip \ --service-account-email=workbench-sa@$projectid.iam.gserviceaccount.com
Create and deploy an online prediction model
Prepare your environment
In the Google Cloud console, go to the Instances tab on the Vertex AI Workbench page.
Next to your Vertex AI Workbench instance's name (
workbench-tutorial
), click Open JupyterLab.Your Vertex AI Workbench instance opens JupyterLab.
In the rest of this section, up to and including model deployment, you'll be working in Jupyterlab, not the Google Cloud console or the Cloud Shell.
Select File > New > Terminal.
In the JupyterLab terminal (not the Cloud Shell), define an environment variable for your project. Replace PROJECT_ID with your project ID:
PROJECT_ID=PROJECT_ID
Create a new directory called
cpr-codelab
andcd
into it (still in the JupyterLab terminal):mkdir cpr-codelab cd cpr-codelab
In the
File Browser, double-click the newcpr-codelab
folder.If this folder doesn't appear in the file browser, refresh the Google Cloud console browser tab, and try again.
Select File > New > Notebook.
From the Select Kernel menu, select Python [conda env:base] * (Local) and click Select.
Rename your new notebook file as follows:
In the
File Browser, right-click theUntitled.ipynb
file icon and entertask.ipynb
.Your
cpr-codelab
directory should now look like this:+ cpr-codelab/ + task.ipynb
In the following steps, you create your model in the Jupyterlab notebook by creating new notebook cells, pasting code into them, and running the cells.
Install dependencies as follows.
When you open your new notebook, there is a default code cell where you can enter code. It looks like
[ ]:
followed by a text field. That text field is where you paste your code.Paste the following code into the cell and click
Run the selected cells and advance to create arequirements.txt
file to be used as input to the following step:%%writefile requirements.txt fastapi uvicorn==0.17.6 joblib~=1.1.1 numpy>=1.17.3, <1.24.0 scikit-learn>=1.2.2 pandas google-cloud-storage>=2.2.1,<3.0.0dev google-cloud-aiplatform[prediction]>=1.18.2
In this step and each of the following ones, add a code cell by clicking
Insert a cell below, paste the code into the cell, and then click Run the selected cells and advance.Use
Pip
to install dependencies in the notebooks instance:!pip install -U --user -r requirements.txt
When installation is complete, select Kernel > Restart kernel to restart the kernel and ensure that the library is available for import.
Paste the following code into a new notebook cell to create the directories to store the model and preprocessing artifacts:
USER_SRC_DIR = "src_dir" !mkdir $USER_SRC_DIR !mkdir model_artifacts # copy the requirements to the source dir !cp requirements.txt $USER_SRC_DIR/requirements.txt
In the
File Browser, yourcpr-codelab
directory structure should now look like this:+ cpr-codelab/ + model_artifacts/ + src_dir/ + requirements.txt + requirements.txt + task.ipynb
Train the model
Continue adding code cells to the task.ipynb
notebook, and paste in and
run the following code in each new cell:
Import the libraries:
import seaborn as sns import numpy as np import pandas as pd from sklearn import preprocessing from sklearn.ensemble import RandomForestRegressor from sklearn.pipeline import make_pipeline from sklearn.compose import make_column_transformer import joblib import logging # set logging to see the docker container logs logging.basicConfig(level=logging.INFO)
Define the following variables, replacing PROJECT_ID with your project ID:
REGION = "us-central1" MODEL_ARTIFACT_DIR = "sklearn-model-artifacts" REPOSITORY = "diamonds" IMAGE = "sklearn-image" MODEL_DISPLAY_NAME = "diamonds-cpr" PROJECT_ID = "PROJECT_ID" BUCKET_NAME = "gs://PROJECT_ID-cpr-bucket"
Create a Cloud Storage bucket:
!gcloud storage buckets create $BUCKET_NAME --location=us-central1
Load the data from the seaborn library and then create two data frames, one with the features and the other with the label:
data = sns.load_dataset('diamonds', cache=True, data_home=None) label = 'price' y_train = data['price'] x_train = data.drop(columns=['price'])
Look at the training data and verify that each row represents a diamond.
x_train.head()
Look at the labels, which are the corresponding prices.
y_train.head()
Define a sklearn column transform to one hot encode the categorical features and scale the numerical features:
column_transform = make_column_transformer( (preprocessing.OneHotEncoder(), [1,2,3]), (preprocessing.StandardScaler(), [0,4,5,6,7,8]))
Define the random forest model:
regr = RandomForestRegressor(max_depth=10, random_state=0)
Make a sklearn pipeline. This pipeline takes input data, encodes and scales it, and passes it to the model.
my_pipeline = make_pipeline(column_transform, regr)
Train the model:
my_pipeline.fit(x_train, y_train)
Call the predict method on the model, passing in a test sample.
my_pipeline.predict([[0.23, 'Ideal', 'E', 'SI2', 61.5, 55.0, 3.95, 3.98, 2.43]])
You may see warnings like
"X does not have valid feature names, but"
, but you can ignore them.Save the pipeline to the
model_artifacts
directory and copy it to your Cloud Storage bucket:joblib.dump(my_pipeline, 'model_artifacts/model.joblib') !gcloud storage cp model_artifacts/model.joblib {BUCKET_NAME}/{MODEL_ARTIFACT_DIR}/
Save a preprocessing artifact
Create a preprocessing artifact. This artifact will be loaded into the custom container when the model server starts up. Your preprocessing artifact can be of almost any form (such as a pickle file), but in this case you'll write a dictionary to a JSON file:
clarity_dict={"Flawless": "FL", "Internally Flawless": "IF", "Very Very Slightly Included": "VVS1", "Very Slightly Included": "VS2", "Slightly Included": "S12", "Included": "I3"}
Build a custom serving container using the CPR model server
The
clarity
feature in our training data was always in the abbreviated form (ie "FL" instead of "Flawless"). At serving time, we want to check that the data for this feature is also abbreviated. This is because our model knows how to one hot encode "FL" but not "Flawless". You'll write this custom preprocessing logic later. But for now, just save this lookup table to a JSON file and then write it to your Cloud Storage bucket:import json with open("model_artifacts/preprocessor.json", "w") as f: json.dump(clarity_dict, f) !gcloud storage cp model_artifacts/preprocessor.json {BUCKET_NAME}/{MODEL_ARTIFACT_DIR}/
In the
File Browser, your directory structure should now look like this:+ cpr-codelab/ + model_artifacts/ + model.joblib + preprocessor.json + src_dir/ + requirements.txt + requirements.txt + task.ipynb
In your notebook, paste in and run the following code to subclass the
SklearnPredictor
and write it to a Python file in thesrc_dir/
. Note that in this example we are only customizing the load, preprocess, and postprocess methods, and not the predict method.%%writefile $USER_SRC_DIR/predictor.py import joblib import numpy as np import json from google.cloud import storage from google.cloud.aiplatform.prediction.sklearn.predictor import SklearnPredictor class CprPredictor(SklearnPredictor): def __init__(self): return def load(self, artifacts_uri: str) -> None: """Loads the sklearn pipeline and preprocessing artifact.""" super().load(artifacts_uri) # open preprocessing artifact with open("preprocessor.json", "rb") as f: self._preprocessor = json.load(f) def preprocess(self, prediction_input: np.ndarray) -> np.ndarray: """Performs preprocessing by checking if clarity feature is in abbreviated form.""" inputs = super().preprocess(prediction_input) for sample in inputs: if sample[3] not in self._preprocessor.values(): sample[3] = self._preprocessor[sample[3]] return inputs def postprocess(self, prediction_results: np.ndarray) -> dict: """Performs postprocessing by rounding predictions and converting to str.""" return {"predictions": [f"${value}" for value in np.round(prediction_results)]}
Use the Vertex AI SDK for Python to build the image using custom prediction routines. The Dockerfile is generated and an image is built for you.
from google.cloud import aiplatform aiplatform.init(project=PROJECT_ID, location=REGION) import os from google.cloud.aiplatform.prediction import LocalModel from src_dir.predictor import CprPredictor # Should be path of variable $USER_SRC_DIR local_model = LocalModel.build_cpr_model( USER_SRC_DIR, f"{REGION}-docker.pkg.dev/{PROJECT_ID}/{REPOSITORY}/{IMAGE}", predictor=CprPredictor, requirements_path=os.path.join(USER_SRC_DIR, "requirements.txt"), )
Write a test file with two samples for prediction. One of the instances has the abbreviated clarity name, but the other needs to be converted first.
import json sample = {"instances": [ [0.23, 'Ideal', 'E', 'VS2', 61.5, 55.0, 3.95, 3.98, 2.43], [0.29, 'Premium', 'J', 'Internally Flawless', 52.5, 49.0, 4.00, 2.13, 3.11]]} with open('instances.json', 'w') as fp: json.dump(sample, fp)
Test the container locally by deploying a local model.
with local_model.deploy_to_local_endpoint( artifact_uri = 'model_artifacts/', # local path to artifacts ) as local_endpoint: predict_response = local_endpoint.predict( request_file='instances.json', headers={"Content-Type": "application/json"}, ) health_check_response = local_endpoint.run_health_check()
You can see the prediction results with:
predict_response.content
The output looks like the following:
b'{"predictions": ["$479.0", "$586.0"]}'
Deploy the model to the online prediction model endpoint
Now that you've tested the container locally, it's time to push the image to Artifact Registry and upload the model to Vertex AI Model Registry.
Configure Docker to access Artifact Registry.
!gcloud artifacts repositories create {REPOSITORY} \ --repository-format=docker \ --location=us-central1 \ --description="Docker repository" !gcloud auth configure-docker {REGION}-docker.pkg.dev --quiet
Push the image.
local_model.push_image()
Upload the model.
model = aiplatform.Model.upload(local_model = local_model, display_name=MODEL_DISPLAY_NAME, artifact_uri=f"{BUCKET_NAME}/{MODEL_ARTIFACT_DIR}",)
Deploy the model:
endpoint = model.deploy(machine_type="n1-standard-2")
Wait until your model deploys before you continue to the next step. Expect deployment to take about 10 to 15 minutes.
Test the deployed model by getting a prediction:
endpoint.predict(instances=[[0.23, 'Ideal', 'E', 'VS2', 61.5, 55.0, 3.95, 3.98, 2.43]])
The output looks like the following:
Prediction(predictions=['$479.0'], deployed_model_id='3171115779319922688', metadata=None, model_version_id='1', model_resource_name='projects/721032480027/locations/us-central1/models/8554949231515795456', explanations=None)
Validate public internet access to Vertex AI APIs
In this section, you log into the nat-client
VM
instance in one Cloud Shell session tab and use another
session tab to validate connectivity to Vertex AI
APIs by running the dig
and tcpdump
commands against the domain
us-central1-aiplatform.googleapis.com
.
In the Cloud Shell (Tab One), run the following commands, replacing PROJECT_ID with your project ID:
projectid=PROJECT_ID gcloud config set project ${projectid}
Log into the
nat-client
VM instance using IAP:gcloud compute ssh nat-client \ --project=$projectid \ --zone=us-central1-a \ --tunnel-through-iap
Run the
dig
command:dig us-central1-aiplatform.googleapis.com
From the
nat-client
VM (Tab One), run the following command to validate DNS resolution when you send an online prediction request to the endpoint.sudo tcpdump -i any port 53 -n
Open a new Cloud Shell session (Tab Two) by clicking
open a new tab in Cloud Shell.In the new Cloud Shell session (Tab Two), run the following commands, replacing PROJECT_ID with your project ID:
projectid=PROJECT_ID gcloud config set project ${projectid}
Log into the
nat-client
VM instance:gcloud compute ssh --zone "us-central1-a" "nat-client" --project "$projectid"
From the
nat-client
VM (Tab Two), use a text editor such asvim
ornano
to create aninstances.json
file. You need to prependsudo
in order to have permission to write to the file, for example:sudo vim instances.json
Add the following data string to the file:
{"instances": [ [0.23, 'Ideal', 'E', 'VS2', 61.5, 55.0, 3.95, 3.98, 2.43], [0.29, 'Premium', 'J', 'Internally Flawless', 52.5, 49.0, 4.00, 2.13, 3.11]]}
Save the file as follows:
- If you're using
vim
, press theEsc
key, and then type:wq
to save the file and exit. - If you're using
nano
, typeControl+O
and pressEnter
to save the file, and then typeControl+X
to exit.
- If you're using
Locate the online prediction endpoint ID for the PSC endpoint:
In the Google Cloud console, in the Vertex AI section, go to the Endpoints tab in the Online prediction page.
Find the row of the endpoint that you created, named
diamonds-cpr_endpoint
.Locate the 19-digit endpoint ID in the ID column and copy it.
In the Cloud Shell, from the
nat-client
VM (Tab Two), run the following commands, replacing PROJECT_ID with your project ID and ENDPOINT_ID with the PSC endpoint ID:projectid=PROJECT_ID gcloud config set project ${projectid} ENDPOINT_ID=ENDPOINT_ID
From the
nat-client
VM (Tab Two), run the following command to send an online prediction request:curl -X POST -H "Authorization: Bearer $(gcloud auth print-access-token)" -H "Content-Type: application/json" https://us-central1-aiplatform.googleapis.com/v1/projects/${projectid}/locations/us-central1/endpoints/${ENDPOINT_ID}:predict -d @instances.json
Now that you've run the prediction, you'll see that the tcpdump
results
(Tab One) show the nat-client
VM instance (192.168.10.2
) performing a
Cloud DNS query to the local DNS server (169.254.169.254
) for the
Vertex AI API domain (us-central1-aiplatform.googleapis.com
).
The DNS query returns public Virtual IP Addresses (VIPs) for
Vertex AI APIs.
Validate private access to Vertex AI APIs
In this section, you log into the private-client
VM instance using
Identity-Aware Proxy in a new Cloud Shell session (Tab Three),
and then you validate
connectivity to Vertex AI APIs by running the dig
command against the
Vertex AI domain (us-central1-aiplatform.googleapis.com
).
Open a new Cloud Shell session (Tab Three) by clicking
open a new tab in Cloud Shell. This is Tab Three.In the new Cloud Shell session (Tab Three), run the following commands, replacing PROJECT_ID with your project ID:
projectid=PROJECT_ID gcloud config set project ${projectid}
Log into the
private-client
VM instance using IAP:gcloud compute ssh private-client \ --project=$projectid \ --zone=us-central1-a \ --tunnel-through-iap
Run the
dig
command:dig us-central1-aiplatform.googleapis.com
In the
private-client
VM instance (Tab Three), use a text editor such asvim
ornano
to add the following line to the/etc/hosts
file:100.100.10.10 us-central1-aiplatform.googleapis.com
This line assigns the PSC endpoint's IP address (
100.100.10.10
) to the fully qualified domain name for the Vertex AI Google API (us-central1-aiplatform.googleapis.com
). The edited file should look like this:127.0.0.1 localhost ::1 localhost ip6-localhost ip6-loopback ff02::1 ip6-allnodes ff02::2 ip6-allrouters 100.100.10.10 us-central1-aiplatform.googleapis.com # Added by you 192.168.20.2 private-client.c.$projectid.internal private-client # Added by Google 169.254.169.254 metadata.google.internal # Added by Google
From the
private-client
VM (Tab Three), ping the Vertex AI endpoint andControl+C
to exit when you see the output:ping us-central1-aiplatform.googleapis.com
The
ping
command returns the following output containing the PSC endpoint IP address:PING us-central1-aiplatform.googleapis.com (100.100.10.10) 56(84) bytes of data.
From the
private-client
VM (Tab Three), usetcpdump
to run the following command to validate DNS resolution and IP data path when you send an online prediction request to the endpoint:sudo tcpdump -i any port 53 -n or host 100.100.10.10
Open a new Cloud Shell session (Tab Four) by clicking
open a new tab in Cloud Shell.In the new Cloud Shell session (Tab Four), run the following commands, replacing PROJECT_ID with your project ID:
projectid=PROJECT_ID gcloud config set project ${projectid}
In Tab Four, log into the
private-client
instance:gcloud compute ssh \ --zone "us-central1-a" "private-client" \ --project "$projectid"
From the
private-client
VM (Tab Four), using a text editor such asvim
ornano
, create aninstances.json
file containing the following data string:{"instances": [ [0.23, 'Ideal', 'E', 'VS2', 61.5, 55.0, 3.95, 3.98, 2.43], [0.29, 'Premium', 'J', 'Internally Flawless', 52.5, 49.0, 4.00, 2.13, 3.11]]}
From the
private-client
VM (Tab Four), run the following commands, replacing PROJECT_ID with your project name and ENDPOINT_ID with the PSC endpoint ID:projectid=PROJECT_ID echo $projectid ENDPOINT_ID=ENDPOINT_ID
From the
private-client
VM (Tab Four), run the following command to send an online prediction request:curl -v -X POST -H "Authorization: Bearer $(gcloud auth print-access-token)" -H "Content-Type: application/json" https://us-central1-aiplatform.googleapis.com/v1/projects/${projectid}/locations/us-central1/endpoints/${ENDPOINT_ID}:predict -d @instances.json
From the
private-client
VM in Cloud Shell (Tab Three), verify that the PSC endpoint IP address (100.100.10.10
) was used to access Vertex AI APIs.From the
private-client
tcpdump
terminal in Cloud Shell Tab Three, you can see that a DNS lookup tous-central1-aiplatform.googleapis.com
isn't needed, because the line that you added to the/etc/hosts
file takes precedence, and the PSC IP address100.100.10.10
is used in the data path.
Clean up
To avoid incurring charges to your Google Cloud account for the resources used in this tutorial, either delete the project that contains the resources, or keep the project and delete the individual resources.
You can delete the individual resources in the project as follows:
Delete the Vertex AI Workbench instance as follows:
In the Google Cloud console, in the Vertex AI section, go to the Instances tab in the Workbench page.
Select the
workbench-tutorial
Vertex AI Workbench instance and click Delete.
Delete the container image as follows:
In the Google Cloud console, go to the Artifact Registry page.
Select the
diamonds
Docker container, and click Delete.
Delete the storage bucket as follows:
In the Google Cloud console, go to the Cloud Storage page.
Select your storage bucket, and click
Delete.
Undeploy the model from the endpoint as follows:
In the Google Cloud console, in the Vertex AI section, go to the Endpoints page.
Click
diamonds-cpr_endpoint
to go to the endpoint details page.On the row for your model,
diamonds-cpr
, click Undeploy model .In the Undeploy model from endpoint dialog, click Undeploy.
Delete the model as follows:
In the Google Cloud console, in the Vertex AI section, go to the Model Registry page.
Select the
diamonds-cpr
model.To delete the model, click
Actions, and then click Delete model.
Delete the online prediction endpoint as follows:
In the Google Cloud console, in the Vertex AI section, go to the Online prediction page.
Select the
diamonds-cpr_endpoint
endpoint.To delete the endpoint, click
Actions, and then click Delete endpoint.
In the Cloud Shell, delete the remaining resources by executing the following commands.
projectid=PROJECT_ID gcloud config set project ${projectid}
gcloud compute forwarding-rules delete pscvertex \ --global \ --quiet
gcloud compute addresses delete psc-ip \ --global \ --quiet
gcloud compute networks subnets delete workbench-subnet \ --region=us-central1 \ --quiet
gcloud compute vpn-tunnels delete aiml-vpc-tunnel0 aiml-vpc-tunnel1 on-prem-tunnel0 on-prem-tunnel1 \ --region=us-central1 \ --quiet
gcloud compute vpn-gateways delete aiml-vpn-gw on-prem-vpn-gw \ --region=us-central1 \ --quiet
gcloud compute routers nats delete cloud-nat-us-central1 \ --router=cloud-router-us-central1-aiml-nat \ --region=us-central1 \ --quiet
gcloud compute routers delete aiml-cr-us-central1 cloud-router-us-central1-aiml-nat \ --region=us-central1 \ --quiet
gcloud compute routers delete cloud-router-us-central1-on-prem-nat on-prem-cr-us-central1 \ --region=us-central1 \ --quiet
gcloud compute instances delete nat-client private-client \ --zone=us-central1-a \ --quiet
gcloud compute firewall-rules delete ssh-iap-on-prem-vpc \ --quiet
gcloud compute networks subnets delete nat-subnet private-ip-subnet \ --region=us-central1 \ --quiet
gcloud compute networks delete on-prem-vpc \ --quiet
gcloud compute networks delete aiml-vpc \ --quiet
gcloud iam service-accounts delete gce-vertex-sa@$projectid.iam.gserviceaccount.com \ --quiet
gcloud iam service-accounts delete workbench-sa@$projectid.iam.gserviceaccount.com \ --quiet
What's next
- Learn about enterprise networking options for accessing Vertex AI endpoints and services
- Learn how Private Service Connect works and why it offers significant performance benefits.
- Learn how to use VPC Service Controls to create secure perimeters to allow or deny access to Vertex AI and other Google APIs on your online prediction endpoint.
- Learn how and why to
use a DNS forwarding zone
instead of updating the
/etc/hosts
file in large scale and production environments.