Use Private Service Connect to access Vertex AI online predictions from on-premises

On-premises hosts can reach a Vertex AI online prediction endpoint either through the public internet or privately through a hybrid networking architecture that uses Private Service Connect (PSC) over Cloud VPN or Cloud Interconnect. Both options offer SSL/TLS encryption. However, the private option offers much better performance and is therefore recommended for critical applications.

In this tutorial, you use High-Availability VPN (HA VPN) to access an online prediction endpoint both publicly, through Cloud NAT; and privately, between two Virtual Private Cloud networks that can serve as a basis for multi-cloud and on-premises private connectivity.

This tutorial is intended for enterprise network administrators, data scientists, and researchers who are familiar with Vertex AI, Virtual Private Cloud (VPC), the Google Cloud console, and the Cloud Shell. Familiarity with Vertex AI Workbench is helpful but not required.

Architectural diagram of accessing an
online prediction endpoint via Private Service Connect.

Objectives

Create two Virtual Private Cloud (VPC) networks, as shown in the preceding diagram:
- One (on-prem-vpc) represents an on-premises network.
- The other (aiml-vpc) is for building and deploying a Vertex AI online prediction model.
Deploy HA VPN gateways, Cloud VPN tunnels, and Cloud Routers to connect aiml-vpc and on-prem-vpc.
Build and deploy a Vertex AI online prediction model.
Create a Private Service Connect (PSC) endpoint to forward private online prediction requests to the deployed model.
Enable the Cloud Router custom advertisement mode in aiml-vpc to announce routes for the Private Service Connect endpoint to on-prem-vpc.
Create two Compute Engine VM instances in on-prem-vpc to represent client applications:
- One (nat-client) sends online prediction requests over the public internet (through Cloud NAT). This access method is indicated by a red arrow and the number 1 in the diagram.
- The other (private-client) sends prediction requests privately over HA VPN. This access method is indicated by a green arrow and the number 2.

Costs

In this document, you use the following billable components of Google Cloud:

To generate a cost estimate based on your projected usage, use the pricing calculator. New Google Cloud users might be eligible for a free trial.

When you finish the tasks that are described in this document, you can avoid continued billing by deleting the resources that you created. For more information, see Clean up.

Before you begin

In the Google Cloud console, go to the project selector page.

Go to project selector
Select or create a Google Cloud project.
Make sure that billing is enabled for your Google Cloud project.
Open Cloud Shell to execute the commands listed in this tutorial. Cloud Shell is an interactive shell environment for Google Cloud that lets you manage your projects and resources from your web browser.
In the Cloud Shell, set the current project to your Google Cloud project ID and store the same project ID into the projectid shell variable:
```
  projectid="PROJECT_ID"
  gcloud config set project ${projectid}
```
Replace PROJECT_ID with your project ID. If necessary, you can locate your project ID in the Google Cloud console. For more information, see Find your project ID.
Grant roles to your user account. Run the following command once for each of the following IAM roles: roles/appengine.appViewer, roles/artifactregistry.admin, roles/compute.instanceAdmin.v1, roles/compute.networkAdmin, roles/compute.securityAdmin, roles/dns.admin, roles/iap.admin, roles/iap.tunnelResourceAccessor, roles/notebooks.admin, roles/oauthconfig.editor, roles/resourcemanager.projectIamAdmin, roles/servicemanagement.quotaAdmin, roles/iam.serviceAccountAdmin, roles/iam.serviceAccountUser, roles/servicedirectory.editor, roles/storage.admin, roles/aiplatform.user
```
gcloud projects add-iam-policy-binding PROJECT_ID --member="user:USER_IDENTIFIER" --role=ROLE
```
- Replace PROJECT_ID with your project ID.
- Replace USER_IDENTIFIER with the identifier for your user account. For example, user:myemail@example.com.
- Replace ROLE with each individual role.

Enable the DNS, Artifact Registry, IAM, Compute Engine, Notebooks, and Vertex AI APIs:

gcloud services enable dns.googleapis.com artifactregistry.googleapis.com iam.googleapis.com compute.googleapis.com notebooks.googleapis.com aiplatform.googleapis.com

Create the VPC networks

In this section, you create two VPC networks: one for creating an online prediction model and deploying it to an endpoint, the other for private access to that endpoint. In each of the two VPC networks, you create a Cloud Router and Cloud NAT gateway. A Cloud NAT gateway provides outgoing connectivity for Compute Engine virtual machine (VM) instances without external IP addresses.

Create the VPC network for the online prediction endpoint (`aiml-vpc`)

Create the VPC network:

gcloud compute networks create aiml-vpc \
    --project=$projectid \
    --subnet-mode=custom

Create a subnet named workbench-subnet, with a primary IPv4 range of 172.16.10.0/28:

gcloud compute networks subnets create workbench-subnet \
    --project=$projectid \
    --range=172.16.10.0/28 \
    --network=aiml-vpc \
    --region=us-central1 \
    --enable-private-ip-google-access

Create a regional Cloud Router named cloud-router-us-central1-aiml-nat:

gcloud compute routers create cloud-router-us-central1-aiml-nat \
    --network aiml-vpc \
    --region us-central1

Add a Cloud NAT gateway to the Cloud Router:

gcloud compute routers nats create cloud-nat-us-central1 \
    --router=cloud-router-us-central1-aiml-nat \
    --auto-allocate-nat-external-ips \
    --nat-all-subnet-ip-ranges \
    --region us-central1

Create the "on-premises" VPC network (`on-prem-vpc`)

Create the VPC network:

gcloud compute networks create on-prem-vpc \
    --project=$projectid \
    --subnet-mode=custom

Create a subnet named nat-subnet, with a primary IPv4 range of 192.168.10.0/28:

gcloud compute networks subnets create nat-subnet \
    --project=$projectid \
    --range=192.168.10.0/28 \
    --network=on-prem-vpc \
    --region=us-central1

Create a subnet named private-ip-subnet, with a primary IPv4 range of 192.168.20.0/28:

gcloud compute networks subnets create private-ip-subnet \
    --project=$projectid \
    --range=192.168.20.0/28 \
    --network=on-prem-vpc \
    --region=us-central1

Create a regional Cloud Router named cloud-router-us-central1-on-prem-nat:

gcloud compute routers create cloud-router-us-central1-on-prem-nat \
    --network on-prem-vpc \
    --region us-central1

Add a Cloud NAT gateway to the Cloud Router:

gcloud compute routers nats create cloud-nat-us-central1 \
    --router=cloud-router-us-central1-on-prem-nat \
    --auto-allocate-nat-external-ips \
    --nat-all-subnet-ip-ranges \
    --region us-central1

Create the Private Service Connect (PSC) endpoint

In this section, you create the Private Service Connect (PSC) endpoint that the VM instances in the on-prem-vpc network use to access the online prediction endpoint through the Vertex AI API. The Private Service Connect (PSC) endpoint is an internal IP address in the on-prem-vpc network that can be directly accessed by clients in that network. This endpoint is created by deploying a forwarding rule that directs network traffic that matches the PSC endpoint's IP address to a bundle of Google APIs. The PSC endpoint's IP address (100.100.10.10) will be advertised from the aiml-cr-us-central1 Cloud Router as a custom advertised route to the on-prem-vpc network in a later step.

Reserve IP addresses for the PSC endpoint:

gcloud compute addresses create psc-ip \
    --global \
    --purpose=PRIVATE_SERVICE_CONNECT \
    --addresses=100.100.10.10 \
    --network=aiml-vpc

Create the PSC endpoint:

gcloud compute forwarding-rules create pscvertex \
    --global \
    --network=aiml-vpc \
    --address=psc-ip \
    --target-google-apis-bundle=all-apis

List the configured PSC endpoints and verify that the pscvertex endpoint was created:

gcloud compute forwarding-rules list \
    --filter target="(all-apis OR vpc-sc)" --global

Get the details of the configured PSC endpoint and verify that the IP address is 100.100.10.10:
```
gcloud compute forwarding-rules describe pscvertex \
    --global
```

Configure hybrid connectivity

In this section, you create two (HA VPN) gateways that are connected to each other. Each gateway contains a Cloud Router and a pair of VPN tunnels.

Create the HA VPN gateway for the aiml-vpc VPC network:

gcloud compute vpn-gateways create aiml-vpn-gw \
    --network=aiml-vpc \
    --region=us-central1

Create the HA VPN gateway for the on-prem-vpc VPC network:

gcloud compute vpn-gateways create on-prem-vpn-gw \
    --network=on-prem-vpc \
    --region=us-central1

In the Google Cloud console, go to the VPN page.

Go to VPN
On the VPN page, click the Cloud VPN Gateways tab.
In the list of VPN gateways, verify that there are two gateways and that each one has two IP addresses.

In the Cloud Shell, create a Cloud Router for the aiml-vpc Virtual Private Cloud network:

gcloud compute routers create aiml-cr-us-central1 \
    --region=us-central1 \
    --network=aiml-vpc \
    --asn=65001

Create a Cloud Router for the on-prem-vpc Virtual Private Cloud network:

gcloud compute routers create on-prem-cr-us-central1 \
    --region=us-central1 \
    --network=on-prem-vpc \
    --asn=65002

Create the VPN tunnels for `aiml-vpc`

Create a VPN tunnel called aiml-vpc-tunnel0:

gcloud compute vpn-tunnels create aiml-vpc-tunnel0 \
    --peer-gcp-gateway on-prem-vpn-gw \
    --region us-central1 \
    --ike-version 2 \
    --shared-secret [ZzTLxKL8fmRykwNDfCvEFIjmlYLhMucH] \
    --router aiml-cr-us-central1 \
    --vpn-gateway aiml-vpn-gw \
    --interface 0

Create a VPN tunnel called aiml-vpc-tunnel1:

gcloud compute vpn-tunnels create aiml-vpc-tunnel1 \
    --peer-gcp-gateway on-prem-vpn-gw \
    --region us-central1 \
    --ike-version 2 \
    --shared-secret [bcyPaboPl8fSkXRmvONGJzWTrc6tRqY5] \
    --router aiml-cr-us-central1 \
    --vpn-gateway aiml-vpn-gw \
    --interface 1

Create the VPN tunnels for `on-prem-vpc`

Create a VPN tunnel called on-prem-vpc-tunnel0:

gcloud compute vpn-tunnels create on-prem-tunnel0 \
    --peer-gcp-gateway aiml-vpn-gw \
    --region us-central1 \
    --ike-version 2 \
    --shared-secret [ZzTLxKL8fmRykwNDfCvEFIjmlYLhMucH] \
    --router on-prem-cr-us-central1 \
    --vpn-gateway on-prem-vpn-gw \
    --interface 0

Create a VPN tunnel called on-prem-vpc-tunnel1:

gcloud compute vpn-tunnels create on-prem-tunnel1 \
    --peer-gcp-gateway aiml-vpn-gw \
    --region us-central1 \
    --ike-version 2 \
    --shared-secret [bcyPaboPl8fSkXRmvONGJzWTrc6tRqY5] \
    --router on-prem-cr-us-central1 \
    --vpn-gateway on-prem-vpn-gw \
    --interface 1

In the Google Cloud console, go to the VPN page.

Go to VPN
On the VPN page, click the Cloud VPN Tunnels tab.
In the list of VPN tunnels, verify that four VPN tunnels have been established.

Establish BGP sessions

Cloud Router uses Border Gateway Protocol (BGP) to exchange routes between your VPC network (in this case, aiml-vpc) and your on-premises network (represented by on-prem-vpc). On Cloud Router, you configure an interface and a BGP peer for your on-premises router. The interface and BGP peer configuration together form a BGP session. In this section, you create two BGP sessions for aiml-vpc and two for on-prem-vpc.

Establish BGP sessions for `aiml-vpc`

In the Cloud Shell, create the first BGP interface:

gcloud compute routers add-interface aiml-cr-us-central1 \
    --interface-name if-tunnel0-to-onprem \
    --ip-address 169.254.1.1 \
    --mask-length 30 \
    --vpn-tunnel aiml-vpc-tunnel0 \
    --region us-central1

Create the first BGP peer:

gcloud compute routers add-bgp-peer aiml-cr-us-central1 \
    --peer-name bgp-on-premises-tunnel0 \
    --interface if-tunnel1-to-onprem \
    --peer-ip-address 169.254.1.2 \
    --peer-asn 65002 \
    --region us-central1

Create the second BGP interface:

gcloud compute routers add-interface aiml-cr-us-central1 \
    --interface-name if-tunnel1-to-onprem \
    --ip-address 169.254.2.1 \
    --mask-length 30 \
    --vpn-tunnel aiml-vpc-tunnel1 \
    --region us-central1

Create the second BGP peer:

gcloud compute routers add-bgp-peer aiml-cr-us-central1 \
    --peer-name bgp-on-premises-tunnel1 \
    --interface if-tunnel2-to-onprem \
    --peer-ip-address 169.254.2.2 \
    --peer-asn 65002 \
    --region us-central1

Establish BGP sessions for `on-prem-vpc`

Create the first BGP interface:

gcloud compute routers add-interface on-prem-cr-us-central1 \
    --interface-name if-tunnel0-to-aiml-vpc \
    --ip-address 169.254.1.2 \
    --mask-length 30 \
    --vpn-tunnel on-prem-tunnel0 \
    --region us-central1

Create the first BGP peer:

gcloud compute routers add-bgp-peer on-prem-cr-us-central1 \
    --peer-name bgp-aiml-vpc-tunnel0 \
    --interface if-tunnel1-to-aiml-vpc \
    --peer-ip-address 169.254.1.1 \
    --peer-asn 65001 \
    --region us-central1

Create the second BGP interface:

gcloud compute routers add-interface on-prem-cr-us-central1 \
    --interface-name if-tunnel1-to-aiml-vpc \
    --ip-address 169.254.2.2 \
    --mask-length 30 \
    --vpn-tunnel on-prem-tunnel1 \
    --region us-central1

Create the second BGP peer:

gcloud compute routers add-bgp-peer on-prem-cr-us-central1 \
    --peer-name bgp-aiml-vpc-tunnel1 \
    --interface if-tunnel2-to-aiml-vpc \
    --peer-ip-address 169.254.2.1 \
    --peer-asn 65001 \
    --region us-central1

Validate BGP session creation

In the Google Cloud console, go to the VPN page.

Go to VPN
On the VPN page, click the Cloud VPN Tunnels tab.
In the list of VPN tunnels, you should now see that the value in the BGP session status column for each of the four tunnels has changed from Configure BGP session to BGP established. You may need to refresh the Google Cloud console browser tab to see the new values.

Validate that `aiml-vpc` has learned subnet routes over HA VPN

In the Google Cloud console, go to the VPC networks page.

Go to VPC networks
In the list of VPC networks, click aiml-vpc.
Click the Routes tab.
Select us-central1 (Iowa) in the Region list and click View.
In the Destination IP range column, verify that the aiml-vpc VPC network has learned routes from the on-prem-vpc VPC networks's nat-subnet subnet (192.168.10.0/28) and private-ip-subnet (192.168.20.0/28) subnet.

Validate that `on-prem-vpc` has learned subnet routes over HA VPN

In the Google Cloud console, go to the VPC networks page.

Go to VPC networks
In the list of VPC networks, click on-prem-vpc.
Click the Routes tab.
Select us-central1 (Iowa) in the Region list and click View.
In the Destination IP range column, verify that the on-prem-vpc VPC network has learned routes from the aiml-vpc VPC networks's workbench-subnet subnet (172.16.10.0/28).

Create a custom advertised route for `aiml-vpc`

The Private Service Connect endpoint IP address is not automatically advertised by the aiml-cr-us-central1 Cloud Router because the subnet is not configured in the VPC network.

Therefore, you will need to create a custom advertised route from the aiml-cr-us-central Cloud Router for the endpoint IP Address 100.100.10.10 that is advertised to the on-premises environment over BGP to the on-prem-vpc.

In the Google Cloud console, go to the Cloud Routers page.

Go to Cloud Routers
In the Cloud Router list, click aiml-cr-us-central1.
On the Router details page, click Edit.
In the Advertised routes section, for Routes, select Create custom routes.
Click Add a custom route.
For Source, select Custom IP range.
For IP address range, enter 100.100.10.10.
For Description, enter Private Service Connect Endpoint IP.
Click Done, and then click Save.

Validate that `on-prem-vpc` has learned the PSC endpoint IP address over HA VPN

In the Google Cloud console, go to the VPC networks page.

Go to VPC networks
In the list of VPC networks, click on-prem-vpc.
Click the Routes tab.
Select us-central1 (Iowa) in the Region list and click View.
In the Destination IP range column, verify that the on-prem-vpc VPC network has learned the PSC endpoint's IP address (100.100.10.10).

Create a custom advertised route for `on-prem-vpc`

The on-prem-vpc Cloud Router advertises all subnets by default, but only the private-ip-subnet subnet is needed.

In the following section, update the route advertisements from the on-prem-cr-us-central1 Cloud Router.

In the Google Cloud console, go to the Cloud Routers page.

Go to Cloud Routers
In the Cloud Router list, click on-prem-cr-us-central1.
On the Router details page, click Edit.
In the Advertised routes section, for Routes, select Create custom routes.
If the Advertise all subnets visible to the Cloud Router checkbox is selected, clear it.
Click Add a custom route.
For Source, select Custom IP range.
For IP address range, enter 192.168.20.0/28.
For Description, enter Private Service Connect Endpoint IP subnet (private-ip-subnet).
Click Done, and then click Save.

Validate that `aiml-vpc` has learned the `private-ip-subnet` route from the `on-prem-vpc`

In the Google Cloud console, go to the VPC networks page.

Go to VPC networks
In the list of VPC networks, click aiml-vpc.
Click the Routes tab.
Select us-central1 (Iowa) in the Region list and click View.
In the Destination IP range column, verify that the aiml-vpc VPC network has learned the private-ip-subnet route (192.168.20.0/28).

Create the test VM instances

Create a user-managed service account

If you have applications that need to call Google Cloud APIs, Google recommends that you attach a user-managed service account to the VM on which the application or workload is running. Accordingly, in this section you create a user-managed service account to be applied to the VM instances that you create later in this tutorial.

In the Cloud Shell, create the service account:

gcloud iam service-accounts create gce-vertex-sa \
    --description="service account for vertex" \
    --display-name="gce-vertex-sa"

Assign the Compute Instance Admin (v1) (roles/compute.instanceAdmin.v1) IAM role to the service account:

gcloud projects add-iam-policy-binding $projectid \
    --member="serviceAccount:gce-vertex-sa@$projectid.iam.gserviceaccount.com" \
    --role="roles/compute.instanceAdmin.v1"

Assign the Vertex AI User (roles/aiplatform.user) IAM role to the service account:

gcloud projects add-iam-policy-binding $projectid \
    --member="serviceAccount:gce-vertex-sa@$projectid.iam.gserviceaccount.com" \
    --role="roles/aiplatform.user"

Create the test VM instances

In this step you create test VM instances to validate different methods to reach Vertex AI APIs, specifically:

The nat-client instance uses Cloud NAT to resolve Vertex AI to access the Online Prediction endpoint over the public internet.
The private-client instance uses the Private Service Connect IP address 100.100.10.10 to access the online prediction endpoint over HA VPN.

To allow Identity-Aware Proxy (IAP) to connect to your VM instances, you create a firewall rule that:

Applies to all VM instances that you want to make accessible through IAP.
Allows TCP traffic through port 22 from the IP range 35.235.240.0/20. This range contains all IP addresses that IAP uses for TCP forwarding.

Create the nat-client VM instance:

gcloud compute instances create nat-client \
    --zone=us-central1-a \
    --image-family=debian-11 \
    --image-project=debian-cloud \
    --subnet=nat-subnet \
    --service-account=gce-vertex-sa@$projectid.iam.gserviceaccount.com \
    --scopes=https://www.googleapis.com/auth/cloud-platform \
    --no-address \
    --metadata startup-script="#! /bin/bash
        sudo apt-get update
        sudo apt-get install tcpdump dnsutils -y"

Create the private-client VM instance:

gcloud compute instances create private-client \
    --zone=us-central1-a \
    --image-family=debian-11 \
    --image-project=debian-cloud \
    --subnet=private-ip-subnet \
    --service-account=gce-vertex-sa@$projectid.iam.gserviceaccount.com \
    --scopes=https://www.googleapis.com/auth/cloud-platform \
    --no-address \
    --metadata startup-script="#! /bin/bash
        sudo apt-get update
        sudo apt-get install tcpdump dnsutils -y"

Create the IAP firewall rule:

gcloud compute firewall-rules create ssh-iap-on-prem-vpc \
    --network on-prem-vpc \
    --allow tcp:22 \
    --source-ranges=35.235.240.0/20

Create a Vertex AI Workbench instance

Create a user-managed service account for Vertex AI Workbench

When you create a Vertex AI Workbench instance, Google strongly recommends that you specify a user-managed service account instead of using the Compute Engine default service account. If your organization doesn't enforce the iam.automaticIamGrantsForDefaultServiceAccounts organization policy constraint, the Compute Engine default service account (and thus anyone you specify as an instance user) is granted the Editor role (roles/editor) on your Google Cloud project. To turn off this behavior, see Disable automatic role grants for default service accounts.

In the Cloud Shell, create a service account named workbench-sa:

gcloud iam service-accounts create workbench-sa \
    --display-name="workbench-sa"

Assign the Storage Admin (roles/storage.admin) IAM role to the service account:

gcloud projects add-iam-policy-binding $projectid \
    --member="serviceAccount:workbench-sa@$projectid.iam.gserviceaccount.com" \
    --role="roles/storage.admin"

Assign the Vertex AI User (roles/aiplatform.user) IAM role to the service account:

gcloud projects add-iam-policy-binding $projectid \
     --member="serviceAccount:workbench-sa@$projectid.iam.gserviceaccount.com" \
     --role="roles/aiplatform.user"

Assign the Artifact Registry Administrator IAM role to the service account:

gcloud projects add-iam-policy-binding $projectid \
    --member="serviceAccount:workbench-sa@$projectid.iam.gserviceaccount.com" \
    --role="roles/artifactregistry.admin"

Create the Vertex AI Workbench instance

In Cloud Shell, create a Vertex AI Workbench instance, specifying the workbench-sa service account:

gcloud workbench instances create workbench-tutorial \
  --vm-image-project=deeplearning-platform-release \
  --vm-image-family=common-cpu-notebooks \
  --machine-type=n1-standard-4 \
  --location=us-central1-a \
  --subnet-region=us-central1 \
  --shielded-secure-boot=True \
  --subnet=workbench-subnet \
  --disable-public-ip \
  --service-account-email=workbench-sa@$projectid.iam.gserviceaccount.com

Create and deploy an online prediction model

Prepare your environment

In the Google Cloud console, go to the Instances tab on the Vertex AI Workbench page.

Go to Vertex AI Workbench
Next to your Vertex AI Workbench instance's name (workbench-tutorial), click Open JupyterLab.

Your Vertex AI Workbench instance opens JupyterLab.

In the rest of this section, up to and including model deployment, you'll be working in Jupyterlab, not the Google Cloud console or the Cloud Shell.
Select File > New > Terminal.
In the JupyterLab terminal (not the Cloud Shell), define an environment variable for your project. Replace PROJECT_ID with your project ID:
```
PROJECT_ID=PROJECT_ID
```
Create a new directory called cpr-codelab and cd into it (still in the JupyterLab terminal):
```
mkdir cpr-codelab
cd cpr-codelab
```
In the File Browser, double-click the new cpr-codelab folder.

If this folder doesn't appear in the file browser, refresh the Google Cloud console browser tab, and try again.
Select File > New > Notebook.
From the Select Kernel menu, select Python [conda env:base] * (Local) and click Select.
Rename your new notebook file as follows:

In the File Browser, right-click the Untitled.ipynb file icon and enter task.ipynb.

Your cpr-codelab directory should now look like this:
```
+ cpr-codelab/
   + task.ipynb
```
In the following steps, you create your model in the Jupyterlab notebook by creating new notebook cells, pasting code into them, and running the cells.
Install dependencies as follows.
1. When you open your new notebook, there is a default code cell where you can enter code. It looks like [ ]: followed by a text field. That text field is where you paste your code.
  
  Paste the following code into the cell and click Run the selected cells and advance to create a requirements.txt file to be used as input to the following step:
```
%%writefile requirements.txt
fastapi
uvicorn==0.17.6
joblib~=1.1.1
numpy>=1.17.3, <1.24.0
scikit-learn>=1.2.2
pandas
google-cloud-storage>=2.2.1,<3.0.0dev
google-cloud-aiplatform[prediction]>=1.18.2
```
2. In this step and each of the following ones, add a code cell by clicking Insert a cell below, paste the code into the cell, and then click Run the selected cells and advance.
  
  Use Pip to install dependencies in the notebooks instance:
```
!pip install -U --user -r requirements.txt
```
3. When installation is complete, select Kernel > Restart kernel to restart the kernel and ensure that the library is available for import.
4. Paste the following code into a new notebook cell to create the directories to store the model and preprocessing artifacts:
```
USER_SRC_DIR = "src_dir"
!mkdir $USER_SRC_DIR
!mkdir model_artifacts
# copy the requirements to the source dir
!cp requirements.txt $USER_SRC_DIR/requirements.txt
```
In the File Browser, your cpr-codelab directory structure should now look like this:
```
+ cpr-codelab/
  + model_artifacts/
  + src_dir/
     + requirements.txt
  + requirements.txt
  + task.ipynb
```

Train the model

Continue adding code cells to the task.ipynb notebook, and paste in and run the following code in each new cell:

Import the libraries:

import seaborn as sns
import numpy as np
import pandas as pd

from sklearn import preprocessing
from sklearn.ensemble import RandomForestRegressor
from sklearn.pipeline import make_pipeline
from sklearn.compose import make_column_transformer

import joblib
import logging

# set logging to see the docker container logs
logging.basicConfig(level=logging.INFO)

Define the following variables, replacing PROJECT_ID with your project ID:

REGION = "us-central1"
MODEL_ARTIFACT_DIR = "sklearn-model-artifacts"
REPOSITORY = "diamonds"
IMAGE = "sklearn-image"
MODEL_DISPLAY_NAME = "diamonds-cpr"
PROJECT_ID = "PROJECT_ID"
BUCKET_NAME = "gs://PROJECT_ID-cpr-bucket"

Create a Cloud Storage bucket:

!gcloud storage buckets create $BUCKET_NAME --location=us-central1

Load the data from the seaborn library and then create two data frames, one with the features and the other with the label:

data = sns.load_dataset('diamonds', cache=True, data_home=None)

label = 'price'

y_train = data['price']
x_train = data.drop(columns=['price'])

Look at the training data and verify that each row represents a diamond.
```
x_train.head()
```
Look at the labels, which are the corresponding prices.
```
y_train.head()
```

Define a sklearn column transform to one hot encode the categorical features and scale the numerical features:

column_transform = make_column_transformer(
   (preprocessing.OneHotEncoder(), [1,2,3]),
   (preprocessing.StandardScaler(), [0,4,5,6,7,8]))

Define the random forest model:

regr = RandomForestRegressor(max_depth=10, random_state=0)

Make a sklearn pipeline. This pipeline takes input data, encodes and scales it, and passes it to the model.
```
my_pipeline = make_pipeline(column_transform, regr)
```
Train the model:
```
my_pipeline.fit(x_train, y_train)
```
Call the predict method on the model, passing in a test sample.
```
my_pipeline.predict([[0.23, 'Ideal', 'E', 'SI2', 61.5, 55.0, 3.95, 3.98, 2.43]])
```
You may see warnings like "X does not have valid feature names, but", but you can ignore them.

Save the pipeline to the model_artifacts directory and copy it to your Cloud Storage bucket:

joblib.dump(my_pipeline, 'model_artifacts/model.joblib')

!gcloud storage cp model_artifacts/model.joblib {BUCKET_NAME}/{MODEL_ARTIFACT_DIR}/

Save a preprocessing artifact

Create a preprocessing artifact. This artifact will be loaded into the custom container when the model server starts up. Your preprocessing artifact can be of almost any form (such as a pickle file), but in this case you'll write a dictionary to a JSON file:
```
clarity_dict={"Flawless": "FL",
   "Internally Flawless": "IF",
   "Very Very Slightly Included": "VVS1",
   "Very Slightly Included": "VS2",
   "Slightly Included": "S12",
   "Included": "I3"}
```

Build a custom serving container using the CPR model server

The clarity feature in our training data was always in the abbreviated form (ie "FL" instead of "Flawless"). At serving time, we want to check that the data for this feature is also abbreviated. This is because our model knows how to one hot encode "FL" but not "Flawless". You'll write this custom preprocessing logic later. But for now, just save this lookup table to a JSON file and then write it to your Cloud Storage bucket:
```
import json
with open("model_artifacts/preprocessor.json", "w") as f:
   json.dump(clarity_dict, f)

!gcloud storage cp model_artifacts/preprocessor.json {BUCKET_NAME}/{MODEL_ARTIFACT_DIR}/
```
In the File Browser, your directory structure should now look like this:
```
+ cpr-codelab/
   + model_artifacts/
      + model.joblib
      + preprocessor.json
   + src_dir/
      + requirements.txt
   + requirements.txt
   + task.ipynb
```

In your notebook, paste in and run the following code to subclass the SklearnPredictor and write it to a Python file in the src_dir/. Note that in this example we are only customizing the load, preprocess, and postprocess methods, and not the predict method.

%%writefile $USER_SRC_DIR/predictor.py

import joblib
import numpy as np
import json

from google.cloud import storage
from google.cloud.aiplatform.prediction.sklearn.predictor import SklearnPredictor

class CprPredictor(SklearnPredictor):

 def __init__(self):
     return

 def load(self, artifacts_uri: str) -> None:
     """Loads the sklearn pipeline and preprocessing artifact."""

     super().load(artifacts_uri)

     # open preprocessing artifact
     with open("preprocessor.json", "rb") as f:
         self._preprocessor = json.load(f)

 def preprocess(self, prediction_input: np.ndarray) -> np.ndarray:
     """Performs preprocessing by checking if clarity feature is in abbreviated form."""

     inputs = super().preprocess(prediction_input)

     for sample in inputs:
         if sample[3] not in self._preprocessor.values():
             sample[3] = self._preprocessor[sample[3]]
     return inputs

 def postprocess(self, prediction_results: np.ndarray) -> dict:
     """Performs postprocessing by rounding predictions and converting to str."""

     return {"predictions": [f"${value}" for value in np.round(prediction_results)]}

Use the Vertex AI SDK for Python to build the image using custom prediction routines. The Dockerfile is generated and an image is built for you.

from google.cloud import aiplatform

aiplatform.init(project=PROJECT_ID, location=REGION)

import os

from google.cloud.aiplatform.prediction import LocalModel

from src_dir.predictor import CprPredictor  # Should be path of variable $USER_SRC_DIR

local_model = LocalModel.build_cpr_model(
   USER_SRC_DIR,
   f"{REGION}-docker.pkg.dev/{PROJECT_ID}/{REPOSITORY}/{IMAGE}",
   predictor=CprPredictor,
   requirements_path=os.path.join(USER_SRC_DIR, "requirements.txt"),
)

Write a test file with two samples for prediction. One of the instances has the abbreviated clarity name, but the other needs to be converted first.

import json

sample = {"instances": [
   [0.23, 'Ideal', 'E', 'VS2', 61.5, 55.0, 3.95, 3.98, 2.43],
   [0.29, 'Premium', 'J', 'Internally Flawless', 52.5, 49.0, 4.00, 2.13, 3.11]]}

with open('instances.json', 'w') as fp:
   json.dump(sample, fp)

Test the container locally by deploying a local model.

with local_model.deploy_to_local_endpoint(
   artifact_uri = 'model_artifacts/', # local path to artifacts
) as local_endpoint:
   predict_response = local_endpoint.predict(
      request_file='instances.json',
      headers={"Content-Type": "application/json"},
   )

   health_check_response = local_endpoint.run_health_check()

You can see the prediction results with:
```
predict_response.content
```
The output looks like the following:
```
b'{"predictions": ["$479.0", "$586.0"]}'
```

Deploy the model to the online prediction model endpoint

Now that you've tested the container locally, it's time to push the image to Artifact Registry and upload the model to Vertex AI Model Registry.

Configure Docker to access Artifact Registry.

!gcloud artifacts repositories create {REPOSITORY} \
    --repository-format=docker \
    --location=us-central1 \
    --description="Docker repository"

!gcloud auth configure-docker {REGION}-docker.pkg.dev --quiet

Push the image.
```
local_model.push_image()
```

Upload the model.

model = aiplatform.Model.upload(local_model = local_model,
                                display_name=MODEL_DISPLAY_NAME,
                                artifact_uri=f"{BUCKET_NAME}/{MODEL_ARTIFACT_DIR}",)

Deploy the model:
```
endpoint = model.deploy(machine_type="n1-standard-2")
```
Wait until your model deploys before you continue to the next step. Expect deployment to take about 10 to 15 minutes.

Test the deployed model by getting a prediction:

endpoint.predict(instances=[[0.23, 'Ideal', 'E', 'VS2', 61.5, 55.0, 3.95, 3.98, 2.43]])

The output looks like the following:

Prediction(predictions=['$479.0'], deployed_model_id='3171115779319922688', metadata=None, model_version_id='1', model_resource_name='projects/721032480027/locations/us-central1/models/8554949231515795456', explanations=None)

Validate public internet access to Vertex AI APIs

In this section, you log into the nat-client VM instance in one Cloud Shell session tab and use another session tab to validate connectivity to Vertex AI APIs by running the dig and tcpdump commands against the domain us-central1-aiplatform.googleapis.com.

In the Cloud Shell (Tab One), run the following commands, replacing PROJECT_ID with your project ID:
```
projectid=PROJECT_ID
gcloud config set project ${projectid}
```

Log into the nat-client VM instance using IAP:

gcloud compute ssh nat-client \
    --project=$projectid \
    --zone=us-central1-a \
    --tunnel-through-iap

Run the dig command:

dig us-central1-aiplatform.googleapis.com

From the nat-client VM (Tab One), run the following command to validate DNS resolution when you send an online prediction request to the endpoint.
```
sudo tcpdump -i any port 53 -n
```
Open a new Cloud Shell session (Tab Two) by clicking open a new tab in Cloud Shell.
In the new Cloud Shell session (Tab Two), run the following commands, replacing PROJECT_ID with your project ID:
```
projectid=PROJECT_ID
gcloud config set project ${projectid}
```

Log into the nat-client VM instance:

gcloud compute ssh --zone "us-central1-a" "nat-client" --project "$projectid"

From the nat-client VM (Tab Two), use a text editor such as vim or nano to create an instances.json file. You need to prepend sudo in order to have permission to write to the file, for example:
```
sudo vim instances.json
```

Add the following data string to the file:

{"instances": [
   [0.23, 'Ideal', 'E', 'VS2', 61.5, 55.0, 3.95, 3.98, 2.43],
   [0.29, 'Premium', 'J', 'Internally Flawless', 52.5, 49.0, 4.00, 2.13, 3.11]]}

Save the file as follows:
- If you're using vim, press the Esc key, and then type :wq to save the file and exit.
- If you're using nano, type Control+O and press Enter to save the file, and then type Control+X to exit.
Locate the online prediction endpoint ID for the PSC endpoint:
1. In the Google Cloud console, in the Vertex AI section, go to the Endpoints tab in the Online prediction page.
  
  Go to Endpoints
2. Find the row of the endpoint that you created, named diamonds-cpr_endpoint.
3. Locate the 19-digit endpoint ID in the ID column and copy it.
In the Cloud Shell, from the nat-client VM (Tab Two), run the following commands, replacing PROJECT_ID with your project ID and ENDPOINT_ID with the PSC endpoint ID:
```
projectid=PROJECT_ID
gcloud config set project ${projectid}
ENDPOINT_ID=ENDPOINT_ID
```

From the nat-client VM (Tab Two), run the following command to send an online prediction request:

curl -X POST -H "Authorization: Bearer $(gcloud auth print-access-token)" -H "Content-Type: application/json" https://us-central1-aiplatform.googleapis.com/v1/projects/${projectid}/locations/us-central1/endpoints/${ENDPOINT_ID}:predict -d @instances.json

Now that you've run the prediction, you'll see that the tcpdump results (Tab One) show the nat-client VM instance (192.168.10.2) performing a Cloud DNS query to the local DNS server (169.254.169.254) for the Vertex AI API domain (us-central1-aiplatform.googleapis.com). The DNS query returns public Virtual IP Addresses (VIPs) for Vertex AI APIs.

Validate private access to Vertex AI APIs

In this section, you log into the private-client VM instance using Identity-Aware Proxy in a new Cloud Shell session (Tab Three), and then you validate connectivity to Vertex AI APIs by running the dig command against the Vertex AI domain (us-central1-aiplatform.googleapis.com).

Open a new Cloud Shell session (Tab Three) by clicking open a new tab in Cloud Shell. This is Tab Three.
In the new Cloud Shell session (Tab Three), run the following commands, replacing PROJECT_ID with your project ID:
```
projectid=PROJECT_ID
gcloud config set project ${projectid}
```

Log into the private-client VM instance using IAP:

gcloud compute ssh private-client \
    --project=$projectid \
    --zone=us-central1-a \
    --tunnel-through-iap

Run the dig command:

dig us-central1-aiplatform.googleapis.com

In the private-client VM instance (Tab Three), use a text editor such as vim or nano to add the following line to the /etc/hosts file:

100.100.10.10 us-central1-aiplatform.googleapis.com

This line assigns the PSC endpoint's IP address (100.100.10.10) to the fully qualified domain name for the Vertex AI Google API (us-central1-aiplatform.googleapis.com). The edited file should look like this:

127.0.0.1       localhost
::1             localhost ip6-localhost ip6-loopback
ff02::1         ip6-allnodes
ff02::2         ip6-allrouters

100.100.10.10 us-central1-aiplatform.googleapis.com # Added by you
192.168.20.2 private-client.c.$projectid.internal private-client  # Added by Google
169.254.169.254 metadata.google.internal  # Added by Google

From the private-client VM (Tab Three), ping the Vertex AI endpoint and Control+C to exit when you see the output:
```
ping us-central1-aiplatform.googleapis.com
```
The ping command returns the following output containing the PSC endpoint IP address:
```
PING us-central1-aiplatform.googleapis.com (100.100.10.10) 56(84) bytes of data.
```
From the private-client VM (Tab Three), use tcpdump to run the following command to validate DNS resolution and IP data path when you send an online prediction request to the endpoint:
```
 sudo tcpdump -i any port 53 -n or host 100.100.10.10
```
Open a new Cloud Shell session (Tab Four) by clicking open a new tab in Cloud Shell.
In the new Cloud Shell session (Tab Four), run the following commands, replacing PROJECT_ID with your project ID:
```
projectid=PROJECT_ID
gcloud config set project ${projectid}
```

In Tab Four, log into the private-client instance:

gcloud compute ssh \
    --zone "us-central1-a" "private-client" \
    --project "$projectid"

From the private-client VM (Tab Four), using a text editor such as vim or nano, create an instances.json file containing the following data string:

{"instances": [
   [0.23, 'Ideal', 'E', 'VS2', 61.5, 55.0, 3.95, 3.98, 2.43],
   [0.29, 'Premium', 'J', 'Internally Flawless', 52.5, 49.0, 4.00, 2.13, 3.11]]}

From the private-client VM (Tab Four), run the following commands, replacing PROJECT_ID with your project name and ENDPOINT_ID with the PSC endpoint ID:
```
projectid=PROJECT_ID
echo $projectid
ENDPOINT_ID=ENDPOINT_ID
```

From the private-client VM (Tab Four), run the following command to send an online prediction request:

curl -v -X POST -H "Authorization: Bearer $(gcloud auth print-access-token)" -H "Content-Type: application/json" https://us-central1-aiplatform.googleapis.com/v1/projects/${projectid}/locations/us-central1/endpoints/${ENDPOINT_ID}:predict -d @instances.json

From the private-client VM in Cloud Shell (Tab Three), verify that the PSC endpoint IP address (100.100.10.10) was used to access Vertex AI APIs.

From the private-client tcpdump terminal in Cloud Shell Tab Three, you can see that a DNS lookup to us-central1-aiplatform.googleapis.com isn't needed, because the line that you added to the /etc/hosts file takes precedence, and the PSC IP address 100.100.10.10 is used in the data path.

Clean up

To avoid incurring charges to your Google Cloud account for the resources used in this tutorial, either delete the project that contains the resources, or keep the project and delete the individual resources.

You can delete the individual resources in the project as follows:

Delete the Vertex AI Workbench instance as follows:
1. In the Google Cloud console, in the Vertex AI section, go to the Instances tab in the Workbench page.
  
  Go to Vertex AI Workbench
2. Select the workbench-tutorial Vertex AI Workbench instance and click Delete.
Delete the container image as follows:
1. In the Google Cloud console, go to the Artifact Registry page.
  
  Go to Artifact Registry
2. Select the diamonds Docker container, and click Delete.
Delete the storage bucket as follows:
1. In the Google Cloud console, go to the Cloud Storage page.
  
  Go to Cloud Storage
2. Select your storage bucket, and click Delete.
Undeploy the model from the endpoint as follows:
1. In the Google Cloud console, in the Vertex AI section, go to the Endpoints page.
  
  Go to Endpoints
2. Click diamonds-cpr_endpoint to go to the endpoint details page.
3. On the row for your model, diamonds-cpr, click Undeploy model .
4. In the Undeploy model from endpoint dialog, click Undeploy.
Delete the model as follows:
1. In the Google Cloud console, in the Vertex AI section, go to the Model Registry page.
  
  Go to Model Registry
2. Select the diamonds-cpr model.
3. To delete the model, click Actions, and then click Delete model.
Delete the online prediction endpoint as follows:
1. In the Google Cloud console, in the Vertex AI section, go to the Online prediction page.
  
  Go to Online prediction
2. Select the diamonds-cpr_endpoint endpoint.
3. To delete the endpoint, click Actions, and then click Delete endpoint.

In the Cloud Shell, delete the remaining resources by executing the following commands.

Go to Cloud Shell

projectid=PROJECT_ID
gcloud config set project ${projectid}

gcloud compute forwarding-rules delete pscvertex \
    --global \
    --quiet

gcloud compute addresses delete psc-ip \
    --global \
    --quiet

gcloud compute networks subnets delete workbench-subnet \
    --region=us-central1 \
    --quiet

gcloud compute vpn-tunnels delete aiml-vpc-tunnel0 aiml-vpc-tunnel1 on-prem-tunnel0 on-prem-tunnel1 \
    --region=us-central1 \
    --quiet

gcloud compute vpn-gateways delete aiml-vpn-gw on-prem-vpn-gw \
    --region=us-central1 \
    --quiet

gcloud compute routers nats delete cloud-nat-us-central1 \
    --router=cloud-router-us-central1-aiml-nat \
    --region=us-central1 \
    --quiet

gcloud compute routers delete aiml-cr-us-central1 cloud-router-us-central1-aiml-nat \
    --region=us-central1 \
    --quiet

gcloud compute routers delete cloud-router-us-central1-on-prem-nat on-prem-cr-us-central1 \
    --region=us-central1 \
    --quiet

gcloud compute instances delete nat-client private-client \
    --zone=us-central1-a \
    --quiet

gcloud compute firewall-rules delete ssh-iap-on-prem-vpc \
    --quiet

gcloud compute networks subnets delete nat-subnet  private-ip-subnet \
    --region=us-central1 \
    --quiet

gcloud compute networks delete on-prem-vpc \
    --quiet

gcloud compute networks delete aiml-vpc \
    --quiet

gcloud iam service-accounts delete gce-vertex-sa@$projectid.iam.gserviceaccount.com \
    --quiet

gcloud iam service-accounts delete workbench-sa@$projectid.iam.gserviceaccount.com \
    --quiet

What's next

Learn about enterprise networking options for accessing Vertex AI endpoints and services
Learn how Private Service Connect works and why it offers significant performance benefits.
Learn how to use VPC Service Controls to create secure perimeters to allow or deny access to Vertex AI and other Google APIs on your online prediction endpoint.
Learn how and why to use a DNS forwarding zone instead of updating the /etc/hosts file in large scale and production environments.

Use Private Service Connect to access Vertex AI online predictions from on-premises

Objectives

Costs

Before you begin

Create the VPC networks

Create the VPC network for the online prediction endpoint (aiml-vpc)

Create the "on-premises" VPC network (on-prem-vpc)

Create the Private Service Connect (PSC) endpoint

Configure hybrid connectivity

Create the VPN tunnels for aiml-vpc

Create the VPN tunnels for on-prem-vpc

Establish BGP sessions

Establish BGP sessions for aiml-vpc

Establish BGP sessions for on-prem-vpc

Validate BGP session creation

Validate that aiml-vpc has learned subnet routes over HA VPN

Validate that on-prem-vpc has learned subnet routes over HA VPN

Create a custom advertised route for aiml-vpc

Validate that on-prem-vpc has learned the PSC endpoint IP address over HA VPN

Create a custom advertised route for on-prem-vpc

Validate that aiml-vpc has learned the private-ip-subnet route from the on-prem-vpc

Create the test VM instances

Create a user-managed service account

Create the test VM instances

Create a Vertex AI Workbench instance

Create a user-managed service account for Vertex AI Workbench

Create the Vertex AI Workbench instance

Create and deploy an online prediction model

Prepare your environment

Train the model

Save a preprocessing artifact

Build a custom serving container using the CPR model server

Deploy the model to the online prediction model endpoint

Validate public internet access to Vertex AI APIs

Validate private access to Vertex AI APIs

Clean up

What's next

Create the VPC network for the online prediction endpoint (`aiml-vpc`)

Create the "on-premises" VPC network (`on-prem-vpc`)

Create the VPN tunnels for `aiml-vpc`

Create the VPN tunnels for `on-prem-vpc`

Establish BGP sessions for `aiml-vpc`

Establish BGP sessions for `on-prem-vpc`

Validate that `aiml-vpc` has learned subnet routes over HA VPN

Validate that `on-prem-vpc` has learned subnet routes over HA VPN

Create a custom advertised route for `aiml-vpc`

Validate that `on-prem-vpc` has learned the PSC endpoint IP address over HA VPN

Create a custom advertised route for `on-prem-vpc`

Validate that `aiml-vpc` has learned the `private-ip-subnet` route from the `on-prem-vpc`