Use Private Service Connect to access Vertex AI online predictions from on-premises


On-premises hosts can reach a Vertex AI online prediction endpoint either through the public internet or privately through a hybrid networking architecture that uses Private Service Connect (PSC) over Cloud VPN or Cloud Interconnect. Both options offer SSL/TLS encryption. However, the private option offers much better performance and is therefore recommended for critical applications.

In this tutorial, you use High-Availability VPN (HA VPN) to access an online prediction endpoint both publicly, through Cloud NAT; and privately, between two Virtual Private Cloud networks that can serve as a basis for multi-cloud and on-premises private connectivity.

This tutorial is intended for enterprise network administrators, data scientists, and researchers who are familiar with Vertex AI, Virtual Private Cloud (VPC), the Google Cloud console, and the Cloud Shell. Familiarity with Vertex AI Workbench is helpful but not required.

Architectural diagram of accessing an
online prediction endpoint via Private Service Connect.

Objectives

  • Create two Virtual Private Cloud (VPC) networks, as shown in the preceding diagram:
    • One (on-prem-vpc) represents an on-premises network.
    • The other (aiml-vpc) is for building and deploying a Vertex AI online prediction model.
  • Deploy HA VPN gateways, Cloud VPN tunnels, and Cloud Routers to connect aiml-vpc and on-prem-vpc.
  • Build and deploy a Vertex AI online prediction model.
  • Create a Private Service Connect (PSC) endpoint to forward private online prediction requests to the deployed model.
  • Enable the Cloud Router custom advertisement mode in aiml-vpc to announce routes for the Private Service Connect endpoint to on-prem-vpc.
  • Create two Compute Engine VM instances in on-prem-vpc to represent client applications:
    • One (nat-client) sends online prediction requests over the public internet (through Cloud NAT). This access method is indicated by a red arrow and the number 1 in the diagram.
    • The other (private-client) sends prediction requests privately over HA VPN. This access method is indicated by a green arrow and the number 2.

Costs

In this document, you use the following billable components of Google Cloud:

To generate a cost estimate based on your projected usage, use the pricing calculator. New Google Cloud users might be eligible for a free trial.

When you finish the tasks that are described in this document, you can avoid continued billing by deleting the resources that you created. For more information, see Clean up.

Before you begin

  1. In the Google Cloud console, go to the project selector page.

    Go to project selector

  2. Select or create a Google Cloud project.

  3. Make sure that billing is enabled for your Google Cloud project.

  4. Open Cloud Shell to execute the commands listed in this tutorial. Cloud Shell is an interactive shell environment for Google Cloud that lets you manage your projects and resources from your web browser.
  5. In the Cloud Shell, set the current project to your Google Cloud project ID and store the same project ID into the projectid shell variable:
      projectid="PROJECT_ID"
      gcloud config set project ${projectid}
    Replace PROJECT_ID with your project ID. If necessary, you can locate your project ID in the Google Cloud console. For more information, see Find your project ID.
  6. Grant roles to your user account. Run the following command once for each of the following IAM roles: roles/appengine.appViewer, roles/artifactregistry.admin, roles/compute.instanceAdmin.v1, roles/compute.networkAdmin, roles/compute.securityAdmin, roles/dns.admin, roles/iap.admin, roles/iap.tunnelResourceAccessor, roles/notebooks.admin, roles/oauthconfig.editor, roles/resourcemanager.projectIamAdmin, roles/servicemanagement.quotaAdmin, roles/iam.serviceAccountAdmin, roles/iam.serviceAccountUser, roles/servicedirectory.editor, roles/storage.admin, roles/aiplatform.user

    gcloud projects add-iam-policy-binding PROJECT_ID --member="USER_IDENTIFIER" --role=ROLE
    • Replace PROJECT_ID with your project ID.
    • Replace USER_IDENTIFIER with the identifier for your user account. For example, user:myemail@example.com.

    • Replace ROLE with each individual role.
  7. Enable the DNS, Artifact Registry, IAM, Compute Engine, Notebooks, and Vertex AI APIs:

    gcloud services enable dns.googleapis.com artifactregistry.googleapis.com iam.googleapis.com compute.googleapis.com notebooks.googleapis.com aiplatform.googleapis.com

Create the VPC networks

In this section, you create two VPC networks: one for creating an online prediction model and deploying it to an endpoint, the other for private access to that endpoint. In each of the two VPC networks, you create a Cloud Router and Cloud NAT gateway. A Cloud NAT gateway provides outgoing connectivity for Compute Engine virtual machine (VM) instances without external IP addresses.

Create the VPC network for the online prediction endpoint (aiml-vpc)

  1. Create the VPC network:

    gcloud compute networks create aiml-vpc \
        --project=$projectid \
        --subnet-mode=custom
    
  2. Create a subnet named workbench-subnet, with a primary IPv4 range of 172.16.10.0/28:

    gcloud compute networks subnets create workbench-subnet \
        --project=$projectid \
        --range=172.16.10.0/28 \
        --network=aiml-vpc \
        --region=us-central1 \
        --enable-private-ip-google-access
    
  3. Create a regional Cloud Router named cloud-router-us-central1-aiml-nat:

    gcloud compute routers create cloud-router-us-central1-aiml-nat \
        --network aiml-vpc \
        --region us-central1
    
  4. Add a Cloud NAT gateway to the Cloud Router:

    gcloud compute routers nats create cloud-nat-us-central1 \
        --router=cloud-router-us-central1-aiml-nat \
        --auto-allocate-nat-external-ips \
        --nat-all-subnet-ip-ranges \
        --region us-central1
    

Create the "on-premises" VPC network (on-prem-vpc)

  1. Create the VPC network:

    gcloud compute networks create on-prem-vpc \
        --project=$projectid \
        --subnet-mode=custom
    
  2. Create a subnet named nat-subnet, with a primary IPv4 range of 192.168.10.0/28:

    gcloud compute networks subnets create nat-subnet \
        --project=$projectid \
        --range=192.168.10.0/28 \
        --network=on-prem-vpc \
        --region=us-central1
    
  3. Create a subnet named private-ip-subnet, with a primary IPv4 range of 192.168.20.0/28:

    gcloud compute networks subnets create private-ip-subnet \
        --project=$projectid \
        --range=192.168.20.0/28 \
        --network=on-prem-vpc \
        --region=us-central1
    
  4. Create a regional Cloud Router named cloud-router-us-central1-on-prem-nat:

    gcloud compute routers create cloud-router-us-central1-on-prem-nat \
        --network on-prem-vpc \
        --region us-central1
    
  5. Add a Cloud NAT gateway to the Cloud Router:

    gcloud compute routers nats create cloud-nat-us-central1 \
        --router=cloud-router-us-central1-on-prem-nat \
        --auto-allocate-nat-external-ips \
        --nat-all-subnet-ip-ranges \
        --region us-central1
    

Create the Private Service Connect (PSC) endpoint

In this section, you create the Private Service Connect (PSC) endpoint that the VM instances in the on-prem-vpc network use to access the online prediction endpoint through the Vertex AI API. The Private Service Connect (PSC) endpoint is an internal IP address in the on-prem-vpc network that can be directly accessed by clients in that network. This endpoint is created by deploying a forwarding rule that directs network traffic that matches the PSC endpoint's IP address to a bundle of Google APIs. The PSC endpoint's IP address (100.100.10.10) will be advertised from the aiml-vpc-cloud-router-vpn as a custom advertised route to the on-premises network in a later step.

  1. Reserve IP addresses for the PSC endpoint:

    gcloud compute addresses create psc-ip \
        --global \
        --purpose=PRIVATE_SERVICE_CONNECT \
        --addresses=100.100.10.10 \
        --network=aiml-vpc
    
  2. Create the PSC endpoint:

    gcloud compute forwarding-rules create pscvertex \
        --global \
        --network=aiml-vpc \
        --address=psc-ip \
        --target-google-apis-bundle=all-apis
    
  3. List the configured PSC endpoints and verify that the pscvertex endpoint was created:

    gcloud compute forwarding-rules list \
        --filter target="(all-apis OR vpc-sc)" --global
    
  4. Get the details of the configured PSC endpoint and verify that the IP address is 100.100.10.10:

    gcloud compute forwarding-rules describe pscvertex \
        --global
    

Configure hybrid connectivity

In this section, you create two (HA VPN) gateways that are connected to each other. Each gateway contains a Cloud Router and a pair of VPN tunnels.

  1. Create the HA VPN gateway for the aiml-vpc VPC network:

    gcloud compute vpn-gateways create aiml-vpn-gw \
        --network=aiml-vpc \
        --region=us-central1
    
  2. Create the HA VPN gateway for the on-prem-vpc VPC network:

    gcloud compute vpn-gateways create on-prem-vpn-gw \
        --network=on-prem-vpc \
        --region=us-central1
    
  3. In the Google Cloud console, go to the VPN page.

    Go to VPN

  4. On the VPN page, click the Cloud VPN Gateways tab.

  5. In the list of VPN gateways, verify that there are two gateways and that each one has two IP addresses.

  6. In the Cloud Shell, create a Cloud Router for the aiml-vpc Virtual Private Cloud network:

    gcloud compute routers create aiml-cr-us-central1 \
        --region=us-central1 \
        --network=aiml-vpc \
        --asn=65001
    
  7. Create a Cloud Router for the on-prem-vpc Virtual Private Cloud network:

    gcloud compute routers create on-prem-cr-us-central1 \
        --region=us-central1 \
        --network=on-prem-vpc \
        --asn=65002
    

Create the VPN tunnels for aiml-vpc

  1. Create a VPN tunnel called aiml-vpc-tunnel0:

    gcloud compute vpn-tunnels create aiml-vpc-tunnel0 \
        --peer-gcp-gateway on-prem-vpn-gw \
        --region us-central1 \
        --ike-version 2 \
        --shared-secret [ZzTLxKL8fmRykwNDfCvEFIjmlYLhMucH] \
        --router aiml-cr-us-central1 \
        --vpn-gateway aiml-vpn-gw \
        --interface 0
    
  2. Create a VPN tunnel called aiml-vpc-tunnel1:

    gcloud compute vpn-tunnels create aiml-vpc-tunnel1 \
        --peer-gcp-gateway on-prem-vpn-gw \
        --region us-central1 \
        --ike-version 2 \
        --shared-secret [bcyPaboPl8fSkXRmvONGJzWTrc6tRqY5] \
        --router aiml-cr-us-central1 \
        --vpn-gateway aiml-vpn-gw \
        --interface 1
    

Create the VPN tunnels for on-prem-vpc

  1. Create a VPN tunnel called on-prem-vpc-tunnel0:

    gcloud compute vpn-tunnels create on-prem-tunnel0 \
        --peer-gcp-gateway aiml-vpn-gw \
        --region us-central1 \
        --ike-version 2 \
        --shared-secret [ZzTLxKL8fmRykwNDfCvEFIjmlYLhMucH] \
        --router on-prem-cr-us-central1 \
        --vpn-gateway on-prem-vpn-gw \
        --interface 0
    
  2. Create a VPN tunnel called on-prem-vpc-tunnel1:

    gcloud compute vpn-tunnels create on-prem-tunnel1 \
        --peer-gcp-gateway aiml-vpn-gw \
        --region us-central1 \
        --ike-version 2 \
        --shared-secret [bcyPaboPl8fSkXRmvONGJzWTrc6tRqY5] \
        --router on-prem-cr-us-central1 \
        --vpn-gateway on-prem-vpn-gw \
        --interface 1
    
  3. In the Google Cloud console, go to the VPN page.

    Go to VPN

  4. On the VPN page, click the Cloud VPN Tunnels tab.

  5. In the list of VPN tunnels, verify that four VPN tunnels have been established.

Establish BGP sessions

Cloud Router uses Border Gateway Protocol (BGP) to exchange routes between your VPC network (in this case, aiml-vpc) and your on-premises network (represented by on-prem-vpc). On Cloud Router, you configure an interface and a BGP peer for your on-premises router. The interface and BGP peer configuration together form a BGP session. In this section, you create two BGP sessions for aiml-vpc and two for on-prem-vpc.

Establish BGP sessions for aiml-vpc

  1. In the Cloud Shell, create the first BGP interface:

    gcloud compute routers add-interface aiml-cr-us-central1 \
        --interface-name if-tunnel0-to-onprem \
        --ip-address 169.254.1.1 \
        --mask-length 30 \
        --vpn-tunnel aiml-vpc-tunnel0 \
        --region us-central1
    
  2. Create the first BGP peer:

    gcloud compute routers add-bgp-peer aiml-cr-us-central1 \
        --peer-name bgp-on-premises-tunnel0 \
        --interface if-tunnel1-to-onprem \
        --peer-ip-address 169.254.1.2 \
        --peer-asn 65002 \
        --region us-central1
    
  3. Create the second BGP interface:

    gcloud compute routers add-interface aiml-cr-us-central1 \
        --interface-name if-tunnel1-to-onprem \
        --ip-address 169.254.2.1 \
        --mask-length 30 \
        --vpn-tunnel aiml-vpc-tunnel1 \
        --region us-central1
    
  4. Create the second BGP peer:

    gcloud compute routers add-bgp-peer aiml-cr-us-central1 \
        --peer-name bgp-on-premises-tunnel1 \
        --interface if-tunnel2-to-onprem \
        --peer-ip-address 169.254.2.2 \
        --peer-asn 65002 \
        --region us-central1
    

Establish BGP sessions for on-prem-vpc

  1. Create the first BGP interface:

    gcloud compute routers add-interface on-prem-cr-us-central1 \
        --interface-name if-tunnel0-to-aiml-vpc \
        --ip-address 169.254.1.2 \
        --mask-length 30 \
        --vpn-tunnel on-prem-tunnel0 \
        --region us-central1
    
  2. Create the first BGP peer:

    gcloud compute routers add-bgp-peer on-prem-cr-us-central1 \
        --peer-name bgp-aiml-vpc-tunnel0 \
        --interface if-tunnel1-to-aiml-vpc \
        --peer-ip-address 169.254.1.1 \
        --peer-asn 65001 \
        --region us-central1
    
  3. Create the second BGP interface:

    gcloud compute routers add-interface on-prem-cr-us-central1 \
        --interface-name if-tunnel1-to-aiml-vpc \
        --ip-address 169.254.2.2 \
        --mask-length 30 \
        --vpn-tunnel on-prem-tunnel1 \
        --region us-central1
    
  4. Create the second BGP peer:

    gcloud compute routers add-bgp-peer on-prem-cr-us-central1 \
        --peer-name bgp-aiml-vpc-tunnel1 \
        --interface if-tunnel2-to-aiml-vpc \
        --peer-ip-address 169.254.2.1 \
        --peer-asn 65001 \
        --region us-central1
    

Validate BGP session creation

  1. In the Google Cloud console, go to the VPN page.

    Go to VPN

  2. On the VPN page, click the Cloud VPN Tunnels tab.

  3. In the list of VPN tunnels, you should now see that the value in the BGP session status column for each of the four tunnels has changed from Configure BGP session to BGP established. You may need to refresh the Google Cloud console browser tab to see the new values.

Validate that aiml-vpc has learned subnet routes over HA VPN

  1. In the Google Cloud console, go to the VPC networks page.

    Go to VPC networks

  2. In the list of VPC networks, click aiml-vpc.

  3. Click the Routes tab.

  4. Select us-central1 (Iowa) in the Region list and click View.

  5. In the Destination IP range column, verify that the aiml-vpc VPC network has learned routes from the on-prem-vpc VPC networks's nat-subnet subnet (192.168.10.0/28) and private-ip-subnet (192.168.20.0/28) subnet.

Validate that on-prem-vpc has learned subnet routes over HA VPN

  1. In the Google Cloud console, go to the VPC networks page.

    Go to VPC networks

  2. In the list of VPC networks, click on-prem-vpc.

  3. Click the Routes tab.

  4. Select us-central1 (Iowa) in the Region list and click View.

  5. In the Destination IP range column, verify that the on-prem-vpc VPC network has learned routes from the aiml-vpc VPC networks's workbench-subnet subnet (172.16.10.0/28).

Create a custom advertised route for aiml-vpc

The Private Service Connect endpoint IP address is not automatically advertised by the aiml-cr-us-central1 Cloud Router because the subnet is not configured in the VPC network.

Therefore, you will need to create a custom advertised route from the aiml-cr-us-central Cloud Router for the endpoint IP Address 100.100.10.10 that is advertised to the on-premises environment over BGP to the on-prem-vpc.

  1. In the Google Cloud console, go to the Cloud Routers page.

    Go to Cloud Routers

  2. In the Cloud Router list, click aiml-cr-us-central1.

  3. On the Router details page, click Edit.

  4. In the Advertised routes section, for Routes, select Create custom routes.

  5. Click Add a custom route.

  6. For Source, select Custom IP range.

  7. For IP address range, enter 100.100.10.10.

  8. For Description, enter Private Service Connect Endpoint IP.

  9. Click Done, and then click Save.

Validate that on-prem-vpc has learned the PSC Endpoint IP Address over HA VPN

  1. In the Google Cloud console, go to the VPC networks page.

    Go to VPC networks

  2. In the list of VPC networks, click on-prem-vpc.

  3. Click the Routes tab.

  4. Select us-central1 (Iowa) in the Region list and click View.

  5. In the Destination IP range column, verify that the on-prem-vpc VPC network has learned the PSC endpoint's IP address (100.100.10.10).

Create a custom advertised route for on-prem-vpc

The on-prem-vpc Cloud Router advertises all subnets by default, but only the private-ip-subnet subnet is needed.

In the following section, update the route advertisements from the on-prem-cr-us-central1 Cloud Router.

  1. In the Google Cloud console, go to the Cloud Routers page.

    Go to Cloud Routers

  2. In the Cloud Router list, click on-prem-cr-us-central1.

  3. On the Router details page, click Edit.

  4. In the Advertised routes section, for Routes, select Create custom routes.

  5. If the Advertise all subnets visible to the Cloud Router checkbox is selected, clear it.

  6. Click Add a custom route.

  7. For Source, select Custom IP range.

  8. For IP address range, enter 192.168.20.0/28.

  9. For Description, enter Private Service Connect Endpoint IP subnet (private-ip-subnet).

  10. Click Done, and then click Save.

Validate that aiml-vpc has learned the private-ip-subnet route from the on-prem-vpc

  1. In the Google Cloud console, go to the VPC networks page.

    Go to VPC networks

  2. In the list of VPC networks, click aiml-vpc.

  3. Click the Routes tab.

  4. Select us-central1 (Iowa) in the Region list and click View.

  5. In the Destination IP range column, verify that the aiml-vpc VPC network has learned the private-ip-subnet route (192.168.20.0/28).

Create the test VM instances

Create a user-managed service account

If you have applications that need to call Google Cloud APIs, Google recommends that you attach a user-managed service account to the VM on which the application or workload is running. Accordingly, in this section you create a user-managed service account to be applied to the VM instances that you create later in this tutorial.

  1. In the Cloud Shell, create the service account:

    gcloud iam service-accounts create gce-vertex-sa \
        --description="service account for vertex" \
        --display-name="gce-vertex-sa"
    
  2. Assign the Compute Instance Admin (v1) (roles/compute.instanceAdmin.v1) IAM role to the service account:

    gcloud projects add-iam-policy-binding $projectid \
        --member="serviceAccount:gce-vertex-sa@$projectid.iam.gserviceaccount.com" \
        --role="roles/compute.instanceAdmin.v1"
    
  3. Assign the Vertex AI User (roles/aiplatform.user) IAM role to the service account:

    gcloud projects add-iam-policy-binding $projectid \
        --member="serviceAccount:gce-vertex-sa@$projectid.iam.gserviceaccount.com" \
        --role="roles/aiplatform.user"
    

Create the test VM instances

In this step you create test VM instances to validate different methods to reach Vertex AI APIs, specifically:

  • The nat-client instance uses Cloud NAT to resolve Vertex AI to access the Online Prediction endpoint over the public internet.
  • The private-client instance uses the Private Service Connect IP address 100.100.10.10 to access the online prediction endpoint over HA VPN.

To allow Identity-Aware Proxy (IAP) to connect to your VM instances, you create a firewall rule that:

  • Applies to all VM instances that you want to make accessible through IAP.
  • Allows TCP traffic through port 22 from the IP range 35.235.240.0/20. This range contains all IP addresses that IAP uses for TCP forwarding.
  1. Create the nat-client VM instance:

    gcloud compute instances create nat-client \
        --zone=us-central1-a \
        --image-family=debian-11 \
        --image-project=debian-cloud \
        --subnet=nat-subnet \
        --service-account=gce-vertex-sa@$projectid.iam.gserviceaccount.com \
        --scopes=https://www.googleapis.com/auth/cloud-platform \
        --no-address \
        --metadata startup-script="#! /bin/bash
            sudo apt-get update
            sudo apt-get install tcpdump dnsutils -y"
    
  2. Create the private-client VM instance:

    gcloud compute instances create private-client \
        --zone=us-central1-a \
        --image-family=debian-11 \
        --image-project=debian-cloud \
        --subnet=private-ip-subnet \
        --service-account=gce-vertex-sa@$projectid.iam.gserviceaccount.com \
        --scopes=https://www.googleapis.com/auth/cloud-platform \
        --no-address \
        --metadata startup-script="#! /bin/bash
            sudo apt-get update
            sudo apt-get install tcpdump dnsutils -y"
    
  3. Create the IAP firewall rule:

    gcloud compute firewall-rules create ssh-iap-on-prem-vpc \
        --network on-prem-vpc \
        --allow tcp:22 \
        --source-ranges=35.235.240.0/20
    

Create a Vertex AI Workbench instance

Create a user-managed service account for Vertex AI Workbench

When you create a Vertex AI Workbench instance, Google strongly recommends that you specify a user-managed service account instead of using the Compute Engine default service account. If your organization doesn't enforce the iam.automaticIamGrantsForDefaultServiceAccounts organization policy constraint, the Compute Engine default service account (and thus anyone you specify as an instance user) is granted the Editor role (roles/editor) on your Google Cloud project. To turn off this behavior, see Disable automatic role grants for default service accounts.

  1. In the Cloud Shell, create a service account named workbench-sa:

    gcloud iam service-accounts create workbench-sa \
        --display-name="workbench-sa"
    
  2. Assign the Storage Admin (roles/storage.admin) IAM role to the service account:

    gcloud projects add-iam-policy-binding $projectid \
        --member="serviceAccount:workbench-sa@$projectid.iam.gserviceaccount.com" \
        --role="roles/storage.admin"
    
  3. Assign the Vertex AI User (roles/aiplatform.user) IAM role to the service account:

    gcloud projects add-iam-policy-binding $projectid \
         --member="serviceAccount:workbench-sa@$projectid.iam.gserviceaccount.com" \
         --role="roles/aiplatform.user"
    
  4. Assign the Artifact Registry Administrator IAM role to the service account:

    gcloud projects add-iam-policy-binding $projectid \
        --member="serviceAccount:workbench-sa@$projectid.iam.gserviceaccount.com" \
        --role="roles/artifactregistry.admin"
    

Create the Vertex AI Workbench instance

  1. In Cloud Shell, create a Vertex AI Workbench instance, specifying the workbench-sa service account:

    gcloud workbench instances create workbench-tutorial \
      --vm-image-project=deeplearning-platform-release \
      --vm-image-family=common-cpu-notebooks \
      --machine-type=n1-standard-4 \
      --location=us-central1-a \
      --subnet-region=us-central1 \
      --shielded-secure-boot=True \
      --subnet=workbench-subnet \
      --disable-public-ip \
      --service-account-email=workbench-sa@$projectid.iam.gserviceaccount.com
    

Create and deploy an online prediction model

Prepare your environment

  1. In the Google Cloud console, go to the Instances tab on the Vertex AI Workbench page.

    Go to Vertex AI Workbench

  2. Next to your Vertex AI Workbench instance's name (workbench-tutorial), click Open JupyterLab.

    Your Vertex AI Workbench instance opens JupyterLab.

    In the rest of this section, up to and including model deployment, you'll be working in Jupyterlab, not the Google Cloud console or the Cloud Shell.

  3. Select File > New > Terminal.

  4. In the JupyterLab terminal (not the Cloud Shell), define an environment variable for your project. Replace PROJECT_ID with your project ID:

    PROJECT_ID=PROJECT_ID
    
  5. Create a new directory called cpr-codelab and cd into it (still in the JupyterLab terminal):

    mkdir cpr-codelab
    cd cpr-codelab
    
  6. In the  File Browser, double-click the new cpr-codelab folder.

    If this folder doesn't appear in the file browser, refresh the Google Cloud console browser tab, and try again.

  7. Select File > New > Notebook.

  8. From the Select Kernel menu, select Python [conda env:base] * (Local) and click Select.

  9. Rename your new notebook file as follows:

    In the  File Browser, right-click the Untitled.ipynb file icon and enter task.ipynb.

    Your cpr-codelab directory should now look like this:

    + cpr-codelab/
       + task.ipynb
    

    In the following steps, you create your model in the Jupyterlab notebook by creating new notebook cells, pasting code into them, and running the cells.

  10. Install dependencies as follows.

    1. When you open your new notebook, there is a default code cell where you can enter code. It looks like [ ]: followed by a text field. That text field is where you paste your code.

      Paste the following code into the cell and click  Run the selected cells and advance to create a requirements.txt file to be used as input to the following step:

      %%writefile requirements.txt
      fastapi
      uvicorn==0.17.6
      joblib~=1.1.1
      numpy>=1.17.3, <1.24.0
      scikit-learn>=1.2.2
      pandas
      google-cloud-storage>=2.2.1,<3.0.0dev
      google-cloud-aiplatform[prediction]>=1.18.2
      
    2. In this step and each of the following ones, add a code cell by clicking Insert a cell below, paste the code into the cell, and then click  Run the selected cells and advance.

      Use Pip to install dependencies in the notebooks instance:

      !pip install -U --user -r requirements.txt
      
    3. When installation is complete, select Kernel > Restart kernel to restart the kernel and ensure that the library is available for import.

    4. Paste the following code into a new notebook cell to create the directories to store the model and preprocessing artifacts:

      USER_SRC_DIR = "src_dir"
      !mkdir $USER_SRC_DIR
      !mkdir model_artifacts
      # copy the requirements to the source dir
      !cp requirements.txt $USER_SRC_DIR/requirements.txt
      

    In the  File Browser, your cpr-codelab directory structure should now look like this:

    + cpr-codelab/
      + model_artifacts/
      + src_dir/
         + requirements.txt
      + requirements.txt
      + task.ipynb
    

Train the model

Continue adding code cells to the task.ipynb notebook, and paste in and run the following code in each new cell:

  1. Import the libraries:

    import seaborn as sns
    import numpy as np
    import pandas as pd
    
    from sklearn import preprocessing
    from sklearn.ensemble import RandomForestRegressor
    from sklearn.pipeline import make_pipeline
    from sklearn.compose import make_column_transformer
    
    import joblib
    import logging
    
    # set logging to see the docker container logs
    logging.basicConfig(level=logging.INFO)
    
  2. Define the following variables, replacing PROJECT_ID with your project ID:

    REGION = "us-central1"
    MODEL_ARTIFACT_DIR = "sklearn-model-artifacts"
    REPOSITORY = "diamonds"
    IMAGE = "sklearn-image"
    MODEL_DISPLAY_NAME = "diamonds-cpr"
    PROJECT_ID = "PROJECT_ID"
    BUCKET_NAME = "gs://PROJECT_ID-cpr-bucket"
    
  3. Create a Cloud Storage bucket:

    !gcloud storage buckets create $BUCKET_NAME --location=us-central1
    
  4. Load the data from the seaborn library and then create two data frames, one with the features and the other with the label:

    data = sns.load_dataset('diamonds', cache=True, data_home=None)
    
    label = 'price'
    
    y_train = data['price']
    x_train = data.drop(columns=['price'])
    
  5. Look at the training data and verify that each row represents a diamond.

    x_train.head()
    
  6. Look at the labels, which are the corresponding prices.

    y_train.head()
    
  7. Define a sklearn column transform to one hot encode the categorical features and scale the numerical features:

    column_transform = make_column_transformer(
       (preprocessing.OneHotEncoder(), [1,2,3]),
       (preprocessing.StandardScaler(), [0,4,5,6,7,8]))
    
  8. Define the random forest model:

    regr = RandomForestRegressor(max_depth=10, random_state=0)
    
  9. Make a sklearn pipeline. This pipeline takes input data, encodes and scales it, and passes it to the model.

    my_pipeline = make_pipeline(column_transform, regr)
    
  10. Train the model:

    my_pipeline.fit(x_train, y_train)
    
  11. Call the predict method on the model, passing in a test sample.

    my_pipeline.predict([[0.23, 'Ideal', 'E', 'SI2', 61.5, 55.0, 3.95, 3.98, 2.43]])
    

    You may see warnings like "X does not have valid feature names, but", but you can ignore them.

  12. Save the pipeline to the model_artifacts directory and copy it to your Cloud Storage bucket:

    joblib.dump(my_pipeline, 'model_artifacts/model.joblib')
    
    !gcloud storage cp model_artifacts/model.joblib {BUCKET_NAME}/{MODEL_ARTIFACT_DIR}/
    

Save a preprocessing artifact

  1. Create a preprocessing artifact. This artifact will be loaded into the custom container when the model server starts up. Your preprocessing artifact can be of almost any form (such as a pickle file), but in this case you'll write a dictionary to a JSON file:

    clarity_dict={"Flawless": "FL",
       "Internally Flawless": "IF",
       "Very Very Slightly Included": "VVS1",
       "Very Slightly Included": "VS2",
       "Slightly Included": "S12",
       "Included": "I3"}
    

Build a custom serving container using the CPR model server

  1. The clarity feature in our training data was always in the abbreviated form (ie "FL" instead of "Flawless"). At serving time, we want to check that the data for this feature is also abbreviated. This is because our model knows how to one hot encode "FL" but not "Flawless". You'll write this custom preprocessing logic later. But for now, just save this lookup table to a JSON file and then write it to your Cloud Storage bucket:

    import json
    with open("model_artifacts/preprocessor.json", "w") as f:
       json.dump(clarity_dict, f)
    
    !gcloud storage cp model_artifacts/preprocessor.json {BUCKET_NAME}/{MODEL_ARTIFACT_DIR}/
    

    In the  File Browser, your directory structure should now look like this:

    + cpr-codelab/
       + model_artifacts/
          + model.joblib
          + preprocessor.json
       + src_dir/
          + requirements.txt
       + requirements.txt
       + task.ipynb
    
  2. In your notebook, paste in and run the following code to subclass the SklearnPredictor and write it to a Python file in the src_dir/. Note that in this example we are only customizing the load, preprocess, and postprocess methods, and not the predict method.

    %%writefile $USER_SRC_DIR/predictor.py
    
    import joblib
    import numpy as np
    import json
    
    from google.cloud import storage
    from google.cloud.aiplatform.prediction.sklearn.predictor import SklearnPredictor
    
    class CprPredictor(SklearnPredictor):
    
     def __init__(self):
         return
    
     def load(self, artifacts_uri: str) -> None:
         """Loads the sklearn pipeline and preprocessing artifact."""
    
         super().load(artifacts_uri)
    
         # open preprocessing artifact
         with open("preprocessor.json", "rb") as f:
             self._preprocessor = json.load(f)
    
     def preprocess(self, prediction_input: np.ndarray) -> np.ndarray:
         """Performs preprocessing by checking if clarity feature is in abbreviated form."""
    
         inputs = super().preprocess(prediction_input)
    
         for sample in inputs:
             if sample[3] not in self._preprocessor.values():
                 sample[3] = self._preprocessor[sample[3]]
         return inputs
    
     def postprocess(self, prediction_results: np.ndarray) -> dict:
         """Performs postprocessing by rounding predictions and converting to str."""
    
         return {"predictions": [f"${value}" for value in np.round(prediction_results)]}
    
  3. Use the Vertex AI SDK for Python to build the image using custom prediction routines. The Dockerfile is generated and an image is built for you.

    from google.cloud import aiplatform
    
    aiplatform.init(project=PROJECT_ID, location=REGION)
    
    import os
    
    from google.cloud.aiplatform.prediction import LocalModel
    
    from src_dir.predictor import CprPredictor  # Should be path of variable $USER_SRC_DIR
    
    local_model = LocalModel.build_cpr_model(
       USER_SRC_DIR,
       f"{REGION}-docker.pkg.dev/{PROJECT_ID}/{REPOSITORY}/{IMAGE}",
       predictor=CprPredictor,
       requirements_path=os.path.join(USER_SRC_DIR, "requirements.txt"),
    )
    
  4. Write a test file with two samples for prediction. One of the instances has the abbreviated clarity name, but the other needs to be converted first.

    import json
    
    sample = {"instances": [
       [0.23, 'Ideal', 'E', 'VS2', 61.5, 55.0, 3.95, 3.98, 2.43],
       [0.29, 'Premium', 'J', 'Internally Flawless', 52.5, 49.0, 4.00, 2.13, 3.11]]}
    
    with open('instances.json', 'w') as fp:
       json.dump(sample, fp)
    
  5. Test the container locally by deploying a local model.

    with local_model.deploy_to_local_endpoint(
       artifact_uri = 'model_artifacts/', # local path to artifacts
    ) as local_endpoint:
       predict_response = local_endpoint.predict(
          request_file='instances.json',
          headers={"Content-Type": "application/json"},
       )
    
       health_check_response = local_endpoint.run_health_check()
    
  6. You can see the prediction results with:

    predict_response.content
    

    The output looks like the following:

    b'{"predictions": ["$479.0", "$586.0"]}'
    

Deploy the model to the online prediction model endpoint

Now that you've tested the container locally, it's time to push the image to Artifact Registry and upload the model to Vertex AI Model Registry.

  1. Configure Docker to access Artifact Registry.

    !gcloud artifacts repositories create {REPOSITORY} \
        --repository-format=docker \
        --location=us-central1 \
        --description="Docker repository"
    
    !gcloud auth configure-docker {REGION}-docker.pkg.dev --quiet
    
  2. Push the image.

    local_model.push_image()
    
  3. Upload the model.

    model = aiplatform.Model.upload(local_model = local_model,
                                    display_name=MODEL_DISPLAY_NAME,
                                    artifact_uri=f"{BUCKET_NAME}/{MODEL_ARTIFACT_DIR}",)
    
  4. Deploy the model:

    endpoint = model.deploy(machine_type="n1-standard-2")
    

    Wait until your model deploys before you continue to the next step. Expect deployment to take about 10 to 15 minutes.

  5. Test the deployed model by getting a prediction:

    endpoint.predict(instances=[[0.23, 'Ideal', 'E', 'VS2', 61.5, 55.0, 3.95, 3.98, 2.43]])
    

    The output looks like the following:

    Prediction(predictions=['$479.0'], deployed_model_id='3171115779319922688', metadata=None, model_version_id='1', model_resource_name='projects/721032480027/locations/us-central1/models/8554949231515795456', explanations=None)
    

Validate public internet access to Vertex AI APIs

In this section, you log into the nat-client VM instance in one Cloud Shell session tab and use another session tab to validate connectivity to Vertex AI APIs by running the dig and tcpdump commands against the domain us-central1-aiplatform.googleapis.com.

  1. In the Cloud Shell (Tab One), run the following commands, replacing PROJECT_ID with your project ID:

    projectid=PROJECT_ID
    gcloud config set project ${projectid}
    
  2. Log into the nat-client VM instance using IAP:

    gcloud compute ssh nat-client \
        --project=$projectid \
        --zone=us-central1-a \
        --tunnel-through-iap
    
  3. Run the dig command:

    dig us-central1-aiplatform.googleapis.com
    
  4. From the nat-client VM (Tab One), run the following command to validate DNS resolution when you send an online prediction request to the endpoint.

    sudo tcpdump -i any port 53 -n
    
  5. Open a new Cloud Shell session (Tab Two) by clicking open a new tab in Cloud Shell.

  6. In the new Cloud Shell session (Tab Two), run the following commands, replacing PROJECT_ID with your project ID:

    projectid=PROJECT_ID
    gcloud config set project ${projectid}
    
  7. Log into the nat-client VM instance:

    gcloud compute ssh --zone "us-central1-a" "nat-client" --project "$projectid"
    
  8. From the nat-client VM (Tab Two), use a text editor such as vim or nano to create an instances.json file. You need to prepend sudo in order to have permission to write to the file, for example:

    sudo vim instances.json
    
  9. Add the following data string to the file:

    {"instances": [
       [0.23, 'Ideal', 'E', 'VS2', 61.5, 55.0, 3.95, 3.98, 2.43],
       [0.29, 'Premium', 'J', 'Internally Flawless', 52.5, 49.0, 4.00, 2.13, 3.11]]}
    
  10. Save the file as follows:

    • If you're using vim, press the Esc key, and then type :wq to save the file and exit.
    • If you're using nano, type Control+O and press Enter to save the file, and then type Control+X to exit.
  11. Locate the online prediction endpoint ID for the PSC endpoint:

    1. In the Google Cloud console, in the Vertex AI section, go to the Endpoints tab in the Online prediction page.

      Go to Endpoints

    2. Find the row of the endpoint that you created, named diamonds-cpr_endpoint.

    3. Locate the 19-digit endpoint ID in the ID column and copy it.

  12. In the Cloud Shell, from the nat-client VM (Tab Two), run the following commands, replacing PROJECT_ID with your project ID and ENDPOINT_ID with the PSC endpoint ID:

    projectid=PROJECT_ID
    gcloud config set project ${projectid}
    ENDPOINT_ID=ENDPOINT_ID
    
  13. From the nat-client VM (Tab Two), run the following command to send an online prediction request:

    curl -X POST -H "Authorization: Bearer $(gcloud auth print-access-token)" -H "Content-Type: application/json" https://us-central1-aiplatform.googleapis.com/v1/projects/${projectid}/locations/us-central1/endpoints/${ENDPOINT_ID}:predict -d @instances.json
    

Now that you've run the prediction, you'll see that the tcpdump results (Tab One) show the nat-client VM instance (192.168.10.2) performing a Cloud DNS query to the local DNS server (169.254.169.254) for the Vertex AI API domain (us-central1-aiplatform.googleapis.com). The DNS query returns public Virtual IP Addresses (VIPs) for Vertex AI APIs.

Validate private access to Vertex AI APIs

In this section, you log into the private-client VM instance using Identity-Aware Proxy in a new Cloud Shell session (Tab Three), and then you validate connectivity to Vertex AI APIs by running the dig command against the Vertex AI domain (us-central1-aiplatform.googleapis.com).

  1. Open a new Cloud Shell session (Tab Three) by clicking open a new tab in Cloud Shell. This is Tab Three.

  2. In the new Cloud Shell session (Tab Three), run the following commands, replacing PROJECT_ID with your project ID:

    projectid=PROJECT_ID
    gcloud config set project ${projectid}
    
  3. Log into the private-client VM instance using IAP:

    gcloud compute ssh private-client \
        --project=$projectid \
        --zone=us-central1-a \
        --tunnel-through-iap
    
  4. Run the dig command:

    dig us-central1-aiplatform.googleapis.com
    
  5. In the private-client VM instance (Tab Three), use a text editor such as vim or nano to add the following line to the /etc/hosts file:

    100.100.10.10 us-central1-aiplatform.googleapis.com
    

    This line assigns the PSC endpoint's IP address (100.100.10.10) to the fully qualified domain name for the Vertex AI Google API (us-central1-aiplatform.googleapis.com). The edited file should look like this:

    127.0.0.1       localhost
    ::1             localhost ip6-localhost ip6-loopback
    ff02::1         ip6-allnodes
    ff02::2         ip6-allrouters
    
    100.100.10.10 us-central1-aiplatform.googleapis.com # Added by you
    192.168.20.2 private-client.c.$projectid.internal private-client  # Added by Google
    169.254.169.254 metadata.google.internal  # Added by Google
    
  6. From the private-client VM (Tab Three), ping the Vertex AI endpoint and Control+C to exit when you see the output:

    ping us-central1-aiplatform.googleapis.com
    

    The ping command returns the following output containing the PSC endpoint IP address:

    PING us-central1-aiplatform.googleapis.com (100.100.10.10) 56(84) bytes of data.
    
  7. From the private-client VM (Tab Three), use tcpdump to run the following command to validate DNS resolution and IP data path when you send an online prediction request to the endpoint:

     sudo tcpdump -i any port 53 -n or host 100.100.10.10
    
  8. Open a new Cloud Shell session (Tab Four) by clicking open a new tab in Cloud Shell.

  9. In the new Cloud Shell session (Tab Four), run the following commands, replacing PROJECT_ID with your project ID:

    projectid=PROJECT_ID
    gcloud config set project ${projectid}
    
  10. In Tab Four, log into the private-client instance:

    gcloud compute ssh \
        --zone "us-central1-a" "private-client" \
        --project "$projectid"
    
  11. From the private-client VM (Tab Four), using a text editor such as vim or nano, create an instances.json file containing the following data string:

    {"instances": [
       [0.23, 'Ideal', 'E', 'VS2', 61.5, 55.0, 3.95, 3.98, 2.43],
       [0.29, 'Premium', 'J', 'Internally Flawless', 52.5, 49.0, 4.00, 2.13, 3.11]]}
    
  12. From the private-client VM (Tab Four), run the following commands, replacing PROJECT_ID with your project name and ENDPOINT_ID with the PSC endpoint ID:

    projectid=PROJECT_ID
    echo $projectid
    ENDPOINT_ID=ENDPOINT_ID
    
  13. From the private-client VM (Tab Four), run the following command to send an online prediction request:

    curl -v -X POST -H "Authorization: Bearer $(gcloud auth print-access-token)" -H "Content-Type: application/json" https://us-central1-aiplatform.googleapis.com/v1/projects/${projectid}/locations/us-central1/endpoints/${ENDPOINT_ID}:predict -d @instances.json
    
  14. From the private-client VM in Cloud Shell (Tab Three), verify that the PSC endpoint IP address (100.100.10.10) was used to access Vertex AI APIs.

    From the private-client tcpdump terminal in Cloud Shell Tab Three, you can see that a DNS lookup to us-central1-aiplatform.googleapis.com isn't needed, because the line that you added to the /etc/hosts file takes precedence, and the PSC IP address 100.100.10.10 is used in the data path.

Clean up

To avoid incurring charges to your Google Cloud account for the resources used in this tutorial, either delete the project that contains the resources, or keep the project and delete the individual resources.

You can delete the individual resources in the project as follows:

  1. Delete the Vertex AI Workbench instance as follows:

    1. In the Google Cloud console, in the Vertex AI section, go to the Instances tab in the Workbench page.

      Go to Vertex AI Workbench

    2. Select the workbench-tutorial Vertex AI Workbench instance and click Delete.

  2. Delete the container image as follows:

    1. In the Google Cloud console, go to the Artifact Registry page.

      Go to Artifact Registry

    2. Select the diamonds Docker container, and click Delete.

  3. Delete the storage bucket as follows:

    1. In the Google Cloud console, go to the Cloud Storage page.

      Go to Cloud Storage

    2. Select your storage bucket, and click Delete.

  4. Undeploy the model from the endpoint as follows:

    1. In the Google Cloud console, in the Vertex AI section, go to the Endpoints page.

      Go to Endpoints

    2. Click diamonds-cpr_endpoint to go to the endpoint details page.

    3. On the row for your model, diamonds-cpr, click Undeploy model .

    4. In the Undeploy model from endpoint dialog, click Undeploy.

  5. Delete the model as follows:

    1. In the Google Cloud console, in the Vertex AI section, go to the Model Registry page.

      Go to Model Registry

    2. Select the diamonds-cpr model.

    3. To delete the model, click Actions, and then click Delete model.

  6. Delete the online prediction endpoint as follows:

    1. In the Google Cloud console, in the Vertex AI section, go to the Online prediction page.

      Go to Online prediction

    2. Select the diamonds-cpr_endpoint endpoint.

    3. To delete the endpoint, click Actions, and then click Delete endpoint.

  7. In the Cloud Shell, delete the remaining resources by executing the following commands.

    Go to Cloud Shell

    projectid=PROJECT_ID
    gcloud config set project ${projectid}
    
    gcloud compute forwarding-rules delete pscvertex \
        --global \
        --quiet
    
    gcloud compute addresses delete psc-ip \
        --global \
        --quiet
    
    gcloud compute networks subnets delete workbench-subnet \
        --region=us-central1 \
        --quiet
    
    gcloud compute vpn-tunnels delete aiml-vpc-tunnel0 aiml-vpc-tunnel1 on-prem-tunnel0 on-prem-tunnel1 \
        --region=us-central1 \
        --quiet
    
    gcloud compute vpn-gateways delete aiml-vpn-gw on-prem-vpn-gw \
        --region=us-central1 \
        --quiet
    
    gcloud compute routers delete aiml-cr-us-central1 cloud-router-us-central1-aiml-nat \
        --region=us-central1 \
        --quiet
    
    gcloud compute routers delete cloud-router-us-central1-on-prem-nat on-prem-cr-us-central1 \
        --region=us-central1 \
        --quiet
    
    gcloud compute instances delete nat-client private-client \
        --zone=us-central1-a \
        --quiet
    
    gcloud compute firewall-rules delete ssh-iap-on-prem-vpc \
        --quiet
    
    gcloud compute networks subnets delete nat-subnet  private-ip-subnet \
        --region=us-central1 \
        --quiet
    
    gcloud compute networks delete on-prem-vpc \
        --quiet
    
    gcloud compute networks delete aiml-vpc \
        --quiet
    

What's next