Migrate Custom Prediction Routines to Vertex AI

This page describes how to migrate your Custom Prediction Routine (CPR) deployments from AI Platform to Vertex AI.

Specifically, given a CPR deployment on AI Platform, this page shows you how to:

Create a corresponding custom container for deployment on Vertex AI. This custom container works like any custom container created with the CPR on Vertex AI API.
Run and test the custom container locally.
Upload it to the vertex_model_registry_name.
Deploy the model to a Vertex AI endpoint to serve online predictions.

Before you begin

Make sure that the following software is installed:
Have the model artifacts and custom code from your CPR on AI Platform deployment that you want to migrate to Vertex AI.
Have a Cloud Storage bucket to store the model artifacts.
Make sure you have the Vertex AI API enabled in your project.

Enable Vertex AI API

Prepare the source folder for Vertex AI deployment

Create a local folder called model_artifacts and copy the model artifacts from your CPR on AI Platform deployment. These should be the same model artifacts that you specified in deployment_uri (or --origin if you used gcloud) when you deployed your CPR on AI Platform model.
Create a local folder called cpr_src_dir. This folder will hold your source distribution packages, adapter.py, and requirements.txt (described below) which are used to build your custom container for deployment on Vertex AI.
Copy all of the packages you supplied in package_uris when you deployed your CPR on AI Platform, including the one that contains your Predictor class.

Create a adapter.py file that contains the AdapterPredictor class (defined below) and set the PREDICTION_CLASS to the fully qualified name of your Predictor class. This value is the same as prediction_class when you deployed your CPR on AI Platform.

The adapter wraps the CPR in the AI Platform Predictor interface so that it's compatible with the CPR on Vertex AI interface.

import pydoc

from google.cloud.aiplatform.utils import prediction_utils
from google.cloud.aiplatform.prediction.predictor import Predictor

# Fully qualified name of your CPR on CAIP Predictor class.
PREDICTION_CLASS = "predictor.MyPredictor"

class AdapterPredictor(Predictor):
  """Predictor implementation for adapting CPR on CAIP predictors."""

  def __init__(self):
      return

  def load(self, artifacts_uri: str):
      """Loads the model artifact.

      Args:
          artifacts_uri (str):
              Required. The model artifacts path (may be local or on Cloud Storage).
      """
      prediction_utils.download_model_artifacts(artifacts_uri)
      custom_class = pydoc.locate(PREDICTION_CLASS)
      self._predictor = custom_class.from_path(".")

  def predict(self, instances):
      """Performs prediction.

      Args:
          instances (Any):
              Required. The instance(s) used for performing prediction.

      Returns:
          Prediction results.
      """
      return self._predictor.predict(**instances)

Create a requirements.txt file that contains your model's dependencies, for example:
```
# Required for model serving
google-cloud-storage>=1.26.0,<2.0.0dev
google-cloud-aiplatform[prediction]>=1.16.0

# ML dependencies
numpy>=1.16.0
scikit-learn==0.20.2
```
The first section lists the dependencies required for model serving.

The second section lists the Machine Learning packages required for model serving (for example, scikit-learn, xgboost, and TensorFlow). Be sure to install the same version of these libraries as listed under the runtime version you chose when previously deploying your model version.

Install the dependencies in your local environment:

pip install -U --user -r cpr_src_dir/requirements.txt

Upload your model artifacts to Cloud Storage

Upload the model artifacts to Cloud Storage:

gcloud storage cp model_artifacts/* gs://BUCKET_NAME/MODEL_ARTIFACT_DIR

Set up Artifact Registry

Artifact Registry is used to to store and manage your Docker container images.

Make sure you have the Artifacts Registry API enabled in your project.

Enable the Artifacts Registry API

Create your repository if you don't already have one.

gcloud artifacts repositories create {REPOSITORY} \
    --repository-format=docker \
    --location={REGION}

Before you can push or pull images, configure Docker to use the Google Cloud CLI to authenticate requests to Artifact Registry.
```
gcloud auth configure-docker {REGION}-docker.pkg.dev
```

Build, test, and deploy your custom container

The following Python script demonstrates how to build, test, and deploy your custom container by using the APIs in the Vertex AI SDK. Be sure to set the variables at the top of the script.

import json
import logging
import os

from google.cloud import aiplatform
from google.cloud.aiplatform.prediction import LocalModel
from cpr_src_dir.adapter import AdapterPredictor

##########################################################################
# CONFIGURE THE FOLLOWING
##########################################################################
# We recommend that you choose the region closest to you.
REGION = …
# Your project ID.
PROJECT_ID = …
# Name of the Artifact Repository to create or use.
REPOSITORY = …
# Name of the container image that will be pushed.
IMAGE = …
# Cloud Storage bucket where your model artifacts will be stored.
BUKCET_NAME = …
# Directory within the bucket where your model artifacts are stored.
MODEL_ARTIFACT_DIR = …
# Your model's input instances.
INSTANCES = …

##########################################################################
# Build the CPR custom container
##########################################################################
local_model = LocalModel.build_cpr_model(
    "cpr_src_dir",
    f"{REGION}-docker.pkg.dev/{PROJECT_ID}/{REPOSITORY}/{IMAGE}",
    predictor=AdapterPredictor,
    requirements_path="cpr_src_dir/requirements.txt",
    extra_packages=["cpr_src_dir/my_custom_code-0.1.tar.gz"],
)

##########################################################################
# Run and test the custom container locally
##########################################################################
logging.basicConfig(level=logging.INFO)

local_endpoint =
       local_model.deploy_to_local_endpoint(artifact_uri="model_artifacts")
local_endpoint.serve()

health_check_response = local_endpoint.run_health_check()

predict_response = local_endpoint.predict(
        request=json.dumps({"instances": INSTANCES}),
        headers={"Content-Type": "application/json"},
    )

local_endpoint.stop()

print(predict_response, predict_response.content)
print(health_check_response, health_check_response.content)
local_endpoint.print_container_logs(show_all=True)

##########################################################################
# Upload and deploy to Vertex
##########################################################################
local_model.push_image()

model = aiplatform.Model.upload(\
    local_model=local_model,
    display_name=MODEL_DISPLAY_NAME,
    artifact_uri=f"gs://{BUKCET_NAME}/{MODEL_ARTIFACT_DIR}",
)

endpoint = model.deploy(machine_type="n1-standard-4")

endpoint.predict(instances=INSTANCES)

Learn more about Vertex AI Prediction.