This legacy version of AI Platform Prediction is deprecated and will no longer be available on Google Cloud after January 31, 2025. All models, associated metadata, and deployments will be deleted after January 31, 2025. Migrate your resources to Vertex AI to get new machine learning features that are unavailable in AI Platform.

Custom prediction routines

Custom prediction routines allow you to determine what code runs when you send an online prediction request to AI Platform Prediction.

When you deploy a version resource to AI Platform Prediction without using a custom prediction routine, it handles prediction requests by performing the prediction operation of the machine learning framework you used for training.

But when you deploy a custom prediction routine as your version resource, you can tell AI Platform Prediction to run custom Python code in response to every prediction request it receives. You can preprocess prediction input before your trained model performs prediction, or you can postprocess the model's prediction before sending the prediction result.

To create a custom prediction routine, you must provide two parts to AI Platform Prediction when you create your model version:

A model directory in Cloud Storage, which contains any artifacts that need to be used for prediction.
A .tar.gz Python source distribution package in Cloud Storage containing your implementation of the Predictor interface and any other custom code you want AI Platform Prediction to use at prediction time.

You can only deploy a custom prediction routine when you use a legacy (MLS1) machine type for your model version.

Upload model artifacts to your model directory

Follow the guide to deploying models to upload your trained model to Cloud Storage, along with any other files that provide data or statefulness for AI Platform Prediction to use during prediction.

The total file size of the model artifacts that you deploy to AI Platform Prediction must be 500 MB or less.

You can upload your trained machine learning model to your model directory as a TensorFlow SavedModel, a model.joblib file, a model.pkl file, or a model.bst file, but you can also provide your model as a HDF5 file containing a trained tf.keras model or in a different serialized format.

You can additionally include a pickle file with an instance of a custom preprocessor class that holds serialized state from training.

For example, consider the following preprocessor, defined in file called preprocess.py:

import numpy as np


class ZeroCenterer(object):
    """Stores means of each column of a matrix and uses them for preprocessing."""

    def __init__(self):
        """On initialization, is not tied to any distribution."""
        self._means = None

    def preprocess(self, data):
        """Transforms a matrix.

        The first time this is called, it stores the means of each column of
        the input. Then it transforms the input so each column has mean 0. For
        subsequent calls, it subtracts the stored means from each column. This
        lets you 'center' data at prediction time based on the distribution of
        the original training data.

        Args:
            data: A NumPy matrix of numerical data.

        Returns:
            A transformed matrix with the same dimensions as the input.
        """
        if self._means is None:  # during training only
            self._means = np.mean(data, axis=0)
        return data - self._means

During training on a numerical dataset, the preprocessor centers the data around 0 by subtracting the mean of each column from every value in the column. Then, you can export the preprocessor instance as a pickle file, preprocessor.pkl, which preserves the means of each column calculated from the training data.

During prediction, a custom prediction routine can load the preprocessor from this file to perform an identical transformation on prediction input.

To learn how to use a stateful preprocessor like this in your custom prediction routine, read the next section, which describes how to implement the Predictor interface.

To work through a full example of using a stateful preprocessor during training and prediction, read Creating a custom prediction routine with Keras or Creating a custom prediction routine with scikit-learn.

Create your Predictor

Tell AI Platform Prediction how to handle prediction requests by providing it with a Predictor class. This is a class that implements the following interface:

class Predictor(object):
    """Interface for constructing custom predictors."""

    def predict(self, instances, **kwargs):
        """Performs custom prediction.

        Instances are the decoded values from the request. They have already
        been deserialized from JSON.

        Args:
            instances: A list of prediction input instances.
            **kwargs: A dictionary of keyword args provided as additional
                fields on the predict request body.

        Returns:
            A list of outputs containing the prediction results. This list must
            be JSON serializable.
        """
        raise NotImplementedError()

    @classmethod
    def from_path(cls, model_dir):
        """Creates an instance of Predictor using the given path.

        Loading of the predictor should be done in this method.

        Args:
            model_dir: The local directory that contains the exported model
                file along with any additional files uploaded when creating the
                version resource.

        Returns:
            An instance implementing this Predictor class.
        """
        raise NotImplementedError()

AI Platform Prediction prediction nodes use the from_path class method to load an instance of your Predictor. This method should load the artifacts you saved in your model directory, the contents of which are copied from Cloud Storage to a location surfaced by the model_dir argument.

Whenever your deployment receives an online prediction request, the instance of the Predictor class returned by from_path uses its predict method to generate predictions. AI Platform Prediction serializes the return value of this method to JSON and sends it as the response to the prediction request.

Note that the predict method does not need to convert input from JSON into Python objects or convert output to JSON; AI Platform Prediction handles this outside of the predict method.

AI Platform Prediction provides the instances argument by parsing the instances field from the body of the predict request to the AI Platform Training and Prediction API. It parses any other fields of the request body and provides them to the predict method as entries in the **kwargs dictionary. To learn more, read about how to structure a predict request to the AI Platform Training and Prediction API.

Continuing the example from the previous section, suppose your model directory contains preprocessor.pkl (the pickled instance of the ZeroCenterer class) and either a trained tf.keras model exported as model.h5or a trained scikit-learn model exported as model.joblib.

Depending on which machine learning framework you use, implement one of the following Predictor classes in a file called predictor.py:

TensorFlow

import os
import pickle

import numpy as np
from tensorflow import keras


class MyPredictor(object):
    """An example Predictor for an AI Platform custom prediction routine."""

    def __init__(self, model, preprocessor):
        """Stores artifacts for prediction. Only initialized via `from_path`."""
        self._model = model
        self._preprocessor = preprocessor

    def predict(self, instances, **kwargs):
        """Performs custom prediction.

        Preprocesses inputs, then performs prediction using the trained Keras
        model.

        Args:
            instances: A list of prediction input instances.
            **kwargs: A dictionary of keyword args provided as additional
                fields on the predict request body.

        Returns:
            A list of outputs containing the prediction results.
        """
        inputs = np.asarray(instances)
        preprocessed_inputs = self._preprocessor.preprocess(inputs)
        outputs = self._model.predict(preprocessed_inputs)
        return outputs.tolist()

    @classmethod
    def from_path(cls, model_dir):
        """Creates an instance of MyPredictor using the given path.

        This loads artifacts that have been copied from your model directory in
        Cloud Storage. MyPredictor uses them during prediction.

        Args:
            model_dir: The local directory that contains the trained Keras
                model and the pickled preprocessor instance. These are copied
                from the Cloud Storage model directory you provide when you
                deploy a version resource.

        Returns:
            An instance of `MyPredictor`.
        """
        model_path = os.path.join(model_dir, "model.h5")
        model = keras.models.load_model(model_path)

        preprocessor_path = os.path.join(model_dir, "preprocessor.pkl")
        with open(preprocessor_path, "rb") as f:
            preprocessor = pickle.load(f)

        return cls(model, preprocessor)

scikit-learn

import os
import pickle

import numpy as np
from sklearn.externals import joblib


class MyPredictor(object):
    """An example Predictor for an AI Platform custom prediction routine."""

    def __init__(self, model, preprocessor):
        """Stores artifacts for prediction. Only initialized via `from_path`."""
        self._model = model
        self._preprocessor = preprocessor

    def predict(self, instances, **kwargs):
        """Performs custom prediction.

        Preprocesses inputs, then performs prediction using the trained
        scikit-learn model.

        Args:
            instances: A list of prediction input instances.
            **kwargs: A dictionary of keyword args provided as additional
                fields on the predict request body.

        Returns:
            A list of outputs containing the prediction results.
        """
        inputs = np.asarray(instances)
        preprocessed_inputs = self._preprocessor.preprocess(inputs)
        outputs = self._model.predict(preprocessed_inputs)
        return outputs.tolist()

    @classmethod
    def from_path(cls, model_dir):
        """Creates an instance of MyPredictor using the given path.

        This loads artifacts that have been copied from your model directory in
        Cloud Storage. MyPredictor uses them during prediction.

        Args:
            model_dir: The local directory that contains the trained
                scikit-learn model and the pickled preprocessor instance. These
                are copied from the Cloud Storage model directory you provide
                when you deploy a version resource.

        Returns:
            An instance of `MyPredictor`.
        """
        model_path = os.path.join(model_dir, "model.joblib")
        model = joblib.load(model_path)

        preprocessor_path = os.path.join(model_dir, "preprocessor.pkl")
        with open(preprocessor_path, "rb") as f:
            preprocessor = pickle.load(f)

        return cls(model, preprocessor)

Notice the predict method converts the prediction results to a list with the tolist method before returning them. NumPy arrays are not JSON serializable, so you must convert them to lists of numbers (which are JSON serializable). Otherwise, AI Platform Prediction will fail to send the prediction response.

Package your Predictor and its dependencies

You must package your Predictor as a .tar.gz source distribution package. Since NumPy, TensorFlow, and scikit-learn are included in the AI Platform Prediction runtime image, you don't need to include these dependencies in the tarball. However, make sure to include any of the Predictor's dependencies that are not pre-installed on AI Platform Prediction.

For the example above, you must include preprocess.py in your source distribution package, even though your Predictor doesn't explicitly import it. Otherwise, preprocessor = pickle.load(f) will fail because Python won't recognize the class of the ZeroCenterer instance in the pickle file.

The following setup.py shows one way to package these scripts:

from setuptools import setup

setup(name="my_custom_code", version="0.1", scripts=["predictor.py", "preprocess.py"])

To package and upload the custom code example described on this page, do the following:

Create the preprocess.py, predictor.py, and setup.py files described in previous sections, all in the same directory. Navigate to that directory in your shell.
Run python setup.py sdist --formats=gztar to create dist/my_custom_code-0.1.tar.gz.
Upload this tarball to a staging location in Cloud Storage.

This does not need to be the same as your model directory. If you plan to iterate and deploy multiple versions of your custom prediction routine, you may want to upload your custom code packages in a designated staging directory. You can increment the version argument in setup.py when you update the code to keep track of different versions.

The following command shows one way to upload your source distribution package to Cloud Storage:
```
gsutil cp dist/my_custom_code-0.1.tar.gz gs://YOUR_BUCKET/PATH_TO_STAGING_DIR/
```

You can provide code for your custom prediction routine in one or more packages.

Deploy your custom prediction routine

First, select a region where online prediction is available and use gcloud to create a model resource:

gcloud ai-platform models create MODEL_NAME{"</var>"}} \
  --regions CHOSEN_REGION

Ensure your gcloud beta component is updated, then create a version resource with special attention to the following gcloud flags:

--origin: The path to your model directory in Cloud Storage.
--package-uris: A comma separated list of user code tarballs in Cloud Storage, including the one containing your Predictor class.
--prediction-class: The fully qualified name of your Predictor class (MODULE_NAME.CLASS_NAME).
--framework: Do not specify a framework when deploying a custom prediction routine.
--runtime-version: Custom prediction routines are available in runtime versions 1.4 through 2.11.

The following command shows how to create a version resource based on the example files described in previous sections:

gcloud components install beta

gcloud beta ai-platform versions create VERSION_NAME \
  --model MODEL_NAME{"</var>"}} \
  --runtime-version 2.11 \
  --python-version 3.7 \
  --origin gs://YOUR_BUCKET/PATH_TO_MODEL_DIR \
  --package-uris gs://YOUR_BUCKET/PATH_TO_STAGING_DIR/my_custom_code-0.1.tar.gz \
  --prediction-class predictor.MyPredictor

To learn more about creating models and versions in detail, or to learn how to create them by using the Google Cloud console rather than the gcloud CLI, see the guide to deploying models.

Specify a service account

When creating a version resource, you may optionally specify a service account for your custom prediction routine to use during prediction. This lets you customize its permissions to access other Google Cloud resources. Learn more about specifying a service account for your custom prediction routine.

What's next

Work through a tutorial on using custom prediction routines with Keras or with scikit-learn to see a more complete example of how to train and deploy a model using a custom prediction routine.
Read the guide to exporting models to learn about exporting artifacts for prediction without using a custom prediction routine.
Read the guide to deploying models to learn more details about deploying model and version resources to AI Platform Prediction to serve predictions.