Custom prediction routines

Custom prediction routines allow you to determine what code runs when you send an online prediction request to AI Platform.

When you deploy a version resource to AI Platform without using a custom prediction routine, it handles prediction requests by performing the prediction operation of the machine learning framework you used for training.

But when you deploy a custom prediction routine as your version resource, you can tell AI Platform to run custom Python code in response to every prediction request it receives. You can preprocess prediction input before your trained model performs prediction, or you can postprocess the model's prediction before sending the prediction result.

To create a custom prediction routine, you must provide two parts to AI Platform when you create your model version:

  • A model directory in Cloud Storage, which contains any artifacts that need to be used for prediction.

  • A .tar.gz Python source distribution package in Cloud Storage containing your implementation of the Predictor interface and any other custom code you want AI Platform to use at prediction time.

Upload model artifacts to your model directory

Follow the guide to deploying models to upload your trained model to Cloud Storage, along with any other files that provide data or statefulness for AI Platform to use during prediction.

The total file size of the model artifacts that you deploy to AI Platform Prediction must be 250 MB or less. You can request a higher quota to deploy larger models.

Your model directory could include a TensorFlow SavedModel, but it could also include an HDF5 file containing a trained tf.keras model instead. You could additionally include a pickle file with an instance of a custom preprocessor class that holds serialized state from training.

For example, consider the following preprocessor, defined in file called

import numpy as np

class ZeroCenterer(object):
    """Stores means of each column of a matrix and uses them for preprocessing.

    def __init__(self):
        """On initialization, is not tied to any distribution."""
        self._means = None

    def preprocess(self, data):
        """Transforms a matrix.

        The first time this is called, it stores the means of each column of
        the input. Then it transforms the input so each column has mean 0. For
        subsequent calls, it subtracts the stored means from each column. This
        lets you 'center' data at prediction time based on the distribution of
        the original training data.

            data: A NumPy matrix of numerical data.

            A transformed matrix with the same dimensions as the input.
        if self._means is None:  # during training only
            self._means = np.mean(data, axis=0)
        return data - self._means

During training on a numerical dataset, the preprocessor centers the data around 0 by subtracting the mean of each column from every value in the column. Then, you can export the preprocessor instance as a pickle file, preprocessor.pkl, which preserves the means of each column calculated from the training data.

During prediction, a custom prediction routine can load the preprocessor from this file to perform an identical transformation on prediction input.

To learn how to use a stateful preprocessor like this in your custom prediction routine, read the next section, which describes how to implement the Predictor interface.

To work through a full example of using a stateful preprocessor during training and prediction, read the tutorial on using a custom prediction routine for preprocessing.

Create your Predictor

Tell AI Platform how to handle prediction requests by providing it with a Predictor class. This is a class that implements the following interface:

class Predictor(object):
    """Interface for constructing custom predictors."""

    def predict(self, instances, **kwargs):
        """Performs custom prediction.

        Instances are the decoded values from the request. They have already
        been deserialized from JSON.

            instances: A list of prediction input instances.
            **kwargs: A dictionary of keyword args provided as additional
                fields on the predict request body.

            A list of outputs containing the prediction results. This list must
            be JSON serializable.
        raise NotImplementedError()

    def from_path(cls, model_dir):
        """Creates an instance of Predictor using the given path.

        Loading of the predictor should be done in this method.

            model_dir: The local directory that contains the exported model
                file along with any additional files uploaded when creating the
                version resource.

            An instance implementing this Predictor class.
        raise NotImplementedError()

AI Platform prediction nodes use the from_path class method to load an instance of your Predictor. This method should load the artifacts you saved in your model directory, the contents of which are copied from Cloud Storage to a location surfaced by the model_dir argument.

Whenever your deployment receives an online prediction request, the instance of the Predictor class returned by from_path uses its predict method to generate predictions. AI Platform serializes the return value of this method to JSON and sends it as the response to the prediction request.

Note that the predict method does not need to convert input from JSON into Python objects or convert output to JSON; AI Platform handles this outside of the predict method.

AI Platform provides the instances argument by parsing the instances field from the body of the predict request to the AI Platform Training and Prediction API. It parses any other fields of the request body and provides them to the predict method as entries in the **kwargs dictionary. To learn more, read about how to structure a predict request to the AI Platform Training and Prediction API.

Continuing the example from the previous section, suppose your model directory contains preprocessor.pkl (the pickled instance of the ZeroCenterer class) and a trained tf.keras model exported as model.h5

Then, you could implement the following Predictor class in a file called

import os
import pickle

import numpy as np
from tensorflow import keras

class MyPredictor(object):
    """An example Predictor for an AI Platform custom prediction routine."""

    def __init__(self, model, preprocessor):
        """Stores artifacts for prediction. Only initialized via `from_path`.
        self._model = model
        self._preprocessor = preprocessor

    def predict(self, instances, **kwargs):
        """Performs custom prediction.

        Preprocesses inputs, then performs prediction using the trained Keras

            instances: A list of prediction input instances.
            **kwargs: A dictionary of keyword args provided as additional
                fields on the predict request body.

            A list of outputs containing the prediction results.
        inputs = np.asarray(instances)
        preprocessed_inputs = self._preprocessor.preprocess(inputs)
        outputs = self._model.predict(preprocessed_inputs)
        return outputs.tolist()

    def from_path(cls, model_dir):
        """Creates an instance of MyPredictor using the given path.

        This loads artifacts that have been copied from your model directory in
        Cloud Storage. MyPredictor uses them during prediction.

            model_dir: The local directory that contains the trained Keras
                model and the pickled preprocessor instance. These are copied
                from the Cloud Storage model directory you provide when you
                deploy a version resource.

            An instance of `MyPredictor`.
        model_path = os.path.join(model_dir, 'model.h5')
        model = keras.models.load_model(model_path)

        preprocessor_path = os.path.join(model_dir, 'preprocessor.pkl')
        with open(preprocessor_path, 'rb') as f:
            preprocessor = pickle.load(f)

        return cls(model, preprocessor)

Notice the predict method converts the prediction results to a list with the tolist method before returning them. NumPy arrays are not JSON serializable, so you must convert them to lists of numbers (which are JSON serializable). Otherwise, AI Platform will fail to send the prediction response.

Package your Predictor and its dependencies

You must package your Predictor as a .tar.gz source distribution package. Since NumPy and TensorFlow are included in the AI Platform runtime image, you don't need to include these dependencies in the tarball. However, make sure to include any of the Predictor's dependencies that are not pre-installed on AI Platform.

For the example above, you must include in your source distribution package, even though your Predictor doesn't explicitly import it. Otherwise, preprocessor = pickle.load(f) will fail because Python won't recognize the class of the ZeroCenterer instance in the pickle file.

The following shows one way to package these scripts:

from setuptools import setup

    scripts=['', ''])

To package and upload the custom code example described on this page, do the following:

  1. Create the,, and files described in previous sections, all in the same directory. Navigate to that directory in your shell.

  2. Run python sdist --formats=gztar to create dist/my_custom_code-0.1.tar.gz.

  3. Upload this tarball to a staging location in Cloud Storage.

    This does not need to be the same as your model directory. If you plan to iterate and deploy multiple versions of your custom prediction routine, you may want to upload your custom code packages in a designated staging dierctory. You can increment the version argument in when you update the code to keep track of different versions.

    The following command shows one way to upload your source distribution package to Cloud Storage:

    gsutil cp dist/my_custom_code-0.1.tar.gz gs://your-bucket/path-to-staging-dir/

You can provide code for your custom prediction routine in one or more packages. The total file size of these packages must be 250 MB or less. You can request a higher quota to deploy more custom code.

Deploy your custom prediction routine

First, select a region where online prediction is available and use gcloud to create a model resource:

gcloud ai-platform models create model-name \
  --regions chosen-region

Ensure your gcloud beta component is updated, then create a version resource with special attention to the following gcloud flags:

  • --origin: The path to your model directory in Cloud Storage.
  • --package-uris: A comma separated list of user code tarballs in Cloud Storage, including the one containing your Predictor class.
  • --prediction-class: The fully qualified name of your Predictor class (module_name.class_name).
  • --framework: Do not specify a framework when deploying a custom prediction routine.
  • --runtime-version: Custom prediction routines are available in runtime versions 1.4 and above.

The following command shows how to create a version resource based on the example files described in previous sections:

gcloud components install beta

gcloud beta ai-platform versions create version-name \
  --model model-name \
  --runtime-version 1.14 \
  --python-version 3.5 \
  --origin gs://your-bucket/path-to-model-dir \
  --package-uris gs://your-bucket/path-to-staging-dir/my_custom_code-0.1.tar.gz \
  --prediction-class predictor.MyPredictor

To learn more about creating models and versions in detail, or to learn how to create them by using the Google Cloud Platform Console rather than the gcloud tool, see the guide to deploying models.

Specify a service account

When creating a version resource, you may optionally specify a service account for your custom prediction routine to use during prediction. This lets you customize its permissions to access other Google Cloud Platform resources. Learn more about specifying a service account for your custom prediction routine.

What's next

Var denne siden nyttig? Si fra hva du synes:

Send tilbakemelding om ...

AI Platform for TensorFlow