Custom prediction routines allow you to determine what code runs when you send an online prediction request to AI Platform.
When you deploy a version resource to AI Platform without using a custom prediction routine, it handles prediction requests by performing the prediction operation of the machine learning framework you used for training.
But when you deploy a custom prediction routine as your version resource, you can tell AI Platform to run custom Python code in response to every prediction request it receives. You can preprocess prediction input before your trained model performs prediction, or you can postprocess the model's prediction before sending the prediction result.
To create a custom prediction routine, you must provide two parts to AI Platform when you create your model version:
A model directory in Cloud Storage, which contains any artifacts that need to be used for prediction.
.tar.gzPython source distribution package in Cloud Storage containing your implementation of the Predictor interface and any other custom code you want AI Platform to use at prediction time.
Upload model artifacts to your model directory
Follow the guide to deploying models to upload your trained model to Cloud Storage, along with any other files that provide data or statefulness for AI Platform to use during prediction.
The total file size of the model artifacts that you deploy to AI Platform Prediction must be 250 MB or less. You can request a higher quota to deploy larger models.
Your model directory could include a
SavedModel, but it could
also include an HDF5 file containing a trained
instead. You could
additionally include a pickle file with an instance of a custom preprocessor
class that holds serialized state from training.
For example, consider the following preprocessor, defined in file called
import numpy as np class ZeroCenterer(object): """Stores means of each column of a matrix and uses them for preprocessing. """ def __init__(self): """On initialization, is not tied to any distribution.""" self._means = None def preprocess(self, data): """Transforms a matrix. The first time this is called, it stores the means of each column of the input. Then it transforms the input so each column has mean 0. For subsequent calls, it subtracts the stored means from each column. This lets you 'center' data at prediction time based on the distribution of the original training data. Args: data: A NumPy matrix of numerical data. Returns: A transformed matrix with the same dimensions as the input. """ if self._means is None: # during training only self._means = np.mean(data, axis=0) return data - self._means
During training on a numerical dataset, the preprocessor centers the data around
0 by subtracting the mean of each column from every value in the column. Then,
you can export the preprocessor instance as a pickle file,
which preserves the means of each column calculated from the training data.
During prediction, a custom prediction routine can load the preprocessor from this file to perform an identical transformation on prediction input.
To learn how to use a stateful preprocessor like this in your custom prediction routine, read the next section, which describes how to implement the Predictor interface.
To work through a full example of using a stateful preprocessor during training and prediction, read the tutorial on using a custom prediction routine for preprocessing.
Create your Predictor
Tell AI Platform how to handle prediction requests by providing it with a Predictor class. This is a class that implements the following interface:
class Predictor(object): """Interface for constructing custom predictors.""" def predict(self, instances, **kwargs): """Performs custom prediction. Instances are the decoded values from the request. They have already been deserialized from JSON. Args: instances: A list of prediction input instances. **kwargs: A dictionary of keyword args provided as additional fields on the predict request body. Returns: A list of outputs containing the prediction results. This list must be JSON serializable. """ raise NotImplementedError() @classmethod def from_path(cls, model_dir): """Creates an instance of Predictor using the given path. Loading of the predictor should be done in this method. Args: model_dir: The local directory that contains the exported model file along with any additional files uploaded when creating the version resource. Returns: An instance implementing this Predictor class. """ raise NotImplementedError()
AI Platform prediction
nodes use the
from_path class method to load an instance of your Predictor. This method
should load the artifacts you saved in your model directory, the contents of
which are copied from Cloud Storage to a location surfaced by the
Whenever your deployment receives an online prediction request, the instance of
the Predictor class returned by
from_path uses its
predict method to generate
predictions. AI Platform serializes the return value of this method
to JSON and sends it as the response to the prediction request.
Note that the
predict method does not need to convert input from JSON into
Python objects or convert output to JSON; AI Platform handles this
outside of the
AI Platform provides the
instances argument by parsing the
instances field from the body of the
request to the AI Platform Training and Prediction API. It parses
any other fields of the request body and provides them to the
as entries in the
**kwargs dictionary. To learn more, read about how to
predict request to the
AI Platform Training and Prediction API.
Continuing the example from the previous section, suppose your model directory
preprocessor.pkl (the pickled instance of the
and a trained
tf.keras model exported as
Then, you could implement the following Predictor class in a file called
import os import pickle import numpy as np from tensorflow import keras class MyPredictor(object): """An example Predictor for an AI Platform custom prediction routine.""" def __init__(self, model, preprocessor): """Stores artifacts for prediction. Only initialized via `from_path`. """ self._model = model self._preprocessor = preprocessor def predict(self, instances, **kwargs): """Performs custom prediction. Preprocesses inputs, then performs prediction using the trained Keras model. Args: instances: A list of prediction input instances. **kwargs: A dictionary of keyword args provided as additional fields on the predict request body. Returns: A list of outputs containing the prediction results. """ inputs = np.asarray(instances) preprocessed_inputs = self._preprocessor.preprocess(inputs) outputs = self._model.predict(preprocessed_inputs) return outputs.tolist() @classmethod def from_path(cls, model_dir): """Creates an instance of MyPredictor using the given path. This loads artifacts that have been copied from your model directory in Cloud Storage. MyPredictor uses them during prediction. Args: model_dir: The local directory that contains the trained Keras model and the pickled preprocessor instance. These are copied from the Cloud Storage model directory you provide when you deploy a version resource. Returns: An instance of `MyPredictor`. """ model_path = os.path.join(model_dir, 'model.h5') model = keras.models.load_model(model_path) preprocessor_path = os.path.join(model_dir, 'preprocessor.pkl') with open(preprocessor_path, 'rb') as f: preprocessor = pickle.load(f) return cls(model, preprocessor)
predict method converts the prediction results to a list with the
before returning them. NumPy arrays are not JSON serializable, so you must
convert them to lists of numbers (which are JSON serializable). Otherwise,
AI Platform will fail to send the prediction response.
Package your Predictor and its dependencies
You must package your Predictor as a
.tar.gz source distribution package.
Since NumPy and
are included in the AI Platform runtime
image, you don't need to include these
dependencies in the tarball. However, make sure to include any of the
Predictor's dependencies that are not pre-installed on AI Platform.
For the example above, you must include
preprocess.py in your source
distribution package, even though your Predictor doesn't explicitly import it.
preprocessor = pickle.load(f) will fail because Python won't
recognize the class of the
ZeroCenterer instance in the pickle file.
setup.py shows one way to package these scripts:
from setuptools import setup setup( name='my_custom_code', version='0.1', scripts=['predictor.py', 'preprocess.py'])
To package and upload the custom code example described on this page, do the following:
setup.pyfiles described in previous sections, all in the same directory. Navigate to that directory in your shell.
python setup.py sdist --formats=gztarto create
Upload this tarball to a staging location in Cloud Storage.
This does not need to be the same as your model directory. If you plan to iterate and deploy multiple versions of your custom prediction routine, you may want to upload your custom code packages in a designated staging dierctory. You can increment the
setup.pywhen you update the code to keep track of different versions.
The following command shows one way to upload your source distribution package to Cloud Storage:
gsutil cp dist/my_custom_code-0.1.tar.gz gs://your-bucket/path-to-staging-dir/
You can provide code for your custom prediction routine in one or more packages. The total file size of these packages must be 250 MB or less. You can request a higher quota to deploy more custom code.
Deploy your custom prediction routine
First, select a region where online prediction is
available and use
gcloud to create a model resource:
gcloud ai-platform models create model-name \ --regions chosen-region
gcloud beta component is updated, then create a version resource
with special attention to the following
--origin: The path to your model directory in Cloud Storage.
--package-uris: A comma separated list of user code tarballs in Cloud Storage, including the one containing your Predictor class.
--prediction-class: The fully qualified name of your Predictor class (module_name.class_name).
--framework: Do not specify a framework when deploying a custom prediction routine.
--runtime-version: Custom prediction routines are available in runtime versions 1.4 and above.
The following command shows how to create a version resource based on the example files described in previous sections:
gcloud components install beta gcloud beta ai-platform versions create version-name \ --model model-name \ --runtime-version 1.13 \ --python-version 3.5 \ --origin gs://your-bucket/path-to-model-dir \ --package-uris gs://your-bucket/path-to-staging-dir/my_custom_code-0.1.tar.gz \ --prediction-class predictor.MyPredictor
To learn more about creating models and versions in detail, or to learn how to
create them by using the Google Cloud Platform Console rather than the
gcloud tool, see the
guide to deploying models.
Specify a service account
When creating a version resource, you may optionally specify a service account for your custom prediction routine to use during prediction. This lets you customize its permissions to access other Google Cloud Platform resources. Learn more about specifying a service account for your custom prediction routine.
- Work through a tutorial on custom prediction routines to see a more complete example of how to train and deploy a model using a custom prediction routine.
- Read the guide to exporting models to learn about exporting a SavedModel for prediction, rather than a custom prediction routine.
- Read the guide to deploying models to learn
more details about deploying
versionresources to AI Platform to serve predictions.