Custom prediction routines allow you to determine what code runs when you send an online prediction request to AI Platform Prediction.
When you deploy a version resource to AI Platform Prediction without using a custom prediction routine, it handles prediction requests by performing the prediction operation of the machine learning framework you used for training.
But when you deploy a custom prediction routine as your version resource, you can tell AI Platform Prediction to run custom Python code in response to every prediction request it receives. You can preprocess prediction input before your trained model performs prediction, or you can postprocess the model's prediction before sending the prediction result.
To create a custom prediction routine, you must provide two parts to AI Platform Prediction when you create your model version:
A model directory in Cloud Storage, which contains any artifacts that need to be used for prediction.
A
.tar.gz
Python source distribution package in Cloud Storage containing your implementation of the Predictor interface and any other custom code you want AI Platform Prediction to use at prediction time.
You can only deploy a custom prediction routine when you use a legacy (MLS1) machine type for your model version.
Upload model artifacts to your model directory
Follow the guide to deploying models to upload your trained model to Cloud Storage, along with any other files that provide data or statefulness for AI Platform Prediction to use during prediction.
The total file size of the model artifacts that you deploy to AI Platform Prediction must be 500 MB or less.
You can upload your trained machine learning model to your model directory as a
TensorFlow SavedModel
, a model.joblib
file, a model.pkl
file, or a
model.bst
file, but you can also
provide your model as a HDF5 file containing a trained tf.keras
model
or in a different serialized format.
You can additionally include a pickle file with an instance of a custom preprocessor class that holds serialized state from training.
For example, consider the following preprocessor, defined in file called
preprocess.py
:
During training on a numerical dataset, the preprocessor centers the data around
0 by subtracting the mean of each column from every value in the column. Then,
you can export the preprocessor instance as a pickle file, preprocessor.pkl
,
which preserves the means of each column calculated from the training data.
During prediction, a custom prediction routine can load the preprocessor from this file to perform an identical transformation on prediction input.
To learn how to use a stateful preprocessor like this in your custom prediction routine, read the next section, which describes how to implement the Predictor interface.
To work through a full example of using a stateful preprocessor during training and prediction, read Creating a custom prediction routine with Keras or Creating a custom prediction routine with scikit-learn.
Create your Predictor
Tell AI Platform Prediction how to handle prediction requests by providing it with a Predictor class. This is a class that implements the following interface:
AI Platform Prediction prediction nodes
use the from_path
class method to load an instance of your Predictor. This
method should load the artifacts you saved in your model directory, the contents
of which are copied from Cloud Storage to a location surfaced by the
model_dir
argument.
Whenever your deployment receives an online prediction request, the instance of
the Predictor class returned by from_path
uses its predict
method to generate
predictions. AI Platform Prediction serializes the return value of this method
to JSON and sends it as the response to the prediction request.
Note that the predict
method does not need to convert input from JSON into
Python objects or convert output to JSON; AI Platform Prediction handles this
outside of the predict
method.
AI Platform Prediction provides the instances
argument by parsing the
instances
field from the body of the predict
request to the AI Platform Training and Prediction API. It parses any other
fields of the request body and provides them to the predict
method as entries
in the **kwargs
dictionary. To learn more, read about how to structure a
predict
request to the AI Platform Training and Prediction API.
Continuing the example from the previous section, suppose your model directory
contains preprocessor.pkl
(the pickled instance of the ZeroCenterer
class)
and either a trained tf.keras
model exported as model.h5
or a trained
scikit-learn model exported as model.joblib
.
Depending on which machine learning framework you use, implement one of the
following Predictor classes in a file called predictor.py
:
TensorFlow
scikit-learn
Notice the predict
method converts the prediction results to a list with the
tolist
method
before returning them. NumPy arrays are not JSON serializable, so you must
convert them to lists of numbers (which are JSON serializable). Otherwise,
AI Platform Prediction will fail to send the prediction response.
Package your Predictor and its dependencies
You must package your Predictor as a .tar.gz
source distribution package.
Since NumPy, TensorFlow, and scikit-learn are included in the
AI Platform Prediction runtime image, you
don't need to include these dependencies in the tarball. However, make sure to
include any of the Predictor's dependencies that are not pre-installed on
AI Platform Prediction.
For the example above, you must include preprocess.py
in your source
distribution package, even though your Predictor doesn't explicitly import it.
Otherwise, preprocessor = pickle.load(f)
will fail because Python won't
recognize the class of the ZeroCenterer
instance in the pickle file.
The following setup.py
shows one way to package these scripts:
To package and upload the custom code example described on this page, do the following:
Create the
preprocess.py
,predictor.py
, andsetup.py
files described in previous sections, all in the same directory. Navigate to that directory in your shell.Run
python setup.py sdist --formats=gztar
to createdist/my_custom_code-0.1.tar.gz
.Upload this tarball to a staging location in Cloud Storage.
This does not need to be the same as your model directory. If you plan to iterate and deploy multiple versions of your custom prediction routine, you may want to upload your custom code packages in a designated staging directory. You can increment the
version
argument insetup.py
when you update the code to keep track of different versions.The following command shows one way to upload your source distribution package to Cloud Storage:
gcloud storage cp dist/my_custom_code-0.1.tar.gz gs://YOUR_BUCKET/PATH_TO_STAGING_DIR/
You can provide code for your custom prediction routine in one or more packages.
Deploy your custom prediction routine
First, select a region where online prediction is
available and use gcloud
to create a model resource:
gcloud ai-platform models create MODEL_NAME{"</var>"}} \
--regions CHOSEN_REGION
Ensure your gcloud beta
component is updated, then create a version resource
with special attention to the following gcloud
flags:
--origin
: The path to your model directory in Cloud Storage.--package-uris
: A comma separated list of user code tarballs in Cloud Storage, including the one containing your Predictor class.--prediction-class
: The fully qualified name of your Predictor class (MODULE_NAME.CLASS_NAME).--framework
: Do not specify a framework when deploying a custom prediction routine.--runtime-version
: Custom prediction routines are available in runtime versions 1.4 through 2.11.
The following command shows how to create a version resource based on the example files described in previous sections:
gcloud components install beta
gcloud beta ai-platform versions create VERSION_NAME \
--model MODEL_NAME{"</var>"}} \
--runtime-version 2.11 \
--python-version 3.7 \
--origin gs://YOUR_BUCKET/PATH_TO_MODEL_DIR \
--package-uris gs://YOUR_BUCKET/PATH_TO_STAGING_DIR/my_custom_code-0.1.tar.gz \
--prediction-class predictor.MyPredictor
To learn more about creating models and versions in detail, or to learn how to create them by using the Google Cloud console rather than the gcloud CLI, see the guide to deploying models.
Specify a service account
When creating a version resource, you may optionally specify a service account for your custom prediction routine to use during prediction. This lets you customize its permissions to access other Google Cloud resources. Learn more about specifying a service account for your custom prediction routine.
What's next
- Work through a tutorial on using custom prediction routines with Keras or with scikit-learn to see a more complete example of how to train and deploy a model using a custom prediction routine.
- Read the guide to exporting models to learn about exporting artifacts for prediction without using a custom prediction routine.
- Read the guide to deploying models to learn
more details about deploying
model
andversion
resources to AI Platform Prediction to serve predictions.