AI in Depth: Serving a PyTorch text classifier on AI Platform Serving using custom online prediction
Vijay Reddy
ML Solutions Engineer
Khalid Salama
Staff Machine Learning Solutions Architect
Earlier this week, we explained in detail how you might build and serve a text classifier in TensorFlow. Today, we’ll provide a new explainer on how to build a similar classifier in PyTorch, another machine learning framework. In today’s blog post, we’ll explain how to implement the same model using PyTorch, and deploy it to AI Platform Serving for online prediction. We will reuse the preprocessing implemented in Keras in the previous blog post. The code for this example can be found in this Notebook.
AI Platform ML Engine is a serverless, NoOps product that lets you train and serve machine learning models at scale. These models can then be served as REST APIs for online prediction. The AI Platform Serving automatically scales to adjust to any throughput, and provides secure authentication to its REST endpoints.
To help maintain affinity of preprocessing between training and serving, AI Platform Serving now enables users to customize the prediction routine that gets called when sending prediction requests to their model deployed on AI Platform Serving. This feature allows you to upload a Custom Model Prediction class, along with your exported model, to apply custom logic before or after invoking the model for prediction.
In other words, we can now leverage AI Platform Serving to execute arbitrary Python code, breaking the typical and previous coupling with TensorFlow. This change enables you to pick the best framework for the job, or even combine multiple frameworks into a single application. For example, we can use Keras APIs for their easy-to-use text pre-processing methods, and combine them with PyTorch for the actual machine learning model. This combination of frameworks is precisely what we’ll discuss in this blog post.
For more details on text classification, the Hacker News dataset used in the example, and the text preprocessing logic, refer to the Serving a Text Classifier with Preprocessing using AIPlatform Serving blog post.
Building a PyTorch text classification model
You can begin by implementing your TorchTextClassifier
model class in the torch_model.py
module. As shown in the following code block, we implement the same text classification model architecture described in this post, which consists of an Embedding layer, Dropout layer, followed by two Conv1d and Pooling Layers, then a Dense layer with Softmax activation at the end.
Loading and preprocessing data
The following code prepares both the training and evaluation data. Note that, you use both fit() and transform() with the training data, while you only use transform() with the evaluation data, to make use of the tokenizer generated from the training data. The created train_texts_vectorized
and eval_texts_vectorized
objects will be used to train and evaluate our text classification model respectively.
The implementation of TextPreprocessor
class, which uses Keras APIs, is described in Serving a Text Classifier with Preprocessing using AI Platform Serving blog post.
Now you need to save the processor object—which includes the tokenizer generated from the training data—to be used when serving the model for prediction. The following code dumps the object to a new processor_state.pkl
file.
Training and saving the PyTorch model
The following code snippet shows you how to train your PyTorch model. First, you create an object of theTorchTextClassifier
, according to your parameters. Second, you implement a training loop, in which each iteration you predictions from your model (y_pred
) given the current training batch, compute the loss using cross_entropy
, and backpropagation using loss.backward()
and optimizer.step()
. After NUM_EPOCH
epochs, the trained model is saved to torch_saved_model.pt
file.Implementing the Custom Prediction class
In order to apply a custom prediction routine, which includes both preprocessing and postprocessing, you need to wrap this logic in a Custom Model Prediction class. This class, along with the trained model and its corresponding preprocessing object, will be used to deploy the AI Platform Serving microservices. The following code shows how the Custom Model Prediction class (CustomModelPrediction
) for our text classification example is implemented in the model_prediction.py
module.Deploying to AI Platform serving
Uploading the artifacts to Cloud Storage
Next, you’ll want to upload your artifacts to Cloud Storage, as follows:
Your saved (trained) model file:
trained_saved_model.pt
(see Training and Saving the PyTorch model).Your pickled preprocessing objects (which contain the state needed for data transformation prior to prediction):
processor_state.pkl.
As described in the previous, Keras-based post, theprocessor_state.pkl
object includes the tokenizer generated from the training data.
Second, you need to upload a Python package including all the classes you’ll need for prediction (preprocessing, model classes, and post-processing, if any). In this example, you need to create a `pip`-installable tar file that includes torch_model.py
, model_prediction.py
, and preprocess.py
. To begin, create the following setup.py
file:
The setup.py
file includes a list of the PyPI packages you need to `pip install` and use for prediction in the REQUIRED_PACKAGES
variable.Because we are deploying a model implemented by PyTorch, we need to include ‘torch’ in REQUIRED_PACKAGES
. Now, you can create the package by running the following command:
This will create a `.tar.gz` package under /dist directory. The name of the package will be `$name-$version.tar.gz` where `$name` and `$version` are the ones specified in setup.py
.
Once you have successfully created the package, you can upload it to Cloud Storage:
Deploying the model to AI Platform Serving
Let’s define the model name, the model version, and the AI Platform Serving runtime (which corresponds to a TensorFlow version) required for deploying the model.
First, you create a model in AI Platform Serving by running the following gcloud command:
Second, you create a model version using the following gcloud
command, in which you specify the location of the model and preprocessing object (--origin
), the location the package(s) including the scripts needed for your prediction (--package-uris
), and a pointer to you Custom Model prediction class (--prediction-class
).This should take 1-2 minutes.
After deploying the model to AI Platform Serving, you can invoke the model for prediction using the code described in previous Keras-based blog post .
Note that the client of our REST API does not need to know whether the service was implemented in TensorFlow or in PyTorch. In either case, the client should send the same request, and receive a response of the same form.
Conclusion
Although AI Platform initially provided only support for TensorFlow, it is now evolving into a platform that supports multiple frameworks. You can now deploy models using TensorFlow, PyTorch, or any Python-based ML framework, since AI Platform Serving supports custom prediction Python code, available in beta. This post demonstrates that you can flexibly deploy a PyTorch text classifier, which utilizes text preprocessing logic implemented in using Keras.
Feel free to reach out @GCPcloud if there are still features or other frameworks you’d like to train or deploy on AI Platform Serving.
Next steps
To learn more about AI Platform serving custom online prediction, read this blog post.
To learn more about machine learning on GCP, take this course.
To try out the code, run this Notebook.