AI & Machine Learning

AI in Depth: Serving a PyTorch text classifier on AI Platform Serving using custom online prediction

Earlier this week, we explained in detail how you might build and serve a text classifier in TensorFlow. Today, we’ll provide a new explainer on how to build a similar classifier in PyTorch, another machine learning framework. In today’s blog post, we’ll explain how to implement the same model using PyTorch, and deploy it to AI Platform Serving for online prediction. We will reuse the preprocessing implemented in Keras in the previous blog post. The code for this example can be found in this Notebook.

AI Platform ML Engine is a serverless, NoOps product that lets you train and serve machine learning models at scale. These models can then be served as REST APIs for online prediction. The AI Platform Serving automatically scales to adjust to any throughput, and provides secure authentication to its REST endpoints.

To help maintain affinity of preprocessing between training and serving, AI Platform Serving now enables users to customize the prediction routine that gets called when sending prediction requests to their model deployed on AI Platform Serving. This feature allows you to upload a Custom Model Prediction class, along with your exported model, to apply custom logic before or after invoking the model for prediction.

In other words, we can now leverage AI Platform Serving to execute arbitrary Python code, breaking the typical and previous coupling with TensorFlow. This change enables you to pick the best framework for the job, or even combine multiple frameworks into a single application. For example, we can use Keras APIs for their easy-to-use text pre-processing methods, and combine them with PyTorch for the actual machine learning model. This combination of frameworks is precisely what we’ll discuss in this blog post.

For more details on text classification, the Hacker News dataset used in the example, and the text preprocessing logic, refer to the Serving a Text Classifier with Preprocessing using AIPlatform Serving blog post.

Building a PyTorch text classification model

You can begin by implementing your TorchTextClassifier model class in the module. As shown in the following code block, we implement the same text classification model architecture described in this post, which consists of an Embedding layer, Dropout layer, followed by two Conv1d and Pooling Layers, then a Dense layer with Softmax activation at the end.

  import torch
import torch.nn as nn
import torch.nn.functional as F

class TorchTextClassifier(nn.Module):
   def __init__(self, vocab_size, embedding_dim, seq_length, num_classes,
                num_filters, kernel_size, pool_size, dropout_rate):
       super(TorchTextClassifier, self).__init__()

       self.embeddings = nn.Embedding(num_embeddings=vocab_size,
       self.conv1 = nn.Conv1d(seq_length, num_filters, kernel_size)
       self.max_pool1 = nn.MaxPool1d(pool_size)
       self.conv2 = nn.Conv1d(num_filters, num_filters*2, kernel_size)
       self.dropout = nn.Dropout(dropout_rate)
       self.dense = nn.Linear(num_filters*2, num_classes)
   def forward(self, x):
       x = self.embeddings(x)
       x = self.dropout(x)
       x = self.conv1(x)
       x = F.relu(x)
       x = self.max_pool1(x)
       x = self.conv2(x)
       x = F.relu(x)
       x = F.max_pool1d(x, x.size()[2]).squeeze(2)
       x = self.dropout(x)
       x = self.dense(x)
       x = F.softmax(x, 1)
       return x

Loading and preprocessing data

The following code prepares both the training and evaluation data. Note that, you use both fit() and transform() with the training data, while you only use transform() with the evaluation data, to make use of the tokenizer generated from the training data. The created train_texts_vectorized and eval_texts_vectorized objects will be used to train and evaluate our text classification model respectively.

The implementation of TextPreprocessor class, which uses Keras APIs, is described in Serving a Text Classifier with Preprocessing using AI Platform Serving blog post.

  ((train_texts, train_labels), (eval_texts, eval_labels)) = load_data( 'train.tsv', 'eval.tsv')

# Create vocabulary from training corpus.
processor = TextPreprocessor(VOCAB_SIZE, MAX_SEQUENCE_LENGTH)

# Preprocess the data
train_texts_vectorized = processor.transform(train_texts)
eval_texts_vectorized = processor.transform(eval_texts)

Now you need to save the processor object—which includes the tokenizer generated from the training data—to be used when serving the model for prediction. The following code dumps the object to a new processor_state.pkl file.

  import pickle
with open('./processor_state.pkl', 'wb') as f:
 pickle.dump(processor, f)

Training and saving the PyTorch model

The following code snippet shows you how to train your PyTorch model. First, you create an object of the TorchTextClassifier, according to your parameters. Second, you implement a training loop, in which each iteration you predictions from your model (y_pred) given the current training batch, compute the loss using cross_entropy, and backpropagation using loss.backward() and optimizer.step(). After NUM_EPOCH epochs, the trained model is saved to file.

  import torch
from torch.autograd import Variable
import torch.nn.functional as F



train_size = len(train_texts)
steps_per_epoch = int(len(train_labels)/BATCH_SIZE)


from torch_model import TorchTextClassifier

model = TorchTextClassifier(VOCAB_SIZE, EMBEDDING_DIM,

loss_metric = F.cross_entropy
optimizer = torch.optim.Adam(model.parameters(), lr=LEARNING_RATE)

for epoch in range(NUM_EPOCH):
   for step in range(steps_per_epoch):
       x, y = get_batch(step)
       y_pred = model(x)
       loss = loss_metric(y_pred, y)


Implementing the Custom Prediction class

In order to apply a custom prediction routine, which includes both preprocessing and postprocessing, you need to wrap this logic in a Custom Model Prediction class. This class, along with the trained model and its corresponding preprocessing object, will be used to deploy the AI Platform Serving microservices. The following code shows how the Custom Model Prediction class (CustomModelPrediction) for our text classification example is implemented in the module.

  import os
import pickle
import numpy as np
import torch
from torch.autograd import Variable

class CustomModelPrediction(object):
 def __init__(self, model, processor):
   self._model = model
   self._processor = processor

 def _postprocess(self, predictions):
   labels = ['github', 'nytimes', 'techcrunch']
   label_indexes = [np.argmax(prediction) for prediction in predictions.detach().numpy()]
   return [labels[label_index] for label_index in label_indexes]

 def predict(self, instances, **kwargs):
   preprocessed_data = self._processor.transform(instances)
   predictions =  self._model(Variable(torch.Tensor(preprocessed_data).long()))
   labels = self._postprocess(predictions)
   return labels

 def from_path(cls, model_dir):
   import torch
   import torch_model
   model = torch.load(os.path.join(model_dir,''))
   with open(os.path.join(model_dir, 'processor_state.pkl'), 'rb') as f:
     processor = pickle.load(f)
   return cls(model, processor)

Deploying to AI Platform serving

Uploading the artifacts to Cloud Storage

Next, you’ll want to upload your artifacts to Cloud Storage, as follows:

  • Your saved (trained) model file: (see Training and Saving the PyTorch model).

  • Your pickled preprocessing objects (which contain the state needed for data transformation prior to prediction): processor_state.pkl. As described in the previous, Keras-based post, the processor_state.pkl object includes the tokenizer generated from the training data.

  !gsutil cp gs://{BUCKET}/{MODEL_DIR}/
!gsutil cp processor_state.pkl gs://{BUCKET}/{MODEL_DIR}/

Second, you need to upload a Python package including all the classes you’ll need for prediction (preprocessing, model classes, and post-processing, if any). In this example, you need to create a `pip`-installable tar file that includes,, and To begin, create the following file:

  from setuptools import setup


 scripts=["", "", ""],

The file includes a list of the PyPI packages you need to `pip install` and use for prediction in the REQUIRED_PACKAGES variable.Because we are deploying a model implemented by PyTorch, we need to include ‘torch’ in REQUIRED_PACKAGES. Now, you can create the package by running the following command:

  !python sdist

This will create a `.tar.gz` package under /dist directory. The name of the package will be `$name-$version.tar.gz` where `$name` and `$version` are the ones specified in

Once you have successfully created the package, you can upload it to Cloud Storage:

  !gsutil cp ./dist/my_package-0.1.tar.gz gs://{BUCKET}/{PACKAGES_DIR}/my_package-0.1.tar.gz

Deploying the model to AI Platform Serving

Let’s define the model name, the model version, and the AI Platform Serving runtime (which corresponds to a TensorFlow version) required for deploying the model.


First, you create a model in AI Platform Serving by running the following gcloud command:

  !gcloud ml-engine models create {MODEL_NAME} --regions {REGION}

Second, you create a model version using the following gcloud command, in which you specify the location of the model and preprocessing object (--origin), the location the package(s) including the scripts needed for your prediction (--package-uris), and a pointer to you Custom Model prediction class (--prediction-class).This should take 1-2 minutes.

  !gcloud alpha ml-engine versions create {VERSION_NAME} --model {MODEL_NAME} \
    --origin=gs://{BUCKET}/{MODEL_DIR}/ \
    --python-version=3.5 \
    --runtime-version={RUNTIME_VERSION} \
    --framework='SCIKIT_LEARN' \
    --package-uris=gs://{BUCKET}/{PACKAGES_DIR}/my_package-0.1.tar.gz \
    --machine-type=mls1-c4-m4 \

After deploying the model to AI Platform Serving, you can invoke the model for prediction using the code described in previous Keras-based blog post .

Note that the client of our REST API does not need to know whether the service was implemented in TensorFlow or in PyTorch. In either case, the client should send the same request, and receive a response of the same form.


Although AI Platform initially provided only support for TensorFlow, it is now evolving into a platform that supports multiple frameworks. You can now deploy models using TensorFlow, PyTorch, or any Python-based ML framework, since AI Platform Serving supports custom prediction Python code, available in beta. This post demonstrates that you can flexibly deploy a PyTorch text classifier, which utilizes text preprocessing logic implemented in using Keras.

Feel free to reach out @GCPcloud if there are still features or other frameworks you’d like to train or deploy on AI Platform Serving.

Next steps

  • To learn more about AI Platform serving custom online prediction, read this blog post.

  • To learn more about machine learning on GCP, take this course.

  • To try out the code, run this Notebook.