Architecture of a Serverless Machine Learning Model

This series of articles explores the architecture of a serverless machine learning (ML) model to enrich support tickets with metadata before they reach a support agent.

When your agents are making relevant business decisions, they need access to data. Logs are a good source of basic insight, but adding enriched data changes the game. Implementing such a system can be difficult. This series offers a possible solution.

Requirements and architecture

Managing incoming support tickets can be challenging. Before an agent can start work on a problem, they need to do the following:

  • Understand the context of the support ticket.
  • Determine how serious the problem is for the customer.
  • Decide how many resources to use to resolve the problem.

A support agent typically receives minimal information from the customer who opened the support ticket. Often, a few back-and-forth exchanges with the customer garner additional details. If you add automated intelligence that is based on ticket data, you can help agents make strategic decisions when they handle support requests.

Usually, a user logs a ticket after filling out a form containing several fields. For this use case, assume that none of the support tickets have been enriched by machine learning. Also assume that the current support system has been processing tickets for a few months.

To start enriching support tickets, you must train an ML model that uses pre-existing labelled data. In this case, the training dataset consists of historical data found in closed support tickets. The data you need resides in two type of fields:

  • Input fields, which contain form data that the user fills in.
  • Target fields, which are filled in when the ticket is processed.

When combined, the data in these fields make examples that serve to train a model capable of making accurate predictions. Predictions in this use case include how long the ticket is likely to remain open, and what priority to assign to the ticket.

This series explores four ML enrichments to accomplish these goals:

  • Analyzing sentiment based on the ticket description.
  • Autotagging based on the ticket description.
  • Predicting how long the ticket remains open.
  • Predicting the priority to assign to the ticket.

During the ML enrichment workflow:

  1. A user creates a ticket.
  2. Ticket creation triggers a function that calls machine learning models to make predictions.
  3. The ticket data is enriched with the prediction returned by the ML models.
  4. The support agent uses the enriched support ticket to make efficient decisions.

The following diagram illustrates this workflow.

ML enrichment workflow

Ticketing system

Whether you build your system from scratch, use open source code, or purchase a commercial solution, this article assumes the following:

  • A branded, customer-facing UI generates support tickets. Not all helpdesk tools offer such an option, so you create one using a simple form page.
  • The third-party helpdesk tool is accessible through a RESTful API which can create a ticket. Your system uses this API to update the ticket backend.
  • When events occur, your system updates your custom-made customer UI in real time.

Firebase is an excellent choice for this type of implementation:

  • Firebase is a real-time database that a client can update, and it displays real-time updates to other subscribed clients.
  • Firebase can use Cloud Functions to call an external API, such as one that your helpdesk platform makes available.
  • Firebase works on desktop and mobile platforms and can be developed in various languages. When Firebase experiences unreliable internet connections, it can cache data locally.

Serverless technology and event-based triggering

"Serverless technology" can be defined in various ways, but most descriptions include the following assumptions:

  • Servers should be a distant concept and invisible to customers.
  • Actions are usually performed by functions triggered by events.
  • Functions run tasks that are usually short lived (lasting a few seconds or minutes).
  • Most of the time, functions have a single purpose.

Combined, Firebase and Cloud Functions streamline DevOps by minimizing infrastructure management. The operational flow works as follows:

  1. Create a Cloud Function event based on Firebase's database updates.
  2. The client writes a ticket to the Firebase database.
  3. A Cloud Function trigger performs a few main tasks:

    • Runs predictions using deployed machine learning algorithms.
    • Updates the Firebase real-time database with enriched data.
    • Creates a ticket in your helpdesk system with the consolidated data.

Enriching support tickets

You can group autotagging, sentiment analysis, priority prediction, and resolution-time prediction into two categories. These categories are based on the way the machine learning tasks are performed:

  • Sentiment analysis and autotagging use machine learning APIs already trained and built by Google. Pretrained models might offer less customization than building your own, but they are ready to use.
  • Predicting ticket resolution time and priority requires that you build a model or used canned ones and train them with custom data, such as the inputs and target fields.

Sentiment analysis and autotagging

When logging a support ticket, agents might like to know how the customer feels. Running a sentiment analysis on the ticket description helps supply this information.

It's also important to get a general idea of what's mentioned in the ticket. When creating a support ticket, the customer typically supplies some parameters from a drop-down list, but more information is often added when describing the problem. By using a tool that identifies the most important words in the description, the agent can narrow down the subject matter. This process is defined as wild autotagging.

Both solutions are generic and easy to describe, but they are challenging to build from scratch. It's a clear advantage to use, at scale, a powerful trained model for text analysis. Such a model reduces development time and simplifies infrastructure management.

A good solution for both of those enrichment ideas is the Cloud Natural Language API. This API is easily accessible from Cloud Functions as a RESTful API. The Natural Language API is a pre-trained model using Google extended datasets capable of several operations:

  • Sentiment analysis
  • Entity analysis with salience calculation
  • Syntax analysis

This article leverages both sentiment and entity analysis. You handle autotagging by retaining words with a salience above a custom-defined threshold.

If you want a model that can return specific tags automatically, you need to custom-train and custom-create a natural language processing (NLP) model. This approach is open to any tagging, because the goal is to quickly analyze the description, not fully categorize the ticket.

Predicting ticket resolution time and priority

The resolution time of a ticket and its priority status depend on inputs (ticket fields) specific to each helpdesk system. Consequently, you can't use a pretrained model as you did for tagging and sentiment analysis of the English language—you must train your own machine learning functions.

While the workflow for predicting resolution time and priority is similar, the two actions represent two different types of values:

  • Resolution time is a continuous value.
  • Priority has several predefined options based on the helpdesk system.

The machine learning section of "Smartening Up Support Tickets with a Serverless Machine Learning Model" explains how you can solve both problems through regression and classification.

Choose an architecture that enables you to do the following:

  • Train models with custom data.
  • Deploy models and make them available as a RESTful API for your Cloud Function.
  • Scale models as needed.

Cloud Datalab is a Google-managed tool that runs Jupyter Notebooks in the cloud. Cloud Datalab integrates with other Google Cloud Platform (GCP) products. Cloud Datalab can also run ML Workbench (See some Notebook examples here), a Python library that facilitates the use of two key technologies: TensorFlow and AI Platform.

TensorFlow features include:

  • TensorFlow-built graphs (executables) are portable and can run on various hardware.
  • The Estimator API allows you to use prebuilt or custom models. This article uses ML Workbench, which makes concepts such as mapping data to a model approachable.
  • The Experiment feature lets you take a model and train and evaluate it in a distributed environment.

AI Platform is a managed service that can execute TensorFlow graphs. The service eases machine learning tasks such as:

  • Training models in a distributed environment with minimal DevOps.
  • Integrating with other GCP products.
  • Tuning hyperparameters to improve model training.
  • Deploying models as RESTful APIs to make predictions at scale.

ML Workbench uses the Estimator API behind the scenes but simplifies a lot of the boilerplate code when working with structured data prediction problems. Estimator API adds several interesting options such as feature crossing, discretization to improve accuracy, and the capability to create custom models. However, our current use case requires only regressor and classifier, with little need for feature engineering. This approach fits well with ML Workbench capabilities, which also support distributed training, reading data in batches, and scaling up as needed using AI Platform.

Depending on how deep you want to get into TensorFlow and coding. you can choose between ML Workbench or the TensorFlow Estimator API. The rest of this series focuses on ML Workbench because the main goal is to learn how to call ML models in a serverless environment. The series also supplies additional information on TensorFlow and AI Platform.

Synchronization with the helpdesk platform

Synchronization between the two systems flows in both directions:

When a customer opens or updates a ticket, Cloud Functions triggers a Firebase write. Cloud Functions also update the third-party helpdesk platform by calling its RESTful API.
When an agent uses the helpdesk platform to update a ticket, the platform triggers its own event and calls a Cloud Functions HTTP handler, which updates Firebase and the client UI, in real time.


The architecture has the following flow:

  1. A user writes a ticket to Firebase, which triggers a Cloud Function.
  2. The Cloud Function calls 3 different endpoints to enrich the ticket:

    • An AI Platform endpoint, where the function can predict the priority.
    • An AI Platform endpoint, where the function can predict the resolution time.
    • The Natural Language API to do sentiment analysis and word salience.
  3. For each reply, the Cloud Function updates the Firebase real-time database.

  4. The Cloud Function then creates a ticket into the helpdesk platform using the RESTful API.

The following diagram illustrates this architecture.

Serverless architecture

Next steps