MLOps with Intelligent Products Essentials

Last reviewed 2022-06-28 UTC

This document describes a reference architecture for implementing MLOps using Intelligent Products Essentials and Vertex AI. These tools can help manufacturers to continuously improve their products by doing the following:

Adding intelligent capabilities to more effectively meet customer needs.
Monetize new product features.

With these objectives in mind, this document is intended for data scientists, machine learning (ML) engineers, and solution architects who want to learn about a MLOps solution architecture for connected products.

MLOps

As described in Hidden technical debt in ML systems, the ML code is only a small part of mature ML systems. In addition to the ML code and high-quality data, you need a way to put your ML processes into operation.

MLOps is a practice which helps companies to build, deploy and put your ML system into operation in a rapid, repeatable, and reliable manner. MLOps is an application of DevOps principles to ML systems. MLOps is an engineering culture and practice that's intended to unify ML system development (Dev) and ML system operation (Ops). The objective of MLOps is to provide a set of standardized processes and technology capabilities for building, deploying, and putting ML systems into operation rapidly and reliably.

The following sections discuss how MLOps can be implemented with Intelligent Products Essentials and Vertex AI.

MLOps personas

High-level architecture of Intelligent Products Essentials and core MLOps user personas.

The preceding diagram shows the following component and core MLOps user personas:

Intelligent Products Essentials: stores customer data, device data, device telemetry, and ownership data across BigQuery and Cloud Storage.
Data scientists: responsible for analyzing data stored in Intelligent Products Essentials , feature engineering, model development, model evaluation, and building a ML pipeline.
ML engineers: responsible for orchestrating and hosting model deployment at scale.

The following sections describe the MLOps architecture from the perspective of data scientists and ML engineers.

Data scientists

For any ML problems, the objective of data scientists is to apply advanced analytical and ML techniques to identify patterns in data and output predictions. Because data is the foundation of ML, data scientists need easy access to datasets and a flexible development environment for data analysis.

The following diagram shows the MLOps architecture of Intelligent Products Essentials from the perspective of data scientists.

Detailed MLOps architecture of Intelligent Products Essentials from the data scientist perspective.

The preceding diagram shows the following MLOps components for data scientists:

Vertex AI Workbench: offers a Jupyter-based, fully managed, scalable, enterprise-ready compute infrastructure to connect to all the Google Cloud data in the organization. Data scientists can use this infrastructure as their development environment.
Vertex AI Feature Store: provides a centralized repository for organizing, storing, and serving ML features. Data scientists can use Vertex AI Feature Store to store and share features across their organization.
Kubeflow Pipelines SDK: lets data scientists build and deploy portable, scalable ML workflows based on Docker containers. After the data scientists produce a ML model, the data scientists can package their training procedures into a ML pipeline using Kubeflow Pipelines SDK.
Vertex AI Pipelines: provides an execution environment for ML pipelines built using the Kubeflow Pipelines SDK or TensorFlow Extended. For Intelligent Products Essentials, we recommend that you use Kubeflow Pipelines SDK. When you use Kubeflow Pipelines SDK, there are also prebuilt components such as the Google Cloud Pipeline Components for simple and rapid deployment. For the full list of prebuilt components, see the Google Cloud Pipeline Components list.
Cloud Source Repositories: are fully featured, private Git repositories hosted on Google Cloud. After data scientists define their continuous training ML pipeline, they can store the pipeline definition in a source repository, like Cloud Source Repositories. This approach triggers the continuous integration and continuous deployment (CI/CD) pipeline to run.

ML engineers

Intelligent Products Essentials helps ML engineers to automate the operation of ML models in a timely and reliable fashion. ML engineers manage the CI/CD pipelines that support the deployment of the ML pipeline, model, and in some cases, the prediction service.

The following diagram shows the MLOps architecture of Intelligent Products Essentials from the perspective of ML engineers.

Detailed MLOps architecture of Intelligent Products Essentials from the perspective of ML engineers.

The preceding diagram shows the following MLOps components for machine learning engineers:

A CI pipeline: builds, tests, and packages the components of the ML pipeline.
A CD pipeline: deploys the ML pipeline to appropriate environments, like staging or production environments.
ML pipeline: prepares training data and trains ML models. It includes the following steps:
- Data extraction: pulls training datasets from predefined data sources.
- Data validation: identifies anomalies in the data schema and in the distribution of data values.
- Data preparation: involves data cleaning, data transformation, and feature engineering.
- Model training: creates trained models using training data and other ML techniques, such as hyperparameter optimization.
- Model evaluation: assesses the performance of the trained model (from the previous model training step) on the test dataset.
- Model validation: confirms whether the trained model meets the predictive performance benchmark for deployment.
ML pipeline triggers: events published to Pub/Sub that trigger the ML pipeline for continuous training.
Vertex AI Model Registry: stores different versions of trained models and their associated metadata.
Batch prediction: applies predictions in batches on input data stored in Cloud Storage or BigQuery (available with AutoML Tables). The batch prediction operation can output the prediction results to Cloud Storage or BigQuery (available with AutoML Tables) for downstream systems to consume.

What's next

Learn more about Vertex AI.
Learn more about ML problem framing.
Read about continuous delivery and automation pipelines in ML
Read about MLOps for practitioners.
For more reference architectures, diagrams, and best practices, explore the Cloud Architecture Center.