Introduction to AI Platform Pipelines

Machine learning (ML) workflows include steps to prepare and analyze data, train and evaluate models, deploy trained models to production, track ML artifacts and understand their dependencies, etc. Managing these steps in an ad-hoc manner can be difficult and time-consuming.

MLOps is the practice of applying DevOps practices to help automate, manage, and audit ML workflows. AI Platform Pipelines helps you implement MLOps by providing a platform where you can orchestrate the steps in your workflow as a pipeline. ML pipelines are portable and reproducible definitions of ML workflows.

AI Platform Pipelines makes it easier to get started with MLOps by saving you the difficulty of setting up Kubeflow Pipelines with TensorFlow Extended (TFX). Kubeflow Pipelines is an open source platform for running, monitoring, auditing, and managing ML pipelines on Kubernetes. TFX is an open source project for building ML pipelines that orchestrate end-to-end ML workflows.

About Kubeflow and the Kubeflow Pipelines platform

Kubeflow is an open source toolkit for running ML workloads on Kubernetes. Kubeflow Pipelines is a component of Kubeflow that provides a platform for building and deploying ML workflows, called pipelines.

About TensorFlow Extended

TFX is an open source project that you can use to define your TensorFlow based ML workflows as a pipeline. TFX provides components that you can reuse to ingest and transform data, train and evaluate a model, deploy a trained model for inference, etc. By reusing TFX components, you can orchestrate your ML process without the need to build custom components for each step.

About AI Platform Pipelines

AI Platform Pipelines saves you the difficulty of:

With AI Platform Pipelines, you can set up a Kubeflow Pipelines cluster in 15 minutes, so you can quickly get started with ML pipelines. AI Platform Pipelines also creates a Cloud Storage bucket, to make it easier to run pipeline tutorials and get started with TFX pipeline templates.

Understanding ML pipelines

ML pipelines are portable, scalable ML workflows, based on containers. ML pipelines are composed of a set of input parameters and a list of tasks. Each task is an instance of a pipeline component.

You can use ML pipelines to:

  • Apply MLOps strategies to automate repeatable processes.
  • Experiment by running an ML workflow with different sets of hyperparameters, number of training steps or iterations, etc.
  • Reuse a pipeline's workflow to train a new model.

You can use TensorFlow Extended pipeline templates or the Kubeflow Pipelines SDK to build pipelines.

Understanding pipeline components

Pipeline components are self-contained sets of code that perform one step in a pipeline's workflow, such as data preprocessing, data transformation, model training, etc.

Components are composed of a set of input parameters, a set of outputs, and the location of a container image. A component's container image is a package that includes the component's executable code and a definition of the environment that the code runs in.

Understanding pipeline workflow

Each task in a pipeline performs a step in the pipeline's workflow. Since tasks are instances of pipeline components, tasks have input parameters, outputs, and a container image. Task input parameters can be set from the pipeline's input parameters or set to depend on the output of other tasks within this pipeline. Kubeflow Pipelines uses these dependencies to define the pipeline's workflow as a directed acyclic graph.

For example, consider a pipeline with the following tasks:

  • Preprocess: This task prepares the training data.
  • Train: This task uses the preprocessed training data to train the model.
  • Predict: This task deploys the trained model as an ML service and gets predictions for the testing dataset.
  • Confusion matrix: This task uses the output of the prediction task to build a confusion matrix.
  • ROC: This task uses the output of the prediction task to perform receiver operating characteristic (ROC) curve analysis.

To create the workflow graph, the Kubeflow Pipelines SDK analyzes the task dependencies.

  • The preprocessing task does not depend on any other tasks, so it can be the first task in the workflow or it can run concurrently with other tasks.
  • The training task relies on data produced by the preprocessing task, so training must occur after preprocessing.
  • The prediction task relies on the trained model produced by the training task, so prediction must occur after training.
  • Building the confusion matrix and performing ROC analysis both rely on the output of the prediction task, so they must occur after prediction is complete. Building the confusion matrix and performing ROC analysis can occur concurrently since they both depend on the output of the prediction task.

Based on this analysis, the Kubeflow Pipelines system runs the preprocessing, training, and prediction tasks sequentially, and then runs the confusion matrix and ROC tasks concurrently.

What's next