A critical part of the scientific method is recording the parameters of an experiment and recording your observations. In data science, rigorously tracking the parameters, artifacts, and metrics used in a machine learning (ML) experiment is also critical. This metadata helps you to perform tasks such as the following:
- Analyzing runs of a production ML system to understand changes in the quality of predictions.
- Analyzing ML experiments to compare the effectiveness of different sets of hyperparameters.
- Tracking the lineage of ML artifacts, such as datasets and models, to understand what contributed to the creation of an artifact or how that artifact was used to create descendant artifacts.
- Rerunning an ML workflow with the same artifacts and parameters.
- Tracking the downstream usage of ML artifacts for governance.
Managing this metadata in an ad-hoc manner can be difficult and time-consuming.
Vertex ML Metadata lets you record the metadata and artifacts produced by your ML system and query that metadata to help analyze, debug, and audit the performance of your ML system or the artifacts that it produces.
Vertex ML Metadata builds on the concepts used in the open source ML Metadata (MLMD) library that was developed by Google's TensorFlow Extended team.
Overview of Vertex ML Metadata
Vertex ML Metadata describes your ML system's metadata as a graph. Before you use Vertex ML Metadata, it is important to understand the following concepts.
- Artifacts: Artifacts are pieces of data that ML systems consume or produce, such as: datasets, models, or logs. For large artifacts like datasets or models, the artifact record includes the URI where the data is stored.
- Executions: Executions describe a single step in your ML system's workflow.
- Events: Executions can depend on artifacts as inputs or produce artifacts as outputs. Events describe the relationship between artifacts and executions to help you determine the lineage of artifacts. For example, an event is created to record that a dataset is used by an execution, and another event is created to record that this execution produced a model.
Contexts: Contexts let you group artifacts and executions together in a single, queryable, and typed category. For example, you can use contexts to represent sets of metadata such as:
- A Vertex AI Pipelines pipeline run. In this case, the context represents one run and each execution represents a step in the ML pipeline.
- An experiment run from a Jupyter notebook. In this case, the context could represent the notebook and each execution could represent a cell in that notebook.
In the metadata graph, artifacts and executions are nodes, and events are edges that link artifacts as inputs or outputs of executions. Contexts represent subgraphs that are used to logically group sets of artifacts and executions.
You can apply key-value pair metadata to artifacts, executions, and contexts. For example, a model could have metadata that describes the framework used to train the model and performance metrics, such as the model's accuracy, precision, and recall.
ML artifact lineage
In order to understand changes in the performance of your machine ML system, you must be able to analyze the metadata produced by your ML workflow and the lineage of its artifacts. An artifact's lineage includes all the factors that contributed to its creation, as well as artifacts and metadata that descend from this artifact.
For example, a model's lineage could include the following:
- The training, test, and evaluation data used to create the model.
- The hyperparameters used during model training.
- The code that was used to train the model.
- Metadata recorded from the training and evaluation process, such as the model's accuracy.
- Artifacts that descend from this model, such as the results of batch predictions.
By tracking your ML system's metadata using Vertex ML Metadata. You can use metadata to help answer questions like the following:
- Which dataset was used to train a certain model?
- Which of my organization's models have been trained using a certain dataset?
- Which run produced the most accurate model, and what hyperparameters were used to train the model?
- Which deployment targets was a certain model deployed to and when was it deployed?
- Which version of your model was used to create a prediction at a given point in time?
Learn more about analyzing your ML system's metadata.