Introduction to Vertex AI Model Monitoring

This page provides an overview of Vertex AI Model Monitoring. To enable Vertex AI Model Monitoring, see Using Model Monitoring.

Overview

A model deployed in production performs best on prediction input data that is similar to the training data. When the input data deviates from the data used to train the model, the model's performance can deteriorate, even if the model itself hasn't changed.

To help you maintain a model's performance, Model Monitoring monitors the model's prediction input data for feature skew and drift:

  • Training-serving skew occurs when the feature data distribution in production deviates from the feature data distribution used to train the model. If the original training data is available, you can enable skew detection to monitor your models for training-serving skew.

  • Prediction drift occurs when feature data distribution in production changes significantly over time. If the original training data isn't available, you can enable drift detection to monitor the input data for changes over time.

You can enable both skew and drift detection.

Model Monitoring supports feature skew and drift detection for categorical and numerical features:

  • Categorical features are data limited by number of possible values, typically grouped by qualitative properties. For example, categories such as product type, country, or customer type.

  • Numerical features are data that can be any numeric value. For example, weight and height.

Calculate training-serving skew and prediction drift

Model Monitoring uses the following process to monitor a feature for training-serving skew or prediction drift:

  1. Calculate the baseline statistical distribution:

    • For skew detection, the baseline is the statistical distribution of the feature's values in the training data.

    • For drift detection, the baseline is the statistical distribution of the feature's values seen in production in the recent past.

    The distributions for categorical and numerical features are calculated as follows:

    • For categorical features, the computed distribution is the number or percentage of instances of each possible value of the feature.

    • For numerical features, Model Monitoring divides the range of possible feature values into equal intervals and computes the number or percentage of feature values that falls in each interval.

  2. Calculate the statistical distribution of the latest feature values seen in production.

  3. Compare the distribution of the latest feature values in production against the baseline distribution by calculating a distance score:

  4. When the distance score between two statistical distributions exceeds the threshold you specify, Model Monitoring identifies the anomaly as skew or drift.

The following example shows skew or drift between the baseline and latest distributions of a categorical feature:

Baseline distribution

An example feature distribution of baseline dataset.

Latest distribution

An example feature distribution of latest dataset.

The following example shows skew or drift between the baseline and latest distributions of a numerical feature:

Baseline distribution

An example feature distribution of baseline dataset.

Latest distribution

An example feature distribution of latest dataset.

Considerations when using Model Monitoring

  • For cost efficiency, you can set a prediction request sampling rate to monitor a subset of the production inputs to a model.

  • You can set a frequency at which a deployed model's recently logged inputs are monitored for skew or drift. Monitoring frequency determines the timespan, or monitoring window size, of logged data that is analyzed in each monitoring run.

  • You can specify alerting thresholds for each feature you want to monitor. An alert is logged when the statistical distance between the input feature distribution and its corresponding baseline exceeds the specified threshold. By default, every categorical and numerical feature is monitored, with threshold values of 0.3.

  • An online prediction endpoint can host multiple models. When you enable skew or drift detection on an endpoint, the following configuration parameters are shared across all models hosted in that endpoint:

    • Type of detection
    • Monitoring frequency
    • Fraction of input requests monitored

    For the other configuration parameters, you can set different values for each model.

Parsing data using input schemas

To get feature values, Model Monitoring parses the payload of the online prediction requests made to the model. Input schemas tell Model Monitoring how to correctly parse the input payload.

Input schemas are automatically parsed for AutoML models, but you may need to provide an input schema for custom-trained models that don't use a key-value input format.

Automatic schema parsing

After skew or drift detection is enabled, Model Monitoring can usually automatically parse the input schema. For automatic schema parsing, Model Monitoring analyzes the first 1,000 input requests to determine the schema.

Automatic schema parsing works best when the input requests are formatted as key-value pairs, where "key" is the name of the feature and "value" is the value of the feature. For example:

"key":"value"
{"TenYearCHD":"0", "glucose":"5.4", "heartRate":"1", "age":"30",
"prevalentStroke":"0", "gender":"f", "ethnicity":"latin american"}

If the inputs are not in "key":"value" format, Model Monitoring tries to identify the data type of each feature, and automatically assigns a default feature name for each input.

Custom instance schemas for parsing input

You can provide your own input schema when you create a Model Monitoring job to guarantee that Model Monitoring correctly parses your model's inputs.

This schema is called the analysis instance schema. The schema file specifies the format of the input payload, the names of each feature, and the type of each feature.

The schema must be written as a YAML file in the Open API schema format. For example:

type: object
properties:
  BMI:
    type: number
  BPMeds:
    type: string
  TenYearCHD:
    type: string
  age:
    type: string
  cigsPerDay:
    type: array
    items:
      type: string
required:
- age
- BMI
- TenYearCHD
- cigsPerDay
- BPMeds
  • type indicates whether your prediction request is one of the following formats:

    • object: key-value pairs
    • array: array-like
    • string: csv-string
  • properties indicates the type of each individual feature.

  • If the request is in array or csv-string format, specify the order in which the features are listed in each request under the required field.

If your prediction request is in array or csv-string format, represent any missing features as null values. For example, consider a prediction request with five features:

[feature_a, feature_b, feature_c, feature_d, feature_e]

If feature_c allows missing values, a sample request missing feature_c would be: {[1, 2, , 4, 6]}. The list length is still 5, with one null value in the middle.

What's next