Monitor feature attribution skew and drift

Vertex Model Monitoring supports feature attribution (feature importance scores) and detects skew and drift for categorical and numerical input features based on Vertex Explainable AI.

This page describes the following features of Model Monitoring:

  • Feature attribution skew or drift detection for models deployed to Vertex AI online prediction endpoints
  • Alerts for changes in feature attributions
  • Visualization and analysis of feature attribution skew or drift

Introduction to feature attribution-based monitoring

Feature attributions explain a model's prediction on a given input by attributing the prediction to the features of the individual input. The attribution scores are proportional to the contribution of the feature to a model's prediction. They are typically signed, indicating whether a feature helps push the prediction up or down. Attributions across all features must add up to the model's prediction score.

Monitoring feature attributions involves tracking the feature attributions associated with a model's predictions in production over time. The key idea is to track changes in a feature's contribution to predictions rather than the feature values themselves. A change in a key feature's attribution score is often a strong signal that the feature has changed in a way that can impact the accuracy of the model's predictions.

There are several benefits to monitoring feature attributions, including the ability to:

  • Track the most important features. A large change in attribution to a feature means that the feature's contribution to the prediction has changed. Because the prediction is equal to the sum of the feature contributions, large attribution drift (of most important features) usually indicates large drift in the model predictions.
  • Monitor all feature representations. Feature attributions are always numeric, regardless of the underlying feature type. Also, due to their additive nature, attributions to a multi-dimensional feature (for example, embeddings) can be reduced to a single numeric value by adding up the attributions across dimensions. This allows using standard univariate drift detection methods for all feature types.
  • Account for feature interactions. Attribution to a feature accounts for the feature's contribution to the prediction, both individually and by its interactions with other features. If a feature's interactions with other features changes, distribution of attributions to a feature change, even if the marginal distribution of the feature remains the same.
  • Monitor feature groups. Because attributions are additive, we can add up attributions to related features to obtain the attribution of a feature group. For instance, in a credit lending model, we can combine the attribution to all features related to the loan type (for example, "grade", "sub_grade", "purpose") to obtain a single loan attribution. This group-level attribution can then be tracked to monitor for changes in the feature group.

Enable feature attribution skew or drift detection

To enable feature attribution drift monitoring for your model, do the following steps:

  1. To enable Explainable AI for your model, configure explanations for your model. Specifically, the model needs to be deployed by using ExplanationParameters.

  2. After your model is configured for explanations, enable Attribution Score Monitoring by doing the following:

    • Set the enableFeatureAttributes field to true in ExplanationConfig.
    • Optional. Specify an explanationBaseline, by providing one of the following:

  3. Set the type of monitoring to skew or drift.

    To enable feature attribution skew detection for a model, you need to provide either the TrainingDataset that was used to train the model or the explanationBaseline in ExplanationConfig.

    To enable drift detection, training data or a explanation baseline is not required.

  4. Specify a user email address at which to receive notifications. This is a required parameter.

  5. Set the prediction request sampling rate.

    For cost efficiency, it is usually sufficient to monitor a subset of the production inputs to a model. This parameter controls the fraction of the incoming prediction requests that are logged and analyzed for monitoring purposes.

    This is an optional parameter. If the user doesn't configure this parameter, the Model Monitoring service logs all prediction requests.

  6. Set monitoring frequency.

    Monitoring frequency determines how often a deployed model's inputs are monitored for skew or drift. At the specified frequency, a monitoring job runs and performs monitoring on the recently logged inputs. Each monitoring job monitors the inputs logged between the timestamps (cutoff time—monitoring window—cutoff time). Monitoring frequency determines the timespan, or monitoring window size, of logged data that is analyzed in each monitoring run. In the Google Cloud Console, you can see the time when each monitoring job runs, and also visualize the data analyzed in each job.

    The minimum granularity is 1 hour. If you use the Cloud SDK to set up a Model Monitoring job, the default value is 24 hours.

  7. Specify a list of features to monitor, along with their alerting thresholds.

    You can specify which input features to monitor, along with the alerting threshold for each feature. The alerting threshold determines when to throw an alert. These thresholds indicate the statistical distance metric computed between the input feature distribution and its corresponding baseline. You can configure a separate threshold value for each monitored feature.

    If this list is not provided, by default, every categorical and numerical feature is monitored, with the following default threshold values:

    • Categorical feature: 0.3
    • Numerical feature: 0.3

Configuration parameters at the endpoint scope

An online prediction endpoint can host multiple models. When you enable skew or drift detection on an endpoint, the following configuration parameters are shared across all models hosted in that endpoint:

  • Type of detection
  • Monitoring frequency
  • Fraction of input requests monitored

For the other configuration parameters, different values can be set for each model.

You can view your configuration parameters in the Google Cloud Console.

Create a model monitoring job

To set up either skew detection or drift detection, you can create a model deployment monitoring job using the Cloud SDK.

Skew detection

If the training dataset is available, you can create a model monitoring job with skew detection for all the deployed models under the endpoint ENDPOINT_ID by running gcloud beta ai model-monitoring-jobs create:

gcloud beta ai model-monitoring-jobs create \
    --project=PROJECT_ID \
    --region=REGION \
    --display-name=MONITORING_JOB_NAME \
    --emails=EMAIL_ADDRESS_1,EMAIL_ADDRESS_2 \
    --endpoint=ENDPOINT_ID \
    --feature-thresholds=FEATURE_1=THRESHOLD_1,FEATURE_2=THRESHOLD_2 \
    --feature-attribution-thresholds=FEATURE_1=THRESHOLD_1,FEATURE_2=THRESHOLD_2 \
    --prediction-sampling-rate=SAMPLING_RATE \
    --monitoring-frequency=MONITORING_FREQUENCY \
    --target-field=TARGET_FIELD \
    --bigquery-uri=BIGQUERY_URI

The preceding command takes the training dataset from BigQuery and is in the following format:

"bq://\.\.\"

You can also specify the training dataset from Cloud Storage in CSV or TFRecord format.

To use CSV, replace the bigquery-uri flag with --data-format=csv --gcs-uris=gs://some_bucket/some_file.

To use TFRecord, replace the bigquery-uri flag with --data-format=tf-record --gcs-uris=gs://some_bucket/some_file.

You can also use a managed dataset for tabular AutoML by replacing the bigquery-uri flag with --dataset=dataset-id.

Drift detection

If the training dataset is not available, you can create a Model Monitoring job with drift detection for all the deployed models under the endpoint ENDPOINT_ID by running gcloud beta ai model-monitoring-jobs create:

gcloud beta ai model-monitoring-jobs create \
    --project=PROJECT_ID \
    --region=REGION \
    --display-name=MONITORING_JOB_NAME \
    --emails=EMAIL_ADDRESS_1,EMAIL_ADDRESS_2 \
    --endpoint=ENDPOINT_ID \
    --feature-thresholds=FEATURE_1=THRESHOLD_1,FEATURE_2=THRESHOLD_2 \
    --feature-attribution-thresholds=FEATURE_1=THRESHOLD_1,FEATURE_2=THRESHOLD_2 \
    --prediction-sampling-rate=SAMPLING_RATE \
    --monitoring-frequency=MONITORING_FREQUENCY

Model Monitoring SDK commands

You can update, pause, and delete a Model Monitoring job by using Cloud SDK.

For example, to update monitoring-frequency of Model Monitoring job 123 under project example in region us-central1:

gcloud beta ai model-monitoring-jobs update 123 \
    --monitoring-frequency=1 --project=example --region=us-central1

To pause the job:

gcloud beta ai model-monitoring-jobs pause 123 --project=example \
    --region=us-central1

To resume the job, run:

gcloud beta ai model-monitoring-jobs resume 123 --project=example \
    --region=us-central1

To delete the job, run:

gcloud beta ai model-monitoring-jobs pause 123 --project=example \
    --region=us-central1

For more information, see model-monitoring-jobs in Cloud SDK.

Model Monitoring jobs API

For more information about skew or drift detection, see the Model Monitoring API docs.

Email alerts

For the following events, Model Monitoring sends an email alert to each email address specified when the Model Monitoring job was created:

  • Each time an alerting threshold is crossed
  • Each time skew or drift detection is set up
  • Each time an existing Model Monitoring job configuration is updated

The email alerts contain pertinent information, including:

  • The time at which the monitoring job ran
  • The name of the feature that has skew or drift
  • The alerting threshold as well as the attribution score distance

Analyze skew and drift data

You can use the Cloud Console to visualize the feature attributions of each monitored feature and learn which changes led to skew or drift. You can view the feature value distributions as a time-series or histogram.

Scorecard showing an example prediction data feature
            attribution and training data feature attribution for
            skew detection.

In a stable machine learning system, features' relative importance generally remains relatively stable over time. If an important feature drops in importance, it might signal that something about that feature has changed. Common causes of feature importance drift or skew include the following:

  • Data source changes
  • Data schema and logging changes
  • Changes in end user mix or behavior (for example, due to seasonal changes or outlier events)
  • Upstream changes in features generated by another machine learning model. Some examples are:
    • Model updates that cause an increase or decrease in coverage (overall or for an individual classification value)
    • A change in performance of the model (which changes the meaning of the feature)
    • Updates to the data pipeline, which can cause a decrease in overall coverage

What's next