Vertex AI Model Monitoring for batch inferences

This page describes how to configure batch inference job requests to include one-time Model Monitoring analysis. For batch inferences, Model Monitoring supports feature skew detection for categorical and numerical input features.

To create a batch inference job with Model Monitoring skew analysis, you must include both your batch inference input data and original training data for your model in the request. You can only add Model Monitoring analysis when creating new batch inference jobs.

For more information about skew, see Introduction to Model Monitoring.

For instructions on how to set up Model Monitoring for online (real-time) inferences, see Using Model Monitoring.

Prerequisites

To use Model Monitoring with batch inferences, complete the following:

Have an available model in Vertex AI Model Registry that is either a tabular AutoML or tabular custom training type.
Upload your training data to Cloud Storage or BigQuery and obtain the URI link to the data.
- For models trained with AutoML, you can use the dataset id for your training dataset instead.

Model Monitoring compares the training data to the batch inference output. Make sure you use supported file formats for the training data and batch inference output:

Model type	Training data	Batch inference output
Custom-trained	CSV, JSONL, BigQuery, TfRecord(tf.train.Example)	JSONL
AutoML tabular	CSV, JSONL, BigQuery, TfRecord(tf.train.Example)	CSV, JSONL, BigQuery, TfRecord(Protobuf.Value)

Optional: For custom-trained models, upload the schema for your model to Cloud Storage. Model Monitoring requires the schema to calculate the baseline distribution for skew detection.

Request a batch inference

You can use the following methods to add Model Monitoring configurations to batch inference jobs:

Console

Follow the instructions to make a batch inference request with Model Monitoring enabled:

REST API

Follow the instructions to make a batch inference request using the REST API:

When you create the batch inference request, add the following Model Monitoring configuration to the request JSON body:

"modelMonitoringConfig": {
 "alertConfig": {
   "emailAlertConfig": {
     "userEmails": "EMAIL_ADDRESS"
   },
  "notificationChannels": [NOTIFICATION_CHANNELS]
 },
 "objectiveConfigs": [
   {
     "trainingDataset": {
       "dataFormat": "csv",
       "gcsSource": {
         "uris": [
           "TRAINING_DATASET"
         ]
       }
     },
     "trainingPredictionSkewDetectionConfig": {
       "skewThresholds": {
         "FEATURE_1": {
           "value": VALUE_1
         },
         "FEATURE_2": {
           "value": VALUE_2
         }
       }
     }
   }
 ]
}

where:

EMAIL_ADDRESS is the email address where you want to receive alerts from Model Monitoring. For example, example@example.com.
NOTIFICATION_CHANNELS: a list of Cloud Monitoring notification channels where you want to receive alerts from Model Monitoring. Use the resource names for the notification channels, which you can retrieve by listing the notification channels in your project. For example, "projects/my-project/notificationChannels/1355376463305411567", "projects/my-project/notificationChannels/1355376463305411568".
TRAINING_DATASET is the link to the training dataset stored in Cloud Storage.
- To use a link to a BigQuery training dataset, replace the the gcsSource field with the following:
```
"bigquerySource": {
    {
      "inputUri": "TRAINING_DATASET"
    }
 }
```
- To use a link to an AutoML model, replace the gcsSource field with the following:
```
"dataset": "TRAINING_DATASET"
```
FEATURE_1:VALUE_1 and FEATURE_2:VALUE_2 is the alerting threshold for each feature you want to monitor. For example, if you specify Age=0.4, Model Monitoring logs an alert when the statistical distance between the input and baseline distributions for the Age feature exceeds 0.4. By default, every categorical and numerical feature is monitored with threshold values of 0.3.

For more information about Model Monitoring configurations, see the Monitoring job reference.

Python

See the example notebook to run a batch inference job with Model Monitoring for a custom tabular model.

Model Monitoring automatically notifies you of job updates and alerts through email.

Access skew metrics

You can use the following methods to access skew metrics for batch inference jobs:

Console (Histogram)

Use the Google Cloud console to view the feature distribution histograms for each monitored feature and learn which changes led to skew over time:

Go to the Batch predictions page:

Go to Batch predictions
On the Batch predictions page, click the batch inference job you want to analyze.
Click the Model Monitoring Alerts tab to view a list of the model's input features, along with pertinent information, such as the alert threshold for each feature.
To analyze a feature, click the name of the feature. A page shows the feature distribution histograms for that feature.

Visualizing data distribution as histograms lets you quickly focus on the changes that occurred in the data. Afterward, you might decide to adjust your feature generation pipeline or retrain the model.

Console (JSON file)

Use the Google Cloud console to access the metrics in JSON format:

Go to the Batch predictions page:

Go to Batch predictions
Click the name of the batch inference monitoring job.
Click the Monitoring properties tab.
Click the Monitoring output directory link, which directs you to a Cloud Storage bucket.
Click the metrics/ folder.
Click the skew/ folder.
Click the feature_skew.json file, which directs you to the Object details page.
Open the JSON file using either option:

Click Download and open the file in your local text editor.
Use the gsutil URI path to run gcloud storage cat gsutil_URI in the Cloud Shell or your local terminal.

The feature_skew.json file includes a dictionary where the key is the feature name and the value is the feature skew. For example:

{
  "cnt_ad_reward": 0.670936,
  "cnt_challenge_a_friend": 0.737924,
  "cnt_completed_5_levels": 0.549467,
  "month": 0.293332,
  "operating_system": 0.05758,
  "user_pseudo_id": 0.1
}

Python

See the example notebook to access skew metrics for a custom tabular model after running a batch inference job with Model Monitoring.

Debug batch inference monitoring failures

If your batch inference monitoring job fails, you can find debugging logs in the Google Cloud console:

Go to the Batch predictions page.

Go to Batch predictions
Click the name of the failed batch inference monitoring job.
Click the Monitoring properties tab.
Click the Monitoring output directory link, which directs you to a Cloud Storage bucket.
Click the logs/ folder.
Click either of the .INFO files, which directs you to the Object details page.
Open the logs file using either option:
- Click Download and open the file in your local text editor.
- Use the gsutil URI path to run gcloud storage cat gsutil_URI in the Cloud Shell or your local terminal.

Notebook tutorials

Learn more about how to use Vertex AI Model Monitoring to get visualizations and statistics for models with these end-to-end tutorials.

AutoML

Custom

What's next

Learn how to use Model Monitoring.
Learn how Model Monitoring calculates training-serving skew and inference drift.

Vertex AI Model Monitoring for batch inferences

Prerequisites

Request a batch inference

Console

REST API

Python

Access skew metrics

Console (Histogram)

Console (JSON file)

Python

Debug batch inference monitoring failures

Notebook tutorials

AutoML

Custom

XGBoost models

Vertex Explainable AI feature attributions

Batch inference

Setup for tabular models

What's next