REST Resource: projects.locations.modelDeploymentMonitoringJobs

Resource: ModelDeploymentMonitoringJob

Represents a job that runs periodically to monitor the deployed models in an endpoint. It will analyze the logged training & prediction data to detect any abnormal behaviors.

JSON representation
{
  "name": string,
  "displayName": string,
  "endpoint": string,
  "state": enum (JobState),
  "scheduleState": enum (MonitoringScheduleState),
  "latestMonitoringPipelineMetadata": {
    object (LatestMonitoringPipelineMetadata)
  },
  "modelDeploymentMonitoringObjectiveConfigs": [
    {
      object (ModelDeploymentMonitoringObjectiveConfig)
    }
  ],
  "modelDeploymentMonitoringScheduleConfig": {
    object (ModelDeploymentMonitoringScheduleConfig)
  },
  "loggingSamplingStrategy": {
    object (SamplingStrategy)
  },
  "modelMonitoringAlertConfig": {
    object (ModelMonitoringAlertConfig)
  },
  "predictInstanceSchemaUri": string,
  "samplePredictInstance": value,
  "analysisInstanceSchemaUri": string,
  "bigqueryTables": [
    {
      object (ModelDeploymentMonitoringBigQueryTable)
    }
  ],
  "logTtl": string,
  "labels": {
    string: string,
    ...
  },
  "createTime": string,
  "updateTime": string,
  "nextScheduleTime": string,
  "statsAnomaliesBaseDirectory": {
    object (GcsDestination)
  },
  "encryptionSpec": {
    object (EncryptionSpec)
  },
  "enableMonitoringPipelineLogs": boolean,
  "error": {
    object (Status)
  }
}
Fields
name

string

Output only. Resource name of a ModelDeploymentMonitoringJob.

displayName

string

Required. The user-defined name of the ModelDeploymentMonitoringJob. The name can be up to 128 characters long and can consist of any UTF-8 characters. Display name of a ModelDeploymentMonitoringJob.

endpoint

string

Required. Endpoint resource name. Format: projects/{project}/locations/{location}/endpoints/{endpoint}

state

enum (JobState)

Output only. The detailed state of the monitoring job. When the job is still creating, the state will be 'PENDING'. Once the job is successfully created, the state will be 'RUNNING'. Pause the job, the state will be 'PAUSED'. Resume the job, the state will return to 'RUNNING'.

scheduleState

enum (MonitoringScheduleState)

Output only. Schedule state when the monitoring job is in Running state.

latestMonitoringPipelineMetadata

object (LatestMonitoringPipelineMetadata)

Output only. Latest triggered monitoring pipeline metadata.

modelDeploymentMonitoringObjectiveConfigs[]

object (ModelDeploymentMonitoringObjectiveConfig)

Required. The config for monitoring objectives. This is a per DeployedModel config. Each DeployedModel needs to be configured separately.

modelDeploymentMonitoringScheduleConfig

object (ModelDeploymentMonitoringScheduleConfig)

Required. Schedule config for running the monitoring job.

loggingSamplingStrategy

object (SamplingStrategy)

Required. Sample Strategy for logging.

modelMonitoringAlertConfig

object (ModelMonitoringAlertConfig)

Alert config for model monitoring.

predictInstanceSchemaUri

string

YAML schema file uri describing the format of a single instance, which are given to format this Endpoint's prediction (and explanation). If not set, we will generate predict schema from collected predict requests.

samplePredictInstance

value (Value format)

Sample Predict instance, same format as PredictRequest.instances, this can be set as a replacement of ModelDeploymentMonitoringJob.predict_instance_schema_uri. If not set, we will generate predict schema from collected predict requests.

analysisInstanceSchemaUri

string

YAML schema file uri describing the format of a single instance that you want Tensorflow Data Validation (TFDV) to analyze.

If this field is empty, all the feature data types are inferred from predictInstanceSchemaUri, meaning that TFDV will use the data in the exact format(data type) as prediction request/response. If there are any data type differences between predict instance and TFDV instance, this field can be used to override the schema. For models trained with Vertex AI, this field must be set as all the fields in predict instance formatted as string.

bigqueryTables[]

object (ModelDeploymentMonitoringBigQueryTable)

Output only. The created bigquery tables for the job under customer project. Customer could do their own query & analysis. There could be 4 log tables in maximum: 1. Training data logging predict request/response 2. Serving data logging predict request/response

logTtl

string (Duration format)

The TTL of BigQuery tables in user projects which stores logs. A day is the basic unit of the TTL and we take the ceil of TTL/86400(a day). e.g. { second: 3600} indicates ttl = 1 day.

A duration in seconds with up to nine fractional digits, ending with 's'. Example: "3.5s".

labels

map (key: string, value: string)

The labels with user-defined metadata to organize your ModelDeploymentMonitoringJob.

label keys and values can be no longer than 64 characters (Unicode codepoints), can only contain lowercase letters, numeric characters, underscores and dashes. International characters are allowed.

See https://goo.gl/xmQnxf for more information and examples of labels.

createTime

string (Timestamp format)

Output only. timestamp when this ModelDeploymentMonitoringJob was created.

A timestamp in RFC3339 UTC "Zulu" format, with nanosecond resolution and up to nine fractional digits. Examples: "2014-10-02T15:01:23Z" and "2014-10-02T15:01:23.045123456Z".

updateTime

string (Timestamp format)

Output only. timestamp when this ModelDeploymentMonitoringJob was updated most recently.

A timestamp in RFC3339 UTC "Zulu" format, with nanosecond resolution and up to nine fractional digits. Examples: "2014-10-02T15:01:23Z" and "2014-10-02T15:01:23.045123456Z".

nextScheduleTime

string (Timestamp format)

Output only. timestamp when this monitoring pipeline will be scheduled to run for the next round.

A timestamp in RFC3339 UTC "Zulu" format, with nanosecond resolution and up to nine fractional digits. Examples: "2014-10-02T15:01:23Z" and "2014-10-02T15:01:23.045123456Z".

statsAnomaliesBaseDirectory

object (GcsDestination)

Stats anomalies base folder path.

encryptionSpec

object (EncryptionSpec)

Customer-managed encryption key spec for a ModelDeploymentMonitoringJob. If set, this ModelDeploymentMonitoringJob and all sub-resources of this ModelDeploymentMonitoringJob will be secured by this key.

enableMonitoringPipelineLogs

boolean

If true, the scheduled monitoring pipeline logs are sent to Google Cloud Logging, including pipeline status and anomalies detected. Please note the logs incur cost, which are subject to Cloud Logging pricing.

error

object (Status)

Output only. Only populated when the job's state is JOB_STATE_FAILED or JOB_STATE_CANCELLED.

MonitoringScheduleState

The state to Specify the monitoring pipeline.

Enums
MONITORING_SCHEDULE_STATE_UNSPECIFIED Unspecified state.
PENDING The pipeline is picked up and wait to run.
OFFLINE The pipeline is offline and will be scheduled for next run.
RUNNING The pipeline is running.

LatestMonitoringPipelineMetadata

All metadata of most recent monitoring pipelines.

JSON representation
{
  "runTime": string,
  "status": {
    object (Status)
  }
}
Fields
runTime

string (Timestamp format)

The time that most recent monitoring pipelines that is related to this run.

A timestamp in RFC3339 UTC "Zulu" format, with nanosecond resolution and up to nine fractional digits. Examples: "2014-10-02T15:01:23Z" and "2014-10-02T15:01:23.045123456Z".

status

object (Status)

The status of the most recent monitoring pipeline.

ModelDeploymentMonitoringObjectiveConfig

ModelDeploymentMonitoringObjectiveConfig contains the pair of deployedModelId to ModelMonitoringObjectiveConfig.

JSON representation
{
  "deployedModelId": string,
  "objectiveConfig": {
    object (ModelMonitoringObjectiveConfig)
  }
}
Fields
deployedModelId

string

The DeployedModel ID of the objective config.

objectiveConfig

object (ModelMonitoringObjectiveConfig)

The objective config of for the modelmonitoring job of this deployed model.

ModelMonitoringObjectiveConfig

The objective configuration for model monitoring, including the information needed to detect anomalies for one particular model.

JSON representation
{
  "trainingDataset": {
    object (TrainingDataset)
  },
  "trainingPredictionSkewDetectionConfig": {
    object (TrainingPredictionSkewDetectionConfig)
  },
  "predictionDriftDetectionConfig": {
    object (PredictionDriftDetectionConfig)
  },
  "explanationConfig": {
    object (ExplanationConfig)
  }
}
Fields
trainingDataset

object (TrainingDataset)

Training dataset for models. This field has to be set only if TrainingPredictionSkewDetectionConfig is specified.

trainingPredictionSkewDetectionConfig

object (TrainingPredictionSkewDetectionConfig)

The config for skew between training data and prediction data.

predictionDriftDetectionConfig

object (PredictionDriftDetectionConfig)

The config for drift of prediction data.

explanationConfig

object (ExplanationConfig)

The config for integrating with Vertex Explainable AI.

TrainingDataset

Training Dataset information.

JSON representation
{
  "dataFormat": string,
  "targetField": string,
  "loggingSamplingStrategy": {
    object (SamplingStrategy)
  },

  // Union field data_source can be only one of the following:
  "dataset": string,
  "gcsSource": {
    object (GcsSource)
  },
  "bigquerySource": {
    object (BigQuerySource)
  }
  // End of list of possible types for union field data_source.
}
Fields
dataFormat

string

Data format of the dataset, only applicable if the input is from Google Cloud Storage. The possible formats are:

"tf-record" The source file is a TFRecord file.

"csv" The source file is a CSV file. "jsonl" The source file is a JSONL file.

targetField

string

The target field name the model is to predict. This field will be excluded when doing Predict and (or) Explain for the training data.

loggingSamplingStrategy

object (SamplingStrategy)

Strategy to sample data from Training Dataset. If not set, we process the whole dataset.

Union field data_source.

data_source can be only one of the following:

dataset

string

The resource name of the Dataset used to train this Model.

gcsSource

object (GcsSource)

The Google Cloud Storage uri of the unmanaged Dataset used to train this Model.

bigquerySource

object (BigQuerySource)

The BigQuery table of the unmanaged Dataset used to train this Model.

SamplingStrategy

Sampling Strategy for logging, can be for both training and prediction dataset.

JSON representation
{
  "randomSampleConfig": {
    object (RandomSampleConfig)
  }
}
Fields
randomSampleConfig

object (RandomSampleConfig)

Random sample config. Will support more sampling strategies later.

RandomSampleConfig

Requests are randomly selected.

JSON representation
{
  "sampleRate": number
}
Fields
sampleRate

number

Sample rate (0, 1]

TrainingPredictionSkewDetectionConfig

The config for Training & Prediction data skew detection. It specifies the training dataset sources and the skew detection parameters.

JSON representation
{
  "skewThresholds": {
    string: {
      object (ThresholdConfig)
    },
    ...
  },
  "attributionScoreSkewThresholds": {
    string: {
      object (ThresholdConfig)
    },
    ...
  },
  "defaultSkewThreshold": {
    object (ThresholdConfig)
  }
}
Fields
skewThresholds

map (key: string, value: object (ThresholdConfig))

Key is the feature name and value is the threshold. If a feature needs to be monitored for skew, a value threshold must be configured for that feature. The threshold here is against feature distribution distance between the training and prediction feature.

attributionScoreSkewThresholds

map (key: string, value: object (ThresholdConfig))

Key is the feature name and value is the threshold. The threshold here is against attribution score distance between the training and prediction feature.

defaultSkewThreshold

object (ThresholdConfig)

Skew anomaly detection threshold used by all features. When the per-feature thresholds are not set, this field can be used to specify a threshold for all features.

ThresholdConfig

The config for feature monitoring threshold.

JSON representation
{

  // Union field threshold can be only one of the following:
  "value": number
  // End of list of possible types for union field threshold.
}
Fields

Union field threshold.

threshold can be only one of the following:

value

number

Specify a threshold value that can trigger the alert. If this threshold config is for feature distribution distance: 1. For categorical feature, the distribution distance is calculated by L-inifinity norm. 2. For numerical feature, the distribution distance is calculated by Jensen–Shannon divergence. Each feature must have a non-zero threshold if they need to be monitored. Otherwise no alert will be triggered for that feature.

PredictionDriftDetectionConfig

The config for Prediction data drift detection.

JSON representation
{
  "driftThresholds": {
    string: {
      object (ThresholdConfig)
    },
    ...
  },
  "attributionScoreDriftThresholds": {
    string: {
      object (ThresholdConfig)
    },
    ...
  },
  "defaultDriftThreshold": {
    object (ThresholdConfig)
  }
}
Fields
driftThresholds

map (key: string, value: object (ThresholdConfig))

Key is the feature name and value is the threshold. If a feature needs to be monitored for drift, a value threshold must be configured for that feature. The threshold here is against feature distribution distance between different time windws.

attributionScoreDriftThresholds

map (key: string, value: object (ThresholdConfig))

Key is the feature name and value is the threshold. The threshold here is against attribution score distance between different time windows.

defaultDriftThreshold

object (ThresholdConfig)

Drift anomaly detection threshold used by all features. When the per-feature thresholds are not set, this field can be used to specify a threshold for all features.

ExplanationConfig

The config for integrating with Vertex Explainable AI. Only applicable if the Model has explanationSpec populated.

JSON representation
{
  "enableFeatureAttributes": boolean,
  "explanationBaseline": {
    object (ExplanationBaseline)
  }
}
Fields
enableFeatureAttributes

boolean

If want to analyze the Vertex Explainable AI feature attribute scores or not. If set to true, Vertex AI will log the feature attributions from explain response and do the skew/drift detection for them.

explanationBaseline

object (ExplanationBaseline)

Predictions generated by the BatchPredictionJob using baseline dataset.

ExplanationBaseline

Output from BatchPredictionJob for Model Monitoring baseline dataset, which can be used to generate baseline attribution scores.

JSON representation
{
  "predictionFormat": enum (PredictionFormat),

  // Union field destination can be only one of the following:
  "gcs": {
    object (GcsDestination)
  },
  "bigquery": {
    object (BigQueryDestination)
  }
  // End of list of possible types for union field destination.
}
Fields
predictionFormat

enum (PredictionFormat)

The storage format of the predictions generated BatchPrediction job.

Union field destination. The configuration specifying of BatchExplain job output. This can be used to generate the baseline of feature attribution scores. destination can be only one of the following:
gcs

object (GcsDestination)

Cloud Storage location for BatchExplain output.

bigquery

object (BigQueryDestination)

BigQuery location for BatchExplain output.

PredictionFormat

The storage format of the predictions generated BatchPrediction job.

Enums
PREDICTION_FORMAT_UNSPECIFIED Should not be set.
JSONL Predictions are in JSONL files.
BIGQUERY Predictions are in BigQuery.

ModelDeploymentMonitoringScheduleConfig

The config for scheduling monitoring job.

JSON representation
{
  "monitorInterval": string,
  "monitorWindow": string
}
Fields
monitorInterval

string (Duration format)

Required. The model monitoring job scheduling interval. It will be rounded up to next full hour. This defines how often the monitoring jobs are triggered.

A duration in seconds with up to nine fractional digits, ending with 's'. Example: "3.5s".

monitorWindow

string (Duration format)

The time window of the prediction data being included in each prediction dataset. This window specifies how long the data should be collected from historical model results for each run. If not set, ModelDeploymentMonitoringScheduleConfig.monitor_interval will be used. e.g. If currently the cutoff time is 2022-01-08 14:30:00 and the monitorWindow is set to be 3600, then data from 2022-01-08 13:30:00 to 2022-01-08 14:30:00 will be retrieved and aggregated to calculate the monitoring statistics.

A duration in seconds with up to nine fractional digits, ending with 's'. Example: "3.5s".

ModelMonitoringAlertConfig

The alert config for model monitoring.

JSON representation
{
  "enableLogging": boolean,
  "notificationChannels": [
    string
  ],

  // Union field alert can be only one of the following:
  "emailAlertConfig": {
    object (EmailAlertConfig)
  }
  // End of list of possible types for union field alert.
}
Fields
enableLogging

boolean

Dump the anomalies to Cloud Logging. The anomalies will be put to json payload encoded from proto [google.cloud.aiplatform.logging.ModelMonitoringAnomaliesLogEntry][]. This can be further sinked to Pub/Sub or any other services supported by Cloud Logging.

notificationChannels[]

string

Resource names of the NotificationChannels to send alert. Must be of the format projects/<project_id_or_number>/notificationChannels/<channelId>

Union field alert.

alert can be only one of the following:

emailAlertConfig

object (EmailAlertConfig)

email alert config.

EmailAlertConfig

The config for email alert.

JSON representation
{
  "userEmails": [
    string
  ]
}
Fields
userEmails[]

string

The email addresses to send the alert.

ModelDeploymentMonitoringBigQueryTable

ModelDeploymentMonitoringBigQueryTable specifies the BigQuery table name as well as some information of the logs stored in this table.

JSON representation
{
  "logSource": enum (LogSource),
  "logType": enum (LogType),
  "bigqueryTablePath": string,
  "requestResponseLoggingSchemaVersion": string
}
Fields
logSource

enum (LogSource)

The source of log.

logType

enum (LogType)

The type of log.

bigqueryTablePath

string

The created BigQuery table to store logs. Customer could do their own query & analysis. Format: bq://<projectId>.model_deployment_monitoring_<endpointId>.<tolower(logSource)>_<tolower(logType)>

requestResponseLoggingSchemaVersion

string

Output only. The schema version of the request/response logging BigQuery table. Default to v1 if unset.

LogSource

Indicates where does the log come from.

Enums
LOG_SOURCE_UNSPECIFIED Unspecified source.
TRAINING Logs coming from Training dataset.
SERVING Logs coming from Serving traffic.

LogType

Indicates what type of traffic does the log belong to.

Enums
LOG_TYPE_UNSPECIFIED Unspecified type.
PREDICT Predict logs.
EXPLAIN Explain logs.

Methods

create

Creates a ModelDeploymentMonitoringJob.

delete

Deletes a ModelDeploymentMonitoringJob.

get

Gets a ModelDeploymentMonitoringJob.

list

Lists ModelDeploymentMonitoringJobs in a Location.

patch

Updates a ModelDeploymentMonitoringJob.

pause

Pauses a ModelDeploymentMonitoringJob.

resume

Resumes a paused ModelDeploymentMonitoringJob.

searchModelDeploymentMonitoringStatsAnomalies

Searches Model Monitoring Statistics generated within a given time window.