Resource: ModelDeploymentMonitoringJob
Represents a job that runs periodically to monitor the deployed models in an endpoint. It will analyze the logged training & prediction data to detect any abnormal behaviors.
JSON representation |
---|
{ "name": string, "displayName": string, "endpoint": string, "state": enum ( |
Fields | |
---|---|
name |
Output only. Resource name of a ModelDeploymentMonitoringJob. |
displayName |
Required. The user-defined name of the ModelDeploymentMonitoringJob. The name can be up to 128 characters long and can consist of any UTF-8 characters. Display name of a ModelDeploymentMonitoringJob. |
endpoint |
Required. Endpoint resource name. Format: |
state |
Output only. The detailed state of the monitoring job. When the job is still creating, the state will be 'PENDING'. Once the job is successfully created, the state will be 'RUNNING'. Pause the job, the state will be 'PAUSED'. Resume the job, the state will return to 'RUNNING'. |
scheduleState |
Output only. Schedule state when the monitoring job is in Running state. |
latestMonitoringPipelineMetadata |
Output only. Latest triggered monitoring pipeline metadata. |
modelDeploymentMonitoringObjectiveConfigs[] |
Required. The config for monitoring objectives. This is a per DeployedModel config. Each DeployedModel needs to be configured separately. |
modelDeploymentMonitoringScheduleConfig |
Required. Schedule config for running the monitoring job. |
loggingSamplingStrategy |
Required. Sample Strategy for logging. |
modelMonitoringAlertConfig |
Alert config for model monitoring. |
predictInstanceSchemaUri |
YAML schema file uri describing the format of a single instance, which are given to format this Endpoint's prediction (and explanation). If not set, we will generate predict schema from collected predict requests. |
samplePredictInstance |
Sample Predict instance, same format as |
analysisInstanceSchemaUri |
YAML schema file uri describing the format of a single instance that you want Tensorflow data Validation (TFDV) to analyze. If this field is empty, all the feature data types are inferred from |
bigqueryTables[] |
Output only. The created bigquery tables for the job under customer project. Customer could do their own query & analysis. There could be 4 log tables in maximum: 1. Training data logging predict request/response 2. Serving data logging predict request/response |
logTtl |
The TTL of BigQuery tables in user projects which stores logs. A day is the basic unit of the TTL and we take the ceil of TTL/86400(a day). e.g. { second: 3600} indicates ttl = 1 day. A duration in seconds with up to nine fractional digits, ending with ' |
labels |
The labels with user-defined metadata to organize your ModelDeploymentMonitoringJob. label keys and values can be no longer than 64 characters (Unicode codepoints), can only contain lowercase letters, numeric characters, underscores and dashes. International characters are allowed. See https://goo.gl/xmQnxf for more information and examples of labels. |
createTime |
Output only. timestamp when this ModelDeploymentMonitoringJob was created. A timestamp in RFC3339 UTC "Zulu" format, with nanosecond resolution and up to nine fractional digits. Examples: |
updateTime |
Output only. timestamp when this ModelDeploymentMonitoringJob was updated most recently. A timestamp in RFC3339 UTC "Zulu" format, with nanosecond resolution and up to nine fractional digits. Examples: |
nextScheduleTime |
Output only. timestamp when this monitoring pipeline will be scheduled to run for the next round. A timestamp in RFC3339 UTC "Zulu" format, with nanosecond resolution and up to nine fractional digits. Examples: |
statsAnomaliesBaseDirectory |
Stats anomalies base folder path. |
encryptionSpec |
Customer-managed encryption key spec for a ModelDeploymentMonitoringJob. If set, this ModelDeploymentMonitoringJob and all sub-resources of this ModelDeploymentMonitoringJob will be secured by this key. |
enableMonitoringPipelineLogs |
If true, the scheduled monitoring pipeline logs are sent to Google Cloud Logging, including pipeline status and anomalies detected. Please note the logs incur cost, which are subject to Cloud Logging pricing. |
error |
Output only. Only populated when the job's state is |
MonitoringScheduleState
The state to Specify the monitoring pipeline.
Enums | |
---|---|
MONITORING_SCHEDULE_STATE_UNSPECIFIED |
Unspecified state. |
PENDING |
The pipeline is picked up and wait to run. |
OFFLINE |
The pipeline is offline and will be scheduled for next run. |
RUNNING |
The pipeline is running. |
LatestMonitoringPipelineMetadata
All metadata of most recent monitoring pipelines.
JSON representation |
---|
{
"runTime": string,
"status": {
object ( |
Fields | |
---|---|
runTime |
The time that most recent monitoring pipelines that is related to this run. A timestamp in RFC3339 UTC "Zulu" format, with nanosecond resolution and up to nine fractional digits. Examples: |
status |
The status of the most recent monitoring pipeline. |
ModelDeploymentMonitoringObjectiveConfig
ModelDeploymentMonitoringObjectiveConfig contains the pair of deployedModelId to ModelMonitoringObjectiveConfig.
JSON representation |
---|
{
"deployedModelId": string,
"objectiveConfig": {
object ( |
Fields | |
---|---|
deployedModelId |
The DeployedModel id of the objective config. |
objectiveConfig |
The objective config of for the modelmonitoring job of this deployed model. |
ModelMonitoringObjectiveConfig
The objective configuration for model monitoring, including the information needed to detect anomalies for one particular model.
JSON representation |
---|
{ "trainingDataset": { object ( |
Fields | |
---|---|
trainingDataset |
Training dataset for models. This field has to be set only if TrainingPredictionSkewDetectionConfig is specified. |
trainingPredictionSkewDetectionConfig |
The config for skew between training data and prediction data. |
predictionDriftDetectionConfig |
The config for drift of prediction data. |
explanationConfig |
The config for integrating with Vertex Explainable AI. |
TrainingDataset
Training Dataset information.
JSON representation |
---|
{ "dataFormat": string, "targetField": string, "loggingSamplingStrategy": { object ( |
Fields | |
---|---|
dataFormat |
data format of the dataset, only applicable if the input is from Google Cloud Storage. The possible formats are: "tf-record" The source file is a TFRecord file. "csv" The source file is a CSV file. "jsonl" The source file is a JSONL file. |
targetField |
The target field name the model is to predict. This field will be excluded when doing Predict and (or) Explain for the training data. |
loggingSamplingStrategy |
Strategy to sample data from Training Dataset. If not set, we process the whole dataset. |
Union field
|
|
dataset |
The resource name of the Dataset used to train this Model. |
gcsSource |
The Google Cloud Storage uri of the unmanaged Dataset used to train this Model. |
bigquerySource |
The BigQuery table of the unmanaged Dataset used to train this Model. |
SamplingStrategy
Sampling Strategy for logging, can be for both training and prediction dataset.
JSON representation |
---|
{
"randomSampleConfig": {
object ( |
Fields | |
---|---|
randomSampleConfig |
Random sample config. Will support more sampling strategies later. |
RandomSampleConfig
Requests are randomly selected.
JSON representation |
---|
{ "sampleRate": number } |
Fields | |
---|---|
sampleRate |
Sample rate (0, 1] |
TrainingPredictionSkewDetectionConfig
The config for Training & Prediction data skew detection. It specifies the training dataset sources and the skew detection parameters.
JSON representation |
---|
{ "skewThresholds": { string: { object ( |
Fields | |
---|---|
skewThresholds |
Key is the feature name and value is the threshold. If a feature needs to be monitored for skew, a value threshold must be configured for that feature. The threshold here is against feature distribution distance between the training and prediction feature. |
attributionScoreSkewThresholds |
Key is the feature name and value is the threshold. The threshold here is against attribution score distance between the training and prediction feature. |
defaultSkewThreshold |
Skew anomaly detection threshold used by all features. When the per-feature thresholds are not set, this field can be used to specify a threshold for all features. |
ThresholdConfig
The config for feature monitoring threshold.
JSON representation |
---|
{ // Union field |
Fields | |
---|---|
Union field
|
|
value |
Specify a threshold value that can trigger the alert. If this threshold config is for feature distribution distance: 1. For categorical feature, the distribution distance is calculated by L-inifinity norm. 2. For numerical feature, the distribution distance is calculated by Jensen–Shannon divergence. Each feature must have a non-zero threshold if they need to be monitored. Otherwise no alert will be triggered for that feature. |
PredictionDriftDetectionConfig
The config for Prediction data drift detection.
JSON representation |
---|
{ "driftThresholds": { string: { object ( |
Fields | |
---|---|
driftThresholds |
Key is the feature name and value is the threshold. If a feature needs to be monitored for drift, a value threshold must be configured for that feature. The threshold here is against feature distribution distance between different time windws. |
attributionScoreDriftThresholds |
Key is the feature name and value is the threshold. The threshold here is against attribution score distance between different time windows. |
defaultDriftThreshold |
Drift anomaly detection threshold used by all features. When the per-feature thresholds are not set, this field can be used to specify a threshold for all features. |
ExplanationConfig
The config for integrating with Vertex Explainable AI. Only applicable if the Model has explanationSpec populated.
JSON representation |
---|
{
"enableFeatureAttributes": boolean,
"explanationBaseline": {
object ( |
Fields | |
---|---|
enableFeatureAttributes |
If want to analyze the Vertex Explainable AI feature attribute scores or not. If set to true, Vertex AI will log the feature attributions from explain response and do the skew/drift detection for them. |
explanationBaseline |
Predictions generated by the BatchPredictionJob using baseline dataset. |
ExplanationBaseline
Output from BatchPredictionJob
for Model Monitoring baseline dataset, which can be used to generate baseline attribution scores.
JSON representation |
---|
{ "predictionFormat": enum ( |
Fields | |
---|---|
predictionFormat |
The storage format of the predictions generated BatchPrediction job. |
Union field destination . The configuration specifying of BatchExplain job output. This can be used to generate the baseline of feature attribution scores. destination can be only one of the following: |
|
gcs |
Cloud Storage location for BatchExplain output. |
bigquery |
BigQuery location for BatchExplain output. |
PredictionFormat
The storage format of the predictions generated BatchPrediction job.
Enums | |
---|---|
PREDICTION_FORMAT_UNSPECIFIED |
Should not be set. |
JSONL |
Predictions are in JSONL files. |
BIGQUERY |
Predictions are in BigQuery. |
ModelDeploymentMonitoringScheduleConfig
The config for scheduling monitoring job.
JSON representation |
---|
{ "monitorInterval": string, "monitorWindow": string } |
Fields | |
---|---|
monitorInterval |
Required. The model monitoring job scheduling interval. It will be rounded up to next full hour. This defines how often the monitoring jobs are triggered. A duration in seconds with up to nine fractional digits, ending with ' |
monitorWindow |
The time window of the prediction data being included in each prediction dataset. This window specifies how long the data should be collected from historical model results for each run. If not set, A duration in seconds with up to nine fractional digits, ending with ' |
ModelMonitoringAlertConfig
The alert config for model monitoring.
JSON representation |
---|
{ "enableLogging": boolean, "notificationChannels": [ string ], // Union field |
Fields | |
---|---|
enableLogging |
Dump the anomalies to Cloud Logging. The anomalies will be put to json payload encoded from proto [google.cloud.aiplatform.logging.ModelMonitoringAnomaliesLogEntry][]. This can be further sinked to Pub/Sub or any other services supported by Cloud Logging. |
notificationChannels[] |
Resource names of the NotificationChannels to send alert. Must be of the format |
Union field
|
|
emailAlertConfig |
email alert config. |
EmailAlertConfig
The config for email alert.
JSON representation |
---|
{ "userEmails": [ string ] } |
Fields | |
---|---|
userEmails[] |
The email addresses to send the alert. |
ModelDeploymentMonitoringBigQueryTable
ModelDeploymentMonitoringBigQueryTable specifies the BigQuery table name as well as some information of the logs stored in this table.
JSON representation |
---|
{ "logSource": enum ( |
Fields | |
---|---|
logSource |
The source of log. |
logType |
The type of log. |
bigqueryTablePath |
The created BigQuery table to store logs. Customer could do their own query & analysis. Format: |
requestResponseLoggingSchemaVersion |
Output only. The schema version of the request/response logging BigQuery table. Default to v1 if unset. |
LogSource
Indicates where does the log come from.
Enums | |
---|---|
LOG_SOURCE_UNSPECIFIED |
Unspecified source. |
TRAINING |
Logs coming from Training dataset. |
SERVING |
Logs coming from Serving traffic. |
LogType
Indicates what type of traffic does the log belong to.
Enums | |
---|---|
LOG_TYPE_UNSPECIFIED |
Unspecified type. |
PREDICT |
Predict logs. |
EXPLAIN |
Explain logs. |
Methods |
|
---|---|
|
Creates a ModelDeploymentMonitoringJob. |
|
Deletes a ModelDeploymentMonitoringJob. |
|
Gets a ModelDeploymentMonitoringJob. |
|
Lists ModelDeploymentMonitoringJobs in a Location. |
|
Updates a ModelDeploymentMonitoringJob. |
|
Pauses a ModelDeploymentMonitoringJob. |
|
Resumes a paused ModelDeploymentMonitoringJob. |
|
Searches Model Monitoring Statistics generated within a given time window. |