FeatureStatsAnomaly

Stats and Anomaly generated at specific timestamp for specific feature. The startTime and endTime are used to define the time range of the dataset that current stats belongs to, e.g. prediction traffic is bucketed into prediction datasets by time window. If the Dataset is not defined by time window, startTime = endTime. timestamp of the stats and anomalies always refers to endTime. Raw stats and anomalies are stored in statsUri or anomalyUri in the tensorflow defined protos. Field dataStats contains almost identical information with the raw stats in Vertex AI defined proto, for UI to display.

JSON representation
{
  "score": number,
  "statsUri": string,
  "anomalyUri": string,
  "distributionDeviation": number,
  "anomalyDetectionThreshold": number,
  "startTime": string,
  "endTime": string
}
Fields
score

number

feature importance score, only populated when cross-feature monitoring is enabled. For now only used to represent feature attribution score within range [0, 1] for ModelDeploymentMonitoringObjectiveType.FEATURE_ATTRIBUTION_SKEW and ModelDeploymentMonitoringObjectiveType.FEATURE_ATTRIBUTION_DRIFT.

statsUri

string

Path of the stats file for current feature values in Cloud Storage bucket. Format: gs:////stats. Example: gs://monitoring_bucket/featureName/stats. Stats are stored as binary format with Protobuf message tensorflow.metadata.v0.FeatureNameStatistics.

anomalyUri

string

Path of the anomaly file for current feature values in Cloud Storage bucket. Format: gs:////anomalies. Example: gs://monitoring_bucket/featureName/anomalies. Stats are stored as binary format with Protobuf message Anoamlies are stored as binary format with Protobuf message tensorflow.metadata.v0.AnomalyInfo.

distributionDeviation

number

Deviation from the current stats to baseline stats. 1. For categorical feature, the distribution distance is calculated by L-inifinity norm. 2. For numerical feature, the distribution distance is calculated by Jensen–Shannon divergence.

anomalyDetectionThreshold

number

This is the threshold used when detecting anomalies. The threshold can be changed by user, so this one might be different from ThresholdConfig.value.

startTime

string (Timestamp format)

The start timestamp of window where stats were generated. For objectives where time window doesn't make sense (e.g. Featurestore Snapshot Monitoring), startTime is only used to indicate the monitoring intervals, so it always equals to (endTime - monitoringInterval).

A timestamp in RFC3339 UTC "Zulu" format, with nanosecond resolution and up to nine fractional digits. Examples: "2014-10-02T15:01:23Z" and "2014-10-02T15:01:23.045123456Z".

endTime

string (Timestamp format)

The end timestamp of window where stats were generated. For objectives where time window doesn't make sense (e.g. Featurestore Snapshot Monitoring), endTime indicates the timestamp of the data used to generate stats (e.g. timestamp we take snapshots for feature values).

A timestamp in RFC3339 UTC "Zulu" format, with nanosecond resolution and up to nine fractional digits. Examples: "2014-10-02T15:01:23Z" and "2014-10-02T15:01:23.045123456Z".