The legacy versions of AutoML Tables, AutoML Video Intelligence, AutoML Vision, and AutoML Natural Language are deprecated and will no longer be available on Google Cloud after their shutdown date. All the functionality of legacy AutoML and new features are available on the Vertex AI platform. See Migrate to Vertex AI to learn how to migrate your resources.

REST Resource: projects.locations.models.modelEvaluations

Resource: ModelEvaluation
- JSON representation
ClassificationEvaluationMetrics
- JSON representation
ConfidenceMetricsEntry
- JSON representation
ConfusionMatrix
- JSON representation
Row
- JSON representation
RegressionEvaluationMetrics
- JSON representation
TranslationEvaluationMetrics
- JSON representation
ImageObjectDetectionEvaluationMetrics
- JSON representation
BoundingBoxMetricsEntry
- JSON representation
ConfidenceMetricsEntry
- JSON representation
VideoObjectTrackingEvaluationMetrics
- JSON representation
TextSentimentEvaluationMetrics
- JSON representation
TextExtractionEvaluationMetrics
- JSON representation
ConfidenceMetricsEntry
- JSON representation
Methods

Resource: ModelEvaluation

Evaluation results of a model.

JSON representation

JSON representation
{ "name": string, "annotationSpecId": string, "displayName": string, "createTime": string, "evaluatedExampleCount": integer, // Union field `metrics` can be only one of the following: "classificationEvaluationMetrics": { object (`ClassificationEvaluationMetrics`) }, "regressionEvaluationMetrics": { object (`RegressionEvaluationMetrics`) }, "translationEvaluationMetrics": { object (`TranslationEvaluationMetrics`) }, "imageObjectDetectionEvaluationMetrics": { object (`ImageObjectDetectionEvaluationMetrics`) }, "videoObjectTrackingEvaluationMetrics": { object (`VideoObjectTrackingEvaluationMetrics`) }, "textSentimentEvaluationMetrics": { object (`TextSentimentEvaluationMetrics`) }, "textExtractionEvaluationMetrics": { object (`TextExtractionEvaluationMetrics`) } // End of list of possible types for union field `metrics`. }

{
  "name": string,
  "annotationSpecId": string,
  "displayName": string,
  "createTime": string,
  "evaluatedExampleCount": integer,

  // Union field metrics can be only one of the following:
  "classificationEvaluationMetrics": {
    object (ClassificationEvaluationMetrics)
  },
  "regressionEvaluationMetrics": {
    object (RegressionEvaluationMetrics)
  },
  "translationEvaluationMetrics": {
    object (TranslationEvaluationMetrics)
  },
  "imageObjectDetectionEvaluationMetrics": {
    object (ImageObjectDetectionEvaluationMetrics)
  },
  "videoObjectTrackingEvaluationMetrics": {
    object (VideoObjectTrackingEvaluationMetrics)
  },
  "textSentimentEvaluationMetrics": {
    object (TextSentimentEvaluationMetrics)
  },
  "textExtractionEvaluationMetrics": {
    object (TextExtractionEvaluationMetrics)
  }
  // End of list of possible types for union field metrics.
}

Fields
`name`	`string` Output only. Resource name of the model evaluation. Format: `projects/{project_id}/locations/{locationId}/models/{modelId}/modelEvaluations/{model_evaluation_id}`
`annotationSpecId`	`string` Output only. The ID of the annotation spec that the model evaluation applies to. The The ID is empty for the overall model evaluation. For Tables annotation specs in the dataset do not exist and this ID is always not set, but for CLASSIFICATION `predictionType-s` the `displayName` field is used.
`displayName`	`string` Output only. The value of `displayName` at the moment when the model was trained. Because this field returns a value at model training time, for different models trained from the same dataset, the values may differ, since display names could had been changed between the two model's trainings. For Tables CLASSIFICATION `predictionType-s` distinct values of the target column at the moment of the model evaluation are populated here. The displayName is empty for the overall model evaluation.
`createTime`	`string (Timestamp format)` Output only. Timestamp when this model evaluation was created. A timestamp in RFC3339 UTC "Zulu" format, accurate to nanoseconds. Example: `"2014-10-02T15:01:23.045123456Z"`.
`evaluatedExampleCount`	`integer` Output only. The number of examples used for model evaluation, i.e. for which ground truth from time of model creation is compared against the predicted annotations created by the model. For overall ModelEvaluation (i.e. with annotationSpecId not set) this is the total number of all examples used for evaluation. Otherwise, this is the count of examples that according to the ground truth were annotated by the `annotationSpecId`.
Union field `metrics`. Output only. Problem type specific evaluation metrics. `metrics` can be only one of the following:
`classificationEvaluationMetrics`	`object (ClassificationEvaluationMetrics)` Model evaluation metrics for image, text, video and tables classification. Tables problem is considered a classification when the target column is CATEGORY DataType.
`regressionEvaluationMetrics`	`object (RegressionEvaluationMetrics)` Model evaluation metrics for Tables regression. Tables problem is considered a regression when the target column has FLOAT64 DataType.
`translationEvaluationMetrics`	`object (TranslationEvaluationMetrics)` Model evaluation metrics for translation.
`imageObjectDetectionEvaluationMetrics`	`object (ImageObjectDetectionEvaluationMetrics)` Model evaluation metrics for image object detection.
`videoObjectTrackingEvaluationMetrics`	`object (VideoObjectTrackingEvaluationMetrics)` Model evaluation metrics for video object tracking.
`textSentimentEvaluationMetrics`	`object (TextSentimentEvaluationMetrics)` Evaluation metrics for text sentiment models.
`textExtractionEvaluationMetrics`	`object (TextExtractionEvaluationMetrics)` Evaluation metrics for text extraction models.

ClassificationEvaluationMetrics

Model evaluation metrics for classification problems. Note: For Video Classification this metrics only describe quality of the Video Classification predictions of "segment_classification" type.

JSON representation

JSON representation
{ "auPrc": number, "baseAuPrc": number, "auRoc": number, "logLoss": number, "confidenceMetricsEntry": [ { object (`ConfidenceMetricsEntry`) } ], "confusionMatrix": { object (`ConfusionMatrix`) }, "annotationSpecId": [ string ] }

{
  "auPrc": number,
  "baseAuPrc": number,
  "auRoc": number,
  "logLoss": number,
  "confidenceMetricsEntry": [
    {
      object (ConfidenceMetricsEntry)
    }
  ],
  "confusionMatrix": {
    object (ConfusionMatrix)
  },
  "annotationSpecId": [
    string
  ]
}

Fields
`auPrc`	`number` Output only. The Area Under Precision-Recall Curve metric. Micro-averaged for the overall evaluation.
`baseAuPrc (deprecated)`	`number` This item is deprecated! Output only. The Area Under Precision-Recall Curve metric based on priors. Micro-averaged for the overall evaluation. Deprecated.
`auRoc`	`number` Output only. The Area Under Receiver Operating Characteristic curve metric. Micro-averaged for the overall evaluation.
`logLoss`	`number` Output only. The Log Loss metric.
`confidenceMetricsEntry[]`	`object (ConfidenceMetricsEntry)` Output only. Metrics for each confidenceThreshold in 0.00,0.05,0.10,...,0.95,0.96,0.97,0.98,0.99 and positionThreshold = INT32_MAX_VALUE. ROC and precision-recall curves, and other aggregated metrics are derived from them. The confidence metrics entries may also be supplied for additional values of positionThreshold, but from these no aggregated metrics are computed.
`confusionMatrix`	`object (ConfusionMatrix)` Output only. Confusion matrix of the evaluation. Only set for MULTICLASS classification problems where number of labels is no more than 10. Only set for model level evaluation, not for evaluation per label.
`annotationSpecId[]`	`string` Output only. The annotation spec ids used for this evaluation.

ConfidenceMetricsEntry

Metrics for a single confidence threshold.

JSON representation
{ "confidenceThreshold": number, "positionThreshold": integer, "recall": number, "precision": number, "falsePositiveRate": number, "f1Score": number, "recallAt1": number, "precisionAt1": number, "falsePositiveRateAt1": number, "f1ScoreAt1": number, "truePositiveCount": string, "falsePositiveCount": string, "falseNegativeCount": string, "trueNegativeCount": string }

JSON representation

{
  "confidenceThreshold": number,
  "positionThreshold": integer,
  "recall": number,
  "precision": number,
  "falsePositiveRate": number,
  "f1Score": number,
  "recallAt1": number,
  "precisionAt1": number,
  "falsePositiveRateAt1": number,
  "f1ScoreAt1": number,
  "truePositiveCount": string,
  "falsePositiveCount": string,
  "falseNegativeCount": string,
  "trueNegativeCount": string
}

Fields
`confidenceThreshold`	`number` Output only. Metrics are computed with an assumption that the model never returns predictions with score lower than this value.
`positionThreshold`	`integer` Output only. Metrics are computed with an assumption that the model always returns at most this many predictions (ordered by their score, descendingly), but they all still need to meet the confidenceThreshold.
`recall`	`number` Output only. Recall (True Positive Rate) for the given confidence threshold.
`precision`	`number` Output only. Precision for the given confidence threshold.
`falsePositiveRate`	`number` Output only. False Positive Rate for the given confidence threshold.
`f1Score`	`number` Output only. The harmonic mean of recall and precision.
`recallAt1`	`number` Output only. The Recall (True Positive Rate) when only considering the label that has the highest prediction score and not below the confidence threshold for each example.
`precisionAt1`	`number` Output only. The precision when only considering the label that has the highest prediction score and not below the confidence threshold for each example.
`falsePositiveRateAt1`	`number` Output only. The False Positive Rate when only considering the label that has the highest prediction score and not below the confidence threshold for each example.
`f1ScoreAt1`	`number` Output only. The harmonic mean of `recallAt1` and `precisionAt1`.
`truePositiveCount`	`string (int64 format)` Output only. The number of model created labels that match a ground truth label.
`falsePositiveCount`	`string (int64 format)` Output only. The number of model created labels that do not match a ground truth label.
`falseNegativeCount`	`string (int64 format)` Output only. The number of ground truth labels that are not matched by a model created label.
`trueNegativeCount`	`string (int64 format)` Output only. The number of labels that were not created by the model, but if they would, they would not match a ground truth label.

ConfusionMatrix

Confusion matrix of the model running the classification.

JSON representation
{ "annotationSpecId": [ string ], "displayName": [ string ], "row": [ { object (`Row`) } ] }

Fields

Fields
`annotationSpecId[]`	`string` Output only. IDs of the annotation specs used in the confusion matrix. For Tables CLASSIFICATION `predictionType` only list of [annotation_spec_display_name-s][] is populated.
`displayName[]`	`string` Output only. Display name of the annotation specs used in the confusion matrix, as they were at the moment of the evaluation. For Tables CLASSIFICATION `predictionType-s`, distinct values of the target column at the moment of the model evaluation are populated here.
`row[]`	`object (Row)` Output only. Rows in the confusion matrix. The number of rows is equal to the size of `annotationSpecId`. `row[i].example_count[j]` is the number of examples that have ground truth of the `annotationSpecId[i]` and are predicted as `annotationSpecId[j]` by the model being evaluated.

annotationSpecId[]

string

Output only. IDs of the annotation specs used in the confusion matrix. For Tables CLASSIFICATION

predictionType only list of [annotation_spec_display_name-s][] is populated.

displayName[]

string

Output only. Display name of the annotation specs used in the confusion matrix, as they were at the moment of the evaluation. For Tables CLASSIFICATION

predictionType-s, distinct values of the target column at the moment of the model evaluation are populated here.

row[]

object (Row)

Output only. Rows in the confusion matrix. The number of rows is equal to the size of annotationSpecId. row[i].example_count[j] is the number of examples that have ground truth of the annotationSpecId[i] and are predicted as annotationSpecId[j] by the model being evaluated.

Row

Output only. A row in the confusion matrix.

JSON representation
{ "exampleCount": [ integer ] }

Fields

Fields
`exampleCount[]`	`integer` Output only. Value of the specific cell in the confusion matrix. The number of values each row has (i.e. the length of the row) is equal to the length of the `annotationSpecId` field or, if that one is not populated, length of the `displayName` field.

exampleCount[]

integer

Output only. Value of the specific cell in the confusion matrix. The number of values each row has (i.e. the length of the row) is equal to the length of the annotationSpecId field or, if that one is not populated, length of the displayName field.

RegressionEvaluationMetrics

Metrics for regression problems.

JSON representation
{ "rootMeanSquaredError": number, "meanAbsoluteError": number, "meanAbsolutePercentageError": number, "rSquared": number, "rootMeanSquaredLogError": number }

Fields
`rootMeanSquaredError`	`number` Output only. Root Mean Squared Error (RMSE).
`meanAbsoluteError`	`number` Output only. Mean Absolute Error (MAE).
`meanAbsolutePercentageError`	`number` Output only. Mean absolute percentage error. Only set if all ground truth values are are positive.
`rSquared`	`number` Output only. R squared.
`rootMeanSquaredLogError`	`number` Output only. Root mean squared log error.

TranslationEvaluationMetrics

Evaluation metrics for the dataset.

JSON representation
{ "bleuScore": number, "baseBleuScore": number }

Fields

Fields
`bleuScore`	`number` Output only. BLEU score.
`baseBleuScore`	`number` Output only. BLEU score for base model.

bleuScore

number

Output only. BLEU score.

baseBleuScore

number

Output only. BLEU score for base model.

ImageObjectDetectionEvaluationMetrics

Model evaluation metrics for image object detection problems. Evaluates prediction quality of labeled bounding boxes.

JSON representation
{ "evaluatedBoundingBoxCount": integer, "boundingBoxMetricsEntries": [ { object (`BoundingBoxMetricsEntry`) } ], "boundingBoxMeanAveragePrecision": number }

Fields

Fields
`evaluatedBoundingBoxCount`	`integer` Output only. The total number of bounding boxes (i.e. summed over all images) the ground truth used to create this evaluation had.
`boundingBoxMetricsEntries[]`	`object (BoundingBoxMetricsEntry)` Output only. The bounding boxes match metrics for each Intersection-over-union threshold 0.05,0.10,...,0.95,0.96,0.97,0.98,0.99 and each label confidence threshold 0.05,0.10,...,0.95,0.96,0.97,0.98,0.99 pair.
`boundingBoxMeanAveragePrecision`	`number` Output only. The single metric for bounding boxes evaluation: the meanAveragePrecision averaged over all boundingBoxMetricsEntries.

evaluatedBoundingBoxCount

integer

Output only. The total number of bounding boxes (i.e. summed over all images) the ground truth used to create this evaluation had.

boundingBoxMetricsEntries[]

object (BoundingBoxMetricsEntry)

Output only. The bounding boxes match metrics for each Intersection-over-union threshold 0.05,0.10,...,0.95,0.96,0.97,0.98,0.99 and each label confidence threshold 0.05,0.10,...,0.95,0.96,0.97,0.98,0.99 pair.

boundingBoxMeanAveragePrecision

number

Output only. The single metric for bounding boxes evaluation: the meanAveragePrecision averaged over all boundingBoxMetricsEntries.

BoundingBoxMetricsEntry

Bounding box matching model metrics for a single intersection-over-union threshold and multiple label match confidence thresholds.

JSON representation
{ "iouThreshold": number, "meanAveragePrecision": number, "confidenceMetricsEntries": [ { object (`ConfidenceMetricsEntry`) } ] }

Fields

Fields
`iouThreshold`	`number` Output only. The intersection-over-union threshold value used to compute this metrics entry.
`meanAveragePrecision`	`number` Output only. The mean average precision, most often close to auPrc.
`confidenceMetricsEntries[]`	`object (ConfidenceMetricsEntry)` Output only. Metrics for each label-match confidenceThreshold from 0.05,0.10,...,0.95,0.96,0.97,0.98,0.99. Precision-recall curve is derived from them.

iouThreshold

number

Output only. The intersection-over-union threshold value used to compute this metrics entry.

meanAveragePrecision

number

Output only. The mean average precision, most often close to auPrc.

confidenceMetricsEntries[]

object (ConfidenceMetricsEntry)

Output only. Metrics for each label-match confidenceThreshold from 0.05,0.10,...,0.95,0.96,0.97,0.98,0.99. Precision-recall curve is derived from them.

ConfidenceMetricsEntry

Metrics for a single confidence threshold.

JSON representation
{ "confidenceThreshold": number, "recall": number, "precision": number, "f1Score": number }

Fields
`confidenceThreshold`	`number` Output only. The confidence threshold value used to compute the metrics.
`recall`	`number` Output only. Recall under the given confidence threshold.
`precision`	`number` Output only. Precision under the given confidence threshold.
`f1Score`	`number` Output only. The harmonic mean of recall and precision.

VideoObjectTrackingEvaluationMetrics

Model evaluation metrics for video object tracking problems. Evaluates prediction quality of both labeled bounding boxes and labeled tracks (i.e. series of bounding boxes sharing same label and instance ID).

JSON representation
{ "evaluatedFrameCount": integer, "evaluatedBoundingBoxCount": integer, "boundingBoxMetricsEntries": [ { object (`BoundingBoxMetricsEntry`) } ], "boundingBoxMeanAveragePrecision": number }

Fields
`evaluatedFrameCount`	`integer` Output only. The number of video frames used to create this evaluation.
`evaluatedBoundingBoxCount`	`integer` Output only. The total number of bounding boxes (i.e. summed over all frames) the ground truth used to create this evaluation had.
`boundingBoxMetricsEntries[]`	`object (BoundingBoxMetricsEntry)` Output only. The bounding boxes match metrics for each Intersection-over-union threshold 0.05,0.10,...,0.95,0.96,0.97,0.98,0.99 and each label confidence threshold 0.05,0.10,...,0.95,0.96,0.97,0.98,0.99 pair.
`boundingBoxMeanAveragePrecision`	`number` Output only. The single metric for bounding boxes evaluation: the meanAveragePrecision averaged over all boundingBoxMetricsEntries.

TextSentimentEvaluationMetrics

Model evaluation metrics for text sentiment problems.

JSON representation

JSON representation
{ "precision": number, "recall": number, "f1Score": number, "meanAbsoluteError": number, "meanSquaredError": number, "linearKappa": number, "quadraticKappa": number, "confusionMatrix": { object (`ConfusionMatrix`) }, "annotationSpecId": [ string ] }

{
  "precision": number,
  "recall": number,
  "f1Score": number,
  "meanAbsoluteError": number,
  "meanSquaredError": number,
  "linearKappa": number,
  "quadraticKappa": number,
  "confusionMatrix": {
    object (ConfusionMatrix)
  },
  "annotationSpecId": [
    string
  ]
}

Fields
`precision`	`number` Output only. Precision.
`recall`	`number` Output only. Recall.
`f1Score`	`number` Output only. The harmonic mean of recall and precision.
`meanAbsoluteError`	`number` Output only. Mean absolute error. Only set for the overall model evaluation, not for evaluation of a single annotation spec.
`meanSquaredError`	`number` Output only. Mean squared error. Only set for the overall model evaluation, not for evaluation of a single annotation spec.
`linearKappa`	`number` Output only. Linear weighted kappa. Only set for the overall model evaluation, not for evaluation of a single annotation spec.
`quadraticKappa`	`number` Output only. Quadratic weighted kappa. Only set for the overall model evaluation, not for evaluation of a single annotation spec.
`confusionMatrix`	`object (ConfusionMatrix)` Output only. Confusion matrix of the evaluation. Only set for the overall model evaluation, not for evaluation of a single annotation spec.
`annotationSpecId[] (deprecated)`	`string` This item is deprecated! Output only. The annotation spec ids used for this evaluation. Deprecated .

TextExtractionEvaluationMetrics

Model evaluation metrics for text extraction problems.

JSON representation
{ "auPrc": number, "confidenceMetricsEntries": [ { object (`ConfidenceMetricsEntry`) } ] }

Fields

Fields
`auPrc`	`number` Output only. The Area under precision recall curve metric.
`confidenceMetricsEntries[]`	`object (ConfidenceMetricsEntry)` Output only. Metrics that have confidence thresholds. Precision-recall curve can be derived from it.

auPrc

number

Output only. The Area under precision recall curve metric.

confidenceMetricsEntries[]

object (ConfidenceMetricsEntry)

Output only. Metrics that have confidence thresholds. Precision-recall curve can be derived from it.

ConfidenceMetricsEntry

Metrics for a single confidence threshold.

JSON representation
{ "confidenceThreshold": number, "recall": number, "precision": number, "f1Score": number }

Fields
`confidenceThreshold`	`number` Output only. The confidence threshold value used to compute the metrics. Only annotations with score of at least this threshold are considered to be ones the model would return.
`recall`	`number` Output only. Recall under the given confidence threshold.
`precision`	`number` Output only. Precision under the given confidence threshold.
`f1Score`	`number` Output only. The harmonic mean of recall and precision.

Methods
`get`	Gets a model evaluation.
`list`	Lists model evaluations.

REST Resource: projects.locations.models.modelEvaluations

Resource: ModelEvaluation

ClassificationEvaluationMetrics

ConfidenceMetricsEntry

ConfusionMatrix

Row

RegressionEvaluationMetrics

TranslationEvaluationMetrics

ImageObjectDetectionEvaluationMetrics

BoundingBoxMetricsEntry

ConfidenceMetricsEntry

VideoObjectTrackingEvaluationMetrics

TextSentimentEvaluationMetrics

TextExtractionEvaluationMetrics

ConfidenceMetricsEntry

Methods

`get`

`list`