Evaluating models

After training a model, AutoML Video Intelligence Classification uses items from the TEST set to evaluate the quality and accuracy of the new model.

AutoML Video Intelligence Classification provides an aggregate set of evaluation metrics indicating how well the model performs overall, as well as evaluation metrics for each category label, indicating how well the model performs for that label.

  • AuPRC : Area under Precision/Recall curve, also referred to as "average precision." Generally between 0.5 and 1.0. Higher values indicate more accurate models.

  • The Confidence threshold curves show how different confidence thresholds would affect precision, recall, true and false positive rates. Read about the relationship of precision and recall.

Use this data to evaluate your model's readiness. Low AUC scores, or low precision and recall scores can indicate that your model needs additional training data or has inconsistent labels. A very high AUC score and perfect precision and recall could indicate that the data is too easy and may not generalize well.

Get model evaluation values

Web UI

  1. Open the Models page in the AutoML Video Classification UI.

  2. Click the row for the model you want to evaluate.

  3. Click the Evaluate tab.

    If training has completed for the model, AutoML Video Classification shows its evaluation metrics.

    Evaluate tab with model information
  4. To view metrics for a specific label, select the label name from the list of labels in the lower part of the page.


Before using any of the request data below, make the following replacements:

  • model-name: the full name of your model, from the response when you created the model. The full name has the format: projects/project-number/locations/location-id/models/model-id.

HTTP method and URL:

GET  https://automl.googleapis.com/v1beta1/model-name:modelEvaluations

To send your request, choose one of these options:


Execute the following command:

curl -X GET \
-H "Authorization: Bearer "$(gcloud auth application-default print-access-token) \


Execute the following command:

$cred = gcloud auth application-default print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
-Method GET `
-Headers $headers `
-Uri " https://automl.googleapis.com/v1beta1/model-name:modelEvaluations" | Select-Object -Expand Content
The response includes a ModelEvaluation resource for the overall model, for example: 2308399322008336298.


import com.google.cloud.automl.v1beta1.AutoMlClient;
import com.google.cloud.automl.v1beta1.ModelEvaluation;
import com.google.cloud.automl.v1beta1.ModelEvaluationName;
import java.io.IOException;

class GetModelEvaluation {

  static void getModelEvaluation() throws IOException {
    // TODO(developer): Replace these variables before running the sample.
    String projectId = "YOUR_PROJECT_ID";
    String modelId = "YOUR_MODEL_ID";
    String modelEvaluationId = "YOUR_MODEL_EVALUATION_ID";
    getModelEvaluation(projectId, modelId, modelEvaluationId);

  // Get a model evaluation
  static void getModelEvaluation(String projectId, String modelId, String modelEvaluationId)
      throws IOException {
    // Initialize client that will be used to send requests. This client only needs to be created
    // once, and can be reused for multiple requests. After completing all of your requests, call
    // the "close" method on the client to safely clean up any remaining background resources.
    try (AutoMlClient client = AutoMlClient.create()) {
      // Get the full path of the model evaluation.
      ModelEvaluationName modelEvaluationFullId =
          ModelEvaluationName.of(projectId, "us-central1", modelId, modelEvaluationId);

      // Get complete detail of the model evaluation.
      ModelEvaluation modelEvaluation = client.getModelEvaluation(modelEvaluationFullId);

      System.out.format("Model Evaluation Name: %s%n", modelEvaluation.getName());
      System.out.format("Model Annotation Spec Id: %s", modelEvaluation.getAnnotationSpecId());
      System.out.println("Create Time:");
      System.out.format("\tseconds: %s%n", modelEvaluation.getCreateTime().getSeconds());
      System.out.format("\tnanos: %s", modelEvaluation.getCreateTime().getNanos() / 1e9);
          "Evalution Example Count: %d%n", modelEvaluation.getEvaluatedExampleCount());

          "Classification Model Evaluation Metrics: %s%n",


 * TODO(developer): Uncomment these variables before running the sample.
// const projectId = 'YOUR_PROJECT_ID';
// const location = 'us-central1';
// const modelId = 'YOUR_MODEL_ID';
// const modelEvaluationId = 'YOUR_MODEL_EVALUATION_ID';

// Imports the Google Cloud AutoML library
const {AutoMlClient} = require('@google-cloud/automl').v1beta1;

// Instantiates a client
const client = new AutoMlClient();

async function getModelEvaluation() {
  // Construct request
  const request = {
    name: client.modelEvaluationPath(

  const [response] = await client.getModelEvaluation(request);

  console.log(`Model evaluation name: ${response.name}`);
  console.log(`Model annotation spec id: ${response.annotationSpecId}`);
  console.log(`Model display name: ${response.displayName}`);
  console.log('Model create time');
  console.log(`\tseconds ${response.createTime.seconds}`);
  console.log(`\tnanos ${response.createTime.nanos / 1e9}`);
  console.log(`Evaluation example count: ${response.evaluatedExampleCount}`);

    `Video classification model evaluation metrics: ${response.videoClassificationEvaluationMetrics}`



from google.cloud import automl_v1beta1 as automl

# TODO(developer): Uncomment and set the following variables
# project_id = "YOUR_PROJECT_ID"
# model_id = "YOUR_MODEL_ID"
# model_evaluation_id = "YOUR_MODEL_EVALUATION_ID"

client = automl.AutoMlClient()
# Get the full path of the model evaluation.
model_evaluation_full_id = client.model_evaluation_path(
    project_id, "us-central1", model_id, model_evaluation_id

# Get complete detail of the model evaluation.
response = client.get_model_evaluation(model_evaluation_full_id)

print("Model evaluation name: {}".format(response.name))
print("Model annotation spec id: {}".format(response.annotation_spec_id))
print("Create Time:")
print("\tseconds: {}".format(response.create_time.seconds))
print("\tnanos: {}".format(response.create_time.nanos / 1e9))
    "Evaluation example count: {}".format(response.evaluated_example_count)

    "Classification model evaluation metrics: {}".format(

Iterate on your model

If you're not happy with the quality levels, you can go back to earlier steps to improve the quality:

  • You may need to add different types of videos (e.g. wider angle, higher or lower resolution, different points of view).
  • Consider removing labels altogether if you don't have enough training videos.
  • Remember that machines can’t read your label name; it's just a random string of letters to them. If you have one label that says "door" and another that says "door_with_knob" the machine has no way of figuring out the nuance other than the videos you provide it.
  • Augment your data with more examples of true positives and negatives. Especially important examples are the ones that are close to the decision boundary.

Once you've made changes, train and evaluate a new model until you reach a high enough quality level.