Train a video object tracking model

This page shows you how to train an AutoML object tracking model from a video dataset using either the Google Cloud console or the Vertex AI API.

Train an AutoML model

Google Cloud console

  1. In the Google Cloud console, in the Vertex AI section, go to the Datasets page.

    Go to the Datasets page

  2. Click the name of the dataset you want to use to train your model to open its details page.

  3. Click Train new model.

  4. Enter the display name for your new model.

  5. If you want manually set how your training data is split, expand Advanced options and select a data split option. Learn more.

  6. Click Continue.

  7. Select the model training method.

    • AutoML is a good choice for a wide range of use cases.
    • Seq2seq+ is a good choice for experimentation. The algorithm is likely to converge faster than AutoML because its architecture is simpler and it uses a smaller search space. Our experiments find that Seq2Seq+ performs well with a small time budget and on datasets smaller than 1 GB in size.
    Click Continue.

  8. Click Start Training.

    Model training can take many hours, depending on the size and complexity of your data and your training budget, if you specified one. You can close this tab and return to it later. You will receive an email when your model has completed training.

    Several minutes after training starts, you can check the training node hour estimation from the model's properties information. If you cancel the training, there is no charge on the current product.

API

Select the tab below for your language or environment:

REST

Before using any of the request data, make the following replacements:

  • LOCATION: Region where Dataset is located and Model will be stored. For example, us-central1.
  • PROJECT: Your project ID.
  • MODEL_DISPLAY_NAME: Display name for the newly trained model.
  • DATASET_ID: ID for the training Dataset.
  • The filterSplit object is optional; you use it to control your data split. For more information about controlling data split, see Controlling the data split using REST.
  • PROJECT_NUMBER: Your project's automatically generated project number

HTTP method and URL:

POST https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT/locations/LOCATION/trainingPipelines

Request JSON body:

{
    "displayName": "MODE_DISPLAY_NAME",
    "trainingTaskDefinition": "gs://google-cloud-aiplatform/schema/trainingjob/definition/automl_video_object_tracking_1.0.0.yaml",
    "trainingTaskInputs": {},
    "modelToUpload": {"displayName": "MODE_DISPLAY_NAME"},
    "inputDataConfig": {
      "datasetId": "DATASET_ID",
      "filterSplit": {
        "trainingFilter": "labels.ml_use = training",
        "validationFilter": "labels.ml_use = -",
        "testFilter": "labels.ml_use = test"
      }
    }
}

To send your request, expand one of these options:

You should receive a JSON response similar to the following:

{
  "name": "projects/PROJECT_NUMBER/locations/us-central1/trainingPipelines/2307109646608891904",
  "displayName": "myModelName",
  "trainingTaskDefinition": "gs://google-cloud-aiplatform/schema/trainingjob/definition/automl_video_object_tracking_1.0.0.yaml",
  "modelToUpload": {
    "displayName": "myModelName"
  },
  "state": "PIPELINE_STATE_PENDING",
  "createTime": "2020-04-18T01:22:57.479336Z",
  "updateTime": "2020-04-18T01:22:57.479336Z"
}

Java

Before trying this sample, follow the Java setup instructions in the Vertex AI quickstart using client libraries. For more information, see the Vertex AI Java API reference documentation.

To authenticate to Vertex AI, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.


import com.google.cloud.aiplatform.util.ValueConverter;
import com.google.cloud.aiplatform.v1.FilterSplit;
import com.google.cloud.aiplatform.v1.FractionSplit;
import com.google.cloud.aiplatform.v1.InputDataConfig;
import com.google.cloud.aiplatform.v1.LocationName;
import com.google.cloud.aiplatform.v1.Model;
import com.google.cloud.aiplatform.v1.PipelineServiceClient;
import com.google.cloud.aiplatform.v1.PipelineServiceSettings;
import com.google.cloud.aiplatform.v1.PredefinedSplit;
import com.google.cloud.aiplatform.v1.TimestampSplit;
import com.google.cloud.aiplatform.v1.TrainingPipeline;
import com.google.cloud.aiplatform.v1.schema.trainingjob.definition.AutoMlVideoObjectTrackingInputs;
import com.google.cloud.aiplatform.v1.schema.trainingjob.definition.AutoMlVideoObjectTrackingInputs.ModelType;
import com.google.rpc.Status;
import java.io.IOException;

public class CreateTrainingPipelineVideoObjectTrackingSample {

  public static void main(String[] args) throws IOException {
    String trainingPipelineVideoObjectTracking =
        "YOUR_TRAINING_PIPELINE_VIDEO_OBJECT_TRACKING_DISPLAY_NAME";
    String datasetId = "YOUR_DATASET_ID";
    String modelDisplayName = "YOUR_MODEL_DISPLAY_NAME";
    String project = "YOUR_PROJECT_ID";
    createTrainingPipelineVideoObjectTracking(
        trainingPipelineVideoObjectTracking, datasetId, modelDisplayName, project);
  }

  static void createTrainingPipelineVideoObjectTracking(
      String trainingPipelineVideoObjectTracking,
      String datasetId,
      String modelDisplayName,
      String project)
      throws IOException {
    PipelineServiceSettings pipelineServiceSettings =
        PipelineServiceSettings.newBuilder()
            .setEndpoint("us-central1-aiplatform.googleapis.com:443")
            .build();

    // Initialize client that will be used to send requests. This client only needs to be created
    // once, and can be reused for multiple requests. After completing all of your requests, call
    // the "close" method on the client to safely clean up any remaining background resources.
    try (PipelineServiceClient pipelineServiceClient =
        PipelineServiceClient.create(pipelineServiceSettings)) {
      String location = "us-central1";
      String trainingTaskDefinition =
          "gs://google-cloud-aiplatform/schema/trainingjob/definition/"
              + "automl_video_object_tracking_1.0.0.yaml";
      LocationName locationName = LocationName.of(project, location);

      AutoMlVideoObjectTrackingInputs trainingTaskInputs =
          AutoMlVideoObjectTrackingInputs.newBuilder().setModelType(ModelType.CLOUD).build();

      InputDataConfig inputDataConfig =
          InputDataConfig.newBuilder().setDatasetId(datasetId).build();
      Model modelToUpload = Model.newBuilder().setDisplayName(modelDisplayName).build();
      TrainingPipeline trainingPipeline =
          TrainingPipeline.newBuilder()
              .setDisplayName(trainingPipelineVideoObjectTracking)
              .setTrainingTaskDefinition(trainingTaskDefinition)
              .setTrainingTaskInputs(ValueConverter.toValue(trainingTaskInputs))
              .setInputDataConfig(inputDataConfig)
              .setModelToUpload(modelToUpload)
              .build();

      TrainingPipeline createTrainingPipelineResponse =
          pipelineServiceClient.createTrainingPipeline(locationName, trainingPipeline);

      System.out.println("Create Training Pipeline Video Object Tracking Response");
      System.out.format("Name: %s\n", createTrainingPipelineResponse.getName());
      System.out.format("Display Name: %s\n", createTrainingPipelineResponse.getDisplayName());

      System.out.format(
          "Training Task Definition %s\n",
          createTrainingPipelineResponse.getTrainingTaskDefinition());
      System.out.format(
          "Training Task Inputs: %s\n",
          createTrainingPipelineResponse.getTrainingTaskInputs().toString());
      System.out.format(
          "Training Task Metadata: %s\n",
          createTrainingPipelineResponse.getTrainingTaskMetadata().toString());

      System.out.format("State: %s\n", createTrainingPipelineResponse.getState().toString());
      System.out.format(
          "Create Time: %s\n", createTrainingPipelineResponse.getCreateTime().toString());
      System.out.format("StartTime %s\n", createTrainingPipelineResponse.getStartTime().toString());
      System.out.format("End Time: %s\n", createTrainingPipelineResponse.getEndTime().toString());
      System.out.format(
          "Update Time: %s\n", createTrainingPipelineResponse.getUpdateTime().toString());
      System.out.format("Labels: %s\n", createTrainingPipelineResponse.getLabelsMap().toString());

      InputDataConfig inputDataConfigResponse = createTrainingPipelineResponse.getInputDataConfig();
      System.out.println("Input Data config");
      System.out.format("Dataset Id: %s\n", inputDataConfigResponse.getDatasetId());
      System.out.format("Annotations Filter: %s\n", inputDataConfigResponse.getAnnotationsFilter());

      FractionSplit fractionSplit = inputDataConfigResponse.getFractionSplit();
      System.out.println("Fraction split");
      System.out.format("Training Fraction: %s\n", fractionSplit.getTrainingFraction());
      System.out.format("Validation Fraction: %s\n", fractionSplit.getValidationFraction());
      System.out.format("Test Fraction: %s\n", fractionSplit.getTestFraction());

      FilterSplit filterSplit = inputDataConfigResponse.getFilterSplit();
      System.out.println("Filter Split");
      System.out.format("Training Filter: %s\n", filterSplit.getTrainingFilter());
      System.out.format("Validation Filter: %s\n", filterSplit.getValidationFilter());
      System.out.format("Test Filter: %s\n", filterSplit.getTestFilter());

      PredefinedSplit predefinedSplit = inputDataConfigResponse.getPredefinedSplit();
      System.out.println("Predefined Split");
      System.out.format("Key: %s\n", predefinedSplit.getKey());

      TimestampSplit timestampSplit = inputDataConfigResponse.getTimestampSplit();
      System.out.println("Timestamp Split");
      System.out.format("Training Fraction: %s\n", timestampSplit.getTrainingFraction());
      System.out.format("Validation Fraction: %s\n", timestampSplit.getValidationFraction());
      System.out.format("Test Fraction: %s\n", timestampSplit.getTestFraction());
      System.out.format("Key: %s\n", timestampSplit.getKey());

      Model modelResponse = createTrainingPipelineResponse.getModelToUpload();
      System.out.println("Model To Upload");
      System.out.format("Name: %s\n", modelResponse.getName());
      System.out.format("Display Name: %s\n", modelResponse.getDisplayName());
      System.out.format("Description: %s\n", modelResponse.getDescription());
      System.out.format("Metadata Schema Uri: %s\n", modelResponse.getMetadataSchemaUri());
      System.out.format("Metadata: %s\n", modelResponse.getMetadata());

      System.out.format("Training Pipeline: %s\n", modelResponse.getTrainingPipeline());
      System.out.format("Artifact Uri: %s\n", modelResponse.getArtifactUri());

      System.out.format(
          "Supported Deployment Resources Types: %s\n",
          modelResponse.getSupportedDeploymentResourcesTypesList().toString());
      System.out.format(
          "Supported Input Storage Formats: %s\n",
          modelResponse.getSupportedInputStorageFormatsList().toString());
      System.out.format(
          "Supported Output Storage Formats: %s\n",
          modelResponse.getSupportedOutputStorageFormatsList().toString());

      System.out.format("Create Time: %s\n", modelResponse.getCreateTime());
      System.out.format("Update Time: %s\n", modelResponse.getUpdateTime());
      System.out.format("Labels: %s\n", modelResponse.getLabelsMap());

      Status status = createTrainingPipelineResponse.getError();
      System.out.println("Error");
      System.out.format("Code: %s\n", status.getCode());
      System.out.format("Message: %s\n", status.getMessage());
    }
  }
}

Node.js

Before trying this sample, follow the Node.js setup instructions in the Vertex AI quickstart using client libraries. For more information, see the Vertex AI Node.js API reference documentation.

To authenticate to Vertex AI, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.

/**
 * TODO(developer): Uncomment these variables before running the sample.\
 * (Not necessary if passing values as arguments)
 */

// const datasetId = 'YOUR_DATASET_ID';
// const modelDisplayName = 'YOUR_MODEL_DISPLAY_NAME';
// const trainingPipelineDisplayName = 'YOUR_TRAINING_PIPELINE_DISPLAY_NAME';
// const project = 'YOUR_PROJECT_ID';
// const location = 'YOUR_PROJECT_LOCATION';
const aiplatform = require('@google-cloud/aiplatform');
const {definition} =
  aiplatform.protos.google.cloud.aiplatform.v1.schema.trainingjob;
const ModelType = definition.AutoMlVideoObjectTrackingInputs.ModelType;

// Imports the Google Cloud Pipeline Service Client library
const {PipelineServiceClient} = aiplatform.v1;

// Specifies the location of the api endpoint
const clientOptions = {
  apiEndpoint: 'us-central1-aiplatform.googleapis.com',
};

// Instantiates a client
const pipelineServiceClient = new PipelineServiceClient(clientOptions);

async function createTrainingPipelineVideoObjectTracking() {
  // Configure the parent resource
  const parent = `projects/${project}/locations/${location}`;

  const trainingTaskInputsObj =
    new definition.AutoMlVideoObjectTrackingInputs({
      modelType: ModelType.CLOUD,
    });
  const trainingTaskInputs = trainingTaskInputsObj.toValue();

  const modelToUpload = {displayName: modelDisplayName};
  const inputDataConfig = {datasetId: datasetId};
  const trainingPipeline = {
    displayName: trainingPipelineDisplayName,
    trainingTaskDefinition:
      'gs://google-cloud-aiplatform/schema/trainingjob/definition/automl_video_object_tracking_1.0.0.yaml',
    trainingTaskInputs,
    inputDataConfig,
    modelToUpload,
  };
  const request = {
    parent,
    trainingPipeline,
  };

  // Create training pipeline request
  const [response] =
    await pipelineServiceClient.createTrainingPipeline(request);

  console.log('Create training pipeline video object tracking response');
  console.log(`Name : ${response.name}`);
  console.log('Raw response:');
  console.log(JSON.stringify(response, null, 2));
}
createTrainingPipelineVideoObjectTracking();

Python

To learn how to install or update the Vertex AI SDK for Python, see Install the Vertex AI SDK for Python. For more information, see the Python API reference documentation.

from google.cloud import aiplatform
from google.cloud.aiplatform.gapic.schema import trainingjob


def create_training_pipeline_video_object_tracking_sample(
    project: str,
    display_name: str,
    dataset_id: str,
    model_display_name: str,
    location: str = "us-central1",
    api_endpoint: str = "us-central1-aiplatform.googleapis.com",
):
    # The AI Platform services require regional API endpoints.
    client_options = {"api_endpoint": api_endpoint}
    # Initialize client that will be used to create and send requests.
    # This client only needs to be created once, and can be reused for multiple requests.
    client = aiplatform.gapic.PipelineServiceClient(client_options=client_options)
    training_task_inputs = trainingjob.definition.AutoMlVideoObjectTrackingInputs(
        model_type="CLOUD",
    ).to_value()

    training_pipeline = {
        "display_name": display_name,
        "training_task_definition": "gs://google-cloud-aiplatform/schema/trainingjob/definition/automl_video_object_tracking_1.0.0.yaml",
        "training_task_inputs": training_task_inputs,
        "input_data_config": {"dataset_id": dataset_id},
        "model_to_upload": {"display_name": model_display_name},
    }
    parent = f"projects/{project}/locations/{location}"
    response = client.create_training_pipeline(
        parent=parent, training_pipeline=training_pipeline
    )
    print("response:", response)

Control the data split using REST

You can control how your training data is split between the training, validation, and test sets. When using the Vertex AI API, use the Split object to determine your data split. The Split object can be included in the InputConfig object as one of several object types, each of which provides a different way to split the training data. You can select one method only.

  • FractionSplit:
    • TRAINING_FRACTION: The fraction of the training data to be used for the training set.
    • VALIDATION_FRACTION: The fraction of the training data to be used for the validation set. Not used for video data.
    • TEST_FRACTION: The fraction of the training data to be used for the test set.

    If any of the fractions are specified, all must be specified. The fractions must add up to 1.0. The default values for the fractions differ depending on your data type. Learn more.

    "fractionSplit": {
      "trainingFraction": TRAINING_FRACTION,
      "validationFraction": VALIDATION_FRACTION,
      "testFraction": TEST_FRACTION
    },
    
  • FilterSplit:
    • TRAINING_FILTER: Data items that match this filter are used for the training set.
    • VALIDATION_FILTER: Data items that match this filter are used for the validation set. Must be "-" for video data.
    • TEST_FILTER: Data items that match this filter are used for the test set.

    These filters can be used with the ml_use label, or with any labels you apply to your data. Learn more about using the ml-use label and other labels to filter your data.

    The following example shows how to use the filterSplit object with the ml_use label, with the validation set included:

    "filterSplit": {
    "trainingFilter": "labels.aiplatform.googleapis.com/ml_use=training",
    "validationFilter": "labels.aiplatform.googleapis.com/ml_use=validation",
    "testFilter": "labels.aiplatform.googleapis.com/ml_use=test"
    }