Get predictions from a video object tracking model

This page shows you how to get batch predictions from your video object tracking models using the Google Cloud console or the Vertex AI API. Batch predictions are asynchronous requests. You request batch predictions directly from the model resource without needing to deploy the model to an endpoint.

AutoML video models do not support online predictions.

Get batch predictions

To make a batch prediction request, you specify an input source and an output format where Vertex AI stores predictions results.

Input data requirements

The input for batch requests specifies the items to send to your model for prediction. Batch predictions for the AutoML video model type use a JSON Lines file to specify a list of videos to make predictions for, and then store the JSON Lines file in a Cloud Storage bucket. You can specify Infinity for the timeSegmentEnd field to specify the end of the video. The following sample shows a single line in an input JSON Lines file.

{'content': 'gs://sourcebucket/datasets/videos/source_video.mp4', 'mimeType': 'video/mp4', 'timeSegmentStart': '0.0s', 'timeSegmentEnd': '2.366667s'}

Request a batch prediction

For batch prediction requests, you can use the Google Cloud console or the Vertex AI API. Depending on the number of input items that you've submitted, a batch prediction task can take some time to complete.

Google Cloud console

Use the Google Cloud console to request a batch prediction.

  1. In the Google Cloud console, in the Vertex AI section, go to the Batch predictions page.

    Go to the Batch predictions page

  2. Click Create to open the New batch prediction window and complete the following steps:

    1. Enter a name for the batch prediction.
    2. For Model name, select the name of the model to use for this batch prediction.
    3. For Source path, specify the Cloud Storage location where your JSON Lines input file is located.
    4. For the Destination path, specify a Cloud Storage location where the batch prediction results are stored. The Output format is determined by your model's objective. AutoML models for image objectives output JSON Lines files.


Use the Vertex AI API to send batch prediction requests.


Before using any of the request data, make the following replacements:

  • LOCATION_ID: Region where Model is stored and batch prediction job is executed. For example, us-central1.
  • PROJECT_ID: Your project ID.
  • BATCH_JOB_NAME: Display name for the batch job
  • MODEL_ID: The ID for the model to use for making predictions
  • THRESHOLD_VALUE (optional): Vertex AI returns only predictions that have confidence scores with at least this value. The default is 0.0.
  • URI: Cloud Storage URI where your input JSON Lines file is located.
  • BUCKET: Your Cloud Storage bucket
  • PROJECT_NUMBER: Your project's automatically generated project number

HTTP method and URL:


Request JSON body:

    "displayName": "BATCH_JOB_NAME",
    "model": "projects/PROJECT_ID/locations/LOCATION_ID/models/MODEL_ID",
    "modelParameters": {
      "confidenceThreshold": THRESHOLD_VALUE,
    "inputConfig": {
        "instancesFormat": "jsonl",
        "gcsSource": {
            "uris": ["URI"],
    "outputConfig": {
        "predictionsFormat": "jsonl",
        "gcsDestination": {
            "outputUriPrefix": "OUTPUT_BUCKET",

To send your request, choose one of these options:


Save the request body in a file named request.json, and execute the following command:

curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \


Save the request body in a file named request.json, and execute the following command:

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
-Method POST `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "" | Select-Object -Expand Content

You should receive a JSON response similar to the following:

  "name": "projects/PROJECT_NUMBER/locations/us-central1/batchPredictionJobs/BATCH_JOB_ID",
  "displayName": "BATCH_JOB_NAME",
  "model": "projects/PROJECT_NUMBER/locations/us-central1/models/MODEL_ID",
  "inputConfig": {
    "instancesFormat": "jsonl",
    "gcsSource": {
      "uris": [
  "outputConfig": {
    "predictionsFormat": "jsonl",
    "gcsDestination": {
      "outputUriPrefix": "BUCKET"
  "state": "JOB_STATE_PENDING",
  "createTime": "2020-05-30T02:58:44.341643Z",
  "updateTime": "2020-05-30T02:58:44.341643Z",
  "modelDisplayName": "MODEL_NAME",
  "modelObjective": "MODEL_OBJECTIVE"

You can poll for the status of the batch job using the BATCH_JOB_ID until the job state is JOB_STATE_SUCCEEDED.


Before trying this sample, follow the Java setup instructions in the Vertex AI quickstart using client libraries. For more information, see the Vertex AI Java API reference documentation.

To authenticate to Vertex AI, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.

import java.util.List;

public class CreateBatchPredictionJobVideoObjectTrackingSample {

  public static void main(String[] args) throws IOException {
    // TODO(developer): Replace these variables before running the sample.
    String batchPredictionDisplayName = "YOUR_VIDEO_OBJECT_TRACKING_DISPLAY_NAME";
    String modelId = "YOUR_MODEL_ID";
    String gcsSourceUri =
    String gcsDestinationOutputUriPrefix =
    String project = "YOUR_PROJECT_ID";
        batchPredictionDisplayName, modelId, gcsSourceUri, gcsDestinationOutputUriPrefix, project);

  static void batchPredictionJobVideoObjectTracking(
      String batchPredictionDisplayName,
      String modelId,
      String gcsSourceUri,
      String gcsDestinationOutputUriPrefix,
      String project)
      throws IOException {
    JobServiceSettings jobServiceSettings =

    // Initialize client that will be used to send requests. This client only needs to be created
    // once, and can be reused for multiple requests. After completing all of your requests, call
    // the "close" method on the client to safely clean up any remaining background resources.
    try (JobServiceClient jobServiceClient = JobServiceClient.create(jobServiceSettings)) {
      String location = "us-central1";
      LocationName locationName = LocationName.of(project, location);
      ModelName modelName = ModelName.of(project, location, modelId);

      VideoObjectTrackingPredictionParams modelParamsObj =
              .setConfidenceThreshold(((float) 0.5))

      Value modelParameters = ValueConverter.toValue(modelParamsObj);

      GcsSource.Builder gcsSource = GcsSource.newBuilder();
      InputConfig inputConfig =

      GcsDestination gcsDestination =
      OutputConfig outputConfig =

      BatchPredictionJob batchPredictionJob =
      BatchPredictionJob batchPredictionJobResponse =
          jobServiceClient.createBatchPredictionJob(locationName, batchPredictionJob);

      System.out.println("Create Batch Prediction Job Video Object Tracking Response");
      System.out.format("\tName: %s\n", batchPredictionJobResponse.getName());
      System.out.format("\tDisplay Name: %s\n", batchPredictionJobResponse.getDisplayName());
      System.out.format("\tModel %s\n", batchPredictionJobResponse.getModel());
          "\tModel Parameters: %s\n", batchPredictionJobResponse.getModelParameters());

      System.out.format("\tState: %s\n", batchPredictionJobResponse.getState());
      System.out.format("\tCreate Time: %s\n", batchPredictionJobResponse.getCreateTime());
      System.out.format("\tStart Time: %s\n", batchPredictionJobResponse.getStartTime());
      System.out.format("\tEnd Time: %s\n", batchPredictionJobResponse.getEndTime());
      System.out.format("\tUpdate Time: %s\n", batchPredictionJobResponse.getUpdateTime());
      System.out.format("\tLabels: %s\n", batchPredictionJobResponse.getLabelsMap());

      InputConfig inputConfigResponse = batchPredictionJobResponse.getInputConfig();
      System.out.println("\tInput Config");
      System.out.format("\t\tInstances Format: %s\n", inputConfigResponse.getInstancesFormat());

      GcsSource gcsSourceResponse = inputConfigResponse.getGcsSource();
      System.out.println("\t\tGcs Source");
      System.out.format("\t\t\tUris %s\n", gcsSourceResponse.getUrisList());

      BigQuerySource bigQuerySource = inputConfigResponse.getBigquerySource();
      System.out.println("\t\tBigquery Source");
      System.out.format("\t\t\tInput_uri: %s\n", bigQuerySource.getInputUri());

      OutputConfig outputConfigResponse = batchPredictionJobResponse.getOutputConfig();
      System.out.println("\tOutput Config");
          "\t\tPredictions Format: %s\n", outputConfigResponse.getPredictionsFormat());

      GcsDestination gcsDestinationResponse = outputConfigResponse.getGcsDestination();
      System.out.println("\t\tGcs Destination");
          "\t\t\tOutput Uri Prefix: %s\n", gcsDestinationResponse.getOutputUriPrefix());

      BigQueryDestination bigQueryDestination = outputConfigResponse.getBigqueryDestination();
      System.out.println("\t\tBig Query Destination");
      System.out.format("\t\t\tOutput Uri: %s\n", bigQueryDestination.getOutputUri());

      BatchDedicatedResources batchDedicatedResources =
      System.out.println("\tBatch Dedicated Resources");
          "\t\tStarting Replica Count: %s\n", batchDedicatedResources.getStartingReplicaCount());
          "\t\tMax Replica Count: %s\n", batchDedicatedResources.getMaxReplicaCount());

      MachineSpec machineSpec = batchDedicatedResources.getMachineSpec();
      System.out.println("\t\tMachine Spec");
      System.out.format("\t\t\tMachine Type: %s\n", machineSpec.getMachineType());
      System.out.format("\t\t\tAccelerator Type: %s\n", machineSpec.getAcceleratorType());
      System.out.format("\t\t\tAccelerator Count: %s\n", machineSpec.getAcceleratorCount());

      ManualBatchTuningParameters manualBatchTuningParameters =
      System.out.println("\tManual Batch Tuning Parameters");
      System.out.format("\t\tBatch Size: %s\n", manualBatchTuningParameters.getBatchSize());

      OutputInfo outputInfo = batchPredictionJobResponse.getOutputInfo();
      System.out.println("\tOutput Info");
      System.out.format("\t\tGcs Output Directory: %s\n", outputInfo.getGcsOutputDirectory());
      System.out.format("\t\tBigquery Output Dataset: %s\n", outputInfo.getBigqueryOutputDataset());

      Status status = batchPredictionJobResponse.getError();
      System.out.format("\t\tCode: %s\n", status.getCode());
      System.out.format("\t\tMessage: %s\n", status.getMessage());
      List<Any> details = status.getDetailsList();

      for (Status partialFailure : batchPredictionJobResponse.getPartialFailuresList()) {
        System.out.println("\tPartial Failure");
        System.out.format("\t\tCode: %s\n", partialFailure.getCode());
        System.out.format("\t\tMessage: %s\n", partialFailure.getMessage());
        List<Any> partialFailureDetailsList = partialFailure.getDetailsList();

      ResourcesConsumed resourcesConsumed = batchPredictionJobResponse.getResourcesConsumed();
      System.out.println("\tResources Consumed");
      System.out.format("\t\tReplica Hours: %s\n", resourcesConsumed.getReplicaHours());

      CompletionStats completionStats = batchPredictionJobResponse.getCompletionStats();
      System.out.println("\tCompletion Stats");
      System.out.format("\t\tSuccessful Count: %s\n", completionStats.getSuccessfulCount());
      System.out.format("\t\tFailed Count: %s\n", completionStats.getFailedCount());
      System.out.format("\t\tIncomplete Count: %s\n", completionStats.getIncompleteCount());


Before trying this sample, follow the Node.js setup instructions in the Vertex AI quickstart using client libraries. For more information, see the Vertex AI Node.js API reference documentation.

To authenticate to Vertex AI, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.

 * TODO(developer): Uncomment these variables before running the sample.\
 * (Not necessary if passing values as arguments)

// const batchPredictionDisplayName = 'YOUR_BATCH_PREDICTION_DISPLAY_NAME';
// const modelId = 'YOUR_MODEL_ID';
// const gcsSourceUri = 'YOUR_GCS_SOURCE_URI';
// const gcsDestinationOutputUriPrefix = 'YOUR_GCS_DEST_OUTPUT_URI_PREFIX';
//    eg. "gs://<your-gcs-bucket>/destination_path"
// const project = 'YOUR_PROJECT_ID';
// const location = 'YOUR_PROJECT_LOCATION';
const aiplatform = require('@google-cloud/aiplatform');
const {params} =;

// Imports the Google Cloud Job Service Client library
const {JobServiceClient} = require('@google-cloud/aiplatform').v1;

// Specifies the location of the api endpoint
const clientOptions = {
  apiEndpoint: '',

// Instantiates a client
const jobServiceClient = new JobServiceClient(clientOptions);

async function createBatchPredictionJobVideoObjectTracking() {
  // Configure the parent resource
  const parent = `projects/${project}/locations/${location}`;
  const modelName = `projects/${project}/locations/${location}/models/${modelId}`;

  // For more information on how to configure the model parameters object, see
  const modelParamsObj = new params.VideoObjectTrackingPredictionParams({
    confidenceThreshold: 0.5,

  const modelParameters = modelParamsObj.toValue();

  const inputConfig = {
    instancesFormat: 'jsonl',
    gcsSource: {uris: [gcsSourceUri]},
  const outputConfig = {
    predictionsFormat: 'jsonl',
    gcsDestination: {outputUriPrefix: gcsDestinationOutputUriPrefix},
  const batchPredictionJob = {
    displayName: batchPredictionDisplayName,
    model: modelName,
  const request = {

  // Create batch prediction job request
  const [response] = await jobServiceClient.createBatchPredictionJob(request);

  console.log('Create batch prediction job video object tracking response');
  console.log(`Name : ${}`);
  console.log('Raw response:');
  console.log(JSON.stringify(response, null, 2));


To learn how to install or update the Vertex AI SDK for Python, see Install the Vertex AI SDK for Python. For more information, see the Python API reference documentation.

def create_batch_prediction_job_sample(
    project: str,
    location: str,
    model_resource_name: str,
    job_display_name: str,
    gcs_source: Union[str, Sequence[str]],
    gcs_destination: str,
    sync: bool = True,
    aiplatform.init(project=project, location=location)

    my_model = aiplatform.Model(model_resource_name)

    batch_prediction_job = my_model.batch_predict(


    return batch_prediction_job

Retrieve batch prediction results

Vertex AI sends batch prediction output to your specified destination.

When a batch prediction task is complete, the output of the prediction is stored in the Cloud Storage bucket that you specified in your request.

Example batch prediction results

The following an example batch prediction results from a video object tracking model.

  "instance": {
   "content": "gs://bucket/video.mp4",
    "mimeType": "video/mp4",
    "timeSegmentStart": "1s",
    "timeSegmentEnd": "5s"
  "prediction": [{
    "id": "1",
    "displayName": "cat",
    "timeSegmentStart": "1.2s",
    "timeSegmentEnd": "3.4s",
    "frames": [{
      "timeOffset": "1.2s",
      "xMin": 0.1,
      "xMax": 0.2,
      "yMin": 0.3,
      "yMax": 0.4
    }, {
      "timeOffset": "3.4s",
      "xMin": 0.2,
      "xMax": 0.3,
      "yMin": 0.4,
      "yMax": 0.5,
    "confidence": 0.7
  }, {
    "id": "1",
    "displayName": "cat",
    "timeSegmentStart": "4.8s",
    "timeSegmentEnd": "4.8s",
    "frames": [{
      "timeOffset": "4.8s",
      "xMin": 0.2,
      "xMax": 0.3,
      "yMin": 0.4,
      "yMax": 0.5,
    "confidence": 0.6
  }, {
    "id": "2",
    "displayName": "dog",
    "timeSegmentStart": "1.2s",
    "timeSegmentEnd": "3.4s",
    "frames": [{
      "timeOffset": "1.2s",
      "xMin": 0.1,
      "xMax": 0.2,
      "yMin": 0.3,
      "yMax": 0.4
    }, {
      "timeOffset": "3.4s",
      "xMin": 0.2,
      "xMax": 0.3,
      "yMin": 0.4,
      "yMax": 0.5,
    "confidence": 0.5