Esta página foi traduzida pela API Cloud Translation.
Switch to English

Como rastrear objetos

O rastreamento de objetos acompanha vários objetos detectados em um vídeo de entrada. Para fazer uma solicitação de rastreamento de objetos, chame o método annotate e especifique OBJECT_TRACKING no campo features.

Uma solicitação de rastreamento de objetos anota rótulos e locais espaciais para entidades que são detectadas em um vídeo ou em trechos dele. Por exemplo, um vídeo de veículos atravessando um sinal de trânsito pode produzir rótulos como "carro", "caminhão", "bicicleta", "pneus", "luzes", "janela" e assim por diante. Cada rótulo pode incluir uma série de caixas delimitadoras, com cada caixa delimitadora tendo um segmento de tempo associado que contém uma diferença de horário que indica a diferença de duração desde o início do vídeo. A anotação também contém outras informações da entidade, inclusive um ID de entidade que pode ser usado para saber mais sobre a entidade na API Google Knowledge Graph Search (em inglês).

Rastreamento de objetos vs. detecção de rótulos

O rastreamento de objetos é diferente da detecção de rótulos, pois a detecção de rótulos fornece rótulos sem caixas delimitadoras, enquanto o rastreamento de objetos fornece os rótulos de objetos individuais presentes em um determinado vídeo junto de caixas delimitadoras de cada instância de objeto a cada etapa.

Além disso, o rastreamento de objetos tenta atribuir o mesmo ID à mesma instância de objeto, desde que esse objeto não tenha saído da cena. Portanto, se houver um carro vermelho e um azul no segundo zero, a solicitação de acompanhamento atribuirá ao carro vermelho um ID (por exemplo, 0) e outro ao azul (por exemplo, 1). Se, no segundo um, os dois carros estiverem visíveis, a solicitação de rastreamento atribuirá a eles os mesmos IDs que eles receberam no segundo zero.

Solicitar rastreamento de objetos para um vídeo no Cloud Storage

As amostras a seguir demonstram o rastreamento de objetos em um arquivo localizado no Cloud Storage.

REST e LINHA DE CMD

Enviar a solicitação de processo

Veja a seguir como enviar uma solicitação POST para o método annotate. O exemplo utiliza o token de acesso para uma conta de serviço configurada para o projeto com o SDK do Cloud. Consulte o Guia de início rápido do Video Intelligence para instruções de como instalar o SDK do Cloud, configurar um projeto com uma conta de serviço e conseguir um token de acesso.

Antes de usar os dados da solicitação abaixo, faça as substituições a seguir:

  • input-uri: base-64-encoded-content
    Por exemplo:
    "inputUri": "gs://cloud-videointelligence-demo/assistant.mp4",

Método HTTP e URL:

POST https://videointelligence.googleapis.com/v1/videos:annotate

Corpo JSON da solicitação:

{
  "inputUri": "base-64-encoded-content",
  "features": ["OBJECT_TRACKING"]
}

Para enviar a solicitação, expanda uma destas opções:

Você receberá uma resposta JSON semelhante a esta:

{
  "name": "projects/project-number/locations/location-id/operations/operation-id"
}

Se a solicitação for bem-sucedida, a API Video Intelligence retornará o name da sua operação. O exemplo acima mostra um exemplo dessa resposta, em que project-number é o número do projeto e operation-id é o ID da operação de longa duração criado para a solicitação.

Ver os resultados

Para receber os resultados da solicitação, envie um GET usando o nome da operação retornado da chamada para videos:annotate, conforme mostrado no exemplo a seguir.

Antes de usar os dados da solicitação abaixo, faça as substituições a seguir:

  • operation-name: o nome da operação, conforme retornado pela API Video Intelligence. O nome da operação tem o formato projects/project-number/locations/location-id/operations/operation-id.

Método HTTP e URL:

GET https://videointelligence.googleapis.com/v1/operation-name

Para enviar a solicitação, expanda uma destas opções:

Você receberá uma resposta JSON semelhante a esta:

Fazer download dos resultados da anotação

Copiar a anotação da origem para o bucket de destino (consulte Copiar arquivos e objetos)

gsutil cp gcs_uri gs://my-bucket

Observação: se o URI gcs de saída for fornecido pelo usuário, a anotação será armazenada nesse URI do gcs.

C#

public static object TrackObjectGcs(string gcsUri)
{
    var client = VideoIntelligenceServiceClient.Create();
    var request = new AnnotateVideoRequest
    {
        InputUri = gcsUri,
        Features = { Feature.ObjectTracking },
        // It is recommended to use location_id as 'us-east1' for the
        // best latency due to different types of processors used in
        // this region and others.
        LocationId = "us-east1"
    };

    Console.WriteLine("\nProcessing video for object annotations.");
    var op = client.AnnotateVideo(request).PollUntilCompleted();

    Console.WriteLine("\nFinished processing.\n");

    // Retrieve first result because a single video was processed.
    var objectAnnotations = op.Result.AnnotationResults[0]
                              .ObjectAnnotations;

    // Get only the first annotation for demo purposes
    var objAnnotation = objectAnnotations[0];

    Console.WriteLine(
        $"Entity description: {objAnnotation.Entity.Description}");

    if (objAnnotation.Entity.EntityId != null)
    {
        Console.WriteLine(
            $"Entity id: {objAnnotation.Entity.EntityId}");
    }

    Console.Write($"Segment: ");
    Console.WriteLine(
        String.Format("{0}s to {1}s",
                      objAnnotation.Segment.StartTimeOffset.Seconds +
                      objAnnotation.Segment.StartTimeOffset.Nanos / 1e9,
                      objAnnotation.Segment.EndTimeOffset.Seconds +
                      objAnnotation.Segment.EndTimeOffset.Nanos / 1e9));

    Console.WriteLine($"Confidence: {objAnnotation.Confidence}");

    // Here we print only the bounding box of the first frame in this segment
    var frame = objAnnotation.Frames[0];
    var box = frame.NormalizedBoundingBox;
    Console.WriteLine(
        String.Format("Time offset of the first frame: {0}s",
                      frame.TimeOffset.Seconds +
                      frame.TimeOffset.Nanos / 1e9));
    Console.WriteLine("Bounding box positions:");
    Console.WriteLine($"\tleft   : {box.Left}");
    Console.WriteLine($"\ttop    : {box.Top}");
    Console.WriteLine($"\tright  : {box.Right}");
    Console.WriteLine($"\tbottom : {box.Bottom}");

    return 0;
}

Go


import (
	"context"
	"fmt"
	"io"

	video "cloud.google.com/go/videointelligence/apiv1"
	"github.com/golang/protobuf/ptypes"
	videopb "google.golang.org/genproto/googleapis/cloud/videointelligence/v1"
)

// objectTrackingGCS analyzes a video and extracts entities with their bounding boxes.
func objectTrackingGCS(w io.Writer, gcsURI string) error {
	// gcsURI := "gs://cloud-samples-data/video/cat.mp4"

	ctx := context.Background()

	// Creates a client.
	client, err := video.NewClient(ctx)
	if err != nil {
		return fmt.Errorf("video.NewClient: %v", err)
	}

	op, err := client.AnnotateVideo(ctx, &videopb.AnnotateVideoRequest{
		InputUri: gcsURI,
		Features: []videopb.Feature{
			videopb.Feature_OBJECT_TRACKING,
		},
	})
	if err != nil {
		return fmt.Errorf("AnnotateVideo: %v", err)
	}

	resp, err := op.Wait(ctx)
	if err != nil {
		return fmt.Errorf("Wait: %v", err)
	}

	// Only one video was processed, so get the first result.
	result := resp.GetAnnotationResults()[0]

	for _, annotation := range result.ObjectAnnotations {
		fmt.Fprintf(w, "Description: %q\n", annotation.Entity.GetDescription())
		if len(annotation.Entity.EntityId) > 0 {
			fmt.Fprintf(w, "\tEntity ID: %q\n", annotation.Entity.GetEntityId())
		}

		segment := annotation.GetSegment()
		start, _ := ptypes.Duration(segment.GetStartTimeOffset())
		end, _ := ptypes.Duration(segment.GetEndTimeOffset())
		fmt.Fprintf(w, "\tSegment: %v to %v\n", start, end)

		fmt.Fprintf(w, "\tConfidence: %f\n", annotation.GetConfidence())

		// Here we print only the bounding box of the first frame in this segment.
		frame := annotation.GetFrames()[0]
		seconds := float32(frame.GetTimeOffset().GetSeconds())
		nanos := float32(frame.GetTimeOffset().GetNanos())
		fmt.Fprintf(w, "\tTime offset of the first frame: %fs\n", seconds+nanos/1e9)

		box := frame.GetNormalizedBoundingBox()
		fmt.Fprintf(w, "\tBounding box position:\n")
		fmt.Fprintf(w, "\t\tleft  : %f\n", box.GetLeft())
		fmt.Fprintf(w, "\t\ttop   : %f\n", box.GetTop())
		fmt.Fprintf(w, "\t\tright : %f\n", box.GetRight())
		fmt.Fprintf(w, "\t\tbottom: %f\n", box.GetBottom())
	}

	return nil
}

Java

/**
 * Track objects in a video.
 *
 * @param gcsUri the path to the video file to analyze.
 */
public static VideoAnnotationResults trackObjectsGcs(String gcsUri) throws Exception {
  try (VideoIntelligenceServiceClient client = VideoIntelligenceServiceClient.create()) {
    // Create the request
    AnnotateVideoRequest request =
        AnnotateVideoRequest.newBuilder()
            .setInputUri(gcsUri)
            .addFeatures(Feature.OBJECT_TRACKING)
            .setLocationId("us-east1")
            .build();

    // asynchronously perform object tracking on videos
    OperationFuture<AnnotateVideoResponse, AnnotateVideoProgress> future =
        client.annotateVideoAsync(request);

    System.out.println("Waiting for operation to complete...");
    // The first result is retrieved because a single video was processed.
    AnnotateVideoResponse response = future.get(300, TimeUnit.SECONDS);
    VideoAnnotationResults results = response.getAnnotationResults(0);

    // Get only the first annotation for demo purposes.
    ObjectTrackingAnnotation annotation = results.getObjectAnnotations(0);
    System.out.println("Confidence: " + annotation.getConfidence());

    if (annotation.hasEntity()) {
      Entity entity = annotation.getEntity();
      System.out.println("Entity description: " + entity.getDescription());
      System.out.println("Entity id:: " + entity.getEntityId());
    }

    if (annotation.hasSegment()) {
      VideoSegment videoSegment = annotation.getSegment();
      Duration startTimeOffset = videoSegment.getStartTimeOffset();
      Duration endTimeOffset = videoSegment.getEndTimeOffset();
      // Display the segment time in seconds, 1e9 converts nanos to seconds
      System.out.println(
          String.format(
              "Segment: %.2fs to %.2fs",
              startTimeOffset.getSeconds() + startTimeOffset.getNanos() / 1e9,
              endTimeOffset.getSeconds() + endTimeOffset.getNanos() / 1e9));
    }

    // Here we print only the bounding box of the first frame in this segment.
    ObjectTrackingFrame frame = annotation.getFrames(0);
    // Display the offset time in seconds, 1e9 converts nanos to seconds
    Duration timeOffset = frame.getTimeOffset();
    System.out.println(
        String.format(
            "Time offset of the first frame: %.2fs",
            timeOffset.getSeconds() + timeOffset.getNanos() / 1e9));

    // Display the bounding box of the detected object
    NormalizedBoundingBox normalizedBoundingBox = frame.getNormalizedBoundingBox();
    System.out.println("Bounding box position:");
    System.out.println("\tleft: " + normalizedBoundingBox.getLeft());
    System.out.println("\ttop: " + normalizedBoundingBox.getTop());
    System.out.println("\tright: " + normalizedBoundingBox.getRight());
    System.out.println("\tbottom: " + normalizedBoundingBox.getBottom());
    return results;
  }
}

Node.js

// Imports the Google Cloud Video Intelligence library
const Video = require('@google-cloud/video-intelligence');

// Creates a client
const video = new Video.VideoIntelligenceServiceClient();

/**
 * TODO(developer): Uncomment the following line before running the sample.
 */
// const gcsUri = 'GCS URI of the video to analyze, e.g. gs://my-bucket/my-video.mp4';

const request = {
  inputUri: gcsUri,
  features: ['OBJECT_TRACKING'],
  //recommended to use us-east1 for the best latency due to different types of processors used in this region and others
  locationId: 'us-east1',
};
// Detects objects in a video
const [operation] = await video.annotateVideo(request);
const results = await operation.promise();
console.log('Waiting for operation to complete...');
//Gets annotations for video
const annotations = results[0].annotationResults[0];
const objects = annotations.objectAnnotations;
objects.forEach(object => {
  console.log(`Entity description:  ${object.entity.description}`);
  console.log(`Entity id: ${object.entity.entityId}`);
  const time = object.segment;
  console.log(
    `Segment: ${time.startTimeOffset.seconds || 0}` +
      `.${(time.startTimeOffset.nanos / 1e6).toFixed(0)}s to ${
        time.endTimeOffset.seconds || 0
      }.` +
      `${(time.endTimeOffset.nanos / 1e6).toFixed(0)}s`
  );
  console.log(`Confidence: ${object.confidence}`);
  const frame = object.frames[0];
  const box = frame.normalizedBoundingBox;
  const timeOffset = frame.timeOffset;
  console.log(
    `Time offset for the first frame: ${timeOffset.seconds || 0}` +
      `.${(timeOffset.nanos / 1e6).toFixed(0)}s`
  );
  console.log('Bounding box position:');
  console.log(` left   :${box.left}`);
  console.log(` top    :${box.top}`);
  console.log(` right  :${box.right}`);
  console.log(` bottom :${box.bottom}`);
});

PHP

use Google\Cloud\VideoIntelligence\V1\VideoIntelligenceServiceClient;
use Google\Cloud\VideoIntelligence\V1\Feature;

/** Uncomment and populate these variables in your code */
// $uri = 'The cloud storage object to analyze (gs://your-bucket-name/your-object-name)';
// $options = [];

# Instantiate a client.
$video = new VideoIntelligenceServiceClient();

# Execute a request.
$operation = $video->annotateVideo([
    'inputUri' => $uri,
    'features' => [Feature::OBJECT_TRACKING]
]);

# Wait for the request to complete.
$operation->pollUntilComplete($options);

# Print the results.
if ($operation->operationSucceeded()) {
    $results = $operation->getResult()->getAnnotationResults()[0];
    # Process video/segment level label annotations
    $objectEntity = $results->getObjectAnnotations()[0];

    printf('Video object entity: %s' . PHP_EOL, $objectEntity->getEntity()->getEntityId());
    printf('Video object description: %s' . PHP_EOL, $objectEntity->getEntity()->getDescription());

    $start = $objectEntity->getSegment()->getStartTimeOffset();
    $end = $objectEntity->getSegment()->getEndTimeOffset();
    printf('  Segment: %ss to %ss' . PHP_EOL,
        $start->getSeconds() + $start->getNanos()/1000000000.0,
        $end->getSeconds() + $end->getNanos()/1000000000.0);
    printf('  Confidence: %f' . PHP_EOL, $objectEntity->getConfidence());

    foreach ($objectEntity->getFrames() as $objectEntityFrame) {
        $offset = $objectEntityFrame->getTimeOffset();
        $boundingBox = $objectEntityFrame->getNormalizedBoundingBox();
        printf('  Time offset: %ss' . PHP_EOL,
            $offset->getSeconds() + $offset->getNanos()/1000000000.0);
        printf('  Bounding box position:' . PHP_EOL);
        printf('   Left: %s', $boundingBox->getLeft());
        printf('   Top: %s', $boundingBox->getTop());
        printf('   Right: %s', $boundingBox->getRight());
        printf('   Bottom: %s', $boundingBox->getBottom());
    }
    print(PHP_EOL);
} else {
    print_r($operation->getError());
}

Python

"""Object tracking in a video stored on GCS."""
from google.cloud import videointelligence

video_client = videointelligence.VideoIntelligenceServiceClient()
features = [videointelligence.Feature.OBJECT_TRACKING]
operation = video_client.annotate_video(
    request={"features": features, "input_uri": gcs_uri}
)
print("\nProcessing video for object annotations.")

result = operation.result(timeout=300)
print("\nFinished processing.\n")

# The first result is retrieved because a single video was processed.
object_annotations = result.annotation_results[0].object_annotations

for object_annotation in object_annotations:
    print("Entity description: {}".format(object_annotation.entity.description))
    if object_annotation.entity.entity_id:
        print("Entity id: {}".format(object_annotation.entity.entity_id))

    print(
        "Segment: {}s to {}s".format(
            object_annotation.segment.start_time_offset.seconds
            + object_annotation.segment.start_time_offset.microseconds / 1e6,
            object_annotation.segment.end_time_offset.seconds
            + object_annotation.segment.end_time_offset.microseconds / 1e6,
        )
    )

    print("Confidence: {}".format(object_annotation.confidence))

    # Here we print only the bounding box of the first frame in the segment
    frame = object_annotation.frames[0]
    box = frame.normalized_bounding_box
    print(
        "Time offset of the first frame: {}s".format(
            frame.time_offset.seconds + frame.time_offset.microseconds / 1e6
        )
    )
    print("Bounding box position:")
    print("\tleft  : {}".format(box.left))
    print("\ttop   : {}".format(box.top))
    print("\tright : {}".format(box.right))
    print("\tbottom: {}".format(box.bottom))
    print("\n")

Ruby

# path = "Path to a video file on Google Cloud Storage: gs://bucket/video.mp4"

require "google/cloud/video_intelligence"

video = Google::Cloud::VideoIntelligence.video_intelligence_service

# Register a callback during the method call
operation = video.annotate_video features: [:OBJECT_TRACKING], input_uri: path

puts "Processing video for object tracking:"
operation.wait_until_done!

raise operation.results.message? if operation.error?
puts "Finished Processing."

object_annotations = operation.results.annotation_results.first.object_annotations
print_object_annotations object_annotations

Solicitar rastreamento de objetos para vídeo de um arquivo local

As amostras a seguir demonstram o rastreamento de objetos em um arquivo armazenado localmente.

REST e LINHA DE CMD

Enviar a solicitação de processo

Para realizar a anotação em um arquivo de vídeo local, codifique em base64 o conteúdo do arquivo de vídeo. Inclua o conteúdo codificado em base64 no campo inputContent da solicitação. Para informações sobre como codificar o conteúdo de um arquivo de vídeo em base64, consulte Codificação em Base64.

Veja a seguir como enviar uma solicitação POST para o método videos:annotate. O exemplo utiliza o token de acesso para uma conta de serviço configurada para o projeto com o SDK do Cloud. Consulte o Guia de início rápido da Video Intelligence para instruções de como instalar o SDK do Cloud, configurar um projeto com uma conta de serviço e conseguir um token de acesso.

Antes de usar os dados da solicitação abaixo, faça as substituições a seguir:

  • inputContent: base-64-encoded-content
    Por exemplo: "UklGRg41AwBBVkkgTElTVAwBAABoZHJsYXZpaDgAAAA1ggAAxPMBAAAAAAAQCAA..."

Método HTTP e URL:

POST https://videointelligence.googleapis.com/v1/videos:annotate

Corpo JSON da solicitação:

{
  "inputContent": "base-64-encoded-content",
  "features": ["OBJECT_TRACKING"]
}

Para enviar a solicitação, expanda uma destas opções:

Você receberá uma resposta JSON semelhante a esta:

Se a solicitação for bem-sucedida, a Video Intelligence retornará o name para sua operação. Veja a seguir um exemplo dessa resposta, em que project-number é o número do projeto e operation-id é o ID da operação de longa duração criada para a solicitação.

Ver os resultados

Para receber os resultados da solicitação, envie um GET usando o nome da operação retornado da chamada para videos:annotate, conforme mostrado no exemplo a seguir.

Antes de usar os dados da solicitação abaixo, faça as substituições a seguir:

  • operation-name: o nome da operação, conforme retornado pela API Video Intelligence. O nome da operação tem o formato projects/project-number/locations/location-id/operations/operation-id.

Método HTTP e URL:

GET https://videointelligence.googleapis.com/v1/operation-name

Para enviar a solicitação, expanda uma destas opções:

Você receberá uma resposta JSON semelhante a esta:

C#

public static object TrackObject(string filePath)
{
    var client = VideoIntelligenceServiceClient.Create();
    var request = new AnnotateVideoRequest
    {
        InputContent = Google.Protobuf.ByteString.CopyFrom(File.ReadAllBytes(filePath)),
        Features = { Feature.ObjectTracking },
        // It is recommended to use location_id as 'us-east1' for the
        // best latency due to different types of processors used in
        // this region and others.
        LocationId = "us-east1"
    };

    Console.WriteLine("\nProcessing video for object annotations.");
    var op = client.AnnotateVideo(request).PollUntilCompleted();

    Console.WriteLine("\nFinished processing.\n");

    // Retrieve first result because a single video was processed.
    var objectAnnotations = op.Result.AnnotationResults[0]
                              .ObjectAnnotations;

    // Get only the first annotation for demo purposes
    var objAnnotation = objectAnnotations[0];

    Console.WriteLine(
        $"Entity description: {objAnnotation.Entity.Description}");

    if (objAnnotation.Entity.EntityId != null)
    {
        Console.WriteLine(
            $"Entity id: {objAnnotation.Entity.EntityId}");
    }

    Console.Write($"Segment: ");
    Console.WriteLine(
        String.Format("{0}s to {1}s",
                      objAnnotation.Segment.StartTimeOffset.Seconds +
                      objAnnotation.Segment.StartTimeOffset.Nanos / 1e9,
                      objAnnotation.Segment.EndTimeOffset.Seconds +
                      objAnnotation.Segment.EndTimeOffset.Nanos / 1e9));

    Console.WriteLine($"Confidence: {objAnnotation.Confidence}");

    // Here we print only the bounding box of the first frame in this segment
    var frame = objAnnotation.Frames[0];
    var box = frame.NormalizedBoundingBox;
    Console.WriteLine(
        String.Format("Time offset of the first frame: {0}s",
                      frame.TimeOffset.Seconds +
                      frame.TimeOffset.Nanos / 1e9));
    Console.WriteLine("Bounding box positions:");
    Console.WriteLine($"\tleft   : {box.Left}");
    Console.WriteLine($"\ttop    : {box.Top}");
    Console.WriteLine($"\tright  : {box.Right}");
    Console.WriteLine($"\tbottom : {box.Bottom}");

    return 0;
}

Go


import (
	"context"
	"fmt"
	"io"
	"io/ioutil"

	video "cloud.google.com/go/videointelligence/apiv1"
	"github.com/golang/protobuf/ptypes"
	videopb "google.golang.org/genproto/googleapis/cloud/videointelligence/v1"
)

// objectTracking analyzes a video and extracts entities with their bounding boxes.
func objectTracking(w io.Writer, filename string) error {
	// filename := "../testdata/cat.mp4"

	ctx := context.Background()

	// Creates a client.
	client, err := video.NewClient(ctx)
	if err != nil {
		return fmt.Errorf("video.NewClient: %v", err)
	}

	fileBytes, err := ioutil.ReadFile(filename)
	if err != nil {
		return err
	}

	op, err := client.AnnotateVideo(ctx, &videopb.AnnotateVideoRequest{
		InputContent: fileBytes,
		Features: []videopb.Feature{
			videopb.Feature_OBJECT_TRACKING,
		},
	})
	if err != nil {
		return fmt.Errorf("AnnotateVideo: %v", err)
	}

	resp, err := op.Wait(ctx)
	if err != nil {
		return fmt.Errorf("Wait: %v", err)
	}

	// Only one video was processed, so get the first result.
	result := resp.GetAnnotationResults()[0]

	for _, annotation := range result.ObjectAnnotations {
		fmt.Fprintf(w, "Description: %q\n", annotation.Entity.GetDescription())
		if len(annotation.Entity.EntityId) > 0 {
			fmt.Fprintf(w, "\tEntity ID: %q\n", annotation.Entity.GetEntityId())
		}

		segment := annotation.GetSegment()
		start, _ := ptypes.Duration(segment.GetStartTimeOffset())
		end, _ := ptypes.Duration(segment.GetEndTimeOffset())
		fmt.Fprintf(w, "\tSegment: %v to %v\n", start, end)

		fmt.Fprintf(w, "\tConfidence: %f\n", annotation.GetConfidence())

		// Here we print only the bounding box of the first frame in this segment.
		frame := annotation.GetFrames()[0]
		seconds := float32(frame.GetTimeOffset().GetSeconds())
		nanos := float32(frame.GetTimeOffset().GetNanos())
		fmt.Fprintf(w, "\tTime offset of the first frame: %fs\n", seconds+nanos/1e9)

		box := frame.GetNormalizedBoundingBox()
		fmt.Fprintf(w, "\tBounding box position:\n")
		fmt.Fprintf(w, "\t\tleft  : %f\n", box.GetLeft())
		fmt.Fprintf(w, "\t\ttop   : %f\n", box.GetTop())
		fmt.Fprintf(w, "\t\tright : %f\n", box.GetRight())
		fmt.Fprintf(w, "\t\tbottom: %f\n", box.GetBottom())
	}

	return nil
}

Java

/**
 * Track objects in a video.
 *
 * @param filePath the path to the video file to analyze.
 */
public static VideoAnnotationResults trackObjects(String filePath) throws Exception {
  try (VideoIntelligenceServiceClient client = VideoIntelligenceServiceClient.create()) {
    // Read file
    Path path = Paths.get(filePath);
    byte[] data = Files.readAllBytes(path);

    // Create the request
    AnnotateVideoRequest request =
        AnnotateVideoRequest.newBuilder()
            .setInputContent(ByteString.copyFrom(data))
            .addFeatures(Feature.OBJECT_TRACKING)
            .setLocationId("us-east1")
            .build();

    // asynchronously perform object tracking on videos
    OperationFuture<AnnotateVideoResponse, AnnotateVideoProgress> future =
        client.annotateVideoAsync(request);

    System.out.println("Waiting for operation to complete...");
    // The first result is retrieved because a single video was processed.
    AnnotateVideoResponse response = future.get(300, TimeUnit.SECONDS);
    VideoAnnotationResults results = response.getAnnotationResults(0);

    // Get only the first annotation for demo purposes.
    ObjectTrackingAnnotation annotation = results.getObjectAnnotations(0);
    System.out.println("Confidence: " + annotation.getConfidence());

    if (annotation.hasEntity()) {
      Entity entity = annotation.getEntity();
      System.out.println("Entity description: " + entity.getDescription());
      System.out.println("Entity id:: " + entity.getEntityId());
    }

    if (annotation.hasSegment()) {
      VideoSegment videoSegment = annotation.getSegment();
      Duration startTimeOffset = videoSegment.getStartTimeOffset();
      Duration endTimeOffset = videoSegment.getEndTimeOffset();
      // Display the segment time in seconds, 1e9 converts nanos to seconds
      System.out.println(
          String.format(
              "Segment: %.2fs to %.2fs",
              startTimeOffset.getSeconds() + startTimeOffset.getNanos() / 1e9,
              endTimeOffset.getSeconds() + endTimeOffset.getNanos() / 1e9));
    }

    // Here we print only the bounding box of the first frame in this segment.
    ObjectTrackingFrame frame = annotation.getFrames(0);
    // Display the offset time in seconds, 1e9 converts nanos to seconds
    Duration timeOffset = frame.getTimeOffset();
    System.out.println(
        String.format(
            "Time offset of the first frame: %.2fs",
            timeOffset.getSeconds() + timeOffset.getNanos() / 1e9));

    // Display the bounding box of the detected object
    NormalizedBoundingBox normalizedBoundingBox = frame.getNormalizedBoundingBox();
    System.out.println("Bounding box position:");
    System.out.println("\tleft: " + normalizedBoundingBox.getLeft());
    System.out.println("\ttop: " + normalizedBoundingBox.getTop());
    System.out.println("\tright: " + normalizedBoundingBox.getRight());
    System.out.println("\tbottom: " + normalizedBoundingBox.getBottom());
    return results;
  }
}

Node.js

// Imports the Google Cloud Video Intelligence library
const Video = require('@google-cloud/video-intelligence');
const fs = require('fs');
const util = require('util');
// Creates a client
const video = new Video.VideoIntelligenceServiceClient();
/**
 * TODO(developer): Uncomment the following line before running the sample.
 */
// const path = 'Local file to analyze, e.g. ./my-file.mp4';

// Reads a local video file and converts it to base64
const file = await util.promisify(fs.readFile)(path);
const inputContent = file.toString('base64');

const request = {
  inputContent: inputContent,
  features: ['OBJECT_TRACKING'],
  //recommended to use us-east1 for the best latency due to different types of processors used in this region and others
  locationId: 'us-east1',
};
// Detects objects in a video
const [operation] = await video.annotateVideo(request);
const results = await operation.promise();
console.log('Waiting for operation to complete...');
//Gets annotations for video
const annotations = results[0].annotationResults[0];
const objects = annotations.objectAnnotations;
objects.forEach(object => {
  console.log(`Entity description:  ${object.entity.description}`);
  console.log(`Entity id: ${object.entity.entityId}`);
  const time = object.segment;
  console.log(
    `Segment: ${time.startTimeOffset.seconds || 0}` +
      `.${(time.startTimeOffset.nanos / 1e6).toFixed(0)}s to ${
        time.endTimeOffset.seconds || 0
      }.` +
      `${(time.endTimeOffset.nanos / 1e6).toFixed(0)}s`
  );
  console.log(`Confidence: ${object.confidence}`);
  const frame = object.frames[0];
  const box = frame.normalizedBoundingBox;
  const timeOffset = frame.timeOffset;
  console.log(
    `Time offset for the first frame: ${timeOffset.seconds || 0}` +
      `.${(timeOffset.nanos / 1e6).toFixed(0)}s`
  );
  console.log('Bounding box position:');
  console.log(` left   :${box.left}`);
  console.log(` top    :${box.top}`);
  console.log(` right  :${box.right}`);
  console.log(` bottom :${box.bottom}`);
});

PHP

use Google\Cloud\VideoIntelligence\V1\VideoIntelligenceServiceClient;
use Google\Cloud\VideoIntelligence\V1\Feature;

/** Uncomment and populate these variables in your code */
// $path = 'File path to a video file to analyze';
// $options = [];

# Instantiate a client.
$video = new VideoIntelligenceServiceClient();

# Read the local video file
$inputContent = file_get_contents($path);

# Execute a request.
$operation = $video->annotateVideo([
    'inputContent' => $inputContent,
    'features' => [Feature::OBJECT_TRACKING]
]);

# Wait for the request to complete.
$operation->pollUntilComplete($options);

# Print the results.
if ($operation->operationSucceeded()) {
    $results = $operation->getResult()->getAnnotationResults()[0];
    # Process video/segment level label annotations
    $objectEntity = $results->getObjectAnnotations()[0];

    printf('Video object entity: %s' . PHP_EOL, $objectEntity->getEntity()->getEntityId());
    printf('Video object description: %s' . PHP_EOL, $objectEntity->getEntity()->getDescription());

    $start = $objectEntity->getSegment()->getStartTimeOffset();
    $end = $objectEntity->getSegment()->getEndTimeOffset();
    printf('  Segment: %ss to %ss' . PHP_EOL,
        $start->getSeconds() + $start->getNanos()/1000000000.0,
        $end->getSeconds() + $end->getNanos()/1000000000.0);
    printf('  Confidence: %f' . PHP_EOL, $objectEntity->getConfidence());

    foreach ($objectEntity->getFrames() as $objectEntityFrame) {
        $offset = $objectEntityFrame->getTimeOffset();
        $boundingBox = $objectEntityFrame->getNormalizedBoundingBox();
        printf('  Time offset: %ss' . PHP_EOL,
            $offset->getSeconds() + $offset->getNanos()/1000000000.0);
        printf('  Bounding box position:' . PHP_EOL);
        printf('   Left: %s', $boundingBox->getLeft());
        printf('   Top: %s', $boundingBox->getTop());
        printf('   Right: %s', $boundingBox->getRight());
        printf('   Bottom: %s', $boundingBox->getBottom());
    }
    print(PHP_EOL);
} else {
    print_r($operation->getError());
}

Python

"""Object tracking in a local video."""
from google.cloud import videointelligence

video_client = videointelligence.VideoIntelligenceServiceClient()
features = [videointelligence.Feature.OBJECT_TRACKING]

with io.open(path, "rb") as file:
    input_content = file.read()

operation = video_client.annotate_video(
    request={"features": features, "input_content": input_content}
)
print("\nProcessing video for object annotations.")

result = operation.result(timeout=300)
print("\nFinished processing.\n")

# The first result is retrieved because a single video was processed.
object_annotations = result.annotation_results[0].object_annotations

# Get only the first annotation for demo purposes.
object_annotation = object_annotations[0]
print("Entity description: {}".format(object_annotation.entity.description))
if object_annotation.entity.entity_id:
    print("Entity id: {}".format(object_annotation.entity.entity_id))

print(
    "Segment: {}s to {}s".format(
        object_annotation.segment.start_time_offset.seconds
        + object_annotation.segment.start_time_offset.microseconds / 1e6,
        object_annotation.segment.end_time_offset.seconds
        + object_annotation.segment.end_time_offset.microseconds / 1e6,
    )
)

print("Confidence: {}".format(object_annotation.confidence))

# Here we print only the bounding box of the first frame in this segment
frame = object_annotation.frames[0]
box = frame.normalized_bounding_box
print(
    "Time offset of the first frame: {}s".format(
        frame.time_offset.seconds + frame.time_offset.microseconds / 1e6
    )
)
print("Bounding box position:")
print("\tleft  : {}".format(box.left))
print("\ttop   : {}".format(box.top))
print("\tright : {}".format(box.right))
print("\tbottom: {}".format(box.bottom))
print("\n")

Ruby

# "Path to a local video file: path/to/file.mp4"

require "google/cloud/video_intelligence"

video = Google::Cloud::VideoIntelligence.video_intelligence_service

video_contents = File.binread path

# Register a callback during the method call
operation = video.annotate_video features: [:OBJECT_TRACKING], input_content: video_contents

puts "Processing video for object tracking:"
operation.wait_until_done!

raise operation.results.message? if operation.error?
puts "Finished Processing."

object_annotations = operation.results.annotation_results.first.object_annotations
print_object_annotations object_annotations