Esta página foi traduzida pela API Cloud Translation.

Monitorize objetos

A funcionalidade de seguimento de objetos segue objetos detetados num vídeo de entrada. Para fazer um pedido de acompanhamento de objetos, chame o método annotate e especifique OBJECT_TRACKING no campo features.

Para entidades e localizações espaciais detetadas num vídeo ou em segmentos de vídeo, um pedido de acompanhamento de objetos anota o vídeo com as etiquetas adequadas para estas entidades e localizações espaciais. Por exemplo, um vídeo de veículos a atravessar um sinal de trânsito pode gerar etiquetas como "carro", "camião", "bicicleta", "pneus", "luzes", "janela" e assim sucessivamente. Cada etiqueta pode incluir uma série de caixas delimitadoras, com cada caixa delimitadora a ter um segmento de tempo associado que contém um desvio de tempo que indica o desvio de duração desde o início do vídeo. A anotação também contém informações adicionais sobre a entidade, incluindo um ID da entidade que pode usar para encontrar mais informações sobre a entidade na API Google Knowledge Graph Search.

Acompanhamento de objetos vs. deteção de etiquetas

A monitorização de objetos difere da deteção de etiquetas. A deteção de etiquetas fornece etiquetas sem caixas delimitadoras, enquanto a monitorização de objetos fornece as etiquetas dos objetos individuais presentes num determinado vídeo, juntamente com a caixa delimitadora de cada instância de objeto em cada passo de tempo.

São atribuídas várias instâncias do mesmo tipo de objeto a diferentes instâncias da mensagem ObjectTrackingAnnotation, onde todas as ocorrências de uma determinada faixa de objeto são mantidas na respetiva instância de ObjectTrackingAnnotation. Por exemplo, se houver um carro vermelho e um carro azul a aparecer durante 5 segundos num vídeo, o pedido de acompanhamento deve devolver duas instâncias de ObjectTrackingAnnotation. A primeira instância vai conter as localizações de um dos dois carros, por exemplo, o carro vermelho, enquanto a segunda vai conter as localizações do outro carro.

Peça o acompanhamento de objetos para um vídeo no Cloud Storage

Os exemplos seguintes demonstram a monitorização de objetos num ficheiro localizado no Cloud Storage.

REST

Envie o pedido de processamento

O exemplo seguinte mostra como enviar um pedido POST para o método annotate. O exemplo usa o token de acesso para uma conta de serviço configurada para o projeto com a CLI do Google Cloud. Para ver instruções sobre como instalar a Google Cloud CLI, configurar um projeto com uma conta de serviço e obter um token de acesso, consulte o início rápido do Video Intelligence.

Antes de usar qualquer um dos dados do pedido, faça as seguintes substituições:

INPUT_URI: STORAGE_URI
Por exemplo:
"inputUri": "gs://cloud-videointelligence-demo/assistant.mp4",
PROJECT_NUMBER: o identificador numérico do seu Google Cloud projeto

Método HTTP e URL:

POST https://videointelligence.googleapis.com/v1/videos:annotate

Corpo JSON do pedido:

{
  "inputUri": "STORAGE_URI",
  "features": ["OBJECT_TRACKING"]
}

Para enviar o seu pedido, expanda uma destas opções:

curl (Linux, macOS ou Cloud Shell)

Nota: O comando seguinte pressupõe que tem sessão iniciada na CLI gcloud com a sua conta de utilizador executando gcloud init ou gcloud auth login, ou usando o Cloud Shell, que inicia automaticamente sessão na CLI gcloud. Pode verificar a conta atualmente ativa executando o comando gcloud auth list.

Guarde o corpo do pedido num ficheiro com o nome request.json, e execute o seguinte comando:

curl -X POST \
     -H "Authorization: Bearer $(gcloud auth print-access-token)" \
     -H "x-goog-user-project: PROJECT_NUMBER" \
     -H "Content-Type: application/json; charset=utf-8" \
     -d @request.json \
     "https://videointelligence.googleapis.com/v1/videos:annotate"

PowerShell (Windows)

Nota: O comando seguinte pressupõe que iniciou sessão na CLI do Google Cloud com a sua conta de utilizador executando gcloud init ou gcloud auth login .gcloud Pode verificar a conta atualmente ativa executando o comando gcloud auth list.

Guarde o corpo do pedido num ficheiro com o nome request.json, e execute o seguinte comando:

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred"; "x-goog-user-project" = "PROJECT_NUMBER" }

Invoke-WebRequest `
    -Method POST `
    -Headers $headers `
    -ContentType: "application/json; charset=utf-8" `
    -InFile request.json `
    -Uri "https://videointelligence.googleapis.com/v1/videos:annotate" | Select-Object -Expand Content

Deve receber uma resposta JSON semelhante à seguinte:

{
  "name": "projects/PROJECT_NUMBER/locations/LOCATION_ID/operations/OPERATION_ID"
}

Se o pedido for bem-sucedido, a Video Intelligence API devolve o name da sua operação. O exemplo acima mostra uma resposta deste tipo, em que PROJECT_NUMBER é o número do seu projeto e OPERATION_ID é o ID da operação de longa duração criada para o pedido.

Obtenha os resultados

Para obter os resultados do seu pedido, envie um GET, usando o nome da operação devolvido da chamada para videos:annotate, conforme mostrado no exemplo seguinte.

Antes de usar qualquer um dos dados do pedido, faça as seguintes substituições:

OPERATION_NAME: o nome da operação, conforme devolvido pela API Video Intelligence. O nome da operação tem o formato projects/PROJECT_NUMBER/locations/LOCATION_ID/operations/OPERATION_ID
PROJECT_NUMBER: o identificador numérico do seu Google Cloud projeto

Método HTTP e URL:

GET https://videointelligence.googleapis.com/v1/OPERATION_NAME

Para enviar o seu pedido, expanda uma destas opções:

curl (Linux, macOS ou Cloud Shell)

Execute o seguinte comando:

curl -X GET \
     -H "Authorization: Bearer $(gcloud auth print-access-token)" \
     -H "x-goog-user-project: PROJECT_NUMBER" \
     "https://videointelligence.googleapis.com/v1/OPERATION_NAME"

PowerShell (Windows)

Execute o seguinte comando:

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred"; "x-goog-user-project" = "PROJECT_NUMBER" }

Invoke-WebRequest `
    -Method GET `
    -Headers $headers `
    -Uri "https://videointelligence.googleapis.com/v1/OPERATION_NAME" | Select-Object -Expand Content

Deve receber uma resposta JSON semelhante à seguinte:

Resposta

// Object tracking annotations are returned as a objectAnnotations list.
{
  "name": "projects/PROJECT_NUMBER/locations/LOCATION_ID/operations/OPERATION_ID",
  "metadata": {
    "@type": "type.googleapis.com/google.cloud.videointelligence.v1.AnnotateVideoProgress",
    "annotationProgress": [
      {
        "inputUri": "/cloud-ml-sandbox/video/chicago.mp4",
        "progressPercent": 100,
        "startTime": "2019-12-21T16:56:46.755199Z",
        "updateTime": "2019-12-21T16:59:17.911197Z"
      }
    ]
  },
  "done": true,
  "response": {
    "@type": "type.googleapis.com/google.cloud.videointelligence.v1.AnnotateVideoResponse",
    "annotationResults": [
      {
        "inputUri": "/cloud-ml-sandbox/video/chicago.mp4",
        "objectAnnotations": [
          {
            "entity": {
              "entityId": "/m/0k4j",
              "description": "car",
              "languageCode": "en-US"
            },
            "frames": [
              {
                "normalizedBoundingBox": {
                  "left": 0.2672763,
                  "top": 0.5677657,
                  "right": 0.4388713,
                  "bottom": 0.7623171
                },
                "timeOffset": "0s"
              },
              {
                "normalizedBoundingBox": {
                  "left": 0.26920167,
                  "top": 0.5659805,
                  "right": 0.44331276,
                  "bottom": 0.76780635
                },
                "timeOffset": "0.100495s"
              },
           ...
              {
                "normalizedBoundingBox": {
                  "left": 0.83573246,
                  "top": 0.6645812,
                  "right": 1,
                  "bottom": 0.99865407
                },
                "timeOffset": "2.311402s"
              }
            ],
            "segment": {
              "startTimeOffset": "0s",
              "endTimeOffset": "2.311402s"
            },
            "confidence": 0.99488896
          },
        ...
          {
            "entity": {
              "entityId": "/m/0cgh4",
              "description": "building",
              "languageCode": "en-US"
            },
            "frames": [
              {
                "normalizedBoundingBox": {
                  "left": 0.12340179,
                  "top": 0.010383379,
                  "right": 0.21914443,
                  "bottom": 0.5591795
                },
                "timeOffset": "0s"
              },
              {
                "normalizedBoundingBox": {
                  "left": 0.12340179,
                  "top": 0.009684974,
                  "right": 0.22915152,
                  "bottom": 0.56070584
                },
                "timeOffset": "0.100495s"
              },
           ...
              {
                "normalizedBoundingBox": {
                  "left": 0.12340179,
                  "top": 0.008624528,
                  "right": 0.22723165,
                  "bottom": 0.56158626
                },
                "timeOffset": "0.401983s"
              }
            ],
            "segment": {
              "startTimeOffset": "0s",
              "endTimeOffset": "0.401983s"
            },
            "confidence": 0.33914912
          },
       ...
          {
            "entity": {
              "entityId": "/m/0cgh4",
              "description": "building",
              "languageCode": "en-US"
            },
            "frames": [
              {
                "normalizedBoundingBox": {
                  "left": 0.79324204,
                  "top": 0.0006896425,
                  "right": 0.99659824,
                  "bottom": 0.5324423
                },
                "timeOffset": "37.585421s"
              },
              {
                "normalizedBoundingBox": {
                  "left": 0.78935236,
                  "top": 0.0011992548,
                  "right": 0.99659824,
                  "bottom": 0.5374946
                },
                "timeOffset": "37.685917s"
              },
           ...
              {
                "normalizedBoundingBox": {
                  "left": 0.79404694,
                  "right": 0.99659824,
                  "bottom": 0.5280966
                },
                "timeOffset": "38.590379s"
              }
            ],
            "segment": {
              "startTimeOffset": "37.585421s",
              "endTimeOffset": "38.590379s"
            },
            "confidence": 0.3415429
          }
        ]
      }
    ]
  }
}

Transfira os resultados das anotações

Copie a anotação da origem para o contentor de destino: (consulte o artigo Copie ficheiros e objetos)

gcloud storage cp gcs_uri gs://my-bucket

Nota: se o URI do GCS de saída for fornecido pelo utilizador, a anotação é armazenada nesse URI do GCS.

Go


import (
	"context"
	"fmt"
	"io"

	video "cloud.google.com/go/videointelligence/apiv1"
	videopb "cloud.google.com/go/videointelligence/apiv1/videointelligencepb"
	"github.com/golang/protobuf/ptypes"
)

// objectTrackingGCS analyzes a video and extracts entities with their bounding boxes.
func objectTrackingGCS(w io.Writer, gcsURI string) error {
	// gcsURI := "gs://cloud-samples-data/video/cat.mp4"

	ctx := context.Background()

	// Creates a client.
	client, err := video.NewClient(ctx)
	if err != nil {
		return fmt.Errorf("video.NewClient: %w", err)
	}
	defer client.Close()

	op, err := client.AnnotateVideo(ctx, &videopb.AnnotateVideoRequest{
		InputUri: gcsURI,
		Features: []videopb.Feature{
			videopb.Feature_OBJECT_TRACKING,
		},
	})
	if err != nil {
		return fmt.Errorf("AnnotateVideo: %w", err)
	}

	resp, err := op.Wait(ctx)
	if err != nil {
		return fmt.Errorf("Wait: %w", err)
	}

	// Only one video was processed, so get the first result.
	result := resp.GetAnnotationResults()[0]

	for _, annotation := range result.ObjectAnnotations {
		fmt.Fprintf(w, "Description: %q\n", annotation.Entity.GetDescription())
		if len(annotation.Entity.EntityId) > 0 {
			fmt.Fprintf(w, "\tEntity ID: %q\n", annotation.Entity.GetEntityId())
		}

		segment := annotation.GetSegment()
		start, _ := ptypes.Duration(segment.GetStartTimeOffset())
		end, _ := ptypes.Duration(segment.GetEndTimeOffset())
		fmt.Fprintf(w, "\tSegment: %v to %v\n", start, end)

		fmt.Fprintf(w, "\tConfidence: %f\n", annotation.GetConfidence())

		// Here we print only the bounding box of the first frame in this segment.
		frame := annotation.GetFrames()[0]
		seconds := float32(frame.GetTimeOffset().GetSeconds())
		nanos := float32(frame.GetTimeOffset().GetNanos())
		fmt.Fprintf(w, "\tTime offset of the first frame: %fs\n", seconds+nanos/1e9)

		box := frame.GetNormalizedBoundingBox()
		fmt.Fprintf(w, "\tBounding box position:\n")
		fmt.Fprintf(w, "\t\tleft  : %f\n", box.GetLeft())
		fmt.Fprintf(w, "\t\ttop   : %f\n", box.GetTop())
		fmt.Fprintf(w, "\t\tright : %f\n", box.GetRight())
		fmt.Fprintf(w, "\t\tbottom: %f\n", box.GetBottom())
	}

	return nil
}

Java

/**
 * Track objects in a video.
 *
 * @param gcsUri the path to the video file to analyze.
 */
public static VideoAnnotationResults trackObjectsGcs(String gcsUri) throws Exception {
  try (VideoIntelligenceServiceClient client = VideoIntelligenceServiceClient.create()) {
    // Create the request
    AnnotateVideoRequest request =
        AnnotateVideoRequest.newBuilder()
            .setInputUri(gcsUri)
            .addFeatures(Feature.OBJECT_TRACKING)
            .setLocationId("us-east1")
            .build();

    // asynchronously perform object tracking on videos
    OperationFuture<AnnotateVideoResponse, AnnotateVideoProgress> future =
        client.annotateVideoAsync(request);

    System.out.println("Waiting for operation to complete...");
    // The first result is retrieved because a single video was processed.
    AnnotateVideoResponse response = future.get(450, TimeUnit.SECONDS);
    VideoAnnotationResults results = response.getAnnotationResults(0);

    // Get only the first annotation for demo purposes.
    ObjectTrackingAnnotation annotation = results.getObjectAnnotations(0);
    System.out.println("Confidence: " + annotation.getConfidence());

    if (annotation.hasEntity()) {
      Entity entity = annotation.getEntity();
      System.out.println("Entity description: " + entity.getDescription());
      System.out.println("Entity id:: " + entity.getEntityId());
    }

    if (annotation.hasSegment()) {
      VideoSegment videoSegment = annotation.getSegment();
      Duration startTimeOffset = videoSegment.getStartTimeOffset();
      Duration endTimeOffset = videoSegment.getEndTimeOffset();
      // Display the segment time in seconds, 1e9 converts nanos to seconds
      System.out.println(
          String.format(
              "Segment: %.2fs to %.2fs",
              startTimeOffset.getSeconds() + startTimeOffset.getNanos() / 1e9,
              endTimeOffset.getSeconds() + endTimeOffset.getNanos() / 1e9));
    }

    // Here we print only the bounding box of the first frame in this segment.
    ObjectTrackingFrame frame = annotation.getFrames(0);
    // Display the offset time in seconds, 1e9 converts nanos to seconds
    Duration timeOffset = frame.getTimeOffset();
    System.out.println(
        String.format(
            "Time offset of the first frame: %.2fs",
            timeOffset.getSeconds() + timeOffset.getNanos() / 1e9));

    // Display the bounding box of the detected object
    NormalizedBoundingBox normalizedBoundingBox = frame.getNormalizedBoundingBox();
    System.out.println("Bounding box position:");
    System.out.println("\tleft: " + normalizedBoundingBox.getLeft());
    System.out.println("\ttop: " + normalizedBoundingBox.getTop());
    System.out.println("\tright: " + normalizedBoundingBox.getRight());
    System.out.println("\tbottom: " + normalizedBoundingBox.getBottom());
    return results;
  }
}

Node.js

Para se autenticar no Video Intelligence, configure as Credenciais padrão da aplicação. Para mais informações, consulte o artigo Configure a autenticação para um ambiente de desenvolvimento local.

// Imports the Google Cloud Video Intelligence library
const Video = require('@google-cloud/video-intelligence');

// Creates a client
const video = new Video.VideoIntelligenceServiceClient();

/**
 * TODO(developer): Uncomment the following line before running the sample.
 */
// const gcsUri = 'GCS URI of the video to analyze, e.g. gs://my-bucket/my-video.mp4';

const request = {
  inputUri: gcsUri,
  features: ['OBJECT_TRACKING'],
  //recommended to use us-east1 for the best latency due to different types of processors used in this region and others
  locationId: 'us-east1',
};
// Detects objects in a video
const [operation] = await video.annotateVideo(request);
const results = await operation.promise();
console.log('Waiting for operation to complete...');
//Gets annotations for video
const annotations = results[0].annotationResults[0];
const objects = annotations.objectAnnotations;
objects.forEach(object => {
  console.log(`Entity description:  ${object.entity.description}`);
  console.log(`Entity id: ${object.entity.entityId}`);
  const time = object.segment;
  console.log(
    `Segment: ${time.startTimeOffset.seconds || 0}` +
      `.${(time.startTimeOffset.nanos / 1e6).toFixed(0)}s to ${
        time.endTimeOffset.seconds || 0
      }.` +
      `${(time.endTimeOffset.nanos / 1e6).toFixed(0)}s`
  );
  console.log(`Confidence: ${object.confidence}`);
  const frame = object.frames[0];
  const box = frame.normalizedBoundingBox;
  const timeOffset = frame.timeOffset;
  console.log(
    `Time offset for the first frame: ${timeOffset.seconds || 0}` +
      `.${(timeOffset.nanos / 1e6).toFixed(0)}s`
  );
  console.log('Bounding box position:');
  console.log(` left   :${box.left}`);
  console.log(` top    :${box.top}`);
  console.log(` right  :${box.right}`);
  console.log(` bottom :${box.bottom}`);
});

Python

"""Object tracking in a video stored on GCS."""
from google.cloud import videointelligence

video_client = videointelligence.VideoIntelligenceServiceClient()
features = [videointelligence.Feature.OBJECT_TRACKING]
operation = video_client.annotate_video(
    request={"features": features, "input_uri": gcs_uri}
)
print("\nProcessing video for object annotations.")

result = operation.result(timeout=500)
print("\nFinished processing.\n")

# The first result is retrieved because a single video was processed.
object_annotations = result.annotation_results[0].object_annotations

for object_annotation in object_annotations:
    print("Entity description: {}".format(object_annotation.entity.description))
    if object_annotation.entity.entity_id:
        print("Entity id: {}".format(object_annotation.entity.entity_id))

    print(
        "Segment: {}s to {}s".format(
            object_annotation.segment.start_time_offset.seconds
            + object_annotation.segment.start_time_offset.microseconds / 1e6,
            object_annotation.segment.end_time_offset.seconds
            + object_annotation.segment.end_time_offset.microseconds / 1e6,
        )
    )

    print("Confidence: {}".format(object_annotation.confidence))

    # Here we print only the bounding box of the first frame in the segment
    frame = object_annotation.frames[0]
    box = frame.normalized_bounding_box
    print(
        "Time offset of the first frame: {}s".format(
            frame.time_offset.seconds + frame.time_offset.microseconds / 1e6
        )
    )
    print("Bounding box position:")
    print("\tleft  : {}".format(box.left))
    print("\ttop   : {}".format(box.top))
    print("\tright : {}".format(box.right))
    print("\tbottom: {}".format(box.bottom))
    print("\n")

Idiomas adicionais

C#: Siga as instruções de configuração do C# na página das bibliotecas cliente e, em seguida, visite a documentação de referência da Video Intelligence API para .NET.

PHP: Siga as instruções de configuração do PHP na página das bibliotecas cliente e, em seguida, visite a documentação de referência da Video Intelligence API para PHP.

Ruby: Siga as instruções de configuração do Ruby na página das bibliotecas cliente e, em seguida, visite a documentação de referência da Video Intelligence API para Ruby.

Peça o acompanhamento de objetos para vídeo a partir de um ficheiro local

Os seguintes exemplos demonstram a deteção de objetos num ficheiro armazenado localmente.

REST

Envie o pedido de processamento

Para fazer anotações num ficheiro de vídeo local, codifique em base64 o conteúdo do ficheiro de vídeo. Inclua o conteúdo codificado em base64 no campo inputContent do pedido. Para obter informações sobre como codificar em base64 o conteúdo de um ficheiro de vídeo, consulte o artigo Codificação base64.

O exemplo seguinte mostra como enviar um pedido POST para o método videos:annotate. O exemplo usa a chave de acesso para uma conta de serviço configurada para o projeto através da CLI do Google Cloud. Para ver instruções sobre como instalar a CLI Google Cloud, configurar um projeto com uma conta de serviço e obter um token de acesso, consulte o início rápido do Video Intelligence.

Antes de usar qualquer um dos dados do pedido, faça as seguintes substituições:

inputContent: BASE64_ENCODED_CONTENT
Por exemplo: "UklGRg41AwBBVkkgTElTVAwBAABoZHJsYXZpaDgAAAA1ggAAxPMBAAAAAAAQCAA..."
PROJECT_NUMBER: o identificador numérico do seu Google Cloud projeto

Método HTTP e URL:

POST https://videointelligence.googleapis.com/v1/videos:annotate

Corpo JSON do pedido:

{
  "inputContent": "BASE64_ENCODED_CONTENT",
  "features": ["OBJECT_TRACKING"]
}

Para enviar o seu pedido, expanda uma destas opções:

curl (Linux, macOS ou Cloud Shell)

Guarde o corpo do pedido num ficheiro com o nome request.json, e execute o seguinte comando:

curl -X POST \
     -H "Authorization: Bearer $(gcloud auth print-access-token)" \
     -H "x-goog-user-project: PROJECT_NUMBER" \
     -H "Content-Type: application/json; charset=utf-8" \
     -d @request.json \
     "https://videointelligence.googleapis.com/v1/videos:annotate"

PowerShell (Windows)

Guarde o corpo do pedido num ficheiro com o nome request.json, e execute o seguinte comando:

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred"; "x-goog-user-project" = "PROJECT_NUMBER" }

Invoke-WebRequest `
    -Method POST `
    -Headers $headers `
    -ContentType: "application/json; charset=utf-8" `
    -InFile request.json `
    -Uri "https://videointelligence.googleapis.com/v1/videos:annotate" | Select-Object -Expand Content

Deve receber uma resposta JSON semelhante à seguinte:

Resposta

{
  "name": "projects/PROJECT_NUMBER/locations/LOCATION_ID/operations/OPERATION_ID"
}

Se o pedido for bem-sucedido, o Video Intelligence devolve o name para a sua operação. O exemplo seguinte mostra uma resposta deste tipo, em que PROJECT_NUMBER é o número do seu projeto e OPERATION_ID é o ID da operação de longa duração criada para o pedido.

Obtenha os resultados

Para receber os resultados do seu pedido, tem de enviar um GET, usando o nome da operação devolvido da chamada para videos:annotate, conforme mostrado no exemplo seguinte.

Antes de usar qualquer um dos dados do pedido, faça as seguintes substituições:

OPERATION_NAME: o nome da operação, conforme devolvido pela API Video Intelligence. O nome da operação tem o formato projects/PROJECT_NUMBER/locations/LOCATION_ID/operations/OPERATION_ID
PROJECT_NUMBER: o identificador numérico do seu Google Cloud projeto

Método HTTP e URL:

GET https://videointelligence.googleapis.com/v1/OPERATION_NAME

Para enviar o seu pedido, expanda uma destas opções:

curl (Linux, macOS ou Cloud Shell)

Execute o seguinte comando:

curl -X GET \
     -H "Authorization: Bearer $(gcloud auth print-access-token)" \
     -H "x-goog-user-project: PROJECT_NUMBER" \
     "https://videointelligence.googleapis.com/v1/OPERATION_NAME"

PowerShell (Windows)

Execute o seguinte comando:

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred"; "x-goog-user-project" = "PROJECT_NUMBER" }

Invoke-WebRequest `
    -Method GET `
    -Headers $headers `
    -Uri "https://videointelligence.googleapis.com/v1/OPERATION_NAME" | Select-Object -Expand Content

Deve receber uma resposta JSON semelhante à seguinte:

Resposta

// Object tracking annotations are returned as a objectAnnotations list.
{
  "name": "projects/PROJECT_NUMBER/locations/LOCATION_ID/operations/OPERATION_ID",
  "metadata": {
    "@type": "type.googleapis.com/google.cloud.videointelligence.v1.AnnotateVideoProgress",
    "annotationProgress": [
      {
        "inputContent": "UklGRg41AwBBVkkgTElTVAwBAABoZHJsYXZpaDgAAAA1ggAAxPMBAAAAAAAQCAA...",
        "progressPercent": 100,
        "startTime": "2018-06-21T16:56:46.755199Z",
        "updateTime": "2018-06-21T16:59:17.911197Z"
      }
    ]
  },
  "done": true,
  "response": {
    "@type": "type.googleapis.com/google.cloud.videointelligence.v1.AnnotateVideoResponse",
    "annotationResults": [
      {
        "inputContent": "/cloud-ml-sandbox/video/chicago.mp4",
        "objectAnnotations": [
          {
            "entity": {
              "entityId": "/m/0k4j",
              "description": "car",
              "languageCode": "en-US"
            },
            "frames": [
              {
                "normalizedBoundingBox": {
                  "left": 0.2672763,
                  "top": 0.5677657,
                  "right": 0.4388713,
                  "bottom": 0.7623171
                },
                "timeOffset": "0s"
              },
              {
                "normalizedBoundingBox": {
                  "left": 0.26920167,
                  "top": 0.5659805,
                  "right": 0.44331276,
                  "bottom": 0.76780635
                },
                "timeOffset": "0.100495s"
              },
           ...
              {
                "normalizedBoundingBox": {
                  "left": 0.83573246,
                  "top": 0.6645812,
                  "right": 1,
                  "bottom": 0.99865407
                },
                "timeOffset": "2.311402s"
              }
            ],
            "segment": {
              "startTimeOffset": "0s",
              "endTimeOffset": "2.311402s"
            },
            "confidence": 0.99488896
          },
        ...
          {
            "entity": {
              "entityId": "/m/0cgh4",
              "description": "building",
              "languageCode": "en-US"
            },
            "frames": [
              {
                "normalizedBoundingBox": {
                  "left": 0.12340179,
                  "top": 0.010383379,
                  "right": 0.21914443,
                  "bottom": 0.5591795
                },
                "timeOffset": "0s"
              },
              {
                "normalizedBoundingBox": {
                  "left": 0.12340179,
                  "top": 0.009684974,
                  "right": 0.22915152,
                  "bottom": 0.56070584
                },
                "timeOffset": "0.100495s"
              },
           ...
              {
                "normalizedBoundingBox": {
                  "left": 0.12340179,
                  "top": 0.008624528,
                  "right": 0.22723165,
                  "bottom": 0.56158626
                },
                "timeOffset": "0.401983s"
              }
            ],
            "segment": {
              "startTimeOffset": "0s",
              "endTimeOffset": "0.401983s"
            },
            "confidence": 0.33914912
          },
       ...
          {
            "entity": {
              "entityId": "/m/0cgh4",
              "description": "building",
              "languageCode": "en-US"
            },
            "frames": [
              {
                "normalizedBoundingBox": {
                  "left": 0.79324204,
                  "top": 0.0006896425,
                  "right": 0.99659824,
                  "bottom": 0.5324423
                },
                "timeOffset": "37.585421s"
              },
              {
                "normalizedBoundingBox": {
                  "left": 0.78935236,
                  "top": 0.0011992548,
                  "right": 0.99659824,
                  "bottom": 0.5374946
                },
                "timeOffset": "37.685917s"
              },
           ...
              {
                "normalizedBoundingBox": {
                  "left": 0.79404694,
                  "right": 0.99659824,
                  "bottom": 0.5280966
                },
                "timeOffset": "38.590379s"
              }
            ],
            "segment": {
              "startTimeOffset": "37.585421s",
              "endTimeOffset": "38.590379s"
            },
            "confidence": 0.3415429
          }
        ]
      }
    ]
  }
}

Go


import (
	"context"
	"fmt"
	"io"
	"os"

	video "cloud.google.com/go/videointelligence/apiv1"
	videopb "cloud.google.com/go/videointelligence/apiv1/videointelligencepb"
	"github.com/golang/protobuf/ptypes"
)

// objectTracking analyzes a video and extracts entities with their bounding boxes.
func objectTracking(w io.Writer, filename string) error {
	// filename := "../testdata/cat.mp4"

	ctx := context.Background()

	// Creates a client.
	client, err := video.NewClient(ctx)
	if err != nil {
		return fmt.Errorf("video.NewClient: %w", err)
	}
	defer client.Close()

	fileBytes, err := os.ReadFile(filename)
	if err != nil {
		return err
	}

	op, err := client.AnnotateVideo(ctx, &videopb.AnnotateVideoRequest{
		InputContent: fileBytes,
		Features: []videopb.Feature{
			videopb.Feature_OBJECT_TRACKING,
		},
	})
	if err != nil {
		return fmt.Errorf("AnnotateVideo: %w", err)
	}

	resp, err := op.Wait(ctx)
	if err != nil {
		return fmt.Errorf("Wait: %w", err)
	}

	// Only one video was processed, so get the first result.
	result := resp.GetAnnotationResults()[0]

	for _, annotation := range result.ObjectAnnotations {
		fmt.Fprintf(w, "Description: %q\n", annotation.Entity.GetDescription())
		if len(annotation.Entity.EntityId) > 0 {
			fmt.Fprintf(w, "\tEntity ID: %q\n", annotation.Entity.GetEntityId())
		}

		segment := annotation.GetSegment()
		start, _ := ptypes.Duration(segment.GetStartTimeOffset())
		end, _ := ptypes.Duration(segment.GetEndTimeOffset())
		fmt.Fprintf(w, "\tSegment: %v to %v\n", start, end)

		fmt.Fprintf(w, "\tConfidence: %f\n", annotation.GetConfidence())

		// Here we print only the bounding box of the first frame in this segment.
		frame := annotation.GetFrames()[0]
		seconds := float32(frame.GetTimeOffset().GetSeconds())
		nanos := float32(frame.GetTimeOffset().GetNanos())
		fmt.Fprintf(w, "\tTime offset of the first frame: %fs\n", seconds+nanos/1e9)

		box := frame.GetNormalizedBoundingBox()
		fmt.Fprintf(w, "\tBounding box position:\n")
		fmt.Fprintf(w, "\t\tleft  : %f\n", box.GetLeft())
		fmt.Fprintf(w, "\t\ttop   : %f\n", box.GetTop())
		fmt.Fprintf(w, "\t\tright : %f\n", box.GetRight())
		fmt.Fprintf(w, "\t\tbottom: %f\n", box.GetBottom())
	}

	return nil
}

Java

/**
 * Track objects in a video.
 *
 * @param filePath the path to the video file to analyze.
 */
public static VideoAnnotationResults trackObjects(String filePath) throws Exception {
  try (VideoIntelligenceServiceClient client = VideoIntelligenceServiceClient.create()) {
    // Read file
    Path path = Paths.get(filePath);
    byte[] data = Files.readAllBytes(path);

    // Create the request
    AnnotateVideoRequest request =
        AnnotateVideoRequest.newBuilder()
            .setInputContent(ByteString.copyFrom(data))
            .addFeatures(Feature.OBJECT_TRACKING)
            .setLocationId("us-east1")
            .build();

    // asynchronously perform object tracking on videos
    OperationFuture<AnnotateVideoResponse, AnnotateVideoProgress> future =
        client.annotateVideoAsync(request);

    System.out.println("Waiting for operation to complete...");
    // The first result is retrieved because a single video was processed.
    AnnotateVideoResponse response = future.get(450, TimeUnit.SECONDS);
    VideoAnnotationResults results = response.getAnnotationResults(0);

    // Get only the first annotation for demo purposes.
    ObjectTrackingAnnotation annotation = results.getObjectAnnotations(0);
    System.out.println("Confidence: " + annotation.getConfidence());

    if (annotation.hasEntity()) {
      Entity entity = annotation.getEntity();
      System.out.println("Entity description: " + entity.getDescription());
      System.out.println("Entity id:: " + entity.getEntityId());
    }

    if (annotation.hasSegment()) {
      VideoSegment videoSegment = annotation.getSegment();
      Duration startTimeOffset = videoSegment.getStartTimeOffset();
      Duration endTimeOffset = videoSegment.getEndTimeOffset();
      // Display the segment time in seconds, 1e9 converts nanos to seconds
      System.out.println(
          String.format(
              "Segment: %.2fs to %.2fs",
              startTimeOffset.getSeconds() + startTimeOffset.getNanos() / 1e9,
              endTimeOffset.getSeconds() + endTimeOffset.getNanos() / 1e9));
    }

    // Here we print only the bounding box of the first frame in this segment.
    ObjectTrackingFrame frame = annotation.getFrames(0);
    // Display the offset time in seconds, 1e9 converts nanos to seconds
    Duration timeOffset = frame.getTimeOffset();
    System.out.println(
        String.format(
            "Time offset of the first frame: %.2fs",
            timeOffset.getSeconds() + timeOffset.getNanos() / 1e9));

    // Display the bounding box of the detected object
    NormalizedBoundingBox normalizedBoundingBox = frame.getNormalizedBoundingBox();
    System.out.println("Bounding box position:");
    System.out.println("\tleft: " + normalizedBoundingBox.getLeft());
    System.out.println("\ttop: " + normalizedBoundingBox.getTop());
    System.out.println("\tright: " + normalizedBoundingBox.getRight());
    System.out.println("\tbottom: " + normalizedBoundingBox.getBottom());
    return results;
  }
}

Node.js

// Imports the Google Cloud Video Intelligence library
const Video = require('@google-cloud/video-intelligence');
const fs = require('fs');
const util = require('util');
// Creates a client
const video = new Video.VideoIntelligenceServiceClient();
/**
 * TODO(developer): Uncomment the following line before running the sample.
 */
// const path = 'Local file to analyze, e.g. ./my-file.mp4';

// Reads a local video file and converts it to base64
const file = await util.promisify(fs.readFile)(path);
const inputContent = file.toString('base64');

const request = {
  inputContent: inputContent,
  features: ['OBJECT_TRACKING'],
  //recommended to use us-east1 for the best latency due to different types of processors used in this region and others
  locationId: 'us-east1',
};
// Detects objects in a video
const [operation] = await video.annotateVideo(request);
const results = await operation.promise();
console.log('Waiting for operation to complete...');
//Gets annotations for video
const annotations = results[0].annotationResults[0];
const objects = annotations.objectAnnotations;
objects.forEach(object => {
  console.log(`Entity description:  ${object.entity.description}`);
  console.log(`Entity id: ${object.entity.entityId}`);
  const time = object.segment;
  console.log(
    `Segment: ${time.startTimeOffset.seconds || 0}` +
      `.${(time.startTimeOffset.nanos / 1e6).toFixed(0)}s to ${
        time.endTimeOffset.seconds || 0
      }.` +
      `${(time.endTimeOffset.nanos / 1e6).toFixed(0)}s`
  );
  console.log(`Confidence: ${object.confidence}`);
  const frame = object.frames[0];
  const box = frame.normalizedBoundingBox;
  const timeOffset = frame.timeOffset;
  console.log(
    `Time offset for the first frame: ${timeOffset.seconds || 0}` +
      `.${(timeOffset.nanos / 1e6).toFixed(0)}s`
  );
  console.log('Bounding box position:');
  console.log(` left   :${box.left}`);
  console.log(` top    :${box.top}`);
  console.log(` right  :${box.right}`);
  console.log(` bottom :${box.bottom}`);
});

Python

"""Object tracking in a local video."""
from google.cloud import videointelligence

video_client = videointelligence.VideoIntelligenceServiceClient()
features = [videointelligence.Feature.OBJECT_TRACKING]

with io.open(path, "rb") as file:
    input_content = file.read()

operation = video_client.annotate_video(
    request={"features": features, "input_content": input_content}
)
print("\nProcessing video for object annotations.")

result = operation.result(timeout=500)
print("\nFinished processing.\n")

# The first result is retrieved because a single video was processed.
object_annotations = result.annotation_results[0].object_annotations

# Get only the first annotation for demo purposes.
object_annotation = object_annotations[0]
print("Entity description: {}".format(object_annotation.entity.description))
if object_annotation.entity.entity_id:
    print("Entity id: {}".format(object_annotation.entity.entity_id))

print(
    "Segment: {}s to {}s".format(
        object_annotation.segment.start_time_offset.seconds
        + object_annotation.segment.start_time_offset.microseconds / 1e6,
        object_annotation.segment.end_time_offset.seconds
        + object_annotation.segment.end_time_offset.microseconds / 1e6,
    )
)

print("Confidence: {}".format(object_annotation.confidence))

# Here we print only the bounding box of the first frame in this segment
frame = object_annotation.frames[0]
box = frame.normalized_bounding_box
print(
    "Time offset of the first frame: {}s".format(
        frame.time_offset.seconds + frame.time_offset.microseconds / 1e6
    )
)
print("Bounding box position:")
print("\tleft  : {}".format(box.left))
print("\ttop   : {}".format(box.top))
print("\tright : {}".format(box.right))
print("\tbottom: {}".format(box.bottom))
print("\n")

Idiomas adicionais

C#: Siga as instruções de configuração do C# na página das bibliotecas cliente e, em seguida, visite a documentação de referência da Video Intelligence API para .NET.

PHP: Siga as instruções de configuração do PHP na página das bibliotecas cliente e, em seguida, visite a documentação de referência da Video Intelligence API para PHP.

Ruby: Siga as instruções de configuração do Ruby na página das bibliotecas cliente e, em seguida, visite a documentação de referência da Video Intelligence API para Ruby.

Monitorize objetos Mantenha tudo organizado com as coleções Salve e categorize o conteúdo com base nas suas preferências.

Acompanhamento de objetos vs. deteção de etiquetas

Peça o acompanhamento de objetos para um vídeo no Cloud Storage

REST

Envie o pedido de processamento

curl (Linux, macOS ou Cloud Shell)

PowerShell (Windows)

Obtenha os resultados

curl (Linux, macOS ou Cloud Shell)

PowerShell (Windows)

Resposta

Transfira os resultados das anotações

Go

Java

Node.js

Python

Idiomas adicionais

Peça o acompanhamento de objetos para vídeo a partir de um ficheiro local

REST

Envie o pedido de processamento

curl (Linux, macOS ou Cloud Shell)

PowerShell (Windows)

Resposta

Obtenha os resultados

curl (Linux, macOS ou Cloud Shell)

PowerShell (Windows)

Resposta

Go

Java

Node.js

Python

Idiomas adicionais

Monitorize objetos