Melacak objek

Pelacakan objek melacak objek yang terdeteksi di video input. Untuk membuat permintaan pelacakan objek, panggil metode annotate dan tentukan OBJECT_TRACKING di kolom features.

Untuk entity dan lokasi spasial yang terdeteksi dalam segmen video atau video, permintaan pelacakan objek akan menganotasi video dengan label yang sesuai untuk entity dan lokasi spasial tersebut. Misalnya, video kendaraan yang melintasi sinyal lalu lintas dapat menghasilkan label seperti "mobil", "truk", "sepeda", "ban", "lampu", "jendela", dan sebagainya. Setiap label dapat menyertakan serangkaian kotak pembatas, dengan setiap kotak pembatas memiliki segmen waktu terkait yang berisi offset waktu yang menunjukkan selisih durasi dari awal video. Anotasi juga berisi informasi entity tambahan, termasuk ID entity yang dapat Anda gunakan untuk menemukan informasi selengkapnya tentang entity di Search API Pustaka Pengetahuan Google.

Pelacakan objek vs. deteksi label

Pelacakan objek berbeda dengan deteksi label. Deteksi label memberikan label tanpa kotak pembatas, sedangkan pelacakan objek memberikan label setiap objek yang ada dalam video tertentu bersama dengan kotak pembatas setiap instance objek di setiap langkah waktu.

Beberapa instance dari jenis objek yang sama akan ditetapkan ke instance pesan ObjectTrackingAnnotation yang berbeda, dengan semua kemunculan jalur objek tertentu akan disimpan dalam instance ObjectTrackingAnnotation-nya sendiri. Misalnya, jika ada mobil merah dan mobil biru yang muncul selama 5 detik dalam video, permintaan pelacakan akan menampilkan dua instance ObjectTrackingAnnotation. Instance pertama akan berisi lokasi salah satu dari dua mobil, misalnya, mobil merah, sedangkan instance kedua akan berisi lokasi mobil lain.

Meminta pelacakan objek untuk video di Cloud Storage

Contoh berikut menunjukkan pelacakan objek pada file yang terletak di Cloud Storage.

REST

Mengirim permintaan proses

Berikut ini cara mengirim permintaan POST ke metode annotate. Contoh ini menggunakan token akses untuk akun layanan yang disiapkan untuk project menggunakan Google Cloud CLI. Untuk mengetahui petunjuk cara menginstal Google Cloud CLI, menyiapkan project dengan akun layanan, dan mendapatkan token akses, lihat Panduan memulai Video Intelligence.

Sebelum menggunakan salah satu data permintaan, lakukan penggantian berikut:

  • INPUT_URI: STORAGE_URI
    Contoh:
    "inputUri": "gs://cloud-videointelligence-demo/assistant.mp4",
  • PROJECT_NUMBER: ID numerik untuk project Google Cloud Anda

Metode HTTP dan URL:

POST https://videointelligence.googleapis.com/v1/videos:annotate

Meminta isi JSON:

{
  "inputUri": "STORAGE_URI",
  "features": ["OBJECT_TRACKING"]
}

Untuk mengirim permintaan Anda, perluas salah satu opsi berikut:

Anda akan melihat respons JSON seperti berikut:

{
  "name": "projects/PROJECT_NUMBER/locations/LOCATION_ID/operations/OPERATION_ID"
}

Jika permintaan berhasil, Video Intelligence API akan menampilkan name operasi Anda. Di atas menunjukkan contoh respons tersebut, dengan PROJECT_NUMBER adalah nomor project Anda dan OPERATION_ID adalah ID operasi yang berjalan lama yang dibuat untuk permintaan tersebut.

Mendapatkan hasil

Untuk mendapatkan hasil permintaan, kirim GET menggunakan nama operasi yang ditampilkan dari panggilan ke videos:annotate, seperti yang ditunjukkan dalam contoh berikut.

Sebelum menggunakan salah satu data permintaan, lakukan penggantian berikut:

  • OPERATION_NAME: nama operasi seperti yang ditampilkan oleh Video Intelligence API. Nama operasi memiliki format projects/PROJECT_NUMBER/locations/LOCATION_ID/operations/OPERATION_ID
  • PROJECT_NUMBER: ID numerik untuk project Google Cloud Anda

Metode HTTP dan URL:

GET https://videointelligence.googleapis.com/v1/OPERATION_NAME

Untuk mengirim permintaan, perluas salah satu opsi berikut:

Anda akan menerima respons JSON yang mirip dengan yang berikut ini:

Download hasil anotasi

Salin anotasi dari sumber ke bucket tujuan: (lihat Menyalin file dan objek)

gsutil cp gcs_uri gs://my-bucket

Catatan: Jika output gcs uri disediakan oleh pengguna, anotasi akan disimpan dalam uri gcs tersebut.

Go


import (
	"context"
	"fmt"
	"io"

	video "cloud.google.com/go/videointelligence/apiv1"
	videopb "cloud.google.com/go/videointelligence/apiv1/videointelligencepb"
	"github.com/golang/protobuf/ptypes"
)

// objectTrackingGCS analyzes a video and extracts entities with their bounding boxes.
func objectTrackingGCS(w io.Writer, gcsURI string) error {
	// gcsURI := "gs://cloud-samples-data/video/cat.mp4"

	ctx := context.Background()

	// Creates a client.
	client, err := video.NewClient(ctx)
	if err != nil {
		return fmt.Errorf("video.NewClient: %w", err)
	}
	defer client.Close()

	op, err := client.AnnotateVideo(ctx, &videopb.AnnotateVideoRequest{
		InputUri: gcsURI,
		Features: []videopb.Feature{
			videopb.Feature_OBJECT_TRACKING,
		},
	})
	if err != nil {
		return fmt.Errorf("AnnotateVideo: %w", err)
	}

	resp, err := op.Wait(ctx)
	if err != nil {
		return fmt.Errorf("Wait: %w", err)
	}

	// Only one video was processed, so get the first result.
	result := resp.GetAnnotationResults()[0]

	for _, annotation := range result.ObjectAnnotations {
		fmt.Fprintf(w, "Description: %q\n", annotation.Entity.GetDescription())
		if len(annotation.Entity.EntityId) > 0 {
			fmt.Fprintf(w, "\tEntity ID: %q\n", annotation.Entity.GetEntityId())
		}

		segment := annotation.GetSegment()
		start, _ := ptypes.Duration(segment.GetStartTimeOffset())
		end, _ := ptypes.Duration(segment.GetEndTimeOffset())
		fmt.Fprintf(w, "\tSegment: %v to %v\n", start, end)

		fmt.Fprintf(w, "\tConfidence: %f\n", annotation.GetConfidence())

		// Here we print only the bounding box of the first frame in this segment.
		frame := annotation.GetFrames()[0]
		seconds := float32(frame.GetTimeOffset().GetSeconds())
		nanos := float32(frame.GetTimeOffset().GetNanos())
		fmt.Fprintf(w, "\tTime offset of the first frame: %fs\n", seconds+nanos/1e9)

		box := frame.GetNormalizedBoundingBox()
		fmt.Fprintf(w, "\tBounding box position:\n")
		fmt.Fprintf(w, "\t\tleft  : %f\n", box.GetLeft())
		fmt.Fprintf(w, "\t\ttop   : %f\n", box.GetTop())
		fmt.Fprintf(w, "\t\tright : %f\n", box.GetRight())
		fmt.Fprintf(w, "\t\tbottom: %f\n", box.GetBottom())
	}

	return nil
}

Java

/**
 * Track objects in a video.
 *
 * @param gcsUri the path to the video file to analyze.
 */
public static VideoAnnotationResults trackObjectsGcs(String gcsUri) throws Exception {
  try (VideoIntelligenceServiceClient client = VideoIntelligenceServiceClient.create()) {
    // Create the request
    AnnotateVideoRequest request =
        AnnotateVideoRequest.newBuilder()
            .setInputUri(gcsUri)
            .addFeatures(Feature.OBJECT_TRACKING)
            .setLocationId("us-east1")
            .build();

    // asynchronously perform object tracking on videos
    OperationFuture<AnnotateVideoResponse, AnnotateVideoProgress> future =
        client.annotateVideoAsync(request);

    System.out.println("Waiting for operation to complete...");
    // The first result is retrieved because a single video was processed.
    AnnotateVideoResponse response = future.get(450, TimeUnit.SECONDS);
    VideoAnnotationResults results = response.getAnnotationResults(0);

    // Get only the first annotation for demo purposes.
    ObjectTrackingAnnotation annotation = results.getObjectAnnotations(0);
    System.out.println("Confidence: " + annotation.getConfidence());

    if (annotation.hasEntity()) {
      Entity entity = annotation.getEntity();
      System.out.println("Entity description: " + entity.getDescription());
      System.out.println("Entity id:: " + entity.getEntityId());
    }

    if (annotation.hasSegment()) {
      VideoSegment videoSegment = annotation.getSegment();
      Duration startTimeOffset = videoSegment.getStartTimeOffset();
      Duration endTimeOffset = videoSegment.getEndTimeOffset();
      // Display the segment time in seconds, 1e9 converts nanos to seconds
      System.out.println(
          String.format(
              "Segment: %.2fs to %.2fs",
              startTimeOffset.getSeconds() + startTimeOffset.getNanos() / 1e9,
              endTimeOffset.getSeconds() + endTimeOffset.getNanos() / 1e9));
    }

    // Here we print only the bounding box of the first frame in this segment.
    ObjectTrackingFrame frame = annotation.getFrames(0);
    // Display the offset time in seconds, 1e9 converts nanos to seconds
    Duration timeOffset = frame.getTimeOffset();
    System.out.println(
        String.format(
            "Time offset of the first frame: %.2fs",
            timeOffset.getSeconds() + timeOffset.getNanos() / 1e9));

    // Display the bounding box of the detected object
    NormalizedBoundingBox normalizedBoundingBox = frame.getNormalizedBoundingBox();
    System.out.println("Bounding box position:");
    System.out.println("\tleft: " + normalizedBoundingBox.getLeft());
    System.out.println("\ttop: " + normalizedBoundingBox.getTop());
    System.out.println("\tright: " + normalizedBoundingBox.getRight());
    System.out.println("\tbottom: " + normalizedBoundingBox.getBottom());
    return results;
  }
}

Node.js

Untuk mengautentikasi ke Video Intelligence, siapkan Kredensial Default Aplikasi. Untuk mengetahui informasi selengkapnya, baca Menyiapkan autentikasi untuk lingkungan pengembangan lokal.

// Imports the Google Cloud Video Intelligence library
const Video = require('@google-cloud/video-intelligence');

// Creates a client
const video = new Video.VideoIntelligenceServiceClient();

/**
 * TODO(developer): Uncomment the following line before running the sample.
 */
// const gcsUri = 'GCS URI of the video to analyze, e.g. gs://my-bucket/my-video.mp4';

const request = {
  inputUri: gcsUri,
  features: ['OBJECT_TRACKING'],
  //recommended to use us-east1 for the best latency due to different types of processors used in this region and others
  locationId: 'us-east1',
};
// Detects objects in a video
const [operation] = await video.annotateVideo(request);
const results = await operation.promise();
console.log('Waiting for operation to complete...');
//Gets annotations for video
const annotations = results[0].annotationResults[0];
const objects = annotations.objectAnnotations;
objects.forEach(object => {
  console.log(`Entity description:  ${object.entity.description}`);
  console.log(`Entity id: ${object.entity.entityId}`);
  const time = object.segment;
  console.log(
    `Segment: ${time.startTimeOffset.seconds || 0}` +
      `.${(time.startTimeOffset.nanos / 1e6).toFixed(0)}s to ${
        time.endTimeOffset.seconds || 0
      }.` +
      `${(time.endTimeOffset.nanos / 1e6).toFixed(0)}s`
  );
  console.log(`Confidence: ${object.confidence}`);
  const frame = object.frames[0];
  const box = frame.normalizedBoundingBox;
  const timeOffset = frame.timeOffset;
  console.log(
    `Time offset for the first frame: ${timeOffset.seconds || 0}` +
      `.${(timeOffset.nanos / 1e6).toFixed(0)}s`
  );
  console.log('Bounding box position:');
  console.log(` left   :${box.left}`);
  console.log(` top    :${box.top}`);
  console.log(` right  :${box.right}`);
  console.log(` bottom :${box.bottom}`);
});

Python

"""Object tracking in a video stored on GCS."""
from google.cloud import videointelligence

video_client = videointelligence.VideoIntelligenceServiceClient()
features = [videointelligence.Feature.OBJECT_TRACKING]
operation = video_client.annotate_video(
    request={"features": features, "input_uri": gcs_uri}
)
print("\nProcessing video for object annotations.")

result = operation.result(timeout=500)
print("\nFinished processing.\n")

# The first result is retrieved because a single video was processed.
object_annotations = result.annotation_results[0].object_annotations

for object_annotation in object_annotations:
    print("Entity description: {}".format(object_annotation.entity.description))
    if object_annotation.entity.entity_id:
        print("Entity id: {}".format(object_annotation.entity.entity_id))

    print(
        "Segment: {}s to {}s".format(
            object_annotation.segment.start_time_offset.seconds
            + object_annotation.segment.start_time_offset.microseconds / 1e6,
            object_annotation.segment.end_time_offset.seconds
            + object_annotation.segment.end_time_offset.microseconds / 1e6,
        )
    )

    print("Confidence: {}".format(object_annotation.confidence))

    # Here we print only the bounding box of the first frame in the segment
    frame = object_annotation.frames[0]
    box = frame.normalized_bounding_box
    print(
        "Time offset of the first frame: {}s".format(
            frame.time_offset.seconds + frame.time_offset.microseconds / 1e6
        )
    )
    print("Bounding box position:")
    print("\tleft  : {}".format(box.left))
    print("\ttop   : {}".format(box.top))
    print("\tright : {}".format(box.right))
    print("\tbottom: {}".format(box.bottom))
    print("\n")

Bahasa tambahan

C#: Ikuti Petunjuk penyiapan C# di halaman library klien lalu buka Dokumentasi referensi Video Intelligence untuk .NET.

PHP: Ikuti petunjuk penyiapan PHP di halaman library klien lalu kunjungi Dokumentasi referensi Video Intelligence untuk PHP.

Ruby: Ikuti petunjuk penyiapan Ruby di halaman library klien, lalu buka Dokumentasi referensi Video Intelligence untuk Ruby.

Meminta pelacakan objek untuk video dari file lokal

Contoh berikut menunjukkan pelacakan objek pada file yang disimpan secara lokal.

REST

Mengirim permintaan proses

Untuk menjalankan anotasi pada file video lokal, lakukan enkode base64 atas konten file video. Sertakan konten berenkode base64 di kolom inputContent pada permintaan. Untuk mengetahui informasi tentang cara mengenkode konten file video dengan base64, lihat Encoding Base64.

Berikut ini cara mengirim permintaan POST ke metode videos:annotate. Contoh ini menggunakan token akses untuk akun layanan yang disiapkan untuk project menggunakan Google Cloud CLI. Untuk mengetahui petunjuk cara menginstal Google Cloud CLI, menyiapkan project dengan akun layanan, dan mendapatkan token akses, lihat Panduan memulai Video Intelligence.

Sebelum menggunakan salah satu data permintaan, lakukan penggantian berikut:

  • inputContent: BASE64_ENCODED_CONTENT
    Contoh: "UklGRg41AwBBVkkgTElTVAwBAABoZHJsYXZpaDgAAAA1ggAAxPMBAAAAAAAQCAA..."
  • PROJECT_NUMBER: ID numerik untuk project Google Cloud Anda

Metode HTTP dan URL:

POST https://videointelligence.googleapis.com/v1/videos:annotate

Meminta isi JSON:

{
  "inputContent": "BASE64_ENCODED_CONTENT",
  "features": ["OBJECT_TRACKING"]
}

Untuk mengirim permintaan Anda, perluas salah satu opsi berikut:

Anda akan melihat respons JSON seperti berikut:

Jika permintaan berhasil, Video Intelligence name untuk operasi Anda. Berikut ini contoh respons tersebut, dengan PROJECT_NUMBER adalah nomor project Anda dan OPERATION_ID adalah ID operasi jangka panjang yang dibuat untuk permintaan tersebut.

Mendapatkan hasil

Untuk mendapatkan hasil permintaan, Anda harus mengirim GET, menggunakan nama operasi yang ditampilkan dari panggilan ke videos:annotate, seperti yang ditunjukkan dalam contoh berikut.

Sebelum menggunakan salah satu data permintaan, lakukan penggantian berikut:

  • OPERATION_NAME: nama operasi seperti yang ditampilkan oleh Video Intelligence API. Nama operasi memiliki format projects/PROJECT_NUMBER/locations/LOCATION_ID/operations/OPERATION_ID
  • PROJECT_NUMBER: ID numerik untuk project Google Cloud Anda

Metode HTTP dan URL:

GET https://videointelligence.googleapis.com/v1/OPERATION_NAME

Untuk mengirim permintaan, perluas salah satu opsi berikut:

Anda akan menerima respons JSON yang mirip dengan yang berikut ini:

Go


import (
	"context"
	"fmt"
	"io"
	"io/ioutil"

	video "cloud.google.com/go/videointelligence/apiv1"
	videopb "cloud.google.com/go/videointelligence/apiv1/videointelligencepb"
	"github.com/golang/protobuf/ptypes"
)

// objectTracking analyzes a video and extracts entities with their bounding boxes.
func objectTracking(w io.Writer, filename string) error {
	// filename := "../testdata/cat.mp4"

	ctx := context.Background()

	// Creates a client.
	client, err := video.NewClient(ctx)
	if err != nil {
		return fmt.Errorf("video.NewClient: %w", err)
	}
	defer client.Close()

	fileBytes, err := ioutil.ReadFile(filename)
	if err != nil {
		return err
	}

	op, err := client.AnnotateVideo(ctx, &videopb.AnnotateVideoRequest{
		InputContent: fileBytes,
		Features: []videopb.Feature{
			videopb.Feature_OBJECT_TRACKING,
		},
	})
	if err != nil {
		return fmt.Errorf("AnnotateVideo: %w", err)
	}

	resp, err := op.Wait(ctx)
	if err != nil {
		return fmt.Errorf("Wait: %w", err)
	}

	// Only one video was processed, so get the first result.
	result := resp.GetAnnotationResults()[0]

	for _, annotation := range result.ObjectAnnotations {
		fmt.Fprintf(w, "Description: %q\n", annotation.Entity.GetDescription())
		if len(annotation.Entity.EntityId) > 0 {
			fmt.Fprintf(w, "\tEntity ID: %q\n", annotation.Entity.GetEntityId())
		}

		segment := annotation.GetSegment()
		start, _ := ptypes.Duration(segment.GetStartTimeOffset())
		end, _ := ptypes.Duration(segment.GetEndTimeOffset())
		fmt.Fprintf(w, "\tSegment: %v to %v\n", start, end)

		fmt.Fprintf(w, "\tConfidence: %f\n", annotation.GetConfidence())

		// Here we print only the bounding box of the first frame in this segment.
		frame := annotation.GetFrames()[0]
		seconds := float32(frame.GetTimeOffset().GetSeconds())
		nanos := float32(frame.GetTimeOffset().GetNanos())
		fmt.Fprintf(w, "\tTime offset of the first frame: %fs\n", seconds+nanos/1e9)

		box := frame.GetNormalizedBoundingBox()
		fmt.Fprintf(w, "\tBounding box position:\n")
		fmt.Fprintf(w, "\t\tleft  : %f\n", box.GetLeft())
		fmt.Fprintf(w, "\t\ttop   : %f\n", box.GetTop())
		fmt.Fprintf(w, "\t\tright : %f\n", box.GetRight())
		fmt.Fprintf(w, "\t\tbottom: %f\n", box.GetBottom())
	}

	return nil
}

Java

/**
 * Track objects in a video.
 *
 * @param filePath the path to the video file to analyze.
 */
public static VideoAnnotationResults trackObjects(String filePath) throws Exception {
  try (VideoIntelligenceServiceClient client = VideoIntelligenceServiceClient.create()) {
    // Read file
    Path path = Paths.get(filePath);
    byte[] data = Files.readAllBytes(path);

    // Create the request
    AnnotateVideoRequest request =
        AnnotateVideoRequest.newBuilder()
            .setInputContent(ByteString.copyFrom(data))
            .addFeatures(Feature.OBJECT_TRACKING)
            .setLocationId("us-east1")
            .build();

    // asynchronously perform object tracking on videos
    OperationFuture<AnnotateVideoResponse, AnnotateVideoProgress> future =
        client.annotateVideoAsync(request);

    System.out.println("Waiting for operation to complete...");
    // The first result is retrieved because a single video was processed.
    AnnotateVideoResponse response = future.get(450, TimeUnit.SECONDS);
    VideoAnnotationResults results = response.getAnnotationResults(0);

    // Get only the first annotation for demo purposes.
    ObjectTrackingAnnotation annotation = results.getObjectAnnotations(0);
    System.out.println("Confidence: " + annotation.getConfidence());

    if (annotation.hasEntity()) {
      Entity entity = annotation.getEntity();
      System.out.println("Entity description: " + entity.getDescription());
      System.out.println("Entity id:: " + entity.getEntityId());
    }

    if (annotation.hasSegment()) {
      VideoSegment videoSegment = annotation.getSegment();
      Duration startTimeOffset = videoSegment.getStartTimeOffset();
      Duration endTimeOffset = videoSegment.getEndTimeOffset();
      // Display the segment time in seconds, 1e9 converts nanos to seconds
      System.out.println(
          String.format(
              "Segment: %.2fs to %.2fs",
              startTimeOffset.getSeconds() + startTimeOffset.getNanos() / 1e9,
              endTimeOffset.getSeconds() + endTimeOffset.getNanos() / 1e9));
    }

    // Here we print only the bounding box of the first frame in this segment.
    ObjectTrackingFrame frame = annotation.getFrames(0);
    // Display the offset time in seconds, 1e9 converts nanos to seconds
    Duration timeOffset = frame.getTimeOffset();
    System.out.println(
        String.format(
            "Time offset of the first frame: %.2fs",
            timeOffset.getSeconds() + timeOffset.getNanos() / 1e9));

    // Display the bounding box of the detected object
    NormalizedBoundingBox normalizedBoundingBox = frame.getNormalizedBoundingBox();
    System.out.println("Bounding box position:");
    System.out.println("\tleft: " + normalizedBoundingBox.getLeft());
    System.out.println("\ttop: " + normalizedBoundingBox.getTop());
    System.out.println("\tright: " + normalizedBoundingBox.getRight());
    System.out.println("\tbottom: " + normalizedBoundingBox.getBottom());
    return results;
  }
}

Node.js

Untuk mengautentikasi ke Video Intelligence, siapkan Kredensial Default Aplikasi. Untuk mengetahui informasi selengkapnya, baca Menyiapkan autentikasi untuk lingkungan pengembangan lokal.

// Imports the Google Cloud Video Intelligence library
const Video = require('@google-cloud/video-intelligence');
const fs = require('fs');
const util = require('util');
// Creates a client
const video = new Video.VideoIntelligenceServiceClient();
/**
 * TODO(developer): Uncomment the following line before running the sample.
 */
// const path = 'Local file to analyze, e.g. ./my-file.mp4';

// Reads a local video file and converts it to base64
const file = await util.promisify(fs.readFile)(path);
const inputContent = file.toString('base64');

const request = {
  inputContent: inputContent,
  features: ['OBJECT_TRACKING'],
  //recommended to use us-east1 for the best latency due to different types of processors used in this region and others
  locationId: 'us-east1',
};
// Detects objects in a video
const [operation] = await video.annotateVideo(request);
const results = await operation.promise();
console.log('Waiting for operation to complete...');
//Gets annotations for video
const annotations = results[0].annotationResults[0];
const objects = annotations.objectAnnotations;
objects.forEach(object => {
  console.log(`Entity description:  ${object.entity.description}`);
  console.log(`Entity id: ${object.entity.entityId}`);
  const time = object.segment;
  console.log(
    `Segment: ${time.startTimeOffset.seconds || 0}` +
      `.${(time.startTimeOffset.nanos / 1e6).toFixed(0)}s to ${
        time.endTimeOffset.seconds || 0
      }.` +
      `${(time.endTimeOffset.nanos / 1e6).toFixed(0)}s`
  );
  console.log(`Confidence: ${object.confidence}`);
  const frame = object.frames[0];
  const box = frame.normalizedBoundingBox;
  const timeOffset = frame.timeOffset;
  console.log(
    `Time offset for the first frame: ${timeOffset.seconds || 0}` +
      `.${(timeOffset.nanos / 1e6).toFixed(0)}s`
  );
  console.log('Bounding box position:');
  console.log(` left   :${box.left}`);
  console.log(` top    :${box.top}`);
  console.log(` right  :${box.right}`);
  console.log(` bottom :${box.bottom}`);
});

Python

"""Object tracking in a local video."""
from google.cloud import videointelligence

video_client = videointelligence.VideoIntelligenceServiceClient()
features = [videointelligence.Feature.OBJECT_TRACKING]

with io.open(path, "rb") as file:
    input_content = file.read()

operation = video_client.annotate_video(
    request={"features": features, "input_content": input_content}
)
print("\nProcessing video for object annotations.")

result = operation.result(timeout=500)
print("\nFinished processing.\n")

# The first result is retrieved because a single video was processed.
object_annotations = result.annotation_results[0].object_annotations

# Get only the first annotation for demo purposes.
object_annotation = object_annotations[0]
print("Entity description: {}".format(object_annotation.entity.description))
if object_annotation.entity.entity_id:
    print("Entity id: {}".format(object_annotation.entity.entity_id))

print(
    "Segment: {}s to {}s".format(
        object_annotation.segment.start_time_offset.seconds
        + object_annotation.segment.start_time_offset.microseconds / 1e6,
        object_annotation.segment.end_time_offset.seconds
        + object_annotation.segment.end_time_offset.microseconds / 1e6,
    )
)

print("Confidence: {}".format(object_annotation.confidence))

# Here we print only the bounding box of the first frame in this segment
frame = object_annotation.frames[0]
box = frame.normalized_bounding_box
print(
    "Time offset of the first frame: {}s".format(
        frame.time_offset.seconds + frame.time_offset.microseconds / 1e6
    )
)
print("Bounding box position:")
print("\tleft  : {}".format(box.left))
print("\ttop   : {}".format(box.top))
print("\tright : {}".format(box.right))
print("\tbottom: {}".format(box.bottom))
print("\n")

Bahasa tambahan

C#: Ikuti Petunjuk penyiapan C# di halaman library klien lalu buka Dokumentasi referensi Video Intelligence untuk .NET.

PHP: Ikuti petunjuk penyiapan PHP di halaman library klien lalu kunjungi Dokumentasi referensi Video Intelligence untuk PHP.

Ruby: Ikuti petunjuk penyiapan Ruby di halaman library klien, lalu buka Dokumentasi referensi Video Intelligence untuk Ruby.