Analizar etiquetas en vídeos

El análisis de etiquetas sirve para detectar etiquetas en los vídeos.

Esta sección muestra formas de analizar un vídeo para detectar tales etiquetas.

Este es un ejemplo de análisis de vídeos para detectar etiquetas en un archivo local.

¿Buscas algo más exhaustivo? Consulta nuestro tutorial detallado de Python.

Protocolo

Consulta el punto de conexión de la API videos:annotate para obtener detalles completos.

Para llevar a cabo la detección de etiquetas, realiza una petición POST y proporcione el cuerpo apropiado:

POST https://videointelligence.googleapis.com/v1/videos:annotate?key=YOUR_API_KEY
{
  "inputContent": "/9j/7QBEUGhvdG9zaG9...base64-encoded-video-content...fXNWzvDEeYxxxzj/Coa6Bax//Z",
  "features": ["LABEL_DETECTION"]
}

Las peticiones de anotación correctas de Video Intelligence devuelven una respuesta con un solo campo de nombre:

{
  "name": "us-west1.16680573"
}

Este nombre representa una operación de larga duración, que se puede consultar utilizando la API v1.operations.

Para recuperar tu respuesta de anotación de vídeo, envía una petición GET al punto de conexión v1.operations, pasando el valor de name en la URL. Si se ha completado la operación, devolverá tus resultados de anotación.

Las anotaciones de búsqueda de etiquetas se devuelven en los resultados de annotationResults. Por ejemplo:

{
  "name": "us-east1.7397809392042093732",
  "metadata": {
    "@type": "type.googleapis.com/google.cloud.videointelligence.v1.AnnotateVideoProgress",
    "annotationProgress": [
      {
        "inputContent": "/9j/7QBEUGhvdG9zaG9...base64-encoded-video-content...fXNWzvDEeYxxxzj/Coa6Bax//Z",
        "progressPercent": 100,
        "startTime": "2017-05-18T21:14:35.235527Z",
        "updateTime": "2017-05-18T21:14:42.665369Z"
      }
    ]
  },
  "done": true,
  "response": {
    "@type": "type.googleapis.com/google.cloud.videointelligence.v1.AnnotateVideoResponse",
    "annotationResults": [
          {
            "inputUri": "/demomaker/cat.mp4",
            "segmentLabelAnnotations": [
              {
                "entity": {
                  "entityId": "/m/01yrx",
                  "description": "cat",
                  "languageCode": "en-US"
                },
                "categoryEntities": [
                  {
                    "entityId": "/m/068hy",
                    "description": "pet",
                    "languageCode": "en-US"
                  }
                ],
                "segments": [
                  {
                    "segment": {
                      "startTimeOffset": "0s",
                      "endTimeOffset": "14.833664s"
                    },
                    "confidence": 0.98509187
                  }
                ]
              },
              {
                "entity": {
                  "entityId": "/m/0jbk",
                  "description": "animal",
                  "languageCode": "en-US"
                },
                "segments": [
                  {
                    "segment": {
                      "startTimeOffset": "0s",
                      "endTimeOffset": "14.833664s"
                    },
                    "confidence": 0.9809588
                  }
                ]
              },
              {
                "entity": {
                  "entityId": "/m/068hy",
                  "description": "pet",
                  "languageCode": "en-US"
                },
                "categoryEntities": [
                  {
                    "entityId": "/m/0jbk",
                    "description": "animal",
                    "languageCode": "en-US"
                  }
                ],
                "segments": [
                  {
                    "segment": {
                      "startTimeOffset": "0s",
                      "endTimeOffset": "14.833664s"
                    },
                    "confidence": 0.9382622
                  }
                ]
              },
              {
                "entity": {
                  "entityId": "/m/05h0n",
                  "description": "nature",
                  "languageCode": "en-US"
                },
                "segments": [
                  {
                    "segment": {
                      "startTimeOffset": "0s",
                      "endTimeOffset": "14.833664s"
                    },
                    "confidence": 0.8411303
                  }
                ]
              },
              {
                "entity": {
                  "entityId": "/m/07k6w8",
                  "description": "small to medium sized cats",
                  "languageCode": "en-US"
                },
                "categoryEntities": [
                  {
                    "entityId": "/m/04rky",
                    "description": "mammal",
                    "languageCode": "en-US"
                  }
                ],
                "segments": [
                  {
                    "segment": {
                      "startTimeOffset": "0s",
                      "endTimeOffset": "14.833664s"
                    },
                    "confidence": 0.8077077
                  }
                ]
              },
             <snip>
            ]
          }
        ]

C#

public static object AnalyzeLabels(string path)
{
    var client = VideoIntelligenceServiceClient.Create();
    var request = new AnnotateVideoRequest()
    {
        InputContent = Google.Protobuf.ByteString.CopyFrom(File.ReadAllBytes(path)),
        Features = { Feature.LabelDetection }
    };
    var op = client.AnnotateVideo(request).PollUntilCompleted();
    foreach (var result in op.Result.AnnotationResults)
    {
        PrintLabels("Video", result.SegmentLabelAnnotations);
        PrintLabels("Shot", result.ShotLabelAnnotations);
        PrintLabels("Frame", result.FrameLabelAnnotations);
    }
    return 0;
}

static void PrintLabels(string labelName,
    IEnumerable<LabelAnnotation> labelAnnotations)
{
    foreach (var annotation in labelAnnotations)
    {
        Console.WriteLine($"{labelName} label: {annotation.Entity.Description}");
        foreach (var entity in annotation.CategoryEntities)
        {
            Console.WriteLine($"{labelName} label category: {entity.Description}");
        }
        foreach (var segment in annotation.Segments)
        {
            Console.Write("Segment location: ");
            Console.Write(segment.Segment.StartTimeOffset);
            Console.Write(":");
            Console.WriteLine(segment.Segment.EndTimeOffset);
            System.Console.WriteLine($"Confidence: {segment.Confidence}");
        }
    }
}

Go

func label(w io.Writer, file string) error {
	ctx := context.Background()
	client, err := video.NewClient(ctx)
	if err != nil {
		return err
	}

	fileBytes, err := ioutil.ReadFile(file)
	if err != nil {
		return err
	}

	op, err := client.AnnotateVideo(ctx, &videopb.AnnotateVideoRequest{
		Features: []videopb.Feature{
			videopb.Feature_LABEL_DETECTION,
		},
		InputContent: fileBytes,
	})
	if err != nil {
		return err
	}
	resp, err := op.Wait(ctx)
	if err != nil {
		return err
	}

	printLabels := func(labels []*videopb.LabelAnnotation) {
		for _, label := range labels {
			fmt.Fprintf(w, "\tDescription: %s\n", label.Entity.Description)
			for _, category := range label.CategoryEntities {
				fmt.Fprintf(w, "\t\tCategory: %s\n", category.Description)
			}
			for _, segment := range label.Segments {
				start, _ := ptypes.Duration(segment.Segment.StartTimeOffset)
				end, _ := ptypes.Duration(segment.Segment.EndTimeOffset)
				fmt.Fprintf(w, "\t\tSegment: %s to %s\n", start, end)
			}
		}
	}

	// A single video was processed. Get the first result.
	result := resp.AnnotationResults[0]

	fmt.Fprintln(w, "SegmentLabelAnnotations:")
	printLabels(result.SegmentLabelAnnotations)
	fmt.Fprintln(w, "ShotLabelAnnotations:")
	printLabels(result.ShotLabelAnnotations)
	fmt.Fprintln(w, "FrameLabelAnnotations:")
	printLabels(result.FrameLabelAnnotations)

	return nil
}

Java

// Instantiate a com.google.cloud.videointelligence.v1.VideoIntelligenceServiceClient
try (VideoIntelligenceServiceClient client = VideoIntelligenceServiceClient.create()) {
  // Read file and encode into Base64
  Path path = Paths.get(filePath);
  byte[] data = Files.readAllBytes(path);
  byte[] encodedBytes = Base64.encodeBase64(data);

  AnnotateVideoRequest request = AnnotateVideoRequest.newBuilder()
      .setInputContent(ByteString.copyFrom(encodedBytes))
      .addFeatures(Feature.LABEL_DETECTION)
      .build();

  // Create an operation that will contain the response when the operation completes.
  OperationFuture<AnnotateVideoResponse, AnnotateVideoProgress> response =
      client.annotateVideoAsync(request);

  System.out.println("Waiting for operation to complete...");
  for (VideoAnnotationResults results : response.get().getAnnotationResultsList()) {
    // process video / segment level label annotations
    System.out.println("Locations: ");
    for (LabelAnnotation labelAnnotation : results.getSegmentLabelAnnotationsList()) {
      System.out
          .println("Video label: " + labelAnnotation.getEntity().getDescription());
      // categories
      for (Entity categoryEntity : labelAnnotation.getCategoryEntitiesList()) {
        System.out.println("Video label category: " + categoryEntity.getDescription());
      }
      // segments
      for (LabelSegment segment : labelAnnotation.getSegmentsList()) {
        double startTime = segment.getSegment().getStartTimeOffset().getSeconds()
            + segment.getSegment().getStartTimeOffset().getNanos() / 1e9;
        double endTime = segment.getSegment().getEndTimeOffset().getSeconds()
            + segment.getSegment().getEndTimeOffset().getNanos() / 1e9;
        System.out.printf("Segment location: %.3f:%.2f\n", startTime, endTime);
        System.out.println("Confidence: " + segment.getConfidence());
      }
    }

    // process shot label annotations
    for (LabelAnnotation labelAnnotation : results.getShotLabelAnnotationsList()) {
      System.out
          .println("Shot label: " + labelAnnotation.getEntity().getDescription());
      // categories
      for (Entity categoryEntity : labelAnnotation.getCategoryEntitiesList()) {
        System.out.println("Shot label category: " + categoryEntity.getDescription());
      }
      // segments
      for (LabelSegment segment : labelAnnotation.getSegmentsList()) {
        double startTime = segment.getSegment().getStartTimeOffset().getSeconds()
            + segment.getSegment().getStartTimeOffset().getNanos() / 1e9;
        double endTime = segment.getSegment().getEndTimeOffset().getSeconds()
            + segment.getSegment().getEndTimeOffset().getNanos() / 1e9;
        System.out.printf("Segment location: %.3f:%.2f\n", startTime, endTime);
        System.out.println("Confidence: " + segment.getConfidence());
      }
    }

    // process frame label annotations
    for (LabelAnnotation labelAnnotation : results.getFrameLabelAnnotationsList()) {
      System.out
          .println("Frame label: " + labelAnnotation.getEntity().getDescription());
      // categories
      for (Entity categoryEntity : labelAnnotation.getCategoryEntitiesList()) {
        System.out.println("Frame label category: " + categoryEntity.getDescription());
      }
      // segments
      for (LabelSegment segment : labelAnnotation.getSegmentsList()) {
        double startTime = segment.getSegment().getStartTimeOffset().getSeconds()
            + segment.getSegment().getStartTimeOffset().getNanos() / 1e9;
        double endTime = segment.getSegment().getEndTimeOffset().getSeconds()
            + segment.getSegment().getEndTimeOffset().getNanos() / 1e9;
        System.out.printf("Segment location: %.3f:%.2f\n", startTime, endTime);
        System.out.println("Confidence: " + segment.getConfidence());
      }
    }
  }
}

Node.js

// Imports the Google Cloud Video Intelligence library + Node's fs library
const video = require('@google-cloud/video-intelligence').v1;
const fs = require('fs');

// Creates a client
const client = new video.VideoIntelligenceServiceClient();

/**
 * TODO(developer): Uncomment the following line before running the sample.
 */
// const path = 'Local file to analyze, e.g. ./my-file.mp4';

// Reads a local video file and converts it to base64
const file = fs.readFileSync(path);
const inputContent = file.toString('base64');

// Constructs request
const request = {
  inputContent: inputContent,
  features: ['LABEL_DETECTION'],
};

// Detects labels in a video
client
  .annotateVideo(request)
  .then(results => {
    const operation = results[0];
    console.log('Waiting for operation to complete...');
    return operation.promise();
  })
  .then(results => {
    // Gets annotations for video
    const annotations = results[0].annotationResults[0];

    const labels = annotations.segmentLabelAnnotations;
    labels.forEach(label => {
      console.log(`Label ${label.entity.description} occurs at:`);
      label.segments.forEach(segment => {
        let time = segment.segment;
        if (time.startTimeOffset.seconds === undefined) {
          time.startTimeOffset.seconds = 0;
        }
        if (time.startTimeOffset.nanos === undefined) {
          time.startTimeOffset.nanos = 0;
        }
        if (time.endTimeOffset.seconds === undefined) {
          time.endTimeOffset.seconds = 0;
        }
        if (time.endTimeOffset.nanos === undefined) {
          time.endTimeOffset.nanos = 0;
        }
        console.log(
          `\tStart: ${time.startTimeOffset.seconds}` +
            `.${(time.startTimeOffset.nanos / 1e6).toFixed(0)}s`
        );
        console.log(
          `\tEnd: ${time.endTimeOffset.seconds}.` +
            `${(time.endTimeOffset.nanos / 1e6).toFixed(0)}s`
        );
        console.log(`\tConfidence: ${segment.confidence}`);
      });
    });
  })
  .catch(err => {
    console.error('ERROR:', err);
  });

Python

Para obtener más información sobre la instalación y el uso de la biblioteca cliente de la API Video Intelligence de Cloud para Python, consulta las bibliotecas cliente de la API Video Intelligence de Cloud.
def analyze_labels_file(path):
    """Detect labels given a file path."""
    video_client = videointelligence.VideoIntelligenceServiceClient()
    features = [videointelligence.enums.Feature.LABEL_DETECTION]

    with io.open(path, 'rb') as movie:
        input_content = movie.read()

    operation = video_client.annotate_video(
        features=features, input_content=input_content)
    print('\nProcessing video for label annotations:')

    result = operation.result(timeout=90)
    print('\nFinished processing.')

    # Process video/segment level label annotations
    segment_labels = result.annotation_results[0].segment_label_annotations
    for i, segment_label in enumerate(segment_labels):
        print('Video label description: {}'.format(
            segment_label.entity.description))
        for category_entity in segment_label.category_entities:
            print('\tLabel category description: {}'.format(
                category_entity.description))

        for i, segment in enumerate(segment_label.segments):
            start_time = (segment.segment.start_time_offset.seconds +
                          segment.segment.start_time_offset.nanos / 1e9)
            end_time = (segment.segment.end_time_offset.seconds +
                        segment.segment.end_time_offset.nanos / 1e9)
            positions = '{}s to {}s'.format(start_time, end_time)
            confidence = segment.confidence
            print('\tSegment {}: {}'.format(i, positions))
            print('\tConfidence: {}'.format(confidence))
        print('\n')

    # Process shot level label annotations
    shot_labels = result.annotation_results[0].shot_label_annotations
    for i, shot_label in enumerate(shot_labels):
        print('Shot label description: {}'.format(
            shot_label.entity.description))
        for category_entity in shot_label.category_entities:
            print('\tLabel category description: {}'.format(
                category_entity.description))

        for i, shot in enumerate(shot_label.segments):
            start_time = (shot.segment.start_time_offset.seconds +
                          shot.segment.start_time_offset.nanos / 1e9)
            end_time = (shot.segment.end_time_offset.seconds +
                        shot.segment.end_time_offset.nanos / 1e9)
            positions = '{}s to {}s'.format(start_time, end_time)
            confidence = shot.confidence
            print('\tSegment {}: {}'.format(i, positions))
            print('\tConfidence: {}'.format(confidence))
        print('\n')

    # Process frame level label annotations
    frame_labels = result.annotation_results[0].frame_label_annotations
    for i, frame_label in enumerate(frame_labels):
        print('Frame label description: {}'.format(
            frame_label.entity.description))
        for category_entity in frame_label.category_entities:
            print('\tLabel category description: {}'.format(
                category_entity.description))

        # Each frame_label_annotation has many frames,
        # here we print information only about the first frame.
        frame = frame_label.frames[0]
        time_offset = frame.time_offset.seconds + frame.time_offset.nanos / 1e9
        print('\tFirst frame time offset: {}s'.format(time_offset))
        print('\tFirst frame confidence: {}'.format(frame.confidence))
        print('\n')

PHP

use Google\Cloud\VideoIntelligence\V1\VideoIntelligenceServiceClient;
use Google\Cloud\VideoIntelligence\V1\Feature;

/**
 * Finds labels in the video.
 *
 * @param string $uri The cloud storage object to analyze. Must be formatted
 *                    like gs://bucketname/objectname
 */
function analyze_labels($uri)
{
    # Instantiate a client.
    $video = new VideoIntelligenceServiceClient();

    # Execute a request.
    $operation = $video->annotateVideo([
        'inputUri' => $uri,
        'features' => [Feature::LABEL_DETECTION]
    ]);

    # Wait for the request to complete.
    $operation->pollUntilComplete();

    # Print the results.
    if ($operation->operationSucceeded()) {
        $results = $operation->getResult()->getAnnotationResults()[0];

        # Process video/segment level label annotations
        foreach ($results->getSegmentLabelAnnotations() as $label) {
            printf('Video label description: %s' . PHP_EOL, $label->getEntity()->getDescription());
            foreach ($label->getCategoryEntities() as $categoryEntity) {
                printf('  Category: %s' . PHP_EOL, $categoryEntity->getDescription());
            }
            foreach ($label->getSegments() as $segment) {
                $startTimeOffset = $segment->getSegment()->getStartTimeOffset();
                $startSeconds = $startTimeOffset->getSeconds();
                $startNanoseconds = floatval($startTimeOffset->getNanos())/1000000000.00;
                $startTime = $startSeconds + $startNanoseconds;
                $endTimeOffset = $segment->getSegment()->getEndTimeOffset();
                $endSeconds = $endTimeOffset->getSeconds();
                $endNanoseconds = floatval($endTimeOffset->getNanos())/1000000000.00;
                $endTime = $endSeconds + $endNanoseconds;
                printf('  Segment: %ss to %ss' . PHP_EOL, $startTime, $endTime);
                printf('  Confidence: %f' . PHP_EOL, $segment->getConfidence());
            }
        }
        print(PHP_EOL);

        # Process shot level label annotations
        foreach ($results->getShotLabelAnnotations() as $label) {
            printf('Shot label description: %s' . PHP_EOL, $label->getEntity()->getDescription());
            foreach ($label->getCategoryEntities() as $categoryEntity) {
                printf('  Category: %s' . PHP_EOL, $categoryEntity->getDescription());
            }
            foreach ($label->getSegments() as $shot) {
                $startTimeOffset = $shot->getSegment()->getStartTimeOffset();
                $startSeconds = $startTimeOffset->getSeconds();
                $startNanoseconds = floatval($startTimeOffset->getNanos())/1000000000.00;
                $startTime = $startSeconds + $startNanoseconds;
                $endTimeOffset = $shot->getSegment()->getEndTimeOffset();
                $endSecondseconds = $endTimeOffset->getSeconds();
                $endNanos = floatval($endTimeOffset->getNanos())/1000000000.00;
                $endTime = $endSeconds + $endNanoseconds;
                printf('  Shot: %ss to %ss' . PHP_EOL, $startTime, $endTime);
                printf('  Confidence: %f' . PHP_EOL, $shot->getConfidence());
            }
        }
        print(PHP_EOL);
    } else {
        print_r($operation->getError());
    }
}

Ruby

# path = "Path to a local video file: path/to/file.mp4"

require "google/cloud/video_intelligence"

video = Google::Cloud::VideoIntelligence.new

video_contents = File.binread path

# Register a callback during the method call
operation = video.annotate_video input_content: video_contents, features: [:LABEL_DETECTION] do |operation|
  raise operation.results.message? if operation.error?
  puts "Finished Processing."

  labels = operation.results.annotation_results.first.segment_label_annotations

  labels.each do |label|
    puts "Label description: #{label.entity.description}"

    label.category_entities.each do |category_entity|
      puts "Label category description: #{category_entity.description}"
    end

    label.segments.each do |segment|
      start_time = ( segment.segment.start_time_offset.seconds +
                     segment.segment.start_time_offset.nanos / 1e9 )
      end_time =   ( segment.segment.end_time_offset.seconds +
                     segment.segment.end_time_offset.nanos / 1e9 )

      puts "Segment: #{start_time} to #{end_time}"
      puts "Confidence: #{segment.confidence}"
    end
  end
end

puts "Processing video for label annotations:"
operation.wait_until_done!

Aquí hay un ejemplo de análisis de etiquetas en vídeos en un archivo ubicado en Google Cloud Storage.

Protocolo

Consulta el punto de conexión de la API videos:annotate para obtener detalles completos.

Para llevar a cabo la detección de etiquetas, realiza una petición POST y proporcione el cuerpo apropiado:

POST https://videointelligence.googleapis.com/v1/videos:annotate?key=YOUR_API_KEY
{
  "inputUri": "gs://demomaker/cat.mp4",
  "features": ["LABEL_DETECTION"]
}

Las peticiones de anotación correctas de Video Intelligence devuelven una respuesta con un solo campo de nombre:

{
  "name": "us-west1.16680573"
}

Este nombre representa una operación de larga duración, que se puede consultar utilizando la API v1.operations.

Para recuperar tu respuesta de anotación de vídeo, envía una petición GET al punto de conexión v1.operations, pasando el valor de name en la URL. Si se ha completado la operación, devolverá tus resultados de anotación.

Las anotaciones de búsqueda de etiquetas se devuelven en los resultados de annotationResults. Por ejemplo:

{
  "name": "us-east1.7397809392042093732",
  "metadata": {
    "@type": "type.googleapis.com/google.cloud.videointelligence.v1.AnnotateVideoProgress",
    "annotationProgress": [
      {
        "inputUri": "/demomaker/cat.mp4",
        "progressPercent": 100,
        "startTime": "2017-05-18T21:14:35.235527Z",
        "updateTime": "2017-05-18T21:14:42.665369Z"
      }
    ]
  },
  "done": true,
  "response": {
    "@type": "type.googleapis.com/google.cloud.videointelligence.v1.AnnotateVideoResponse",
    "annotationResults": [
          {
            "inputUri": "/demomaker/cat.mp4",
            "segmentLabelAnnotations": [
              {
                "entity": {
                  "entityId": "/m/01yrx",
                  "description": "cat",
                  "languageCode": "en-US"
                },
                "categoryEntities": [
                  {
                    "entityId": "/m/068hy",
                    "description": "pet",
                    "languageCode": "en-US"
                  }
                ],
                "segments": [
                  {
                    "segment": {
                      "startTimeOffset": "0s",
                      "endTimeOffset": "14.833664s"
                    },
                    "confidence": 0.98509187
                  }
                ]
              },
              {
                "entity": {
                  "entityId": "/m/0jbk",
                  "description": "animal",
                  "languageCode": "en-US"
                },
                "segments": [
                  {
                    "segment": {
                      "startTimeOffset": "0s",
                      "endTimeOffset": "14.833664s"
                    },
                    "confidence": 0.9809588
                  }
                ]
              },
              {
                "entity": {
                  "entityId": "/m/068hy",
                  "description": "pet",
                  "languageCode": "en-US"
                },
                "categoryEntities": [
                  {
                    "entityId": "/m/0jbk",
                    "description": "animal",
                    "languageCode": "en-US"
                  }
                ],
                "segments": [
                  {
                    "segment": {
                      "startTimeOffset": "0s",
                      "endTimeOffset": "14.833664s"
                    },
                    "confidence": 0.9382622
                  }
                ]
              },
              {
                "entity": {
                  "entityId": "/m/05h0n",
                  "description": "nature",
                  "languageCode": "en-US"
                },
                "segments": [
                  {
                    "segment": {
                      "startTimeOffset": "0s",
                      "endTimeOffset": "14.833664s"
                    },
                    "confidence": 0.8411303
                  }
                ]
              },
              {
                "entity": {
                  "entityId": "/m/07k6w8",
                  "description": "small to medium sized cats",
                  "languageCode": "en-US"
                },
                "categoryEntities": [
                  {
                    "entityId": "/m/04rky",
                    "description": "mammal",
                    "languageCode": "en-US"
                  }
                ],
                "segments": [
                  {
                    "segment": {
                      "startTimeOffset": "0s",
                      "endTimeOffset": "14.833664s"
                    },
                    "confidence": 0.8077077
                  }
                ]
              },
             <snip>
            ]
          }
        ]

C#

public static object AnalyzeLabelsGcs(string uri)
{
    var client = VideoIntelligenceServiceClient.Create();
    var request = new AnnotateVideoRequest()
    {
        InputUri = uri,
        Features = { Feature.LabelDetection }
    };
    var op = client.AnnotateVideo(request).PollUntilCompleted();
    foreach (var result in op.Result.AnnotationResults)
    {
        PrintLabels("Video", result.SegmentLabelAnnotations);
        PrintLabels("Shot", result.ShotLabelAnnotations);
        PrintLabels("Frame", result.FrameLabelAnnotations);
    }
    return 0;
}

static void PrintLabels(string labelName,
    IEnumerable<LabelAnnotation> labelAnnotations)
{
    foreach (var annotation in labelAnnotations)
    {
        Console.WriteLine($"{labelName} label: {annotation.Entity.Description}");
        foreach (var entity in annotation.CategoryEntities)
        {
            Console.WriteLine($"{labelName} label category: {entity.Description}");
        }
        foreach (var segment in annotation.Segments)
        {
            Console.Write("Segment location: ");
            Console.Write(segment.Segment.StartTimeOffset);
            Console.Write(":");
            Console.WriteLine(segment.Segment.EndTimeOffset);
            System.Console.WriteLine($"Confidence: {segment.Confidence}");
        }
    }
}

Go

func labelURI(w io.Writer, file string) error {
	ctx := context.Background()
	client, err := video.NewClient(ctx)
	if err != nil {
		return err
	}

	op, err := client.AnnotateVideo(ctx, &videopb.AnnotateVideoRequest{
		Features: []videopb.Feature{
			videopb.Feature_LABEL_DETECTION,
		},
		InputUri: file,
	})
	if err != nil {
		return err
	}
	resp, err := op.Wait(ctx)
	if err != nil {
		return err
	}

	printLabels := func(labels []*videopb.LabelAnnotation) {
		for _, label := range labels {
			fmt.Fprintf(w, "\tDescription: %s\n", label.Entity.Description)
			for _, category := range label.CategoryEntities {
				fmt.Fprintf(w, "\t\tCategory: %s\n", category.Description)
			}
			for _, segment := range label.Segments {
				start, _ := ptypes.Duration(segment.Segment.StartTimeOffset)
				end, _ := ptypes.Duration(segment.Segment.EndTimeOffset)
				fmt.Fprintf(w, "\t\tSegment: %s to %s\n", start, end)
			}
		}
	}

	// A single video was processed. Get the first result.
	result := resp.AnnotationResults[0]

	fmt.Fprintln(w, "SegmentLabelAnnotations:")
	printLabels(result.SegmentLabelAnnotations)
	fmt.Fprintln(w, "ShotLabelAnnotations:")
	printLabels(result.ShotLabelAnnotations)
	fmt.Fprintln(w, "FrameLabelAnnotations:")
	printLabels(result.FrameLabelAnnotations)

	return nil
}

Java

// Instantiate a com.google.cloud.videointelligence.v1.VideoIntelligenceServiceClient
try (VideoIntelligenceServiceClient client = VideoIntelligenceServiceClient.create()) {
  // Provide path to file hosted on GCS as "gs://bucket-name/..."
  AnnotateVideoRequest request = AnnotateVideoRequest.newBuilder()
      .setInputUri(gcsUri)
      .addFeatures(Feature.LABEL_DETECTION)
      .build();
  // Create an operation that will contain the response when the operation completes.
  OperationFuture<AnnotateVideoResponse, AnnotateVideoProgress> response =
      client.annotateVideoAsync(request);

  System.out.println("Waiting for operation to complete...");
  for (VideoAnnotationResults results : response.get().getAnnotationResultsList()) {
    // process video / segment level label annotations
    System.out.println("Locations: ");
    for (LabelAnnotation labelAnnotation : results.getSegmentLabelAnnotationsList()) {
      System.out
          .println("Video label: " + labelAnnotation.getEntity().getDescription());
      // categories
      for (Entity categoryEntity : labelAnnotation.getCategoryEntitiesList()) {
        System.out.println("Video label category: " + categoryEntity.getDescription());
      }
      // segments
      for (LabelSegment segment : labelAnnotation.getSegmentsList()) {
        double startTime = segment.getSegment().getStartTimeOffset().getSeconds()
            + segment.getSegment().getStartTimeOffset().getNanos() / 1e9;
        double endTime = segment.getSegment().getEndTimeOffset().getSeconds()
            + segment.getSegment().getEndTimeOffset().getNanos() / 1e9;
        System.out.printf("Segment location: %.3f:%.3f\n", startTime, endTime);
        System.out.println("Confidence: " + segment.getConfidence());
      }
    }

    // process shot label annotations
    for (LabelAnnotation labelAnnotation : results.getShotLabelAnnotationsList()) {
      System.out
          .println("Shot label: " + labelAnnotation.getEntity().getDescription());
      // categories
      for (Entity categoryEntity : labelAnnotation.getCategoryEntitiesList()) {
        System.out.println("Shot label category: " + categoryEntity.getDescription());
      }
      // segments
      for (LabelSegment segment : labelAnnotation.getSegmentsList()) {
        double startTime = segment.getSegment().getStartTimeOffset().getSeconds()
            + segment.getSegment().getStartTimeOffset().getNanos() / 1e9;
        double endTime = segment.getSegment().getEndTimeOffset().getSeconds()
            + segment.getSegment().getEndTimeOffset().getNanos() / 1e9;
        System.out.printf("Segment location: %.3f:%.3f\n", startTime, endTime);
        System.out.println("Confidence: " + segment.getConfidence());
      }
    }

    // process frame label annotations
    for (LabelAnnotation labelAnnotation : results.getFrameLabelAnnotationsList()) {
      System.out
          .println("Frame label: " + labelAnnotation.getEntity().getDescription());
      // categories
      for (Entity categoryEntity : labelAnnotation.getCategoryEntitiesList()) {
        System.out.println("Frame label category: " + categoryEntity.getDescription());
      }
      // segments
      for (LabelSegment segment : labelAnnotation.getSegmentsList()) {
        double startTime = segment.getSegment().getStartTimeOffset().getSeconds()
            + segment.getSegment().getStartTimeOffset().getNanos() / 1e9;
        double endTime = segment.getSegment().getEndTimeOffset().getSeconds()
            + segment.getSegment().getEndTimeOffset().getNanos() / 1e9;
        System.out.printf("Segment location: %.3f:%.2f\n", startTime, endTime);
        System.out.println("Confidence: " + segment.getConfidence());
      }
    }
  }
}

Node.js

// Imports the Google Cloud Video Intelligence library
const video = require('@google-cloud/video-intelligence').v1;

// Creates a client
const client = new video.VideoIntelligenceServiceClient();

/**
 * TODO(developer): Uncomment the following line before running the sample.
 */
// const gcsUri = 'GCS URI of the video to analyze, e.g. gs://my-bucket/my-video.mp4';

const request = {
  inputUri: gcsUri,
  features: ['LABEL_DETECTION'],
};

// Detects labels in a video
client
  .annotateVideo(request)
  .then(results => {
    const operation = results[0];
    console.log('Waiting for operation to complete...');
    return operation.promise();
  })
  .then(results => {
    // Gets annotations for video
    const annotations = results[0].annotationResults[0];

    const labels = annotations.segmentLabelAnnotations;
    labels.forEach(label => {
      console.log(`Label ${label.entity.description} occurs at:`);
      label.segments.forEach(segment => {
        let time = segment.segment;
        if (time.startTimeOffset.seconds === undefined) {
          time.startTimeOffset.seconds = 0;
        }
        if (time.startTimeOffset.nanos === undefined) {
          time.startTimeOffset.nanos = 0;
        }
        if (time.endTimeOffset.seconds === undefined) {
          time.endTimeOffset.seconds = 0;
        }
        if (time.endTimeOffset.nanos === undefined) {
          time.endTimeOffset.nanos = 0;
        }
        console.log(
          `\tStart: ${time.startTimeOffset.seconds}` +
            `.${(time.startTimeOffset.nanos / 1e6).toFixed(0)}s`
        );
        console.log(
          `\tEnd: ${time.endTimeOffset.seconds}.` +
            `${(time.endTimeOffset.nanos / 1e6).toFixed(0)}s`
        );
        console.log(`\tConfidence: ${segment.confidence}`);
      });
    });
  })
  .catch(err => {
    console.error('ERROR:', err);
  });

Python

def analyze_labels(path):
    """ Detects labels given a GCS path. """
    video_client = videointelligence.VideoIntelligenceServiceClient()
    features = [videointelligence.enums.Feature.LABEL_DETECTION]

    mode = videointelligence.enums.LabelDetectionMode.SHOT_AND_FRAME_MODE
    config = videointelligence.types.LabelDetectionConfig(
        label_detection_mode=mode)
    context = videointelligence.types.VideoContext(
        label_detection_config=config)

    operation = video_client.annotate_video(
        path, features=features, video_context=context)
    print('\nProcessing video for label annotations:')

    result = operation.result(timeout=90)
    print('\nFinished processing.')

    # Process video/segment level label annotations
    segment_labels = result.annotation_results[0].segment_label_annotations
    for i, segment_label in enumerate(segment_labels):
        print('Video label description: {}'.format(
            segment_label.entity.description))
        for category_entity in segment_label.category_entities:
            print('\tLabel category description: {}'.format(
                category_entity.description))

        for i, segment in enumerate(segment_label.segments):
            start_time = (segment.segment.start_time_offset.seconds +
                          segment.segment.start_time_offset.nanos / 1e9)
            end_time = (segment.segment.end_time_offset.seconds +
                        segment.segment.end_time_offset.nanos / 1e9)
            positions = '{}s to {}s'.format(start_time, end_time)
            confidence = segment.confidence
            print('\tSegment {}: {}'.format(i, positions))
            print('\tConfidence: {}'.format(confidence))
        print('\n')

    # Process shot level label annotations
    shot_labels = result.annotation_results[0].shot_label_annotations
    for i, shot_label in enumerate(shot_labels):
        print('Shot label description: {}'.format(
            shot_label.entity.description))
        for category_entity in shot_label.category_entities:
            print('\tLabel category description: {}'.format(
                category_entity.description))

        for i, shot in enumerate(shot_label.segments):
            start_time = (shot.segment.start_time_offset.seconds +
                          shot.segment.start_time_offset.nanos / 1e9)
            end_time = (shot.segment.end_time_offset.seconds +
                        shot.segment.end_time_offset.nanos / 1e9)
            positions = '{}s to {}s'.format(start_time, end_time)
            confidence = shot.confidence
            print('\tSegment {}: {}'.format(i, positions))
            print('\tConfidence: {}'.format(confidence))
        print('\n')

    # Process frame level label annotations
    frame_labels = result.annotation_results[0].frame_label_annotations
    for i, frame_label in enumerate(frame_labels):
        print('Frame label description: {}'.format(
            frame_label.entity.description))
        for category_entity in frame_label.category_entities:
            print('\tLabel category description: {}'.format(
                category_entity.description))

        # Each frame_label_annotation has many frames,
        # here we print information only about the first frame.
        frame = frame_label.frames[0]
        time_offset = (frame.time_offset.seconds +
                       frame.time_offset.nanos / 1e9)
        print('\tFirst frame time offset: {}s'.format(time_offset))
        print('\tFirst frame confidence: {}'.format(frame.confidence))
        print('\n')

PHP

use Google\Cloud\VideoIntelligence\V1\VideoIntelligenceServiceClient;
use Google\Cloud\VideoIntelligence\V1\Feature;

/**
 * Finds labels in the video.
 *
 * @param string $path File path to a video file to analyze.
 */
function analyze_labels_file($path)
{
    # Instantiate a client.
    $video = new VideoIntelligenceServiceClient();

    # Read the local video file
    $inputContent = file_get_contents($path);

    # Execute a request.
    $operation = $video->annotateVideo([
        'inputContent' => $inputContent,
        'features' => [Feature::LABEL_DETECTION]
    ]);

    # Wait for the request to complete.
    $operation->pollUntilComplete();

    # Print the results.
    if ($operation->operationSucceeded()) {
        $results = $operation->getResult()->getAnnotationResults()[0];

        # Process video/segment level label annotations
        foreach ($results->getSegmentLabelAnnotations() as $label) {
            printf('Video label description: %s' . PHP_EOL, $label->getEntity()->getDescription());
            foreach ($label->getCategoryEntities() as $categoryEntity) {
                printf('  Category: %s' . PHP_EOL, $categoryEntity->getDescription());
            }
            foreach ($label->getSegments() as $segment) {
                $startTimeOffset = $segment->getSegment()->getStartTimeOffset();
                $startSeconds = $startTimeOffset->getSeconds();
                $startNanoseconds = floatval($startTimeOffset->getNanos())/1000000000.00;
                $startTime = $startSeconds + $startNanoseconds;
                $endTimeOffset = $segment->getSegment()->getEndTimeOffset();
                $endSeconds = $endTimeOffset->getSeconds();
                $endNanoseconds = floatval($endTimeOffset->getNanos())/1000000000.00;
                $endTime = $endSeconds + $endNanoseconds;
                printf('  Segment: %ss to %ss' . PHP_EOL, $startTime, $endTime);
                printf('  Confidence: %f' . PHP_EOL, $segment->getConfidence());
            }
        }
        print(PHP_EOL);

        # Process shot level label annotations
        foreach ($results->getShotLabelAnnotations() as $label) {
            printf('Shot label description: %s' . PHP_EOL, $label->getEntity()->getDescription());
            foreach ($label->getCategoryEntities() as $categoryEntity) {
                printf('  Category: %s' . PHP_EOL, $categoryEntity->getDescription());
            }
            foreach ($label->getSegments() as $shot) {
                $startTimeOffset = $shot->getSegment()->getStartTimeOffset();
                $startSeconds = $startTimeOffset->getSeconds();
                $startNanoseconds = floatval($startTimeOffset->getNanos())/1000000000.00;
                $startTime = $startSeconds + $startNanoseconds;
                $endTimeOffset = $shot->getSegment()->getEndTimeOffset();
                $endSeconds = $endTimeOffset->getSeconds();
                $endNanoseconds = floatval($endTimeOffset->getNanos())/1000000000.00;
                $endTime = $endSeconds + $endNanoseconds;
                printf('  Shot: %ss to %ss' . PHP_EOL, $startTime, $endTime);
                printf('  Confidence: %f' . PHP_EOL, $shot->getConfidence());
            }
        }
        print(PHP_EOL);
    } else {
        print_r($operation->getError());
    }
}

Ruby

# path = "Path to a video file on Google Cloud Storage: gs://bucket/video.mp4"

require "google/cloud/video_intelligence"

video = Google::Cloud::VideoIntelligence.new

# Register a callback during the method call
operation = video.annotate_video input_uri: path, features: [:LABEL_DETECTION] do |operation|
  raise operation.results.message? if operation.error?
  puts "Finished Processing."

  labels = operation.results.annotation_results.first.segment_label_annotations

  labels.each do |label|
    puts "Label description: #{label.entity.description}"

    label.category_entities.each do |category_entity|
      puts "Label category description: #{category_entity.description}"
    end

    label.segments.each do |segment|
      start_time = ( segment.segment.start_time_offset.seconds +
                     segment.segment.start_time_offset.nanos / 1e9 )
      end_time =   ( segment.segment.end_time_offset.seconds +
                     segment.segment.end_time_offset.nanos / 1e9 )

      puts "Segment: #{start_time} to #{end_time}"
      puts "Confidence: #{segment.confidence}"
    end
  end
end

puts "Processing video for label annotations:"
operation.wait_until_done!

¿Te ha resultado útil esta página? Enviar comentarios:

Enviar comentarios sobre...

Cloud Video Intelligence API Documentation