This page describes how to use Video Intelligence API to label entities appearing in a video.
Video Intelligence can detect and extract information about entities shown in video footage. This label analysis feature identifies objects, locations, activities, animal species, products, and more.
Here is an example of performing video analysis for labels on a local file.
Looking for something more in-depth? Check out our detailed Python tutorial.
REST & CMD LINE
Send the process request
The following shows how to send a POST
request to the
videos:annotate
method. You can configure the
LabelDetectionMode
to shot-level and/or frame-level annotations. We recommend using
SHOT_AND_FRAME_MODE
. The example uses the access token for
a service account set up for the project using the Cloud SDK. For
instructions on installing the Cloud SDK, setting up a project with a service
account, and obtaining an access token, see the
Video Intelligence quickstart.
Before using any of the request data below, make the following replacements:
- base64-encoded-content: your video as base64 encoded data. See the instructions on how to convert your data to base64.
HTTP method and URL:
POST https://videointelligence.googleapis.com/v1/videos:annotate
Request JSON body:
{ "inputContent": "base64-encoded-content", "features": ["LABEL_DETECTION"], }
To send your request, expand one of these options:
curl (Linux, macOS, or Cloud Shell)
Save the request body in a file called request.json
,
and execute the following command:
curl -X POST \
-H "Authorization: Bearer "$(gcloud auth application-default print-access-token) \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
https://videointelligence.googleapis.com/v1/videos:annotate
PowerShell (Windows)
Save the request body in a file called request.json
,
and execute the following command:
$cred = gcloud auth application-default print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }
Invoke-WebRequest `
-Method POST `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "https://videointelligence.googleapis.com/v1/videos:annotate" | Select-Object -Expand Content
You should receive a JSON response similar to the following:
{ "name": "projects/project-number/locations/location-id/operations/operation-id" }
If the request is successful, Video Intelligence returns the name of your operation.
Get the results
To get the results of your request, you must send a GET
request to
the projects.locations.operations
resource. The following shows how to send such a request.
Before using any of the request data below, make the following replacements:
- operation-name: the name of the operation as
returned by the Video Intelligence API. The operation name has the format
projects/project-number/locations/location-id/operations/operation-id
HTTP method and URL:
GET https://videointelligence.googleapis.com/v1/operation-name
To send your request, expand one of these options:
curl (Linux, macOS, or Cloud Shell)
Execute the following command:
curl -X GET \
-H "Authorization: Bearer "$(gcloud auth application-default print-access-token) \
https://videointelligence.googleapis.com/v1/operation-name
PowerShell (Windows)
Execute the following command:
$cred = gcloud auth application-default print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }
Invoke-WebRequest `
-Method GET `
-Headers $headers `
-Uri "https://videointelligence.googleapis.com/v1/operation-name" | Select-Object -Expand Content
You should receive a JSON response similar to the following:
Response
{ "name": "projects/project-number/locations/location-id/operations/operation-id", "metadata": { "@type": "type.googleapis.com/google.cloud.videointelligence.v1.AnnotateVideoProgress", "annotationProgress": [ { "progressPercent": 100, "startTime": "2019-03-12T19:36:09.110351Z", "updateTime": "2019-03-12T19:36:17.519069Z" } ] }, "done": true, "response": { "@type": "type.googleapis.com/google.cloud.videointelligence.v1.AnnotateVideoResponse", "annotationResults": [ { "segmentLabelAnnotations": [ { "entity": { "entityId": "/m/01prls", "description": "land vehicle", "languageCode": "en-US" }, "categoryEntities": [ { "entityId": "/m/07yv9", "description": "vehicle", "languageCode": "en-US" } ], "segments": [ { "segment": { "startTimeOffset": "0s", "endTimeOffset": "38.757872s" }, "confidence": 0.6614419 } ] }, { "entity": { "entityId": "/m/039jbq", "description": "urban area", "languageCode": "en-US" }, "categoryEntities": [ { "entityId": "/m/01n32", "description": "city", "languageCode": "en-US" } ], "segments": [ { "segment": { "startTimeOffset": "0s", "endTimeOffset": "38.757872s" }, "confidence": 0.92337775 } ] }, ...
C#
public static object AnalyzeLabels(string path)
{
var client = VideoIntelligenceServiceClient.Create();
var request = new AnnotateVideoRequest()
{
InputContent = Google.Protobuf.ByteString.CopyFrom(File.ReadAllBytes(path)),
Features = { Feature.LabelDetection }
};
var op = client.AnnotateVideo(request).PollUntilCompleted();
foreach (var result in op.Result.AnnotationResults)
{
PrintLabels("Video", result.SegmentLabelAnnotations);
PrintLabels("Shot", result.ShotLabelAnnotations);
PrintLabels("Frame", result.FrameLabelAnnotations);
}
return 0;
}
static void PrintLabels(string labelName,
IEnumerable<LabelAnnotation> labelAnnotations)
{
foreach (var annotation in labelAnnotations)
{
Console.WriteLine($"{labelName} label: {annotation.Entity.Description}");
foreach (var entity in annotation.CategoryEntities)
{
Console.WriteLine($"{labelName} label category: {entity.Description}");
}
foreach (var segment in annotation.Segments)
{
Console.Write("Segment location: ");
Console.Write(segment.Segment.StartTimeOffset);
Console.Write(":");
Console.WriteLine(segment.Segment.EndTimeOffset);
System.Console.WriteLine($"Confidence: {segment.Confidence}");
}
}
}
Go
func label(w io.Writer, file string) error {
ctx := context.Background()
client, err := video.NewClient(ctx)
if err != nil {
return err
}
fileBytes, err := ioutil.ReadFile(file)
if err != nil {
return err
}
op, err := client.AnnotateVideo(ctx, &videopb.AnnotateVideoRequest{
Features: []videopb.Feature{
videopb.Feature_LABEL_DETECTION,
},
InputContent: fileBytes,
})
if err != nil {
return err
}
resp, err := op.Wait(ctx)
if err != nil {
return err
}
printLabels := func(labels []*videopb.LabelAnnotation) {
for _, label := range labels {
fmt.Fprintf(w, "\tDescription: %s\n", label.Entity.Description)
for _, category := range label.CategoryEntities {
fmt.Fprintf(w, "\t\tCategory: %s\n", category.Description)
}
for _, segment := range label.Segments {
start, _ := ptypes.Duration(segment.Segment.StartTimeOffset)
end, _ := ptypes.Duration(segment.Segment.EndTimeOffset)
fmt.Fprintf(w, "\t\tSegment: %s to %s\n", start, end)
}
}
}
// A single video was processed. Get the first result.
result := resp.AnnotationResults[0]
fmt.Fprintln(w, "SegmentLabelAnnotations:")
printLabels(result.SegmentLabelAnnotations)
fmt.Fprintln(w, "ShotLabelAnnotations:")
printLabels(result.ShotLabelAnnotations)
fmt.Fprintln(w, "FrameLabelAnnotations:")
printLabels(result.FrameLabelAnnotations)
return nil
}
Java
// Instantiate a com.google.cloud.videointelligence.v1.VideoIntelligenceServiceClient
try (VideoIntelligenceServiceClient client = VideoIntelligenceServiceClient.create()) {
// Read file and encode into Base64
Path path = Paths.get(filePath);
byte[] data = Files.readAllBytes(path);
AnnotateVideoRequest request = AnnotateVideoRequest.newBuilder()
.setInputContent(ByteString.copyFrom(data))
.addFeatures(Feature.LABEL_DETECTION)
.build();
// Create an operation that will contain the response when the operation completes.
OperationFuture<AnnotateVideoResponse, AnnotateVideoProgress> response =
client.annotateVideoAsync(request);
System.out.println("Waiting for operation to complete...");
for (VideoAnnotationResults results : response.get().getAnnotationResultsList()) {
// process video / segment level label annotations
System.out.println("Locations: ");
for (LabelAnnotation labelAnnotation : results.getSegmentLabelAnnotationsList()) {
System.out
.println("Video label: " + labelAnnotation.getEntity().getDescription());
// categories
for (Entity categoryEntity : labelAnnotation.getCategoryEntitiesList()) {
System.out.println("Video label category: " + categoryEntity.getDescription());
}
// segments
for (LabelSegment segment : labelAnnotation.getSegmentsList()) {
double startTime = segment.getSegment().getStartTimeOffset().getSeconds()
+ segment.getSegment().getStartTimeOffset().getNanos() / 1e9;
double endTime = segment.getSegment().getEndTimeOffset().getSeconds()
+ segment.getSegment().getEndTimeOffset().getNanos() / 1e9;
System.out.printf("Segment location: %.3f:%.2f\n", startTime, endTime);
System.out.println("Confidence: " + segment.getConfidence());
}
}
// process shot label annotations
for (LabelAnnotation labelAnnotation : results.getShotLabelAnnotationsList()) {
System.out
.println("Shot label: " + labelAnnotation.getEntity().getDescription());
// categories
for (Entity categoryEntity : labelAnnotation.getCategoryEntitiesList()) {
System.out.println("Shot label category: " + categoryEntity.getDescription());
}
// segments
for (LabelSegment segment : labelAnnotation.getSegmentsList()) {
double startTime = segment.getSegment().getStartTimeOffset().getSeconds()
+ segment.getSegment().getStartTimeOffset().getNanos() / 1e9;
double endTime = segment.getSegment().getEndTimeOffset().getSeconds()
+ segment.getSegment().getEndTimeOffset().getNanos() / 1e9;
System.out.printf("Segment location: %.3f:%.2f\n", startTime, endTime);
System.out.println("Confidence: " + segment.getConfidence());
}
}
// process frame label annotations
for (LabelAnnotation labelAnnotation : results.getFrameLabelAnnotationsList()) {
System.out
.println("Frame label: " + labelAnnotation.getEntity().getDescription());
// categories
for (Entity categoryEntity : labelAnnotation.getCategoryEntitiesList()) {
System.out.println("Frame label category: " + categoryEntity.getDescription());
}
// segments
for (LabelSegment segment : labelAnnotation.getSegmentsList()) {
double startTime = segment.getSegment().getStartTimeOffset().getSeconds()
+ segment.getSegment().getStartTimeOffset().getNanos() / 1e9;
double endTime = segment.getSegment().getEndTimeOffset().getSeconds()
+ segment.getSegment().getEndTimeOffset().getNanos() / 1e9;
System.out.printf("Segment location: %.3f:%.2f\n", startTime, endTime);
System.out.println("Confidence: " + segment.getConfidence());
}
}
}
}
Node.js
// Imports the Google Cloud Video Intelligence library + Node's fs library
const video = require('@google-cloud/video-intelligence').v1;
const fs = require('fs');
const util = require('util');
// Creates a client
const client = new video.VideoIntelligenceServiceClient();
/**
* TODO(developer): Uncomment the following line before running the sample.
*/
// const path = 'Local file to analyze, e.g. ./my-file.mp4';
// Reads a local video file and converts it to base64
const readFile = util.promisify(fs.readFile);
const file = await readFile(path);
const inputContent = file.toString('base64');
// Constructs request
const request = {
inputContent: inputContent,
features: ['LABEL_DETECTION'],
};
// Detects labels in a video
const [operation] = await client.annotateVideo(request);
console.log('Waiting for operation to complete...');
const [operationResult] = await operation.promise();
// Gets annotations for video
const annotations = operationResult.annotationResults[0];
const labels = annotations.segmentLabelAnnotations;
labels.forEach(label => {
console.log(`Label ${label.entity.description} occurs at:`);
label.segments.forEach(segment => {
const time = segment.segment;
if (time.startTimeOffset.seconds === undefined) {
time.startTimeOffset.seconds = 0;
}
if (time.startTimeOffset.nanos === undefined) {
time.startTimeOffset.nanos = 0;
}
if (time.endTimeOffset.seconds === undefined) {
time.endTimeOffset.seconds = 0;
}
if (time.endTimeOffset.nanos === undefined) {
time.endTimeOffset.nanos = 0;
}
console.log(
`\tStart: ${time.startTimeOffset.seconds}` +
`.${(time.startTimeOffset.nanos / 1e6).toFixed(0)}s`
);
console.log(
`\tEnd: ${time.endTimeOffset.seconds}.` +
`${(time.endTimeOffset.nanos / 1e6).toFixed(0)}s`
);
console.log(`\tConfidence: ${segment.confidence}`);
});
});
Python
For more information on installing and using the Cloud Video Intelligence API Client Library for Python, refer to Cloud Video Intelligence API Client Libraries."""Detect labels given a file path."""
video_client = videointelligence.VideoIntelligenceServiceClient()
features = [videointelligence.enums.Feature.LABEL_DETECTION]
with io.open(path, 'rb') as movie:
input_content = movie.read()
operation = video_client.annotate_video(
features=features, input_content=input_content)
print('\nProcessing video for label annotations:')
result = operation.result(timeout=90)
print('\nFinished processing.')
# Process video/segment level label annotations
segment_labels = result.annotation_results[0].segment_label_annotations
for i, segment_label in enumerate(segment_labels):
print('Video label description: {}'.format(
segment_label.entity.description))
for category_entity in segment_label.category_entities:
print('\tLabel category description: {}'.format(
category_entity.description))
for i, segment in enumerate(segment_label.segments):
start_time = (segment.segment.start_time_offset.seconds +
segment.segment.start_time_offset.nanos / 1e9)
end_time = (segment.segment.end_time_offset.seconds +
segment.segment.end_time_offset.nanos / 1e9)
positions = '{}s to {}s'.format(start_time, end_time)
confidence = segment.confidence
print('\tSegment {}: {}'.format(i, positions))
print('\tConfidence: {}'.format(confidence))
print('\n')
# Process shot level label annotations
shot_labels = result.annotation_results[0].shot_label_annotations
for i, shot_label in enumerate(shot_labels):
print('Shot label description: {}'.format(
shot_label.entity.description))
for category_entity in shot_label.category_entities:
print('\tLabel category description: {}'.format(
category_entity.description))
for i, shot in enumerate(shot_label.segments):
start_time = (shot.segment.start_time_offset.seconds +
shot.segment.start_time_offset.nanos / 1e9)
end_time = (shot.segment.end_time_offset.seconds +
shot.segment.end_time_offset.nanos / 1e9)
positions = '{}s to {}s'.format(start_time, end_time)
confidence = shot.confidence
print('\tSegment {}: {}'.format(i, positions))
print('\tConfidence: {}'.format(confidence))
print('\n')
# Process frame level label annotations
frame_labels = result.annotation_results[0].frame_label_annotations
for i, frame_label in enumerate(frame_labels):
print('Frame label description: {}'.format(
frame_label.entity.description))
for category_entity in frame_label.category_entities:
print('\tLabel category description: {}'.format(
category_entity.description))
# Each frame_label_annotation has many frames,
# here we print information only about the first frame.
frame = frame_label.frames[0]
time_offset = frame.time_offset.seconds + frame.time_offset.nanos / 1e9
print('\tFirst frame time offset: {}s'.format(time_offset))
print('\tFirst frame confidence: {}'.format(frame.confidence))
print('\n')
PHP
use Google\Cloud\VideoIntelligence\V1\VideoIntelligenceServiceClient;
use Google\Cloud\VideoIntelligence\V1\Feature;
/** Uncomment and populate these variables in your code */
// $uri = 'The cloud storage object to analyze (gs://your-bucket-name/your-object-name)';
// $options = [];
# Instantiate a client.
$video = new VideoIntelligenceServiceClient();
# Execute a request.
$operation = $video->annotateVideo([
'inputUri' => $uri,
'features' => [Feature::LABEL_DETECTION]
]);
# Wait for the request to complete.
$operation->pollUntilComplete($options);
# Print the results.
if ($operation->operationSucceeded()) {
$results = $operation->getResult()->getAnnotationResults()[0];
# Process video/segment level label annotations
foreach ($results->getSegmentLabelAnnotations() as $label) {
printf('Video label description: %s' . PHP_EOL, $label->getEntity()->getDescription());
foreach ($label->getCategoryEntities() as $categoryEntity) {
printf(' Category: %s' . PHP_EOL, $categoryEntity->getDescription());
}
foreach ($label->getSegments() as $segment) {
$start = $segment->getSegment()->getStartTimeOffset();
$end = $segment->getSegment()->getEndTimeOffset();
printf(' Segment: %ss to %ss' . PHP_EOL,
$start->getSeconds() + $start->getNanos()/1000000000.0,
$end->getSeconds() + $end->getNanos()/1000000000.0);
printf(' Confidence: %f' . PHP_EOL, $segment->getConfidence());
}
}
print(PHP_EOL);
# Process shot level label annotations
foreach ($results->getShotLabelAnnotations() as $label) {
printf('Shot label description: %s' . PHP_EOL, $label->getEntity()->getDescription());
foreach ($label->getCategoryEntities() as $categoryEntity) {
printf(' Category: %s' . PHP_EOL, $categoryEntity->getDescription());
}
foreach ($label->getSegments() as $shot) {
$start = $shot->getSegment()->getStartTimeOffset();
$end = $shot->getSegment()->getEndTimeOffset();
printf(' Shot: %ss to %ss' . PHP_EOL,
$start->getSeconds() + $start->getNanos()/1000000000.0,
$end->getSeconds() + $end->getNanos()/1000000000.0);
printf(' Confidence: %f' . PHP_EOL, $shot->getConfidence());
}
}
print(PHP_EOL);
} else {
print_r($operation->getError());
}
Ruby
# path = "Path to a local video file: path/to/file.mp4"
require "google/cloud/video_intelligence"
video = Google::Cloud::VideoIntelligence.new
video_contents = File.binread path
# Register a callback during the method call
operation = video.annotate_video input_content: video_contents, features: [:LABEL_DETECTION] do |operation|
raise operation.results.message? if operation.error?
puts "Finished Processing."
labels = operation.results.annotation_results.first.segment_label_annotations
labels.each do |label|
puts "Label description: #{label.entity.description}"
label.category_entities.each do |category_entity|
puts "Label category description: #{category_entity.description}"
end
label.segments.each do |segment|
start_time = (segment.segment.start_time_offset.seconds +
segment.segment.start_time_offset.nanos / 1e9)
end_time = (segment.segment.end_time_offset.seconds +
segment.segment.end_time_offset.nanos / 1e9)
puts "Segment: #{start_time} to #{end_time}"
puts "Confidence: #{segment.confidence}"
end
end
end
puts "Processing video for label annotations:"
operation.wait_until_done!
Annotating a file on Cloud Storage
Here is an example of performing video analysis for labels on a file located in Cloud Storage.
REST & CMD LINE
Send the process request
The following shows how to send a POST
request to the
annotate
method. The example uses the access token for
a service account set up for the project using the Cloud SDK. For
instructions on installing the Cloud SDK, setting up a project with a service
account, and obtaining an access token, see the
Video Intelligence quickstart.
Before using any of the request data below, make the following replacements:
- input-uri: a Cloud Storage bucket that contains
the file you want to annotate, including the file name. Must
start with
gs://
.
HTTP method and URL:
POST https://videointelligence.googleapis.com/v1/videos:annotate
Request JSON body:
{ "inputUri": "input-uri", "features": ["LABEL_DETECTION"], }
To send your request, expand one of these options:
curl (Linux, macOS, or Cloud Shell)
Save the request body in a file called request.json
,
and execute the following command:
curl -X POST \
-H "Authorization: Bearer "$(gcloud auth application-default print-access-token) \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
https://videointelligence.googleapis.com/v1/videos:annotate
PowerShell (Windows)
Save the request body in a file called request.json
,
and execute the following command:
$cred = gcloud auth application-default print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }
Invoke-WebRequest `
-Method POST `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "https://videointelligence.googleapis.com/v1/videos:annotate" | Select-Object -Expand Content
You should receive a JSON response similar to the following:
{ "name": "projects/project-number/locations/location-id/operations/operation-id" }
If the request is successful, the Video Intelligence returns the name of your operation.
Get the results
To get the results of your request, you must send a GET
request to
the projects.locations.operations
resource. The following shows how to send such a request.
Before using any of the request data below, make the following replacements:
- operation-name: the name of the operation as
returned by the Video Intelligence API. The operation name has the format
projects/project-number/locations/location-id/operations/operation-id
HTTP method and URL:
GET https://videointelligence.googleapis.com/v1/operation-name
To send your request, expand one of these options:
curl (Linux, macOS, or Cloud Shell)
Execute the following command:
curl -X GET \
-H "Authorization: Bearer "$(gcloud auth application-default print-access-token) \
https://videointelligence.googleapis.com/v1/operation-name
PowerShell (Windows)
Execute the following command:
$cred = gcloud auth application-default print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }
Invoke-WebRequest `
-Method GET `
-Headers $headers `
-Uri "https://videointelligence.googleapis.com/v1/operation-name" | Select-Object -Expand Content
You should receive a JSON response similar to the following:
Response
{ "name": "projects/project-number/locations/location-id/operations/operation-id", "metadata": { "@type": "type.googleapis.com/google.cloud.videointelligence.v1.AnnotateVideoProgress", "annotationProgress": [ { "inputUri": "input-uri", "progressPercent": 100, "startTime": "2019-11-13T19:25:56.206335Z", "updateTime": "2019-11-13T19:26:16.215615Z" } ] }, "done": true, "response": { "@type": "type.googleapis.com/google.cloud.videointelligence.v1.AnnotateVideoResponse", "annotationResults": [ { "inputUri": "input-uri", "segmentLabelAnnotations": [ { "entity": { "entityId": "/m/01vkl", "description": "circle", "languageCode": "en-US" }, "categoryEntities": [ { "entityId": "/m/016nqd", "description": "shape", "languageCode": "en-US" } ], "segments": [ { "segment": { "startTimeOffset": "0s", "endTimeOffset": "16.416666s" }, "confidence": 0.36535457 } ] }, ... ] } ] } }
C#
public static object AnalyzeLabelsGcs(string uri)
{
var client = VideoIntelligenceServiceClient.Create();
var request = new AnnotateVideoRequest()
{
InputUri = uri,
Features = { Feature.LabelDetection }
};
var op = client.AnnotateVideo(request).PollUntilCompleted();
foreach (var result in op.Result.AnnotationResults)
{
PrintLabels("Video", result.SegmentLabelAnnotations);
PrintLabels("Shot", result.ShotLabelAnnotations);
PrintLabels("Frame", result.FrameLabelAnnotations);
}
return 0;
}
static void PrintLabels(string labelName,
IEnumerable<LabelAnnotation> labelAnnotations)
{
foreach (var annotation in labelAnnotations)
{
Console.WriteLine($"{labelName} label: {annotation.Entity.Description}");
foreach (var entity in annotation.CategoryEntities)
{
Console.WriteLine($"{labelName} label category: {entity.Description}");
}
foreach (var segment in annotation.Segments)
{
Console.Write("Segment location: ");
Console.Write(segment.Segment.StartTimeOffset);
Console.Write(":");
Console.WriteLine(segment.Segment.EndTimeOffset);
System.Console.WriteLine($"Confidence: {segment.Confidence}");
}
}
}
Go
func labelURI(w io.Writer, file string) error {
ctx := context.Background()
client, err := video.NewClient(ctx)
if err != nil {
return err
}
op, err := client.AnnotateVideo(ctx, &videopb.AnnotateVideoRequest{
Features: []videopb.Feature{
videopb.Feature_LABEL_DETECTION,
},
InputUri: file,
})
if err != nil {
return err
}
resp, err := op.Wait(ctx)
if err != nil {
return err
}
printLabels := func(labels []*videopb.LabelAnnotation) {
for _, label := range labels {
fmt.Fprintf(w, "\tDescription: %s\n", label.Entity.Description)
for _, category := range label.CategoryEntities {
fmt.Fprintf(w, "\t\tCategory: %s\n", category.Description)
}
for _, segment := range label.Segments {
start, _ := ptypes.Duration(segment.Segment.StartTimeOffset)
end, _ := ptypes.Duration(segment.Segment.EndTimeOffset)
fmt.Fprintf(w, "\t\tSegment: %s to %s\n", start, end)
}
}
}
// A single video was processed. Get the first result.
result := resp.AnnotationResults[0]
fmt.Fprintln(w, "SegmentLabelAnnotations:")
printLabels(result.SegmentLabelAnnotations)
fmt.Fprintln(w, "ShotLabelAnnotations:")
printLabels(result.ShotLabelAnnotations)
fmt.Fprintln(w, "FrameLabelAnnotations:")
printLabels(result.FrameLabelAnnotations)
return nil
}
Java
// Instantiate a com.google.cloud.videointelligence.v1.VideoIntelligenceServiceClient
try (VideoIntelligenceServiceClient client = VideoIntelligenceServiceClient.create()) {
// Provide path to file hosted on GCS as "gs://bucket-name/..."
AnnotateVideoRequest request = AnnotateVideoRequest.newBuilder()
.setInputUri(gcsUri)
.addFeatures(Feature.LABEL_DETECTION)
.build();
// Create an operation that will contain the response when the operation completes.
OperationFuture<AnnotateVideoResponse, AnnotateVideoProgress> response =
client.annotateVideoAsync(request);
System.out.println("Waiting for operation to complete...");
for (VideoAnnotationResults results : response.get().getAnnotationResultsList()) {
// process video / segment level label annotations
System.out.println("Locations: ");
for (LabelAnnotation labelAnnotation : results.getSegmentLabelAnnotationsList()) {
System.out
.println("Video label: " + labelAnnotation.getEntity().getDescription());
// categories
for (Entity categoryEntity : labelAnnotation.getCategoryEntitiesList()) {
System.out.println("Video label category: " + categoryEntity.getDescription());
}
// segments
for (LabelSegment segment : labelAnnotation.getSegmentsList()) {
double startTime = segment.getSegment().getStartTimeOffset().getSeconds()
+ segment.getSegment().getStartTimeOffset().getNanos() / 1e9;
double endTime = segment.getSegment().getEndTimeOffset().getSeconds()
+ segment.getSegment().getEndTimeOffset().getNanos() / 1e9;
System.out.printf("Segment location: %.3f:%.3f\n", startTime, endTime);
System.out.println("Confidence: " + segment.getConfidence());
}
}
// process shot label annotations
for (LabelAnnotation labelAnnotation : results.getShotLabelAnnotationsList()) {
System.out
.println("Shot label: " + labelAnnotation.getEntity().getDescription());
// categories
for (Entity categoryEntity : labelAnnotation.getCategoryEntitiesList()) {
System.out.println("Shot label category: " + categoryEntity.getDescription());
}
// segments
for (LabelSegment segment : labelAnnotation.getSegmentsList()) {
double startTime = segment.getSegment().getStartTimeOffset().getSeconds()
+ segment.getSegment().getStartTimeOffset().getNanos() / 1e9;
double endTime = segment.getSegment().getEndTimeOffset().getSeconds()
+ segment.getSegment().getEndTimeOffset().getNanos() / 1e9;
System.out.printf("Segment location: %.3f:%.3f\n", startTime, endTime);
System.out.println("Confidence: " + segment.getConfidence());
}
}
// process frame label annotations
for (LabelAnnotation labelAnnotation : results.getFrameLabelAnnotationsList()) {
System.out
.println("Frame label: " + labelAnnotation.getEntity().getDescription());
// categories
for (Entity categoryEntity : labelAnnotation.getCategoryEntitiesList()) {
System.out.println("Frame label category: " + categoryEntity.getDescription());
}
// segments
for (LabelSegment segment : labelAnnotation.getSegmentsList()) {
double startTime = segment.getSegment().getStartTimeOffset().getSeconds()
+ segment.getSegment().getStartTimeOffset().getNanos() / 1e9;
double endTime = segment.getSegment().getEndTimeOffset().getSeconds()
+ segment.getSegment().getEndTimeOffset().getNanos() / 1e9;
System.out.printf("Segment location: %.3f:%.2f\n", startTime, endTime);
System.out.println("Confidence: " + segment.getConfidence());
}
}
}
}
Node.js
// Imports the Google Cloud Video Intelligence library
const video = require('@google-cloud/video-intelligence').v1;
// Creates a client
const client = new video.VideoIntelligenceServiceClient();
/**
* TODO(developer): Uncomment the following line before running the sample.
*/
// const gcsUri = 'GCS URI of the video to analyze, e.g. gs://my-bucket/my-video.mp4';
const request = {
inputUri: gcsUri,
features: ['LABEL_DETECTION'],
};
// Detects labels in a video
const [operation] = await client.annotateVideo(request);
console.log('Waiting for operation to complete...');
const [operationResult] = await operation.promise();
// Gets annotations for video
const annotations = operationResult.annotationResults[0];
const labels = annotations.segmentLabelAnnotations;
labels.forEach(label => {
console.log(`Label ${label.entity.description} occurs at:`);
label.segments.forEach(segment => {
const time = segment.segment;
if (time.startTimeOffset.seconds === undefined) {
time.startTimeOffset.seconds = 0;
}
if (time.startTimeOffset.nanos === undefined) {
time.startTimeOffset.nanos = 0;
}
if (time.endTimeOffset.seconds === undefined) {
time.endTimeOffset.seconds = 0;
}
if (time.endTimeOffset.nanos === undefined) {
time.endTimeOffset.nanos = 0;
}
console.log(
`\tStart: ${time.startTimeOffset.seconds}` +
`.${(time.startTimeOffset.nanos / 1e6).toFixed(0)}s`
);
console.log(
`\tEnd: ${time.endTimeOffset.seconds}.` +
`${(time.endTimeOffset.nanos / 1e6).toFixed(0)}s`
);
console.log(`\tConfidence: ${segment.confidence}`);
});
});
Python
""" Detects labels given a GCS path. """
video_client = videointelligence.VideoIntelligenceServiceClient()
features = [videointelligence.enums.Feature.LABEL_DETECTION]
mode = videointelligence.enums.LabelDetectionMode.SHOT_AND_FRAME_MODE
config = videointelligence.types.LabelDetectionConfig(
label_detection_mode=mode)
context = videointelligence.types.VideoContext(
label_detection_config=config)
operation = video_client.annotate_video(
path, features=features, video_context=context)
print('\nProcessing video for label annotations:')
result = operation.result(timeout=180)
print('\nFinished processing.')
# Process video/segment level label annotations
segment_labels = result.annotation_results[0].segment_label_annotations
for i, segment_label in enumerate(segment_labels):
print('Video label description: {}'.format(
segment_label.entity.description))
for category_entity in segment_label.category_entities:
print('\tLabel category description: {}'.format(
category_entity.description))
for i, segment in enumerate(segment_label.segments):
start_time = (segment.segment.start_time_offset.seconds +
segment.segment.start_time_offset.nanos / 1e9)
end_time = (segment.segment.end_time_offset.seconds +
segment.segment.end_time_offset.nanos / 1e9)
positions = '{}s to {}s'.format(start_time, end_time)
confidence = segment.confidence
print('\tSegment {}: {}'.format(i, positions))
print('\tConfidence: {}'.format(confidence))
print('\n')
# Process shot level label annotations
shot_labels = result.annotation_results[0].shot_label_annotations
for i, shot_label in enumerate(shot_labels):
print('Shot label description: {}'.format(
shot_label.entity.description))
for category_entity in shot_label.category_entities:
print('\tLabel category description: {}'.format(
category_entity.description))
for i, shot in enumerate(shot_label.segments):
start_time = (shot.segment.start_time_offset.seconds +
shot.segment.start_time_offset.nanos / 1e9)
end_time = (shot.segment.end_time_offset.seconds +
shot.segment.end_time_offset.nanos / 1e9)
positions = '{}s to {}s'.format(start_time, end_time)
confidence = shot.confidence
print('\tSegment {}: {}'.format(i, positions))
print('\tConfidence: {}'.format(confidence))
print('\n')
# Process frame level label annotations
frame_labels = result.annotation_results[0].frame_label_annotations
for i, frame_label in enumerate(frame_labels):
print('Frame label description: {}'.format(
frame_label.entity.description))
for category_entity in frame_label.category_entities:
print('\tLabel category description: {}'.format(
category_entity.description))
# Each frame_label_annotation has many frames,
# here we print information only about the first frame.
frame = frame_label.frames[0]
time_offset = (frame.time_offset.seconds +
frame.time_offset.nanos / 1e9)
print('\tFirst frame time offset: {}s'.format(time_offset))
print('\tFirst frame confidence: {}'.format(frame.confidence))
print('\n')
PHP
use Google\Cloud\VideoIntelligence\V1\VideoIntelligenceServiceClient;
use Google\Cloud\VideoIntelligence\V1\Feature;
/** Uncomment and populate these variables in your code */
// $path = 'File path to a video file to analyze';
// $options = [];
# Instantiate a client.
$video = new VideoIntelligenceServiceClient();
# Read the local video file
$inputContent = file_get_contents($path);
# Execute a request.
$operation = $video->annotateVideo([
'inputContent' => $inputContent,
'features' => [Feature::LABEL_DETECTION]
]);
# Wait for the request to complete.
$operation->pollUntilComplete($options);
# Print the results.
if ($operation->operationSucceeded()) {
$results = $operation->getResult()->getAnnotationResults()[0];
# Process video/segment level label annotations
foreach ($results->getSegmentLabelAnnotations() as $label) {
printf('Video label description: %s' . PHP_EOL, $label->getEntity()->getDescription());
foreach ($label->getCategoryEntities() as $categoryEntity) {
printf(' Category: %s' . PHP_EOL, $categoryEntity->getDescription());
}
foreach ($label->getSegments() as $segment) {
$start = $segment->getSegment()->getStartTimeOffset();
$end = $segment->getSegment()->getEndTimeOffset();
printf(' Segment: %ss to %ss' . PHP_EOL,
$start->getSeconds() + $start->getNanos()/1000000000.0,
$end->getSeconds() + $end->getNanos()/1000000000.0);
printf(' Confidence: %f' . PHP_EOL, $segment->getConfidence());
}
}
print(PHP_EOL);
# Process shot level label annotations
foreach ($results->getShotLabelAnnotations() as $label) {
printf('Shot label description: %s' . PHP_EOL, $label->getEntity()->getDescription());
foreach ($label->getCategoryEntities() as $categoryEntity) {
printf(' Category: %s' . PHP_EOL, $categoryEntity->getDescription());
}
foreach ($label->getSegments() as $shot) {
$start = $shot->getSegment()->getStartTimeOffset();
$end = $shot->getSegment()->getEndTimeOffset();
printf(' Shot: %ss to %ss' . PHP_EOL,
$start->getSeconds() + $start->getNanos()/1000000000.0,
$end->getSeconds() + $end->getNanos()/1000000000.0);
printf(' Confidence: %f' . PHP_EOL, $shot->getConfidence());
}
}
print(PHP_EOL);
} else {
print_r($operation->getError());
}
Ruby
# path = "Path to a video file on Google Cloud Storage: gs://bucket/video.mp4"
require "google/cloud/video_intelligence"
video = Google::Cloud::VideoIntelligence.new
# Register a callback during the method call
operation = video.annotate_video input_uri: path, features: [:LABEL_DETECTION] do |operation|
raise operation.results.message? if operation.error?
puts "Finished Processing."
labels = operation.results.annotation_results.first.segment_label_annotations
labels.each do |label|
puts "Label description: #{label.entity.description}"
label.category_entities.each do |category_entity|
puts "Label category description: #{category_entity.description}"
end
label.segments.each do |segment|
start_time = (segment.segment.start_time_offset.seconds +
segment.segment.start_time_offset.nanos / 1e9)
end_time = (segment.segment.end_time_offset.seconds +
segment.segment.end_time_offset.nanos / 1e9)
puts "Segment: #{start_time} to #{end_time}"
puts "Confidence: #{segment.confidence}"
end
end
end
puts "Processing video for label annotations:"
operation.wait_until_done!