The Video Intelligence API can detect, track, and recognize the presence of over 100,000 brands and logos in video content.
This page describes how to recognize a logo in a video using the Video Intelligence API.
Annotate a video in Cloud Storage
The following code sample demonstrates how to detect logos in a video in Cloud Storage.
REST & CMD LINE
Send the process request
To perform annotation on a local video file, base64-encode the contents of
the video file. Include the base64-encoded contents in the inputContent
field
of the request. For information on how to base64-encode the contents of a video file,
see Base64 Encoding.
The following shows how to send a POST
request to the
videos:annotate
method.
The example uses the access token for a service account set up for the project
using the Cloud SDK. For instructions on installing the Cloud SDK,
setting up a project with a service
account, and obtaining an access token, see the
Video Intelligence quickstart.
Before using any of the request data below, make the following replacements:
- input-uri: a Cloud Storage bucket that contains
the file you want to annotate, including the file name. Must
start with
gs://
.
For example:
"inputUri": "gs://cloud-videointelligence-demo/assistant.mp4",
HTTP method and URL:
POST https://videointelligence.googleapis.com/v1/videos:annotate
Request JSON body:
{ "inputUri":"input-uri", "features": ["LOGO_RECOGNITION"] }
To send your request, expand one of these options:
curl (Linux, macOS, or Cloud Shell)
Save the request body in a file called request.json
,
and execute the following command:
curl -X POST \
-H "Authorization: Bearer "$(gcloud auth application-default print-access-token) \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
https://videointelligence.googleapis.com/v1/videos:annotate
PowerShell (Windows)
Save the request body in a file called request.json
,
and execute the following command:
$cred = gcloud auth application-default print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }
Invoke-WebRequest `
-Method POST `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "https://videointelligence.googleapis.com/v1/videos:annotate" | Select-Object -Expand Content
You should receive a JSON response similar to the following:
{ "name": "projects/project-number/locations/location-id/operations/operation-id" }
If the response is successful, the Video Intelligence API returns the name
for your
operation. The above shows an example of such a response, where:
project-number
is the number of your project and operation-id
is the ID of the
long running operation created for the request.
- project-number: the number of your project
- location-id: the Cloud region where annotation should take
place. Supported cloud regions are:
us-east1
,us-west1
,europe-west1
,asia-east1
. If no region is specified, a region will be determined based on video file location. - operation-id: the ID of the long running operation created
for the request and provided in the response when you started the
operation, for example
12345...
Get the results
To get the results of your request, you send a GET
request, using the
operation name returned from the call to videos:annotate
, as shown in the following example.
Before using any of the request data below, make the following replacements:
- operation-name: the name of the operation as
returned by Video Intelligence API. The operation name has the format
projects/project-number/locations/location-id/operations/operation-id
HTTP method and URL:
GET https://videointelligence.googleapis.com/v1/operation-name
To send your request, expand one of these options:
curl (Linux, macOS, or Cloud Shell)
Execute the following command:
curl -X GET \
-H "Authorization: Bearer "$(gcloud auth application-default print-access-token) \
https://videointelligence.googleapis.com/v1/operation-name
PowerShell (Windows)
Execute the following command:
$cred = gcloud auth application-default print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }
Invoke-WebRequest `
-Method GET `
-Headers $headers `
-Uri "https://videointelligence.googleapis.com/v1/operation-name" | Select-Object -Expand Content
You should receive a JSON response similar to the following:
Response
{ "name": "projects/project-number/locations/location-id/operations/operation-id", "metadata": { "@type": "type.googleapis.com/google.cloud.videointelligence.v1.AnnotateVideoProgress", "annotationProgress": [ { "inputUri": "/cloud-samples-data/video/googlework_short.mp4", "progressPercent": 100, "startTime": "2020-02-31T16:27:44.889439Z", "updateTime": "2020-02-31T16:27:56.526050Z" } ] }, "done": true, "response": { "@type": "type.googleapis.com/google.cloud.videointelligence.v1.AnnotateVideoResponse", "annotationResults": [ { "inputUri": "/cloud-samples-data/video/googlework_short.mp4", "segment": { "startTimeOffset": "0s", "endTimeOffset": "34.234200s" }, "logoRecognitionAnnotations": [{ "entity": { "entityId": "/m/045c7b", "description": "Google", "languageCode": "en-US" }, "tracks": [{ "segment": { "startTimeOffset": "10.543866s", "endTimeOffset": "12.345666s" }, "timestampedObjects": [{ "normalizedBoundingBox": { "left": 0.3912032, "top": 0.26212785, "right": 0.6469412, "bottom": 0.4434373 }, "timeOffset": "10.543866s" }, ... ], "confidence": 0.8588119 }, { "segment": { "startTimeOffset": "15.348666s", "endTimeOffset": "18.752066s" }, "timestampedObjects": [ { "normalizedBoundingBox": { "left": 0.69989866, "top": 0.79943377, "right": 0.76465744, "bottom": 0.9271479 }, "timeOffset": "15.348666s" }, { "normalizedBoundingBox": { "left": 0.68997324, "top": 0.78775305, "right": 0.75723547, "bottom": 0.91808647 }, "timeOffset": "15.448766s" }, ... } } ] } ] } }
Download annotation results
Copy the annotation from the source to the destination bucket: (see Copy files and objects)
gsutil cp gcs_uri gs://my-bucket
Note: If the output gcs uri is provided by the user, then the annotation is stored in that gcs uri.
C#
public static object DetectLogoGcs(string gcsUri)
{
var client = VideoIntelligenceServiceClient.Create();
var request = new AnnotateVideoRequest()
{
InputUri = gcsUri,
Features = { Feature.LogoRecognition }
};
Console.WriteLine("\nWaiting for operation to complete...");
var op = client.AnnotateVideo(request).PollUntilCompleted();
// The first result is retrieved because a single video was processed.
var annotationResults = op.Result.AnnotationResults[0];
// Annotations for list of logos detected, tracked and recognized in video.
foreach (var logoRecognitionAnnotation in annotationResults.LogoRecognitionAnnotations)
{
var entity = logoRecognitionAnnotation.Entity;
// Opaque entity ID. Some IDs may be available in
// [Google Knowledge Graph Search API](https://developers.google.com/knowledge-graph/).
Console.WriteLine($"Entity ID :{entity.EntityId}");
Console.WriteLine($"Description :{entity.Description}");
// All logo tracks where the recognized logo appears. Each track corresponds to one logo
// instance appearing in consecutive frames.
foreach (var track in logoRecognitionAnnotation.Tracks)
{
// Video segment of a track.
var startTimeOffset = track.Segment.StartTimeOffset;
Console.WriteLine(
$"Start Time Offset: {startTimeOffset.Seconds}.{startTimeOffset.Nanos}");
var endTimeOffset = track.Segment.EndTimeOffset;
Console.WriteLine(
$"End Time Offset: {endTimeOffset.Seconds}.{endTimeOffset.Seconds}");
Console.WriteLine($"\tConfidence: {track.Confidence}");
// The object with timestamp and attributes per frame in the track.
foreach (var timestampedObject in track.TimestampedObjects)
{
// Normalized Bounding box in a frame, where the object is located.
var normalizedBoundingBox = timestampedObject.NormalizedBoundingBox;
Console.WriteLine($"Left: {normalizedBoundingBox.Left}");
Console.WriteLine($"Top: {normalizedBoundingBox.Top}");
Console.WriteLine($"Right: {normalizedBoundingBox.Right}");
Console.WriteLine($"Bottom: {normalizedBoundingBox.Bottom}");
// Optional. The attributes of the object in the bounding box.
foreach (var attribute in timestampedObject.Attributes)
{
Console.WriteLine($"Name: {attribute.Name}");
Console.WriteLine($"Confidence: {attribute.Confidence}");
Console.WriteLine($"Value: {attribute.Value}");
}
// Optional. Attributes in the track level.
foreach (var trackAttribute in track.Attributes)
{
Console.WriteLine($"Name : {trackAttribute.Name}");
Console.WriteLine($"Confidence : {trackAttribute.Confidence}");
Console.WriteLine($"Value : {trackAttribute.Value}");
}
}
// All video segments where the recognized logo appears. There might be multiple instances
// of the same logo class appearing in one VideoSegment.
foreach (var segment in logoRecognitionAnnotation.Segments)
{
Console.WriteLine(
$"Start Time Offset : {segment.StartTimeOffset.Seconds}.{segment.StartTimeOffset.Nanos}");
Console.WriteLine(
$"End Time Offset : {segment.EndTimeOffset.Seconds}.{segment.EndTimeOffset.Nanos}");
}
}
}
return 0;
}
Go
import (
"context"
"fmt"
"io"
"time"
video "cloud.google.com/go/videointelligence/apiv1"
"github.com/golang/protobuf/ptypes"
videopb "google.golang.org/genproto/googleapis/cloud/videointelligence/v1"
)
// logoDetectionGCS analyzes a video and extracts logos with their bounding boxes.
func logoDetectionGCS(w io.Writer, gcsURI string) error {
// gcsURI := "gs://cloud-samples-data/video/googlework_tiny.mp4"
ctx := context.Background()
// Creates a client.
client, err := video.NewClient(ctx)
if err != nil {
return fmt.Errorf("video.NewClient: %v", err)
}
defer client.Close()
ctx, cancel := context.WithTimeout(ctx, time.Second*180)
defer cancel()
op, err := client.AnnotateVideo(ctx, &videopb.AnnotateVideoRequest{
InputUri: gcsURI,
Features: []videopb.Feature{
videopb.Feature_LOGO_RECOGNITION,
},
})
if err != nil {
return fmt.Errorf("AnnotateVideo: %v", err)
}
resp, err := op.Wait(ctx)
if err != nil {
return fmt.Errorf("Wait: %v", err)
}
// Only one video was processed, so get the first result.
result := resp.GetAnnotationResults()[0]
// Annotations for list of logos detected, tracked and recognized in video.
for _, annotation := range result.LogoRecognitionAnnotations {
fmt.Fprintf(w, "Description: %q\n", annotation.Entity.GetDescription())
// Opaque entity ID. Some IDs may be available in Google Knowledge
// Graph Search API (https://developers.google.com/knowledge-graph/).
if len(annotation.Entity.EntityId) > 0 {
fmt.Fprintf(w, "\tEntity ID: %q\n", annotation.Entity.GetEntityId())
}
// All logo tracks where the recognized logo appears. Each track
// corresponds to one logo instance appearing in consecutive frames.
for _, track := range annotation.Tracks {
// Video segment of a track.
segment := track.GetSegment()
start, _ := ptypes.Duration(segment.GetStartTimeOffset())
end, _ := ptypes.Duration(segment.GetEndTimeOffset())
fmt.Fprintf(w, "\tSegment: %v to %v\n", start, end)
fmt.Fprintf(w, "\tConfidence: %f\n", track.GetConfidence())
// The object with timestamp and attributes per frame in the track.
for _, timestampedObject := range track.TimestampedObjects {
// Normalized Bounding box in a frame, where the object is
// located.
box := timestampedObject.GetNormalizedBoundingBox()
fmt.Fprintf(w, "\tBounding box position:\n")
fmt.Fprintf(w, "\t\tleft : %f\n", box.GetLeft())
fmt.Fprintf(w, "\t\ttop : %f\n", box.GetTop())
fmt.Fprintf(w, "\t\tright : %f\n", box.GetRight())
fmt.Fprintf(w, "\t\tbottom: %f\n", box.GetBottom())
// Optional. The attributes of the object in the bounding box.
for _, attribute := range timestampedObject.Attributes {
fmt.Fprintf(w, "\t\t\tName: %q\n", attribute.GetName())
fmt.Fprintf(w, "\t\t\tConfidence: %f\n", attribute.GetConfidence())
fmt.Fprintf(w, "\t\t\tValue: %q\n", attribute.GetValue())
}
}
// Optional. Attributes in the track level.
for _, trackAttribute := range track.Attributes {
fmt.Fprintf(w, "\t\tName: %q\n", trackAttribute.GetName())
fmt.Fprintf(w, "\t\tConfidence: %f\n", trackAttribute.GetConfidence())
fmt.Fprintf(w, "\t\tValue: %q\n", trackAttribute.GetValue())
}
}
// All video segments where the recognized logo appears. There might be
// multiple instances of the same logo class appearing in one VideoSegment.
for _, segment := range annotation.Segments {
start, _ := ptypes.Duration(segment.GetStartTimeOffset())
end, _ := ptypes.Duration(segment.GetEndTimeOffset())
fmt.Fprintf(w, "\tSegment: %v to %v\n", start, end)
}
}
return nil
}
Java
import com.google.api.gax.longrunning.OperationFuture;
import com.google.cloud.videointelligence.v1.AnnotateVideoProgress;
import com.google.cloud.videointelligence.v1.AnnotateVideoRequest;
import com.google.cloud.videointelligence.v1.AnnotateVideoResponse;
import com.google.cloud.videointelligence.v1.DetectedAttribute;
import com.google.cloud.videointelligence.v1.Entity;
import com.google.cloud.videointelligence.v1.Feature;
import com.google.cloud.videointelligence.v1.LogoRecognitionAnnotation;
import com.google.cloud.videointelligence.v1.NormalizedBoundingBox;
import com.google.cloud.videointelligence.v1.TimestampedObject;
import com.google.cloud.videointelligence.v1.Track;
import com.google.cloud.videointelligence.v1.VideoAnnotationResults;
import com.google.cloud.videointelligence.v1.VideoIntelligenceServiceClient;
import com.google.cloud.videointelligence.v1.VideoSegment;
import com.google.protobuf.Duration;
import java.io.IOException;
import java.util.concurrent.ExecutionException;
import java.util.concurrent.TimeUnit;
import java.util.concurrent.TimeoutException;
public class LogoDetectionGcs {
public static void detectLogoGcs() throws Exception {
// TODO(developer): Replace these variables before running the sample.
String gcsUri = "gs://YOUR_BUCKET_ID/path/to/your/video.mp4";
detectLogoGcs(gcsUri);
}
public static void detectLogoGcs(String inputUri)
throws IOException, ExecutionException, InterruptedException, TimeoutException {
// Initialize client that will be used to send requests. This client only needs to be created
// once, and can be reused for multiple requests. After completing all of your requests, call
// the "close" method on the client to safely clean up any remaining background resources.
try (VideoIntelligenceServiceClient client = VideoIntelligenceServiceClient.create()) {
// Create the request
AnnotateVideoRequest request =
AnnotateVideoRequest.newBuilder()
.setInputUri(inputUri)
.addFeatures(Feature.LOGO_RECOGNITION)
.build();
// asynchronously perform object tracking on videos
OperationFuture<AnnotateVideoResponse, AnnotateVideoProgress> future =
client.annotateVideoAsync(request);
System.out.println("Waiting for operation to complete...");
// The first result is retrieved because a single video was processed.
AnnotateVideoResponse response = future.get(300, TimeUnit.SECONDS);
VideoAnnotationResults annotationResult = response.getAnnotationResults(0);
// Annotations for list of logos detected, tracked and recognized in video.
for (LogoRecognitionAnnotation logoRecognitionAnnotation :
annotationResult.getLogoRecognitionAnnotationsList()) {
Entity entity = logoRecognitionAnnotation.getEntity();
// Opaque entity ID. Some IDs may be available in
// [Google Knowledge Graph Search API](https://developers.google.com/knowledge-graph/).
System.out.printf("Entity Id : %s\n", entity.getEntityId());
System.out.printf("Description : %s\n", entity.getDescription());
// All logo tracks where the recognized logo appears. Each track corresponds to one logo
// instance appearing in consecutive frames.
for (Track track : logoRecognitionAnnotation.getTracksList()) {
// Video segment of a track.
Duration startTimeOffset = track.getSegment().getStartTimeOffset();
System.out.printf(
"\n\tStart Time Offset: %s.%s\n",
startTimeOffset.getSeconds(), startTimeOffset.getNanos());
Duration endTimeOffset = track.getSegment().getEndTimeOffset();
System.out.printf(
"\tEnd Time Offset: %s.%s\n", endTimeOffset.getSeconds(), endTimeOffset.getNanos());
System.out.printf("\tConfidence: %s\n", track.getConfidence());
// The object with timestamp and attributes per frame in the track.
for (TimestampedObject timestampedObject : track.getTimestampedObjectsList()) {
// Normalized Bounding box in a frame, where the object is located.
NormalizedBoundingBox normalizedBoundingBox =
timestampedObject.getNormalizedBoundingBox();
System.out.printf("\n\t\tLeft: %s\n", normalizedBoundingBox.getLeft());
System.out.printf("\t\tTop: %s\n", normalizedBoundingBox.getTop());
System.out.printf("\t\tRight: %s\n", normalizedBoundingBox.getRight());
System.out.printf("\t\tBottom: %s\n", normalizedBoundingBox.getBottom());
// Optional. The attributes of the object in the bounding box.
for (DetectedAttribute attribute : timestampedObject.getAttributesList()) {
System.out.printf("\n\t\t\tName: %s\n", attribute.getName());
System.out.printf("\t\t\tConfidence: %s\n", attribute.getConfidence());
System.out.printf("\t\t\tValue: %s\n", attribute.getValue());
}
}
// Optional. Attributes in the track level.
for (DetectedAttribute trackAttribute : track.getAttributesList()) {
System.out.printf("\n\t\tName : %s\n", trackAttribute.getName());
System.out.printf("\t\tConfidence : %s\n", trackAttribute.getConfidence());
System.out.printf("\t\tValue : %s\n", trackAttribute.getValue());
}
}
// All video segments where the recognized logo appears. There might be multiple instances
// of the same logo class appearing in one VideoSegment.
for (VideoSegment segment : logoRecognitionAnnotation.getSegmentsList()) {
System.out.printf(
"\n\tStart Time Offset : %s.%s\n",
segment.getStartTimeOffset().getSeconds(), segment.getStartTimeOffset().getNanos());
System.out.printf(
"\tEnd Time Offset : %s.%s\n",
segment.getEndTimeOffset().getSeconds(), segment.getEndTimeOffset().getNanos());
}
}
}
}
}
Node.js
/**
* TODO(developer): Uncomment these variables before running the sample.
*/
// const inputUri = 'gs://cloud-samples-data/video/googlework_short.mp4';
// Imports the Google Cloud client libraries
const Video = require('@google-cloud/video-intelligence');
// Instantiates a client
const client = new Video.VideoIntelligenceServiceClient();
// Performs asynchronous video annotation for logo recognition on a file hosted in GCS.
async function detectLogoGcs() {
// Build the request with the input uri and logo recognition feature.
const request = {
inputUri: inputUri,
features: ['LOGO_RECOGNITION'],
};
// Make the asynchronous request
const [operation] = await client.annotateVideo(request);
// Wait for the results
const [response] = await operation.promise();
// Get the first response, since we sent only one video.
const annotationResult = response.annotationResults[0];
for (const logoRecognitionAnnotation of annotationResult.logoRecognitionAnnotations) {
const entity = logoRecognitionAnnotation.entity;
// Opaque entity ID. Some IDs may be available in
// [Google Knowledge Graph Search API](https://developers.google.com/knowledge-graph/).
console.log(`Entity Id: ${entity.entityId}`);
console.log(`Description: ${entity.description}`);
// All logo tracks where the recognized logo appears.
// Each track corresponds to one logo instance appearing in consecutive frames.
for (const track of logoRecognitionAnnotation.tracks) {
console.log(
`\n\tStart Time Offset: ${track.segment.startTimeOffset.seconds}.${track.segment.startTimeOffset.nanos}`
);
console.log(
`\tEnd Time Offset: ${track.segment.endTimeOffset.seconds}.${track.segment.endTimeOffset.nanos}`
);
console.log(`\tConfidence: ${track.confidence}`);
// The object with timestamp and attributes per frame in the track.
for (const timestampedObject of track.timestampedObjects) {
// Normalized Bounding box in a frame, where the object is located.
const normalizedBoundingBox = timestampedObject.normalizedBoundingBox;
console.log(`\n\t\tLeft: ${normalizedBoundingBox.left}`);
console.log(`\t\tTop: ${normalizedBoundingBox.top}`);
console.log(`\t\tRight: ${normalizedBoundingBox.right}`);
console.log(`\t\tBottom: ${normalizedBoundingBox.bottom}`);
// Optional. The attributes of the object in the bounding box.
for (const attribute of timestampedObject.attributes) {
console.log(`\n\t\t\tName: ${attribute.name}`);
console.log(`\t\t\tConfidence: ${attribute.confidence}`);
console.log(`\t\t\tValue: ${attribute.value}`);
}
}
// Optional. Attributes in the track level.
for (const trackAttribute of track.attributes) {
console.log(`\n\t\tName: ${trackAttribute.name}`);
console.log(`\t\tConfidence: ${trackAttribute.confidence}`);
console.log(`\t\tValue: ${trackAttribute.value}`);
}
}
// All video segments where the recognized logo appears.
// There might be multiple instances of the same logo class appearing in one VideoSegment.
for (const segment of logoRecognitionAnnotation.segments) {
console.log(
`\n\tStart Time Offset: ${segment.startTimeOffset.seconds}.${segment.startTimeOffset.nanos}`
);
console.log(
`\tEnd Time Offset: ${segment.endTimeOffset.seconds}.${segment.endTimeOffset.nanos}`
);
}
}
}
detectLogoGcs();
Python
from google.cloud import videointelligence
def detect_logo_gcs(input_uri="gs://YOUR_BUCKET_ID/path/to/your/file.mp4"):
client = videointelligence.VideoIntelligenceServiceClient()
features = [videointelligence.Feature.LOGO_RECOGNITION]
operation = client.annotate_video(
request={"features": features, "input_uri": input_uri}
)
print(u"Waiting for operation to complete...")
response = operation.result()
# Get the first response, since we sent only one video.
annotation_result = response.annotation_results[0]
# Annotations for list of logos detected, tracked and recognized in video.
for logo_recognition_annotation in annotation_result.logo_recognition_annotations:
entity = logo_recognition_annotation.entity
# Opaque entity ID. Some IDs may be available in [Google Knowledge Graph
# Search API](https://developers.google.com/knowledge-graph/).
print(u"Entity Id : {}".format(entity.entity_id))
print(u"Description : {}".format(entity.description))
# All logo tracks where the recognized logo appears. Each track corresponds
# to one logo instance appearing in consecutive frames.
for track in logo_recognition_annotation.tracks:
# Video segment of a track.
print(
u"\n\tStart Time Offset : {}.{}".format(
track.segment.start_time_offset.seconds,
track.segment.start_time_offset.microseconds * 1000,
)
)
print(
u"\tEnd Time Offset : {}.{}".format(
track.segment.end_time_offset.seconds,
track.segment.end_time_offset.microseconds * 1000,
)
)
print(u"\tConfidence : {}".format(track.confidence))
# The object with timestamp and attributes per frame in the track.
for timestamped_object in track.timestamped_objects:
# Normalized Bounding box in a frame, where the object is located.
normalized_bounding_box = timestamped_object.normalized_bounding_box
print(u"\n\t\tLeft : {}".format(normalized_bounding_box.left))
print(u"\t\tTop : {}".format(normalized_bounding_box.top))
print(u"\t\tRight : {}".format(normalized_bounding_box.right))
print(u"\t\tBottom : {}".format(normalized_bounding_box.bottom))
# Optional. The attributes of the object in the bounding box.
for attribute in timestamped_object.attributes:
print(u"\n\t\t\tName : {}".format(attribute.name))
print(u"\t\t\tConfidence : {}".format(attribute.confidence))
print(u"\t\t\tValue : {}".format(attribute.value))
# Optional. Attributes in the track level.
for track_attribute in track.attributes:
print(u"\n\t\tName : {}".format(track_attribute.name))
print(u"\t\tConfidence : {}".format(track_attribute.confidence))
print(u"\t\tValue : {}".format(track_attribute.value))
# All video segments where the recognized logo appears. There might be
# multiple instances of the same logo class appearing in one VideoSegment.
for segment in logo_recognition_annotation.segments:
print(
u"\n\tStart Time Offset : {}.{}".format(
segment.start_time_offset.seconds,
segment.start_time_offset.microseconds * 1000,
)
)
print(
u"\tEnd Time Offset : {}.{}".format(
segment.end_time_offset.seconds,
segment.end_time_offset.microseconds * 1000,
)
)
Annotate a local video
The following code sample demonstrates how to detect logos in a local video file.
REST & CMD LINE
Send video annotation request
To perform annotation on a local video file, be sure to base64-encode the
contents of the video file.
Include the base64-encoded contents in the inputContent
field of the request.
For information on how
to base64-encode the contents of a video file, see Base64 Encoding.
The following shows how to send a POST request to the videos:annotate
method.
The example uses the access token for a service account set up for the project using the Cloud SDK.
For instructions on installing the Cloud SDK, setting up a project with
a service account, and obtaining an access token, see the
Video Intelligence API Quickstart
Before using any of the request data below, make the following replacements:
- "inputContent": base-64-encoded-content
For example:
"UklGRg41AwBBVkkgTElTVAwBAABoZHJsYXZpaDgAAAA1ggAAxPMBAAAAAAAQCAA..."
- language-code: [Optional] See supported languages
HTTP method and URL:
POST https://videointelligence.googleapis.com/v1/videos:annotate
Request JSON body:
{ "inputContent": "base-64-encoded-content", "features": ["LOGO_RECOGNITION"], "videoContext": { } }
To send your request, expand one of these options:
curl (Linux, macOS, or Cloud Shell)
Save the request body in a file called request.json
,
and execute the following command:
curl -X POST \
-H "Authorization: Bearer "$(gcloud auth application-default print-access-token) \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
https://videointelligence.googleapis.com/v1/videos:annotate
PowerShell (Windows)
Save the request body in a file called request.json
,
and execute the following command:
$cred = gcloud auth application-default print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }
Invoke-WebRequest `
-Method POST `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "https://videointelligence.googleapis.com/v1/videos:annotate" | Select-Object -Expand Content
You should receive a JSON response similar to the following:
{ "name": "projects/project-number/locations/location-id/operations/operation-id" }
If the response is successful, the Video Intelligence API returns the name
for your
operation. The above shows an example of such a response, where
project-number
is the name of your project and operation-id
is the ID of the
long running operation created for the request.
- operation-id: provided in the response when you started the
operation, for example
12345...
Get annotation results
To retrieve the result of the operation, make a GET request, using the operation name returned from the call to videos:annotate, as shown in the following example.
HTTP method and URL:
GET https://videointelligence.googleapis.com/v1/operation-name
To send your request, expand one of these options:
curl (Linux, macOS, or Cloud Shell)
Execute the following command:
curl -X GET \
-H "Authorization: Bearer "$(gcloud auth application-default print-access-token) \
https://videointelligence.googleapis.com/v1/operation-name
PowerShell (Windows)
Execute the following command:
$cred = gcloud auth application-default print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }
Invoke-WebRequest `
-Method GET `
-Headers $headers `
-Uri "https://videointelligence.googleapis.com/v1/operation-name" | Select-Object -Expand Content
You should receive a JSON response similar to the following:
Response
"name": "projects/512816187662/locations/us-east1/operations/8399514592783793684", "metadata": { "@type": "type.googleapis.com/google.cloud.videointelligence.v1p3beta1.AnnotateVideoProgress", "annotationProgress": [ { "inputUri": "/videointelligence-prober-videos/face.mkv", "progressPercent": 100, "startTime": "2020-03-18T19:45:17.725359Z", "updateTime": "2020-03-18T19:45:26.532315Z" } ] }, "done": true, "response": { "@type": "type.googleapis.com/google.cloud.videointelligence.v1p3beta1.AnnotateVideoResponse", "annotationResults": [ { "inputUri": "/videointelligence-prober-videos/face.mkv", "segment": { "startTimeOffset": "0s", "endTimeOffset": "10.010s" }, "logoRecognitionAnnotations": [ { "entity": { "entityId": "/m/02z_b", "description": "Fox News", "languageCode": "en-US" }, "tracks": [ { "segment": { "startTimeOffset": "0s", "endTimeOffset": "1.901900s" }, "timestampedObjects": [ { "normalizedBoundingBox": { "left": 0.032402553, "top": 0.73683465, "right": 0.16249886, "bottom": 0.8664769 }, "timeOffset": "0s" }, { "normalizedBoundingBox": { "left": 0.03267879, "top": 0.73522913, "right": 0.1627307, "bottom": 0.86775583 }, "timeOffset": "0.100100s" }, { "normalizedBoundingBox": { "left": 0.031819325, "top": 0.73514116, "right": 0.16305345, "bottom": 0.8677738 }, "timeOffset": "0.200200s" }, { "normalizedBoundingBox": { "left": 0.03155339, "top": 0.7349258, "right": 0.16275825, "bottom": 0.86660737 }, "timeOffset": "0.300300s" }, .... ] }
Text detection annotations are returned as a textAnnotations
list.
Note: The done field is returned only when its value is True.
It's not included in responses for which the operation has not completed.
C#
public static object DetectLogo(string filePath)
{
var client = VideoIntelligenceServiceClient.Create();
var request = new AnnotateVideoRequest()
{
InputContent = Google.Protobuf.ByteString.CopyFrom(File.ReadAllBytes(filePath)),
Features = { Feature.LogoRecognition }
};
Console.WriteLine("\nWaiting for operation to complete...");
var op = client.AnnotateVideo(request).PollUntilCompleted();
// The first result is retrieved because a single video was processed.
var annotationResults = op.Result.AnnotationResults[0];
// Annotations for list of logos detected, tracked and recognized in video.
foreach (var logoRecognitionAnnotation in annotationResults.LogoRecognitionAnnotations)
{
var entity = logoRecognitionAnnotation.Entity;
// Opaque entity ID. Some IDs may be available in
// [Google Knowledge Graph Search API](https://developers.google.com/knowledge-graph/).
Console.WriteLine($"Entity ID :{entity.EntityId}");
Console.WriteLine($"Description :{entity.Description}");
// All logo tracks where the recognized logo appears. Each track corresponds to one logo
// instance appearing in consecutive frames.
foreach (var track in logoRecognitionAnnotation.Tracks)
{
// Video segment of a track.
var startTimeOffset = track.Segment.StartTimeOffset;
Console.WriteLine(
$"Start Time Offset: {startTimeOffset.Seconds}.{startTimeOffset.Nanos}");
var endTimeOffset = track.Segment.EndTimeOffset;
Console.WriteLine(
$"End Time Offset: {endTimeOffset.Seconds}.{endTimeOffset.Seconds}");
Console.WriteLine($"Confidence: {track.Confidence}");
// The object with timestamp and attributes per frame in the track.
foreach (var timestampedObject in track.TimestampedObjects)
{
// Normalized Bounding box in a frame, where the object is located.
var normalizedBoundingBox = timestampedObject.NormalizedBoundingBox;
Console.WriteLine($"Left: {normalizedBoundingBox.Left}");
Console.WriteLine($"Top: {normalizedBoundingBox.Top}");
Console.WriteLine($"Right: {normalizedBoundingBox.Right}");
Console.WriteLine($"Bottom: {normalizedBoundingBox.Bottom}");
// Optional. The attributes of the object in the bounding box.
foreach (var attribute in timestampedObject.Attributes)
{
Console.WriteLine($"Name: {attribute.Name}");
Console.WriteLine($"Confidence: {attribute.Confidence}");
Console.WriteLine($"Value: {attribute.Value}");
}
// Optional. Attributes in the track level.
foreach (var trackAttribute in track.Attributes)
{
Console.WriteLine($"Name : {trackAttribute.Name}");
Console.WriteLine($"Confidence : {trackAttribute.Confidence}");
Console.WriteLine($"Value : {trackAttribute.Value}");
}
}
// All video segments where the recognized logo appears. There might be multiple instances
// of the same logo class appearing in one VideoSegment.
foreach (var segment in logoRecognitionAnnotation.Segments)
{
Console.WriteLine(
$"Start Time Offset : {segment.StartTimeOffset.Seconds}.{segment.StartTimeOffset.Nanos}");
Console.WriteLine(
$"End Time Offset : {segment.EndTimeOffset.Seconds}.{segment.EndTimeOffset.Nanos}");
}
}
}
return 0;
}
Go
import (
"context"
"fmt"
"io"
"io/ioutil"
"time"
video "cloud.google.com/go/videointelligence/apiv1"
"github.com/golang/protobuf/ptypes"
videopb "google.golang.org/genproto/googleapis/cloud/videointelligence/v1"
)
// logoDetection analyzes a video and extracts logos with their bounding boxes.
func logoDetection(w io.Writer, filename string) error {
// filename := "../testdata/googlework_short.mp4"
ctx := context.Background()
// Creates a client.
client, err := video.NewClient(ctx)
if err != nil {
return fmt.Errorf("video.NewClient: %v", err)
}
defer client.Close()
ctx, cancel := context.WithTimeout(ctx, time.Second*180)
defer cancel()
fileBytes, err := ioutil.ReadFile(filename)
if err != nil {
return fmt.Errorf("ioutil.ReadFile: %v", err)
}
op, err := client.AnnotateVideo(ctx, &videopb.AnnotateVideoRequest{
InputContent: fileBytes,
Features: []videopb.Feature{
videopb.Feature_LOGO_RECOGNITION,
},
})
if err != nil {
return fmt.Errorf("AnnotateVideo: %v", err)
}
resp, err := op.Wait(ctx)
if err != nil {
return fmt.Errorf("Wait: %v", err)
}
// Only one video was processed, so get the first result.
result := resp.GetAnnotationResults()[0]
// Annotations for list of logos detected, tracked and recognized in video.
for _, annotation := range result.LogoRecognitionAnnotations {
fmt.Fprintf(w, "Description: %q\n", annotation.Entity.GetDescription())
// Opaque entity ID. Some IDs may be available in Google Knowledge
// Graph Search API (https://developers.google.com/knowledge-graph/).
if len(annotation.Entity.EntityId) > 0 {
fmt.Fprintf(w, "\tEntity ID: %q\n", annotation.Entity.GetEntityId())
}
// All logo tracks where the recognized logo appears. Each track
// corresponds to one logo instance appearing in consecutive frames.
for _, track := range annotation.Tracks {
// Video segment of a track.
segment := track.GetSegment()
start, _ := ptypes.Duration(segment.GetStartTimeOffset())
end, _ := ptypes.Duration(segment.GetEndTimeOffset())
fmt.Fprintf(w, "\tSegment: %v to %v\n", start, end)
fmt.Fprintf(w, "\tConfidence: %f\n", track.GetConfidence())
// The object with timestamp and attributes per frame in the track.
for _, timestampedObject := range track.TimestampedObjects {
// Normalized Bounding box in a frame, where the object is
// located.
box := timestampedObject.GetNormalizedBoundingBox()
fmt.Fprintf(w, "\tBounding box position:\n")
fmt.Fprintf(w, "\t\tleft : %f\n", box.GetLeft())
fmt.Fprintf(w, "\t\ttop : %f\n", box.GetTop())
fmt.Fprintf(w, "\t\tright : %f\n", box.GetRight())
fmt.Fprintf(w, "\t\tbottom: %f\n", box.GetBottom())
// Optional. The attributes of the object in the bounding box.
for _, attribute := range timestampedObject.Attributes {
fmt.Fprintf(w, "\t\t\tName: %q\n", attribute.GetName())
fmt.Fprintf(w, "\t\t\tConfidence: %f\n", attribute.GetConfidence())
fmt.Fprintf(w, "\t\t\tValue: %q\n", attribute.GetValue())
}
}
// Optional. Attributes in the track level.
for _, trackAttribute := range track.Attributes {
fmt.Fprintf(w, "\t\tName: %q\n", trackAttribute.GetName())
fmt.Fprintf(w, "\t\tConfidence: %f\n", trackAttribute.GetConfidence())
fmt.Fprintf(w, "\t\tValue: %q\n", trackAttribute.GetValue())
}
}
// All video segments where the recognized logo appears. There might be
// multiple instances of the same logo class appearing in one VideoSegment.
for _, segment := range annotation.Segments {
start, _ := ptypes.Duration(segment.GetStartTimeOffset())
end, _ := ptypes.Duration(segment.GetEndTimeOffset())
fmt.Fprintf(w, "\tSegment: %v to %v\n", start, end)
}
}
return nil
}
Java
import com.google.api.gax.longrunning.OperationFuture;
import com.google.cloud.videointelligence.v1.AnnotateVideoProgress;
import com.google.cloud.videointelligence.v1.AnnotateVideoRequest;
import com.google.cloud.videointelligence.v1.AnnotateVideoResponse;
import com.google.cloud.videointelligence.v1.DetectedAttribute;
import com.google.cloud.videointelligence.v1.Entity;
import com.google.cloud.videointelligence.v1.Feature;
import com.google.cloud.videointelligence.v1.LogoRecognitionAnnotation;
import com.google.cloud.videointelligence.v1.NormalizedBoundingBox;
import com.google.cloud.videointelligence.v1.TimestampedObject;
import com.google.cloud.videointelligence.v1.Track;
import com.google.cloud.videointelligence.v1.VideoAnnotationResults;
import com.google.cloud.videointelligence.v1.VideoIntelligenceServiceClient;
import com.google.cloud.videointelligence.v1.VideoSegment;
import com.google.protobuf.ByteString;
import com.google.protobuf.Duration;
import java.io.IOException;
import java.nio.file.Files;
import java.nio.file.Path;
import java.nio.file.Paths;
import java.util.concurrent.ExecutionException;
import java.util.concurrent.TimeUnit;
import java.util.concurrent.TimeoutException;
public class LogoDetection {
public static void detectLogo() throws Exception {
// TODO(developer): Replace these variables before running the sample.
String localFilePath = "path/to/your/video.mp4";
detectLogo(localFilePath);
}
public static void detectLogo(String filePath)
throws IOException, ExecutionException, InterruptedException, TimeoutException {
// Initialize client that will be used to send requests. This client only needs to be created
// once, and can be reused for multiple requests. After completing all of your requests, call
// the "close" method on the client to safely clean up any remaining background resources.
try (VideoIntelligenceServiceClient client = VideoIntelligenceServiceClient.create()) {
// Read file
Path path = Paths.get(filePath);
byte[] data = Files.readAllBytes(path);
// Create the request
AnnotateVideoRequest request =
AnnotateVideoRequest.newBuilder()
.setInputContent(ByteString.copyFrom(data))
.addFeatures(Feature.LOGO_RECOGNITION)
.build();
// asynchronously perform object tracking on videos
OperationFuture<AnnotateVideoResponse, AnnotateVideoProgress> future =
client.annotateVideoAsync(request);
System.out.println("Waiting for operation to complete...");
// The first result is retrieved because a single video was processed.
AnnotateVideoResponse response = future.get(300, TimeUnit.SECONDS);
VideoAnnotationResults annotationResult = response.getAnnotationResults(0);
// Annotations for list of logos detected, tracked and recognized in video.
for (LogoRecognitionAnnotation logoRecognitionAnnotation :
annotationResult.getLogoRecognitionAnnotationsList()) {
Entity entity = logoRecognitionAnnotation.getEntity();
// Opaque entity ID. Some IDs may be available in
// [Google Knowledge Graph Search API](https://developers.google.com/knowledge-graph/).
System.out.printf("Entity Id : %s\n", entity.getEntityId());
System.out.printf("Description : %s\n", entity.getDescription());
// All logo tracks where the recognized logo appears. Each track corresponds to one logo
// instance appearing in consecutive frames.
for (Track track : logoRecognitionAnnotation.getTracksList()) {
// Video segment of a track.
Duration startTimeOffset = track.getSegment().getStartTimeOffset();
System.out.printf(
"\n\tStart Time Offset: %s.%s\n",
startTimeOffset.getSeconds(), startTimeOffset.getNanos());
Duration endTimeOffset = track.getSegment().getEndTimeOffset();
System.out.printf(
"\tEnd Time Offset: %s.%s\n", endTimeOffset.getSeconds(), endTimeOffset.getNanos());
System.out.printf("\tConfidence: %s\n", track.getConfidence());
// The object with timestamp and attributes per frame in the track.
for (TimestampedObject timestampedObject : track.getTimestampedObjectsList()) {
// Normalized Bounding box in a frame, where the object is located.
NormalizedBoundingBox normalizedBoundingBox =
timestampedObject.getNormalizedBoundingBox();
System.out.printf("\n\t\tLeft: %s\n", normalizedBoundingBox.getLeft());
System.out.printf("\t\tTop: %s\n", normalizedBoundingBox.getTop());
System.out.printf("\t\tRight: %s\n", normalizedBoundingBox.getRight());
System.out.printf("\t\tBottom: %s\n", normalizedBoundingBox.getBottom());
// Optional. The attributes of the object in the bounding box.
for (DetectedAttribute attribute : timestampedObject.getAttributesList()) {
System.out.printf("\n\t\t\tName: %s\n", attribute.getName());
System.out.printf("\t\t\tConfidence: %s\n", attribute.getConfidence());
System.out.printf("\t\t\tValue: %s\n", attribute.getValue());
}
}
// Optional. Attributes in the track level.
for (DetectedAttribute trackAttribute : track.getAttributesList()) {
System.out.printf("\n\t\tName : %s\n", trackAttribute.getName());
System.out.printf("\t\tConfidence : %s\n", trackAttribute.getConfidence());
System.out.printf("\t\tValue : %s\n", trackAttribute.getValue());
}
}
// All video segments where the recognized logo appears. There might be multiple instances
// of the same logo class appearing in one VideoSegment.
for (VideoSegment segment : logoRecognitionAnnotation.getSegmentsList()) {
System.out.printf(
"\n\tStart Time Offset : %s.%s\n",
segment.getStartTimeOffset().getSeconds(), segment.getStartTimeOffset().getNanos());
System.out.printf(
"\tEnd Time Offset : %s.%s\n",
segment.getEndTimeOffset().getSeconds(), segment.getEndTimeOffset().getNanos());
}
}
}
}
}
Node.js
/**
* TODO(developer): Uncomment these variables before running the sample.
*/
// const localFilePath = 'path/to/your/video.mp4'
// Imports the Google Cloud client libraries
const Video = require('@google-cloud/video-intelligence');
const fs = require('fs');
// Instantiates a client
const client = new Video.VideoIntelligenceServiceClient();
// Performs asynchronous video annotation for logo recognition on a file.
async function detectLogo() {
const inputContent = fs.readFileSync(localFilePath).toString('base64');
// Build the request with the input content and logo recognition feature.
const request = {
inputContent: inputContent,
features: ['LOGO_RECOGNITION'],
};
// Make the asynchronous request
const [operation] = await client.annotateVideo(request);
// Wait for the results
const [response] = await operation.promise();
// Get the first response, since we sent only one video.
const annotationResult = response.annotationResults[0];
for (const logoRecognitionAnnotation of annotationResult.logoRecognitionAnnotations) {
const entity = logoRecognitionAnnotation.entity;
// Opaque entity ID. Some IDs may be available in
// [Google Knowledge Graph Search API](https://developers.google.com/knowledge-graph/).
console.log(`Entity Id: ${entity.entityId}`);
console.log(`Description: ${entity.description}`);
// All logo tracks where the recognized logo appears.
// Each track corresponds to one logo instance appearing in consecutive frames.
for (const track of logoRecognitionAnnotation.tracks) {
console.log(
`\n\tStart Time Offset: ${track.segment.startTimeOffset.seconds}.${track.segment.startTimeOffset.nanos}`
);
console.log(
`\tEnd Time Offset: ${track.segment.endTimeOffset.seconds}.${track.segment.endTimeOffset.nanos}`
);
console.log(`\tConfidence: ${track.confidence}`);
// The object with timestamp and attributes per frame in the track.
for (const timestampedObject of track.timestampedObjects) {
// Normalized Bounding box in a frame, where the object is located.
const normalizedBoundingBox = timestampedObject.normalizedBoundingBox;
console.log(`\n\t\tLeft: ${normalizedBoundingBox.left}`);
console.log(`\t\tTop: ${normalizedBoundingBox.top}`);
console.log(`\t\tRight: ${normalizedBoundingBox.right}`);
console.log(`\t\tBottom: ${normalizedBoundingBox.bottom}`);
// Optional. The attributes of the object in the bounding box.
for (const attribute of timestampedObject.attributes) {
console.log(`\n\t\t\tName: ${attribute.name}`);
console.log(`\t\t\tConfidence: ${attribute.confidence}`);
console.log(`\t\t\tValue: ${attribute.value}`);
}
}
// Optional. Attributes in the track level.
for (const trackAttribute of track.attributes) {
console.log(`\n\t\tName: ${trackAttribute.name}`);
console.log(`\t\tConfidence: ${trackAttribute.confidence}`);
console.log(`\t\tValue: ${trackAttribute.value}`);
}
}
// All video segments where the recognized logo appears.
// There might be multiple instances of the same logo class appearing in one VideoSegment.
for (const segment of logoRecognitionAnnotation.segments) {
console.log(
`\n\tStart Time Offset: ${segment.startTimeOffset.seconds}.${segment.startTimeOffset.nanos}`
);
console.log(
`\tEnd Time Offset: ${segment.endTimeOffset.seconds}.${segment.endTimeOffset.nanos}`
);
}
}
}
detectLogo();
Python
from google.cloud import videointelligence
def detect_logo(local_file_path="path/to/your/video.mp4"):
"""Performs asynchronous video annotation for logo recognition on a local file."""
client = videointelligence.VideoIntelligenceServiceClient()
with io.open(local_file_path, "rb") as f:
input_content = f.read()
features = [videointelligence.Feature.LOGO_RECOGNITION]
operation = client.annotate_video(
request={"features": features, "input_content": input_content}
)
print(u"Waiting for operation to complete...")
response = operation.result()
# Get the first response, since we sent only one video.
annotation_result = response.annotation_results[0]
# Annotations for list of logos detected, tracked and recognized in video.
for logo_recognition_annotation in annotation_result.logo_recognition_annotations:
entity = logo_recognition_annotation.entity
# Opaque entity ID. Some IDs may be available in [Google Knowledge Graph
# Search API](https://developers.google.com/knowledge-graph/).
print(u"Entity Id : {}".format(entity.entity_id))
print(u"Description : {}".format(entity.description))
# All logo tracks where the recognized logo appears. Each track corresponds
# to one logo instance appearing in consecutive frames.
for track in logo_recognition_annotation.tracks:
# Video segment of a track.
print(
u"\n\tStart Time Offset : {}.{}".format(
track.segment.start_time_offset.seconds,
track.segment.start_time_offset.microseconds * 1000,
)
)
print(
u"\tEnd Time Offset : {}.{}".format(
track.segment.end_time_offset.seconds,
track.segment.end_time_offset.microseconds * 1000,
)
)
print(u"\tConfidence : {}".format(track.confidence))
# The object with timestamp and attributes per frame in the track.
for timestamped_object in track.timestamped_objects:
# Normalized Bounding box in a frame, where the object is located.
normalized_bounding_box = timestamped_object.normalized_bounding_box
print(u"\n\t\tLeft : {}".format(normalized_bounding_box.left))
print(u"\t\tTop : {}".format(normalized_bounding_box.top))
print(u"\t\tRight : {}".format(normalized_bounding_box.right))
print(u"\t\tBottom : {}".format(normalized_bounding_box.bottom))
# Optional. The attributes of the object in the bounding box.
for attribute in timestamped_object.attributes:
print(u"\n\t\t\tName : {}".format(attribute.name))
print(u"\t\t\tConfidence : {}".format(attribute.confidence))
print(u"\t\t\tValue : {}".format(attribute.value))
# Optional. Attributes in the track level.
for track_attribute in track.attributes:
print(u"\n\t\tName : {}".format(track_attribute.name))
print(u"\t\tConfidence : {}".format(track_attribute.confidence))
print(u"\t\tValue : {}".format(track_attribute.value))
# All video segments where the recognized logo appears. There might be
# multiple instances of the same logo class appearing in one VideoSegment.
for segment in logo_recognition_annotation.segments:
print(
u"\n\tStart Time Offset : {}.{}".format(
segment.start_time_offset.seconds,
segment.start_time_offset.microseconds * 1000,
)
)
print(
u"\tEnd Time Offset : {}.{}".format(
segment.end_time_offset.seconds,
segment.end_time_offset.microseconds * 1000,
)
)