Video Intelligence API 可以检测、跟踪和识别视频内容中超过 10 万个品牌和徽标。
本页介绍了如何使用 Video Intelligence API 识别视频中的徽标。
在 Cloud Storage 中为视频添加注释
以下代码示例演示了如何在 Cloud Storage 中检测视频中的徽标。
REST
发送处理请求
要在本地视频文件上执行注解,请对视频文件的内容进行 base64 编码。在请求的 inputContent
字段中添加 base64 编码的内容。如需了解如何对视频文件的内容进行 base64 编码,请参阅 Base64 编码。
下面演示了如何向 videos:annotate
方法发送 POST
请求。该示例使用通过 Google Cloud CLI 为项目设置的服务帐号的访问令牌。如需了解如何安装 Google Cloud CLI、设置项目和服务帐号以及获取访问令牌,请参阅 Video Intelligence 快速入门。
在使用任何请求数据之前,请先进行以下替换:
- INPUT_URI:包含要添加注释的文件的 Cloud Storage 存储桶(包括文件名)。必须以
gs://
开头。
例如:"inputUri": "gs://cloud-videointelligence-demo/assistant.mp4",
- PROJECT_NUMBER:您的 Google Cloud 项目的数字标识符
HTTP 方法和网址:
POST https://videointelligence.googleapis.com/v1/videos:annotate
请求 JSON 正文:
{ "inputUri":"INPUT_URI", "features": ["LOGO_RECOGNITION"] }
如需发送您的请求,请展开以下选项之一:
curl(Linux、macOS 或 Cloud Shell)
将请求正文保存在名为 request.json
的文件中,然后执行以下命令:
curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "x-goog-user-project: PROJECT_NUMBER" \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
"https://videointelligence.googleapis.com/v1/videos:annotate"
PowerShell (Windows)
将请求正文保存在名为 request.json
的文件中,然后执行以下命令:
$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred"; "x-goog-user-project" = "PROJECT_NUMBER" }
Invoke-WebRequest `
-Method POST `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "https://videointelligence.googleapis.com/v1/videos:annotate" | Select-Object -Expand Content
您应该收到类似以下内容的 JSON 响应:
{ "name": "projects/PROJECT_NUMBER/locations/LOCATION_ID/operations/OPERATION_ID" }
如果响应成功,Video Intelligence API 将返回您的操作的 name
。上面的示例展示了此类响应的示例,其中 project-number
是您的项目编号,operation-id
是为请求创建的长时间运行的操作的 ID。
- PROJECT_NUMBER:您项目的编号
- LOCATION_ID:在其中添加注解的 Cloud 区域。支持的云区域为:
us-east1
、us-west1
、europe-west1
、asia-east1
。如果未指定区域,系统将根据视频文件位置确定区域。 - OPERATION_ID:是为请求创建的长时间运行的操作的 ID,并在启动操作时在响应中提供,例如
12345...
获取结果
要获取请求的结果,请使用对 videos:annotate
的调用返回的操作名称发送 GET
请求,如下例所示。
在使用任何请求数据之前,请先进行以下替换:
- OPERATION_NAME:Video Intelligence API 返回的操作名称。操作名称采用
projects/PROJECT_NUMBER/locations/LOCATION_ID/operations/OPERATION_ID
格式 - PROJECT_NUMBER:您的 Google Cloud 项目的数字标识符
HTTP 方法和网址:
GET https://videointelligence.googleapis.com/v1/OPERATION_NAME
如需发送您的请求,请展开以下选项之一:
curl(Linux、macOS 或 Cloud Shell)
执行以下命令:
curl -X GET \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "x-goog-user-project: PROJECT_NUMBER" \
"https://videointelligence.googleapis.com/v1/OPERATION_NAME"
PowerShell (Windows)
执行以下命令:
$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred"; "x-goog-user-project" = "PROJECT_NUMBER" }
Invoke-WebRequest `
-Method GET `
-Headers $headers `
-Uri "https://videointelligence.googleapis.com/v1/OPERATION_NAME" | Select-Object -Expand Content
您应该收到类似以下内容的 JSON 响应:
响应
{ "name": "projects/PROJECT_NUMBER/locations/LOCATION_ID/operations/OPERATION_ID", "metadata": { "@type": "type.googleapis.com/google.cloud.videointelligence.v1.AnnotateVideoProgress", "annotationProgress": [ { "inputUri": "/cloud-samples-data/video/googlework_short.mp4", "progressPercent": 100, "startTime": "2020-02-31T16:27:44.889439Z", "updateTime": "2020-02-31T16:27:56.526050Z" } ] }, "done": true, "response": { "@type": "type.googleapis.com/google.cloud.videointelligence.v1.AnnotateVideoResponse", "annotationResults": [ { "inputUri": "/cloud-samples-data/video/googlework_short.mp4", "segment": { "startTimeOffset": "0s", "endTimeOffset": "34.234200s" }, "logoRecognitionAnnotations": [{ "entity": { "entityId": "/m/045c7b", "description": "Google", "languageCode": "en-US" }, "tracks": [{ "segment": { "startTimeOffset": "10.543866s", "endTimeOffset": "12.345666s" }, "timestampedObjects": [{ "normalizedBoundingBox": { "left": 0.3912032, "top": 0.26212785, "right": 0.6469412, "bottom": 0.4434373 }, "timeOffset": "10.543866s" }, ... ], "confidence": 0.8588119 }, { "segment": { "startTimeOffset": "15.348666s", "endTimeOffset": "18.752066s" }, "timestampedObjects": [ { "normalizedBoundingBox": { "left": 0.69989866, "top": 0.79943377, "right": 0.76465744, "bottom": 0.9271479 }, "timeOffset": "15.348666s" }, { "normalizedBoundingBox": { "left": 0.68997324, "top": 0.78775305, "right": 0.75723547, "bottom": 0.91808647 }, "timeOffset": "15.448766s" }, ... } } ] } ] } }
下载注解结果
将来源中的注解复制到目标存储桶(请参阅复制文件和对象):
gsutil cp gcs_uri gs://my-bucket
注意:如果输出 gcs uri 由用户提供,则注解存储在该 gcs uri 中。
Go
如需向 Video Intelligence 进行身份验证,请设置应用默认凭据。如需了解详情,请参阅为本地开发环境设置身份验证。
import (
"context"
"fmt"
"io"
"time"
video "cloud.google.com/go/videointelligence/apiv1"
videopb "cloud.google.com/go/videointelligence/apiv1/videointelligencepb"
"github.com/golang/protobuf/ptypes"
)
// logoDetectionGCS analyzes a video and extracts logos with their bounding boxes.
func logoDetectionGCS(w io.Writer, gcsURI string) error {
// gcsURI := "gs://cloud-samples-data/video/googlework_tiny.mp4"
ctx := context.Background()
// Creates a client.
client, err := video.NewClient(ctx)
if err != nil {
return fmt.Errorf("video.NewClient: %w", err)
}
defer client.Close()
ctx, cancel := context.WithTimeout(ctx, time.Second*180)
defer cancel()
op, err := client.AnnotateVideo(ctx, &videopb.AnnotateVideoRequest{
InputUri: gcsURI,
Features: []videopb.Feature{
videopb.Feature_LOGO_RECOGNITION,
},
})
if err != nil {
return fmt.Errorf("AnnotateVideo: %w", err)
}
resp, err := op.Wait(ctx)
if err != nil {
return fmt.Errorf("Wait: %w", err)
}
// Only one video was processed, so get the first result.
result := resp.GetAnnotationResults()[0]
// Annotations for list of logos detected, tracked and recognized in video.
for _, annotation := range result.LogoRecognitionAnnotations {
fmt.Fprintf(w, "Description: %q\n", annotation.Entity.GetDescription())
// Opaque entity ID. Some IDs may be available in Google Knowledge
// Graph Search API (https://developers.google.com/knowledge-graph/).
if len(annotation.Entity.EntityId) > 0 {
fmt.Fprintf(w, "\tEntity ID: %q\n", annotation.Entity.GetEntityId())
}
// All logo tracks where the recognized logo appears. Each track
// corresponds to one logo instance appearing in consecutive frames.
for _, track := range annotation.Tracks {
// Video segment of a track.
segment := track.GetSegment()
start, _ := ptypes.Duration(segment.GetStartTimeOffset())
end, _ := ptypes.Duration(segment.GetEndTimeOffset())
fmt.Fprintf(w, "\tSegment: %v to %v\n", start, end)
fmt.Fprintf(w, "\tConfidence: %f\n", track.GetConfidence())
// The object with timestamp and attributes per frame in the track.
for _, timestampedObject := range track.TimestampedObjects {
// Normalized Bounding box in a frame, where the object is
// located.
box := timestampedObject.GetNormalizedBoundingBox()
fmt.Fprintf(w, "\tBounding box position:\n")
fmt.Fprintf(w, "\t\tleft : %f\n", box.GetLeft())
fmt.Fprintf(w, "\t\ttop : %f\n", box.GetTop())
fmt.Fprintf(w, "\t\tright : %f\n", box.GetRight())
fmt.Fprintf(w, "\t\tbottom: %f\n", box.GetBottom())
// Optional. The attributes of the object in the bounding box.
for _, attribute := range timestampedObject.Attributes {
fmt.Fprintf(w, "\t\t\tName: %q\n", attribute.GetName())
fmt.Fprintf(w, "\t\t\tConfidence: %f\n", attribute.GetConfidence())
fmt.Fprintf(w, "\t\t\tValue: %q\n", attribute.GetValue())
}
}
// Optional. Attributes in the track level.
for _, trackAttribute := range track.Attributes {
fmt.Fprintf(w, "\t\tName: %q\n", trackAttribute.GetName())
fmt.Fprintf(w, "\t\tConfidence: %f\n", trackAttribute.GetConfidence())
fmt.Fprintf(w, "\t\tValue: %q\n", trackAttribute.GetValue())
}
}
// All video segments where the recognized logo appears. There might be
// multiple instances of the same logo class appearing in one VideoSegment.
for _, segment := range annotation.Segments {
start, _ := ptypes.Duration(segment.GetStartTimeOffset())
end, _ := ptypes.Duration(segment.GetEndTimeOffset())
fmt.Fprintf(w, "\tSegment: %v to %v\n", start, end)
}
}
return nil
}
Java
如需向 Video Intelligence 进行身份验证,请设置应用默认凭据。如需了解详情,请参阅为本地开发环境设置身份验证。
import com.google.api.gax.longrunning.OperationFuture;
import com.google.cloud.videointelligence.v1.AnnotateVideoProgress;
import com.google.cloud.videointelligence.v1.AnnotateVideoRequest;
import com.google.cloud.videointelligence.v1.AnnotateVideoResponse;
import com.google.cloud.videointelligence.v1.DetectedAttribute;
import com.google.cloud.videointelligence.v1.Entity;
import com.google.cloud.videointelligence.v1.Feature;
import com.google.cloud.videointelligence.v1.LogoRecognitionAnnotation;
import com.google.cloud.videointelligence.v1.NormalizedBoundingBox;
import com.google.cloud.videointelligence.v1.TimestampedObject;
import com.google.cloud.videointelligence.v1.Track;
import com.google.cloud.videointelligence.v1.VideoAnnotationResults;
import com.google.cloud.videointelligence.v1.VideoIntelligenceServiceClient;
import com.google.cloud.videointelligence.v1.VideoSegment;
import com.google.protobuf.Duration;
import java.io.IOException;
import java.util.concurrent.ExecutionException;
import java.util.concurrent.TimeUnit;
import java.util.concurrent.TimeoutException;
public class LogoDetectionGcs {
public static void detectLogoGcs() throws Exception {
// TODO(developer): Replace these variables before running the sample.
String gcsUri = "gs://YOUR_BUCKET_ID/path/to/your/video.mp4";
detectLogoGcs(gcsUri);
}
public static void detectLogoGcs(String inputUri)
throws IOException, ExecutionException, InterruptedException, TimeoutException {
// Initialize client that will be used to send requests. This client only needs to be created
// once, and can be reused for multiple requests. After completing all of your requests, call
// the "close" method on the client to safely clean up any remaining background resources.
try (VideoIntelligenceServiceClient client = VideoIntelligenceServiceClient.create()) {
// Create the request
AnnotateVideoRequest request =
AnnotateVideoRequest.newBuilder()
.setInputUri(inputUri)
.addFeatures(Feature.LOGO_RECOGNITION)
.build();
// asynchronously perform object tracking on videos
OperationFuture<AnnotateVideoResponse, AnnotateVideoProgress> future =
client.annotateVideoAsync(request);
System.out.println("Waiting for operation to complete...");
// The first result is retrieved because a single video was processed.
AnnotateVideoResponse response = future.get(600, TimeUnit.SECONDS);
VideoAnnotationResults annotationResult = response.getAnnotationResults(0);
// Annotations for list of logos detected, tracked and recognized in video.
for (LogoRecognitionAnnotation logoRecognitionAnnotation :
annotationResult.getLogoRecognitionAnnotationsList()) {
Entity entity = logoRecognitionAnnotation.getEntity();
// Opaque entity ID. Some IDs may be available in
// [Google Knowledge Graph Search API](https://developers.google.com/knowledge-graph/).
System.out.printf("Entity Id : %s\n", entity.getEntityId());
System.out.printf("Description : %s\n", entity.getDescription());
// All logo tracks where the recognized logo appears. Each track corresponds to one logo
// instance appearing in consecutive frames.
for (Track track : logoRecognitionAnnotation.getTracksList()) {
// Video segment of a track.
Duration startTimeOffset = track.getSegment().getStartTimeOffset();
System.out.printf(
"\n\tStart Time Offset: %s.%s\n",
startTimeOffset.getSeconds(), startTimeOffset.getNanos());
Duration endTimeOffset = track.getSegment().getEndTimeOffset();
System.out.printf(
"\tEnd Time Offset: %s.%s\n", endTimeOffset.getSeconds(), endTimeOffset.getNanos());
System.out.printf("\tConfidence: %s\n", track.getConfidence());
// The object with timestamp and attributes per frame in the track.
for (TimestampedObject timestampedObject : track.getTimestampedObjectsList()) {
// Normalized Bounding box in a frame, where the object is located.
NormalizedBoundingBox normalizedBoundingBox =
timestampedObject.getNormalizedBoundingBox();
System.out.printf("\n\t\tLeft: %s\n", normalizedBoundingBox.getLeft());
System.out.printf("\t\tTop: %s\n", normalizedBoundingBox.getTop());
System.out.printf("\t\tRight: %s\n", normalizedBoundingBox.getRight());
System.out.printf("\t\tBottom: %s\n", normalizedBoundingBox.getBottom());
// Optional. The attributes of the object in the bounding box.
for (DetectedAttribute attribute : timestampedObject.getAttributesList()) {
System.out.printf("\n\t\t\tName: %s\n", attribute.getName());
System.out.printf("\t\t\tConfidence: %s\n", attribute.getConfidence());
System.out.printf("\t\t\tValue: %s\n", attribute.getValue());
}
}
// Optional. Attributes in the track level.
for (DetectedAttribute trackAttribute : track.getAttributesList()) {
System.out.printf("\n\t\tName : %s\n", trackAttribute.getName());
System.out.printf("\t\tConfidence : %s\n", trackAttribute.getConfidence());
System.out.printf("\t\tValue : %s\n", trackAttribute.getValue());
}
}
// All video segments where the recognized logo appears. There might be multiple instances
// of the same logo class appearing in one VideoSegment.
for (VideoSegment segment : logoRecognitionAnnotation.getSegmentsList()) {
System.out.printf(
"\n\tStart Time Offset : %s.%s\n",
segment.getStartTimeOffset().getSeconds(), segment.getStartTimeOffset().getNanos());
System.out.printf(
"\tEnd Time Offset : %s.%s\n",
segment.getEndTimeOffset().getSeconds(), segment.getEndTimeOffset().getNanos());
}
}
}
}
}
Node.js
如需向 Video Intelligence 进行身份验证,请设置应用默认凭据。如需了解详情,请参阅为本地开发环境设置身份验证。
/**
* TODO(developer): Uncomment these variables before running the sample.
*/
// const inputUri = 'gs://cloud-samples-data/video/googlework_short.mp4';
// Imports the Google Cloud client libraries
const Video = require('@google-cloud/video-intelligence');
// Instantiates a client
const client = new Video.VideoIntelligenceServiceClient();
// Performs asynchronous video annotation for logo recognition on a file hosted in GCS.
async function detectLogoGcs() {
// Build the request with the input uri and logo recognition feature.
const request = {
inputUri: inputUri,
features: ['LOGO_RECOGNITION'],
};
// Make the asynchronous request
const [operation] = await client.annotateVideo(request);
// Wait for the results
const [response] = await operation.promise();
// Get the first response, since we sent only one video.
const annotationResult = response.annotationResults[0];
for (const logoRecognitionAnnotation of annotationResult.logoRecognitionAnnotations) {
const entity = logoRecognitionAnnotation.entity;
// Opaque entity ID. Some IDs may be available in
// [Google Knowledge Graph Search API](https://developers.google.com/knowledge-graph/).
console.log(`Entity Id: ${entity.entityId}`);
console.log(`Description: ${entity.description}`);
// All logo tracks where the recognized logo appears.
// Each track corresponds to one logo instance appearing in consecutive frames.
for (const track of logoRecognitionAnnotation.tracks) {
console.log(
`\n\tStart Time Offset: ${track.segment.startTimeOffset.seconds}.${track.segment.startTimeOffset.nanos}`
);
console.log(
`\tEnd Time Offset: ${track.segment.endTimeOffset.seconds}.${track.segment.endTimeOffset.nanos}`
);
console.log(`\tConfidence: ${track.confidence}`);
// The object with timestamp and attributes per frame in the track.
for (const timestampedObject of track.timestampedObjects) {
// Normalized Bounding box in a frame, where the object is located.
const normalizedBoundingBox = timestampedObject.normalizedBoundingBox;
console.log(`\n\t\tLeft: ${normalizedBoundingBox.left}`);
console.log(`\t\tTop: ${normalizedBoundingBox.top}`);
console.log(`\t\tRight: ${normalizedBoundingBox.right}`);
console.log(`\t\tBottom: ${normalizedBoundingBox.bottom}`);
// Optional. The attributes of the object in the bounding box.
for (const attribute of timestampedObject.attributes) {
console.log(`\n\t\t\tName: ${attribute.name}`);
console.log(`\t\t\tConfidence: ${attribute.confidence}`);
console.log(`\t\t\tValue: ${attribute.value}`);
}
}
// Optional. Attributes in the track level.
for (const trackAttribute of track.attributes) {
console.log(`\n\t\tName: ${trackAttribute.name}`);
console.log(`\t\tConfidence: ${trackAttribute.confidence}`);
console.log(`\t\tValue: ${trackAttribute.value}`);
}
}
// All video segments where the recognized logo appears.
// There might be multiple instances of the same logo class appearing in one VideoSegment.
for (const segment of logoRecognitionAnnotation.segments) {
console.log(
`\n\tStart Time Offset: ${segment.startTimeOffset.seconds}.${segment.startTimeOffset.nanos}`
);
console.log(
`\tEnd Time Offset: ${segment.endTimeOffset.seconds}.${segment.endTimeOffset.nanos}`
);
}
}
}
detectLogoGcs();
Python
如需向 Video Intelligence 进行身份验证,请设置应用默认凭据。如需了解详情,请参阅为本地开发环境设置身份验证。
from google.cloud import videointelligence
def detect_logo_gcs(input_uri="gs://YOUR_BUCKET_ID/path/to/your/file.mp4"):
client = videointelligence.VideoIntelligenceServiceClient()
features = [videointelligence.Feature.LOGO_RECOGNITION]
operation = client.annotate_video(
request={"features": features, "input_uri": input_uri}
)
print("Waiting for operation to complete...")
response = operation.result()
# Get the first response, since we sent only one video.
annotation_result = response.annotation_results[0]
# Annotations for list of logos detected, tracked and recognized in video.
for logo_recognition_annotation in annotation_result.logo_recognition_annotations:
entity = logo_recognition_annotation.entity
# Opaque entity ID. Some IDs may be available in [Google Knowledge Graph
# Search API](https://developers.google.com/knowledge-graph/).
print("Entity Id : {}".format(entity.entity_id))
print("Description : {}".format(entity.description))
# All logo tracks where the recognized logo appears. Each track corresponds
# to one logo instance appearing in consecutive frames.
for track in logo_recognition_annotation.tracks:
# Video segment of a track.
print(
"\n\tStart Time Offset : {}.{}".format(
track.segment.start_time_offset.seconds,
track.segment.start_time_offset.microseconds * 1000,
)
)
print(
"\tEnd Time Offset : {}.{}".format(
track.segment.end_time_offset.seconds,
track.segment.end_time_offset.microseconds * 1000,
)
)
print("\tConfidence : {}".format(track.confidence))
# The object with timestamp and attributes per frame in the track.
for timestamped_object in track.timestamped_objects:
# Normalized Bounding box in a frame, where the object is located.
normalized_bounding_box = timestamped_object.normalized_bounding_box
print("\n\t\tLeft : {}".format(normalized_bounding_box.left))
print("\t\tTop : {}".format(normalized_bounding_box.top))
print("\t\tRight : {}".format(normalized_bounding_box.right))
print("\t\tBottom : {}".format(normalized_bounding_box.bottom))
# Optional. The attributes of the object in the bounding box.
for attribute in timestamped_object.attributes:
print("\n\t\t\tName : {}".format(attribute.name))
print("\t\t\tConfidence : {}".format(attribute.confidence))
print("\t\t\tValue : {}".format(attribute.value))
# Optional. Attributes in the track level.
for track_attribute in track.attributes:
print("\n\t\tName : {}".format(track_attribute.name))
print("\t\tConfidence : {}".format(track_attribute.confidence))
print("\t\tValue : {}".format(track_attribute.value))
# All video segments where the recognized logo appears. There might be
# multiple instances of the same logo class appearing in one VideoSegment.
for segment in logo_recognition_annotation.segments:
print(
"\n\tStart Time Offset : {}.{}".format(
segment.start_time_offset.seconds,
segment.start_time_offset.microseconds * 1000,
)
)
print(
"\tEnd Time Offset : {}.{}".format(
segment.end_time_offset.seconds,
segment.end_time_offset.microseconds * 1000,
)
)
其他语言
C#:请按照“客户端库”页面上的 C# 设置说明操作,然后访问 .NET 的 Video Intelligence 参考文档。
PHP:请按照客户端库页面上的 PHP 设置说明操作,然后访问 PHP 版 Video Intelligence 参考文档。
Ruby:请按照客户端库页面上的 Ruby 设置说明操作,然后访问 Ruby 版 Video Intelligence 参考文档。
为本地视频添加注释
以下代码示例演示了如何在本地视频文件中检测徽标。
REST
发送视频注解请求
要对本地视频文件执行注解,请务必对视频文件的内容进行 base64 编码。在请求的 inputContent
字段中添加 base64 编码的内容。如需了解如何对视频文件的内容进行 base64 编码,请参阅 Base64 编码。
以下代码展示了如何向 videos:annotate
方法发送 POST 请求。该示例使用通过 Google Cloud CLI 为项目设置的服务帐号的访问令牌。
如需了解有关安装 Google Cloud CLI、使用服务帐号设置项目以及获取访问令牌的说明,请参阅 Video Intelligence API 快速入门
在使用任何请求数据之前,请先进行以下替换:
- "inputContent": BASE64_ENCODED_CONTENT
例如:"UklGRg41AwBBVkkgTElTVAwBAABoZHJsYXZpaDgAAAA1ggAAxPMBAAAAAAAQCAA..."
- LANGUAGE_CODE:[可选]请参阅支持的语言
- PROJECT_NUMBER:您的 Google Cloud 项目的数字标识符
HTTP 方法和网址:
POST https://videointelligence.googleapis.com/v1/videos:annotate
请求 JSON 正文:
{ "inputContent": "BASE64_ENCODED_CONTENT", "features": ["LOGO_RECOGNITION"], "videoContext": { } }
如需发送您的请求,请展开以下选项之一:
curl(Linux、macOS 或 Cloud Shell)
将请求正文保存在名为 request.json
的文件中,然后执行以下命令:
curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "x-goog-user-project: PROJECT_NUMBER" \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
"https://videointelligence.googleapis.com/v1/videos:annotate"
PowerShell (Windows)
将请求正文保存在名为 request.json
的文件中,然后执行以下命令:
$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred"; "x-goog-user-project" = "PROJECT_NUMBER" }
Invoke-WebRequest `
-Method POST `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "https://videointelligence.googleapis.com/v1/videos:annotate" | Select-Object -Expand Content
您应该收到类似以下内容的 JSON 响应:
{ "name": "projects/PROJECT_NUMBER/locations/LOCATION_ID/operations/OPERATION_ID" }
如果响应成功,Video Intelligence API 将返回您的操作的 name
。上面的示例展示了此类响应的示例,其中 project-number
是您的项目名称,operation-id
是为请求创建的长时间运行的操作的 ID。
- OPERATION_ID:并在启动操作时在响应中提供,例如
12345...
获取注解结果
要检索操作的结果,请使用从 videos:annotate 调用返回的操作名称发出 GET 请求,如以下示例所示。
在使用任何请求数据之前,请先进行以下替换:
- PROJECT_NUMBER:您的 Google Cloud 项目的数字标识符
HTTP 方法和网址:
GET https://videointelligence.googleapis.com/v1/OPERATION_NAME
如需发送您的请求,请展开以下选项之一:
curl(Linux、macOS 或 Cloud Shell)
执行以下命令:
curl -X GET \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "x-goog-user-project: PROJECT_NUMBER" \
"https://videointelligence.googleapis.com/v1/OPERATION_NAME"
PowerShell (Windows)
执行以下命令:
$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred"; "x-goog-user-project" = "PROJECT_NUMBER" }
Invoke-WebRequest `
-Method GET `
-Headers $headers `
-Uri "https://videointelligence.googleapis.com/v1/OPERATION_NAME" | Select-Object -Expand Content
您应该收到类似以下内容的 JSON 响应:
响应
"name": "projects/512816187662/locations/us-east1/operations/8399514592783793684", "metadata": { "@type": "type.googleapis.com/google.cloud.videointelligence.v1p3beta1.AnnotateVideoProgress", "annotationProgress": [ { "inputUri": "/videointelligence-prober-videos/face.mkv", "progressPercent": 100, "startTime": "2020-03-18T19:45:17.725359Z", "updateTime": "2020-03-18T19:45:26.532315Z" } ] }, "done": true, "response": { "@type": "type.googleapis.com/google.cloud.videointelligence.v1p3beta1.AnnotateVideoResponse", "annotationResults": [ { "inputUri": "/videointelligence-prober-videos/face.mkv", "segment": { "startTimeOffset": "0s", "endTimeOffset": "10.010s" }, "logoRecognitionAnnotations": [ { "entity": { "entityId": "/m/02z_b", "description": "Fox News", "languageCode": "en-US" }, "tracks": [ { "segment": { "startTimeOffset": "0s", "endTimeOffset": "1.901900s" }, "timestampedObjects": [ { "normalizedBoundingBox": { "left": 0.032402553, "top": 0.73683465, "right": 0.16249886, "bottom": 0.8664769 }, "timeOffset": "0s" }, { "normalizedBoundingBox": { "left": 0.03267879, "top": 0.73522913, "right": 0.1627307, "bottom": 0.86775583 }, "timeOffset": "0.100100s" }, { "normalizedBoundingBox": { "left": 0.031819325, "top": 0.73514116, "right": 0.16305345, "bottom": 0.8677738 }, "timeOffset": "0.200200s" }, { "normalizedBoundingBox": { "left": 0.03155339, "top": 0.7349258, "right": 0.16275825, "bottom": 0.86660737 }, "timeOffset": "0.300300s" }, .... ] }
文本检测注释以 textAnnotations
列表的形式返回。注意:仅当 done 字段的值为 True 时才会返回该字段。它不会包含在操作未完成的响应中。
Go
如需向 Video Intelligence 进行身份验证,请设置应用默认凭据。如需了解详情,请参阅为本地开发环境设置身份验证。
import (
"context"
"fmt"
"io"
"io/ioutil"
"time"
video "cloud.google.com/go/videointelligence/apiv1"
videopb "cloud.google.com/go/videointelligence/apiv1/videointelligencepb"
"github.com/golang/protobuf/ptypes"
)
// logoDetection analyzes a video and extracts logos with their bounding boxes.
func logoDetection(w io.Writer, filename string) error {
// filename := "../testdata/googlework_short.mp4"
ctx := context.Background()
// Creates a client.
client, err := video.NewClient(ctx)
if err != nil {
return fmt.Errorf("video.NewClient: %w", err)
}
defer client.Close()
ctx, cancel := context.WithTimeout(ctx, time.Second*180)
defer cancel()
fileBytes, err := ioutil.ReadFile(filename)
if err != nil {
return fmt.Errorf("ioutil.ReadFile: %w", err)
}
op, err := client.AnnotateVideo(ctx, &videopb.AnnotateVideoRequest{
InputContent: fileBytes,
Features: []videopb.Feature{
videopb.Feature_LOGO_RECOGNITION,
},
})
if err != nil {
return fmt.Errorf("AnnotateVideo: %w", err)
}
resp, err := op.Wait(ctx)
if err != nil {
return fmt.Errorf("Wait: %w", err)
}
// Only one video was processed, so get the first result.
result := resp.GetAnnotationResults()[0]
// Annotations for list of logos detected, tracked and recognized in video.
for _, annotation := range result.LogoRecognitionAnnotations {
fmt.Fprintf(w, "Description: %q\n", annotation.Entity.GetDescription())
// Opaque entity ID. Some IDs may be available in Google Knowledge
// Graph Search API (https://developers.google.com/knowledge-graph/).
if len(annotation.Entity.EntityId) > 0 {
fmt.Fprintf(w, "\tEntity ID: %q\n", annotation.Entity.GetEntityId())
}
// All logo tracks where the recognized logo appears. Each track
// corresponds to one logo instance appearing in consecutive frames.
for _, track := range annotation.Tracks {
// Video segment of a track.
segment := track.GetSegment()
start, _ := ptypes.Duration(segment.GetStartTimeOffset())
end, _ := ptypes.Duration(segment.GetEndTimeOffset())
fmt.Fprintf(w, "\tSegment: %v to %v\n", start, end)
fmt.Fprintf(w, "\tConfidence: %f\n", track.GetConfidence())
// The object with timestamp and attributes per frame in the track.
for _, timestampedObject := range track.TimestampedObjects {
// Normalized Bounding box in a frame, where the object is
// located.
box := timestampedObject.GetNormalizedBoundingBox()
fmt.Fprintf(w, "\tBounding box position:\n")
fmt.Fprintf(w, "\t\tleft : %f\n", box.GetLeft())
fmt.Fprintf(w, "\t\ttop : %f\n", box.GetTop())
fmt.Fprintf(w, "\t\tright : %f\n", box.GetRight())
fmt.Fprintf(w, "\t\tbottom: %f\n", box.GetBottom())
// Optional. The attributes of the object in the bounding box.
for _, attribute := range timestampedObject.Attributes {
fmt.Fprintf(w, "\t\t\tName: %q\n", attribute.GetName())
fmt.Fprintf(w, "\t\t\tConfidence: %f\n", attribute.GetConfidence())
fmt.Fprintf(w, "\t\t\tValue: %q\n", attribute.GetValue())
}
}
// Optional. Attributes in the track level.
for _, trackAttribute := range track.Attributes {
fmt.Fprintf(w, "\t\tName: %q\n", trackAttribute.GetName())
fmt.Fprintf(w, "\t\tConfidence: %f\n", trackAttribute.GetConfidence())
fmt.Fprintf(w, "\t\tValue: %q\n", trackAttribute.GetValue())
}
}
// All video segments where the recognized logo appears. There might be
// multiple instances of the same logo class appearing in one VideoSegment.
for _, segment := range annotation.Segments {
start, _ := ptypes.Duration(segment.GetStartTimeOffset())
end, _ := ptypes.Duration(segment.GetEndTimeOffset())
fmt.Fprintf(w, "\tSegment: %v to %v\n", start, end)
}
}
return nil
}
Java
如需向 Video Intelligence 进行身份验证,请设置应用默认凭据。如需了解详情,请参阅为本地开发环境设置身份验证。
import com.google.api.gax.longrunning.OperationFuture;
import com.google.cloud.videointelligence.v1.AnnotateVideoProgress;
import com.google.cloud.videointelligence.v1.AnnotateVideoRequest;
import com.google.cloud.videointelligence.v1.AnnotateVideoResponse;
import com.google.cloud.videointelligence.v1.DetectedAttribute;
import com.google.cloud.videointelligence.v1.Entity;
import com.google.cloud.videointelligence.v1.Feature;
import com.google.cloud.videointelligence.v1.LogoRecognitionAnnotation;
import com.google.cloud.videointelligence.v1.NormalizedBoundingBox;
import com.google.cloud.videointelligence.v1.TimestampedObject;
import com.google.cloud.videointelligence.v1.Track;
import com.google.cloud.videointelligence.v1.VideoAnnotationResults;
import com.google.cloud.videointelligence.v1.VideoIntelligenceServiceClient;
import com.google.cloud.videointelligence.v1.VideoSegment;
import com.google.protobuf.ByteString;
import com.google.protobuf.Duration;
import java.io.IOException;
import java.nio.file.Files;
import java.nio.file.Path;
import java.nio.file.Paths;
import java.util.concurrent.ExecutionException;
import java.util.concurrent.TimeUnit;
import java.util.concurrent.TimeoutException;
public class LogoDetection {
public static void detectLogo() throws Exception {
// TODO(developer): Replace these variables before running the sample.
String localFilePath = "path/to/your/video.mp4";
detectLogo(localFilePath);
}
public static void detectLogo(String filePath)
throws IOException, ExecutionException, InterruptedException, TimeoutException {
// Initialize client that will be used to send requests. This client only needs to be created
// once, and can be reused for multiple requests. After completing all of your requests, call
// the "close" method on the client to safely clean up any remaining background resources.
try (VideoIntelligenceServiceClient client = VideoIntelligenceServiceClient.create()) {
// Read file
Path path = Paths.get(filePath);
byte[] data = Files.readAllBytes(path);
// Create the request
AnnotateVideoRequest request =
AnnotateVideoRequest.newBuilder()
.setInputContent(ByteString.copyFrom(data))
.addFeatures(Feature.LOGO_RECOGNITION)
.build();
// asynchronously perform object tracking on videos
OperationFuture<AnnotateVideoResponse, AnnotateVideoProgress> future =
client.annotateVideoAsync(request);
System.out.println("Waiting for operation to complete...");
// The first result is retrieved because a single video was processed.
AnnotateVideoResponse response = future.get(300, TimeUnit.SECONDS);
VideoAnnotationResults annotationResult = response.getAnnotationResults(0);
// Annotations for list of logos detected, tracked and recognized in video.
for (LogoRecognitionAnnotation logoRecognitionAnnotation :
annotationResult.getLogoRecognitionAnnotationsList()) {
Entity entity = logoRecognitionAnnotation.getEntity();
// Opaque entity ID. Some IDs may be available in
// [Google Knowledge Graph Search API](https://developers.google.com/knowledge-graph/).
System.out.printf("Entity Id : %s\n", entity.getEntityId());
System.out.printf("Description : %s\n", entity.getDescription());
// All logo tracks where the recognized logo appears. Each track corresponds to one logo
// instance appearing in consecutive frames.
for (Track track : logoRecognitionAnnotation.getTracksList()) {
// Video segment of a track.
Duration startTimeOffset = track.getSegment().getStartTimeOffset();
System.out.printf(
"\n\tStart Time Offset: %s.%s\n",
startTimeOffset.getSeconds(), startTimeOffset.getNanos());
Duration endTimeOffset = track.getSegment().getEndTimeOffset();
System.out.printf(
"\tEnd Time Offset: %s.%s\n", endTimeOffset.getSeconds(), endTimeOffset.getNanos());
System.out.printf("\tConfidence: %s\n", track.getConfidence());
// The object with timestamp and attributes per frame in the track.
for (TimestampedObject timestampedObject : track.getTimestampedObjectsList()) {
// Normalized Bounding box in a frame, where the object is located.
NormalizedBoundingBox normalizedBoundingBox =
timestampedObject.getNormalizedBoundingBox();
System.out.printf("\n\t\tLeft: %s\n", normalizedBoundingBox.getLeft());
System.out.printf("\t\tTop: %s\n", normalizedBoundingBox.getTop());
System.out.printf("\t\tRight: %s\n", normalizedBoundingBox.getRight());
System.out.printf("\t\tBottom: %s\n", normalizedBoundingBox.getBottom());
// Optional. The attributes of the object in the bounding box.
for (DetectedAttribute attribute : timestampedObject.getAttributesList()) {
System.out.printf("\n\t\t\tName: %s\n", attribute.getName());
System.out.printf("\t\t\tConfidence: %s\n", attribute.getConfidence());
System.out.printf("\t\t\tValue: %s\n", attribute.getValue());
}
}
// Optional. Attributes in the track level.
for (DetectedAttribute trackAttribute : track.getAttributesList()) {
System.out.printf("\n\t\tName : %s\n", trackAttribute.getName());
System.out.printf("\t\tConfidence : %s\n", trackAttribute.getConfidence());
System.out.printf("\t\tValue : %s\n", trackAttribute.getValue());
}
}
// All video segments where the recognized logo appears. There might be multiple instances
// of the same logo class appearing in one VideoSegment.
for (VideoSegment segment : logoRecognitionAnnotation.getSegmentsList()) {
System.out.printf(
"\n\tStart Time Offset : %s.%s\n",
segment.getStartTimeOffset().getSeconds(), segment.getStartTimeOffset().getNanos());
System.out.printf(
"\tEnd Time Offset : %s.%s\n",
segment.getEndTimeOffset().getSeconds(), segment.getEndTimeOffset().getNanos());
}
}
}
}
}
Node.js
如需向 Video Intelligence 进行身份验证,请设置应用默认凭据。如需了解详情,请参阅为本地开发环境设置身份验证。
/**
* TODO(developer): Uncomment these variables before running the sample.
*/
// const localFilePath = 'path/to/your/video.mp4'
// Imports the Google Cloud client libraries
const Video = require('@google-cloud/video-intelligence');
const fs = require('fs');
// Instantiates a client
const client = new Video.VideoIntelligenceServiceClient();
// Performs asynchronous video annotation for logo recognition on a file.
async function detectLogo() {
const inputContent = fs.readFileSync(localFilePath).toString('base64');
// Build the request with the input content and logo recognition feature.
const request = {
inputContent: inputContent,
features: ['LOGO_RECOGNITION'],
};
// Make the asynchronous request
const [operation] = await client.annotateVideo(request);
// Wait for the results
const [response] = await operation.promise();
// Get the first response, since we sent only one video.
const annotationResult = response.annotationResults[0];
for (const logoRecognitionAnnotation of annotationResult.logoRecognitionAnnotations) {
const entity = logoRecognitionAnnotation.entity;
// Opaque entity ID. Some IDs may be available in
// [Google Knowledge Graph Search API](https://developers.google.com/knowledge-graph/).
console.log(`Entity Id: ${entity.entityId}`);
console.log(`Description: ${entity.description}`);
// All logo tracks where the recognized logo appears.
// Each track corresponds to one logo instance appearing in consecutive frames.
for (const track of logoRecognitionAnnotation.tracks) {
console.log(
`\n\tStart Time Offset: ${track.segment.startTimeOffset.seconds}.${track.segment.startTimeOffset.nanos}`
);
console.log(
`\tEnd Time Offset: ${track.segment.endTimeOffset.seconds}.${track.segment.endTimeOffset.nanos}`
);
console.log(`\tConfidence: ${track.confidence}`);
// The object with timestamp and attributes per frame in the track.
for (const timestampedObject of track.timestampedObjects) {
// Normalized Bounding box in a frame, where the object is located.
const normalizedBoundingBox = timestampedObject.normalizedBoundingBox;
console.log(`\n\t\tLeft: ${normalizedBoundingBox.left}`);
console.log(`\t\tTop: ${normalizedBoundingBox.top}`);
console.log(`\t\tRight: ${normalizedBoundingBox.right}`);
console.log(`\t\tBottom: ${normalizedBoundingBox.bottom}`);
// Optional. The attributes of the object in the bounding box.
for (const attribute of timestampedObject.attributes) {
console.log(`\n\t\t\tName: ${attribute.name}`);
console.log(`\t\t\tConfidence: ${attribute.confidence}`);
console.log(`\t\t\tValue: ${attribute.value}`);
}
}
// Optional. Attributes in the track level.
for (const trackAttribute of track.attributes) {
console.log(`\n\t\tName: ${trackAttribute.name}`);
console.log(`\t\tConfidence: ${trackAttribute.confidence}`);
console.log(`\t\tValue: ${trackAttribute.value}`);
}
}
// All video segments where the recognized logo appears.
// There might be multiple instances of the same logo class appearing in one VideoSegment.
for (const segment of logoRecognitionAnnotation.segments) {
console.log(
`\n\tStart Time Offset: ${segment.startTimeOffset.seconds}.${segment.startTimeOffset.nanos}`
);
console.log(
`\tEnd Time Offset: ${segment.endTimeOffset.seconds}.${segment.endTimeOffset.nanos}`
);
}
}
}
detectLogo();
Python
如需向 Video Intelligence 进行身份验证,请设置应用默认凭据。如需了解详情,请参阅为本地开发环境设置身份验证。
import io
from google.cloud import videointelligence
def detect_logo(local_file_path="path/to/your/video.mp4"):
"""Performs asynchronous video annotation for logo recognition on a local file."""
client = videointelligence.VideoIntelligenceServiceClient()
with io.open(local_file_path, "rb") as f:
input_content = f.read()
features = [videointelligence.Feature.LOGO_RECOGNITION]
operation = client.annotate_video(
request={"features": features, "input_content": input_content}
)
print("Waiting for operation to complete...")
response = operation.result()
# Get the first response, since we sent only one video.
annotation_result = response.annotation_results[0]
# Annotations for list of logos detected, tracked and recognized in video.
for logo_recognition_annotation in annotation_result.logo_recognition_annotations:
entity = logo_recognition_annotation.entity
# Opaque entity ID. Some IDs may be available in [Google Knowledge Graph
# Search API](https://developers.google.com/knowledge-graph/).
print("Entity Id : {}".format(entity.entity_id))
print("Description : {}".format(entity.description))
# All logo tracks where the recognized logo appears. Each track corresponds
# to one logo instance appearing in consecutive frames.
for track in logo_recognition_annotation.tracks:
# Video segment of a track.
print(
"\n\tStart Time Offset : {}.{}".format(
track.segment.start_time_offset.seconds,
track.segment.start_time_offset.microseconds * 1000,
)
)
print(
"\tEnd Time Offset : {}.{}".format(
track.segment.end_time_offset.seconds,
track.segment.end_time_offset.microseconds * 1000,
)
)
print("\tConfidence : {}".format(track.confidence))
# The object with timestamp and attributes per frame in the track.
for timestamped_object in track.timestamped_objects:
# Normalized Bounding box in a frame, where the object is located.
normalized_bounding_box = timestamped_object.normalized_bounding_box
print("\n\t\tLeft : {}".format(normalized_bounding_box.left))
print("\t\tTop : {}".format(normalized_bounding_box.top))
print("\t\tRight : {}".format(normalized_bounding_box.right))
print("\t\tBottom : {}".format(normalized_bounding_box.bottom))
# Optional. The attributes of the object in the bounding box.
for attribute in timestamped_object.attributes:
print("\n\t\t\tName : {}".format(attribute.name))
print("\t\t\tConfidence : {}".format(attribute.confidence))
print("\t\t\tValue : {}".format(attribute.value))
# Optional. Attributes in the track level.
for track_attribute in track.attributes:
print("\n\t\tName : {}".format(track_attribute.name))
print("\t\tConfidence : {}".format(track_attribute.confidence))
print("\t\tValue : {}".format(track_attribute.value))
# All video segments where the recognized logo appears. There might be
# multiple instances of the same logo class appearing in one VideoSegment.
for segment in logo_recognition_annotation.segments:
print(
"\n\tStart Time Offset : {}.{}".format(
segment.start_time_offset.seconds,
segment.start_time_offset.microseconds * 1000,
)
)
print(
"\tEnd Time Offset : {}.{}".format(
segment.end_time_offset.seconds,
segment.end_time_offset.microseconds * 1000,
)
)
其他语言
C#:请按照“客户端库”页面上的 C# 设置说明操作,然后访问 .NET 的 Video Intelligence 参考文档。
PHP:请按照客户端库页面上的 PHP 设置说明操作,然后访问 PHP 版 Video Intelligence 参考文档。
Ruby:请按照客户端库页面上的 Ruby 设置说明操作,然后访问 Ruby 版 Video Intelligence 参考文档。