Video Intelligence API 可以使用 LABEL_DETECTION 功能来识别视频片段中出现的实体。此功能可识别对象、位置、活动、动物物种、产品等。
分析可以按如下方式划分:
- 帧级别:
实体在每一帧中被标识和标记(每秒采样一帧)。 - 镜头级别:
系统会在每个片段(或视频)中自动检测镜头。然后在每个镜头中标识实体并为其添加标签。 - 片段级别:
用户选择的视频片段可以通过为注解指定开始和结束的时间偏移值来指定分析(请参阅 VideoSegment)。然后在每个片段中标识实体并为其添加标签。如果未指定片段,则整个视频将被视为一个片段。
为本地文件添加注释
以下示例展示如何对本地文件执行视频分析以获取标签。
想要更深入了解其他内容?请查看我们的详细 Python 教程。
REST
发送处理请求
下面演示了如何向 videos:annotate
方法发送 POST
请求。您可以将 LabelDetectionMode
配置为镜头级和/或帧级注释。我们建议使用 SHOT_AND_FRAME_MODE
。该示例使用通过 Google Cloud CLI 为项目设置的服务帐号的访问令牌。如需了解有关安装 Google Cloud CLI、使用服务帐号设置项目以及获取访问令牌的说明,请参阅 Video Intelligence 快速入门。
在使用任何请求数据之前,请先进行以下替换:
- BASE64_ENCODED_CONTENT:您的视频以 base64 编码数据。请参阅有关如何将数据转换为 base64 的说明。
- PROJECT_NUMBER:您的 Google Cloud 项目的数字标识符
HTTP 方法和网址:
POST https://videointelligence.googleapis.com/v1/videos:annotate
请求 JSON 正文:
{ "inputContent": "BASE64_ENCODED_CONTENT", "features": ["LABEL_DETECTION"], }
如需发送您的请求,请展开以下选项之一:
curl(Linux、macOS 或 Cloud Shell)
将请求正文保存在名为 request.json
的文件中,然后执行以下命令:
curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "x-goog-user-project: PROJECT_NUMBER" \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
"https://videointelligence.googleapis.com/v1/videos:annotate"
PowerShell (Windows)
将请求正文保存在名为 request.json
的文件中,然后执行以下命令:
$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred"; "x-goog-user-project" = "PROJECT_NUMBER" }
Invoke-WebRequest `
-Method POST `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "https://videointelligence.googleapis.com/v1/videos:annotate" | Select-Object -Expand Content
您应该收到类似以下内容的 JSON 响应:
{ "name": "projects/PROJECT_NUMBER/locations/LOCATION_ID/operations/OPERATION_ID" }
如果请求成功,Video Intelligence 将返回操作的名称。
获取结果
如需获取请求的结果,您必须向 projects.locations.operations
资源发送 GET
请求。下面演示了如何发送此类请求。
在使用任何请求数据之前,请先进行以下替换:
- OPERATION_NAME:Video Intelligence API 返回的操作名称。操作名称采用
projects/PROJECT_NUMBER/locations/LOCATION_ID/operations/OPERATION_ID
格式 - PROJECT_NUMBER:您的 Google Cloud 项目的数字标识符
HTTP 方法和网址:
GET https://videointelligence.googleapis.com/v1/OPERATION_NAME
如需发送您的请求,请展开以下选项之一:
curl(Linux、macOS 或 Cloud Shell)
执行以下命令:
curl -X GET \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "x-goog-user-project: PROJECT_NUMBER" \
"https://videointelligence.googleapis.com/v1/OPERATION_NAME"
PowerShell (Windows)
执行以下命令:
$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred"; "x-goog-user-project" = "PROJECT_NUMBER" }
Invoke-WebRequest `
-Method GET `
-Headers $headers `
-Uri "https://videointelligence.googleapis.com/v1/OPERATION_NAME" | Select-Object -Expand Content
您应该收到类似以下内容的 JSON 响应:
响应
{ "name": "projects/PROJECT_NUMBER/locations/LOCATION_ID/operations/OPERATION_ID", "metadata": { "@type": "type.googleapis.com/google.cloud.videointelligence.v1.AnnotateVideoProgress", "annotationProgress": [ { "progressPercent": 100, "startTime": "2019-03-12T19:36:09.110351Z", "updateTime": "2019-03-12T19:36:17.519069Z" } ] }, "done": true, "response": { "@type": "type.googleapis.com/google.cloud.videointelligence.v1.AnnotateVideoResponse", "annotationResults": [ { "segmentLabelAnnotations": [ { "entity": { "entityId": "/m/01prls", "description": "land vehicle", "languageCode": "en-US" }, "categoryEntities": [ { "entityId": "/m/07yv9", "description": "vehicle", "languageCode": "en-US" } ], "segments": [ { "segment": { "startTimeOffset": "0s", "endTimeOffset": "38.757872s" }, "confidence": 0.6614419 } ] }, { "entity": { "entityId": "/m/039jbq", "description": "urban area", "languageCode": "en-US" }, "categoryEntities": [ { "entityId": "/m/01n32", "description": "city", "languageCode": "en-US" } ], "segments": [ { "segment": { "startTimeOffset": "0s", "endTimeOffset": "38.757872s" }, "confidence": 0.92337775 } ] }, ...
Go
func label(w io.Writer, file string) error {
ctx := context.Background()
client, err := video.NewClient(ctx)
if err != nil {
return fmt.Errorf("video.NewClient: %w", err)
}
defer client.Close()
fileBytes, err := ioutil.ReadFile(file)
if err != nil {
return err
}
op, err := client.AnnotateVideo(ctx, &videopb.AnnotateVideoRequest{
Features: []videopb.Feature{
videopb.Feature_LABEL_DETECTION,
},
InputContent: fileBytes,
})
if err != nil {
return fmt.Errorf("AnnotateVideo: %w", err)
}
resp, err := op.Wait(ctx)
if err != nil {
return fmt.Errorf("Wait: %w", err)
}
printLabels := func(labels []*videopb.LabelAnnotation) {
for _, label := range labels {
fmt.Fprintf(w, "\tDescription: %s\n", label.Entity.Description)
for _, category := range label.CategoryEntities {
fmt.Fprintf(w, "\t\tCategory: %s\n", category.Description)
}
for _, segment := range label.Segments {
start, _ := ptypes.Duration(segment.Segment.StartTimeOffset)
end, _ := ptypes.Duration(segment.Segment.EndTimeOffset)
fmt.Fprintf(w, "\t\tSegment: %s to %s\n", start, end)
}
}
}
// A single video was processed. Get the first result.
result := resp.AnnotationResults[0]
fmt.Fprintln(w, "SegmentLabelAnnotations:")
printLabels(result.SegmentLabelAnnotations)
fmt.Fprintln(w, "ShotLabelAnnotations:")
printLabels(result.ShotLabelAnnotations)
fmt.Fprintln(w, "FrameLabelAnnotations:")
printLabels(result.FrameLabelAnnotations)
return nil
}
Java
// Instantiate a com.google.cloud.videointelligence.v1.VideoIntelligenceServiceClient
try (VideoIntelligenceServiceClient client = VideoIntelligenceServiceClient.create()) {
// Read file and encode into Base64
Path path = Paths.get(filePath);
byte[] data = Files.readAllBytes(path);
AnnotateVideoRequest request =
AnnotateVideoRequest.newBuilder()
.setInputContent(ByteString.copyFrom(data))
.addFeatures(Feature.LABEL_DETECTION)
.build();
// Create an operation that will contain the response when the operation completes.
OperationFuture<AnnotateVideoResponse, AnnotateVideoProgress> response =
client.annotateVideoAsync(request);
System.out.println("Waiting for operation to complete...");
for (VideoAnnotationResults results : response.get().getAnnotationResultsList()) {
// process video / segment level label annotations
System.out.println("Locations: ");
for (LabelAnnotation labelAnnotation : results.getSegmentLabelAnnotationsList()) {
System.out.println("Video label: " + labelAnnotation.getEntity().getDescription());
// categories
for (Entity categoryEntity : labelAnnotation.getCategoryEntitiesList()) {
System.out.println("Video label category: " + categoryEntity.getDescription());
}
// segments
for (LabelSegment segment : labelAnnotation.getSegmentsList()) {
double startTime =
segment.getSegment().getStartTimeOffset().getSeconds()
+ segment.getSegment().getStartTimeOffset().getNanos() / 1e9;
double endTime =
segment.getSegment().getEndTimeOffset().getSeconds()
+ segment.getSegment().getEndTimeOffset().getNanos() / 1e9;
System.out.printf("Segment location: %.3f:%.2f\n", startTime, endTime);
System.out.println("Confidence: " + segment.getConfidence());
}
}
// process shot label annotations
for (LabelAnnotation labelAnnotation : results.getShotLabelAnnotationsList()) {
System.out.println("Shot label: " + labelAnnotation.getEntity().getDescription());
// categories
for (Entity categoryEntity : labelAnnotation.getCategoryEntitiesList()) {
System.out.println("Shot label category: " + categoryEntity.getDescription());
}
// segments
for (LabelSegment segment : labelAnnotation.getSegmentsList()) {
double startTime =
segment.getSegment().getStartTimeOffset().getSeconds()
+ segment.getSegment().getStartTimeOffset().getNanos() / 1e9;
double endTime =
segment.getSegment().getEndTimeOffset().getSeconds()
+ segment.getSegment().getEndTimeOffset().getNanos() / 1e9;
System.out.printf("Segment location: %.3f:%.2f\n", startTime, endTime);
System.out.println("Confidence: " + segment.getConfidence());
}
}
// process frame label annotations
for (LabelAnnotation labelAnnotation : results.getFrameLabelAnnotationsList()) {
System.out.println("Frame label: " + labelAnnotation.getEntity().getDescription());
// categories
for (Entity categoryEntity : labelAnnotation.getCategoryEntitiesList()) {
System.out.println("Frame label category: " + categoryEntity.getDescription());
}
// segments
for (LabelSegment segment : labelAnnotation.getSegmentsList()) {
double startTime =
segment.getSegment().getStartTimeOffset().getSeconds()
+ segment.getSegment().getStartTimeOffset().getNanos() / 1e9;
double endTime =
segment.getSegment().getEndTimeOffset().getSeconds()
+ segment.getSegment().getEndTimeOffset().getNanos() / 1e9;
System.out.printf("Segment location: %.3f:%.2f\n", startTime, endTime);
System.out.println("Confidence: " + segment.getConfidence());
}
}
}
}
Node.js
// Imports the Google Cloud Video Intelligence library + Node's fs library
const video = require('@google-cloud/video-intelligence').v1;
const fs = require('fs');
const util = require('util');
// Creates a client
const client = new video.VideoIntelligenceServiceClient();
/**
* TODO(developer): Uncomment the following line before running the sample.
*/
// const path = 'Local file to analyze, e.g. ./my-file.mp4';
// Reads a local video file and converts it to base64
const readFile = util.promisify(fs.readFile);
const file = await readFile(path);
const inputContent = file.toString('base64');
// Constructs request
const request = {
inputContent: inputContent,
features: ['LABEL_DETECTION'],
};
// Detects labels in a video
const [operation] = await client.annotateVideo(request);
console.log('Waiting for operation to complete...');
const [operationResult] = await operation.promise();
// Gets annotations for video
const annotations = operationResult.annotationResults[0];
const labels = annotations.segmentLabelAnnotations;
labels.forEach(label => {
console.log(`Label ${label.entity.description} occurs at:`);
label.segments.forEach(segment => {
const time = segment.segment;
if (time.startTimeOffset.seconds === undefined) {
time.startTimeOffset.seconds = 0;
}
if (time.startTimeOffset.nanos === undefined) {
time.startTimeOffset.nanos = 0;
}
if (time.endTimeOffset.seconds === undefined) {
time.endTimeOffset.seconds = 0;
}
if (time.endTimeOffset.nanos === undefined) {
time.endTimeOffset.nanos = 0;
}
console.log(
`\tStart: ${time.startTimeOffset.seconds}` +
`.${(time.startTimeOffset.nanos / 1e6).toFixed(0)}s`
);
console.log(
`\tEnd: ${time.endTimeOffset.seconds}.` +
`${(time.endTimeOffset.nanos / 1e6).toFixed(0)}s`
);
console.log(`\tConfidence: ${segment.confidence}`);
});
});
Python
如需详细了解如何安装和使用 Python 版 Video Intelligence API 客户端库,请参阅 Video Intelligence API 客户端库。"""Detect labels given a file path."""
video_client = videointelligence.VideoIntelligenceServiceClient()
features = [videointelligence.Feature.LABEL_DETECTION]
with io.open(path, "rb") as movie:
input_content = movie.read()
operation = video_client.annotate_video(
request={"features": features, "input_content": input_content}
)
print("\nProcessing video for label annotations:")
result = operation.result(timeout=90)
print("\nFinished processing.")
# Process video/segment level label annotations
segment_labels = result.annotation_results[0].segment_label_annotations
for i, segment_label in enumerate(segment_labels):
print("Video label description: {}".format(segment_label.entity.description))
for category_entity in segment_label.category_entities:
print(
"\tLabel category description: {}".format(category_entity.description)
)
for i, segment in enumerate(segment_label.segments):
start_time = (
segment.segment.start_time_offset.seconds
+ segment.segment.start_time_offset.microseconds / 1e6
)
end_time = (
segment.segment.end_time_offset.seconds
+ segment.segment.end_time_offset.microseconds / 1e6
)
positions = "{}s to {}s".format(start_time, end_time)
confidence = segment.confidence
print("\tSegment {}: {}".format(i, positions))
print("\tConfidence: {}".format(confidence))
print("\n")
# Process shot level label annotations
shot_labels = result.annotation_results[0].shot_label_annotations
for i, shot_label in enumerate(shot_labels):
print("Shot label description: {}".format(shot_label.entity.description))
for category_entity in shot_label.category_entities:
print(
"\tLabel category description: {}".format(category_entity.description)
)
for i, shot in enumerate(shot_label.segments):
start_time = (
shot.segment.start_time_offset.seconds
+ shot.segment.start_time_offset.microseconds / 1e6
)
end_time = (
shot.segment.end_time_offset.seconds
+ shot.segment.end_time_offset.microseconds / 1e6
)
positions = "{}s to {}s".format(start_time, end_time)
confidence = shot.confidence
print("\tSegment {}: {}".format(i, positions))
print("\tConfidence: {}".format(confidence))
print("\n")
# Process frame level label annotations
frame_labels = result.annotation_results[0].frame_label_annotations
for i, frame_label in enumerate(frame_labels):
print("Frame label description: {}".format(frame_label.entity.description))
for category_entity in frame_label.category_entities:
print(
"\tLabel category description: {}".format(category_entity.description)
)
# Each frame_label_annotation has many frames,
# here we print information only about the first frame.
frame = frame_label.frames[0]
time_offset = frame.time_offset.seconds + frame.time_offset.microseconds / 1e6
print("\tFirst frame time offset: {}s".format(time_offset))
print("\tFirst frame confidence: {}".format(frame.confidence))
print("\n")
其他语言
C#:请按照“客户端库”页面上的 C# 设置说明操作,然后访问 .NET 的 Video Intelligence 参考文档。
PHP:请按照客户端库页面上的 PHP 设置说明操作,然后访问 PHP 版 Video Intelligence 参考文档。
Ruby:请按照客户端库页面上的 Ruby 设置说明操作,然后访问 Ruby 版 Video Intelligence 参考文档。
为 Cloud Storage 上的文件添加注释
以下示例展示如何对 Cloud Storage 中的文件上的标签执行视频分析。
REST
如需详细了解如何安装和使用 Python 版 Video Intelligence API 客户端库,请参阅 Video Intelligence API 客户端库。发送处理请求
下面演示了如何向 annotate
方法发送 POST
请求。该示例使用通过 Google Cloud CLI 为项目设置的服务帐号的访问令牌。如需了解有关安装 Google Cloud CLI、使用服务帐号设置项目以及获取访问令牌的说明,请参阅 Video Intelligence 快速入门。
在使用任何请求数据之前,请先进行以下替换:
- INPUT_URI:包含要添加注释的文件的 Cloud Storage 存储桶(包括文件名)。必须以
gs://
开头。 - PROJECT_NUMBER:您的 Google Cloud 项目的数字标识符
HTTP 方法和网址:
POST https://videointelligence.googleapis.com/v1/videos:annotate
请求 JSON 正文:
{ "inputUri": "INPUT_URI", "features": ["LABEL_DETECTION"], }
如需发送您的请求,请展开以下选项之一:
curl(Linux、macOS 或 Cloud Shell)
将请求正文保存在名为 request.json
的文件中,然后执行以下命令:
curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "x-goog-user-project: PROJECT_NUMBER" \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
"https://videointelligence.googleapis.com/v1/videos:annotate"
PowerShell (Windows)
将请求正文保存在名为 request.json
的文件中,然后执行以下命令:
$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred"; "x-goog-user-project" = "PROJECT_NUMBER" }
Invoke-WebRequest `
-Method POST `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "https://videointelligence.googleapis.com/v1/videos:annotate" | Select-Object -Expand Content
您应该收到类似以下内容的 JSON 响应:
{ "name": "projects/PROJECT_NUMBER/locations/LOCATION_ID/operations/OPERATION_ID" }
如果请求成功,Video Intelligence 将返回操作的名称。
获取结果
如需获取请求的结果,您必须向 projects.locations.operations
资源发送 GET
请求。下面演示了如何发送此类请求。
在使用任何请求数据之前,请先进行以下替换:
- OPERATION_NAME:Video Intelligence API 返回的操作名称。操作名称采用
projects/PROJECT_NUMBER/locations/LOCATION_ID/operations/OPERATION_ID
格式 - PROJECT_NUMBER:您的 Google Cloud 项目的数字标识符
HTTP 方法和网址:
GET https://videointelligence.googleapis.com/v1/OPERATION_NAME
如需发送您的请求,请展开以下选项之一:
curl(Linux、macOS 或 Cloud Shell)
执行以下命令:
curl -X GET \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "x-goog-user-project: PROJECT_NUMBER" \
"https://videointelligence.googleapis.com/v1/OPERATION_NAME"
PowerShell (Windows)
执行以下命令:
$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred"; "x-goog-user-project" = "PROJECT_NUMBER" }
Invoke-WebRequest `
-Method GET `
-Headers $headers `
-Uri "https://videointelligence.googleapis.com/v1/OPERATION_NAME" | Select-Object -Expand Content
您应该收到类似以下内容的 JSON 响应:
响应
{ "name": "projects/PROJECT_NUMBER/locations/LOCATION_ID/operations/OPERATION_ID", "metadata": { "@type": "type.googleapis.com/google.cloud.videointelligence.v1.AnnotateVideoProgress", "annotationProgress": [ { "inputUri": "INPUT_URI", "progressPercent": 100, "startTime": "2019-11-13T19:25:56.206335Z", "updateTime": "2019-11-13T19:26:16.215615Z" } ] }, "done": true, "response": { "@type": "type.googleapis.com/google.cloud.videointelligence.v1.AnnotateVideoResponse", "annotationResults": [ { "inputUri": "INPUT_URI", "segmentLabelAnnotations": [ { "entity": { "entityId": "/m/01vkl", "description": "circle", "languageCode": "en-US" }, "categoryEntities": [ { "entityId": "/m/016nqd", "description": "shape", "languageCode": "en-US" } ], "segments": [ { "segment": { "startTimeOffset": "0s", "endTimeOffset": "16.416666s" }, "confidence": 0.36535457 } ] }, ... ] } ] } }
下载注解结果
将来源中的注解复制到目标存储桶(请参阅复制文件和对象):
gsutil cp gcs_uri gs://my-bucket
注意:如果输出 gcs uri 由用户提供,则注解存储在该 gcs uri 中。
Go
func labelURI(w io.Writer, file string) error {
ctx := context.Background()
client, err := video.NewClient(ctx)
if err != nil {
return fmt.Errorf("video.NewClient: %w", err)
}
defer client.Close()
op, err := client.AnnotateVideo(ctx, &videopb.AnnotateVideoRequest{
Features: []videopb.Feature{
videopb.Feature_LABEL_DETECTION,
},
InputUri: file,
})
if err != nil {
return fmt.Errorf("AnnotateVideo: %w", err)
}
resp, err := op.Wait(ctx)
if err != nil {
return fmt.Errorf("Wait: %w", err)
}
printLabels := func(labels []*videopb.LabelAnnotation) {
for _, label := range labels {
fmt.Fprintf(w, "\tDescription: %s\n", label.Entity.Description)
for _, category := range label.CategoryEntities {
fmt.Fprintf(w, "\t\tCategory: %s\n", category.Description)
}
for _, segment := range label.Segments {
start, _ := ptypes.Duration(segment.Segment.StartTimeOffset)
end, _ := ptypes.Duration(segment.Segment.EndTimeOffset)
fmt.Fprintf(w, "\t\tSegment: %s to %s\n", start, end)
}
}
}
// A single video was processed. Get the first result.
result := resp.AnnotationResults[0]
fmt.Fprintln(w, "SegmentLabelAnnotations:")
printLabels(result.SegmentLabelAnnotations)
fmt.Fprintln(w, "ShotLabelAnnotations:")
printLabels(result.ShotLabelAnnotations)
fmt.Fprintln(w, "FrameLabelAnnotations:")
printLabels(result.FrameLabelAnnotations)
return nil
}
Java
// Instantiate a com.google.cloud.videointelligence.v1.VideoIntelligenceServiceClient
try (VideoIntelligenceServiceClient client = VideoIntelligenceServiceClient.create()) {
// Provide path to file hosted on GCS as "gs://bucket-name/..."
AnnotateVideoRequest request =
AnnotateVideoRequest.newBuilder()
.setInputUri(gcsUri)
.addFeatures(Feature.LABEL_DETECTION)
.build();
// Create an operation that will contain the response when the operation completes.
OperationFuture<AnnotateVideoResponse, AnnotateVideoProgress> response =
client.annotateVideoAsync(request);
System.out.println("Waiting for operation to complete...");
for (VideoAnnotationResults results : response.get().getAnnotationResultsList()) {
// process video / segment level label annotations
System.out.println("Locations: ");
for (LabelAnnotation labelAnnotation : results.getSegmentLabelAnnotationsList()) {
System.out.println("Video label: " + labelAnnotation.getEntity().getDescription());
// categories
for (Entity categoryEntity : labelAnnotation.getCategoryEntitiesList()) {
System.out.println("Video label category: " + categoryEntity.getDescription());
}
// segments
for (LabelSegment segment : labelAnnotation.getSegmentsList()) {
double startTime =
segment.getSegment().getStartTimeOffset().getSeconds()
+ segment.getSegment().getStartTimeOffset().getNanos() / 1e9;
double endTime =
segment.getSegment().getEndTimeOffset().getSeconds()
+ segment.getSegment().getEndTimeOffset().getNanos() / 1e9;
System.out.printf("Segment location: %.3f:%.3f\n", startTime, endTime);
System.out.println("Confidence: " + segment.getConfidence());
}
}
// process shot label annotations
for (LabelAnnotation labelAnnotation : results.getShotLabelAnnotationsList()) {
System.out.println("Shot label: " + labelAnnotation.getEntity().getDescription());
// categories
for (Entity categoryEntity : labelAnnotation.getCategoryEntitiesList()) {
System.out.println("Shot label category: " + categoryEntity.getDescription());
}
// segments
for (LabelSegment segment : labelAnnotation.getSegmentsList()) {
double startTime =
segment.getSegment().getStartTimeOffset().getSeconds()
+ segment.getSegment().getStartTimeOffset().getNanos() / 1e9;
double endTime =
segment.getSegment().getEndTimeOffset().getSeconds()
+ segment.getSegment().getEndTimeOffset().getNanos() / 1e9;
System.out.printf("Segment location: %.3f:%.3f\n", startTime, endTime);
System.out.println("Confidence: " + segment.getConfidence());
}
}
// process frame label annotations
for (LabelAnnotation labelAnnotation : results.getFrameLabelAnnotationsList()) {
System.out.println("Frame label: " + labelAnnotation.getEntity().getDescription());
// categories
for (Entity categoryEntity : labelAnnotation.getCategoryEntitiesList()) {
System.out.println("Frame label category: " + categoryEntity.getDescription());
}
// segments
for (LabelSegment segment : labelAnnotation.getSegmentsList()) {
double startTime =
segment.getSegment().getStartTimeOffset().getSeconds()
+ segment.getSegment().getStartTimeOffset().getNanos() / 1e9;
double endTime =
segment.getSegment().getEndTimeOffset().getSeconds()
+ segment.getSegment().getEndTimeOffset().getNanos() / 1e9;
System.out.printf("Segment location: %.3f:%.2f\n", startTime, endTime);
System.out.println("Confidence: " + segment.getConfidence());
}
}
}
}
Node.js
// Imports the Google Cloud Video Intelligence library
const video = require('@google-cloud/video-intelligence').v1;
// Creates a client
const client = new video.VideoIntelligenceServiceClient();
/**
* TODO(developer): Uncomment the following line before running the sample.
*/
// const gcsUri = 'GCS URI of the video to analyze, e.g. gs://my-bucket/my-video.mp4';
const request = {
inputUri: gcsUri,
features: ['LABEL_DETECTION'],
};
// Detects labels in a video
const [operation] = await client.annotateVideo(request);
console.log('Waiting for operation to complete...');
const [operationResult] = await operation.promise();
// Gets annotations for video
const annotations = operationResult.annotationResults[0];
const labels = annotations.segmentLabelAnnotations;
labels.forEach(label => {
console.log(`Label ${label.entity.description} occurs at:`);
label.segments.forEach(segment => {
const time = segment.segment;
if (time.startTimeOffset.seconds === undefined) {
time.startTimeOffset.seconds = 0;
}
if (time.startTimeOffset.nanos === undefined) {
time.startTimeOffset.nanos = 0;
}
if (time.endTimeOffset.seconds === undefined) {
time.endTimeOffset.seconds = 0;
}
if (time.endTimeOffset.nanos === undefined) {
time.endTimeOffset.nanos = 0;
}
console.log(
`\tStart: ${time.startTimeOffset.seconds}` +
`.${(time.startTimeOffset.nanos / 1e6).toFixed(0)}s`
);
console.log(
`\tEnd: ${time.endTimeOffset.seconds}.` +
`${(time.endTimeOffset.nanos / 1e6).toFixed(0)}s`
);
console.log(`\tConfidence: ${segment.confidence}`);
});
});
Python
"""Detects labels given a GCS path."""
video_client = videointelligence.VideoIntelligenceServiceClient()
features = [videointelligence.Feature.LABEL_DETECTION]
mode = videointelligence.LabelDetectionMode.SHOT_AND_FRAME_MODE
config = videointelligence.LabelDetectionConfig(label_detection_mode=mode)
context = videointelligence.VideoContext(label_detection_config=config)
operation = video_client.annotate_video(
request={"features": features, "input_uri": path, "video_context": context}
)
print("\nProcessing video for label annotations:")
result = operation.result(timeout=180)
print("\nFinished processing.")
# Process video/segment level label annotations
segment_labels = result.annotation_results[0].segment_label_annotations
for i, segment_label in enumerate(segment_labels):
print("Video label description: {}".format(segment_label.entity.description))
for category_entity in segment_label.category_entities:
print(
"\tLabel category description: {}".format(category_entity.description)
)
for i, segment in enumerate(segment_label.segments):
start_time = (
segment.segment.start_time_offset.seconds
+ segment.segment.start_time_offset.microseconds / 1e6
)
end_time = (
segment.segment.end_time_offset.seconds
+ segment.segment.end_time_offset.microseconds / 1e6
)
positions = "{}s to {}s".format(start_time, end_time)
confidence = segment.confidence
print("\tSegment {}: {}".format(i, positions))
print("\tConfidence: {}".format(confidence))
print("\n")
# Process shot level label annotations
shot_labels = result.annotation_results[0].shot_label_annotations
for i, shot_label in enumerate(shot_labels):
print("Shot label description: {}".format(shot_label.entity.description))
for category_entity in shot_label.category_entities:
print(
"\tLabel category description: {}".format(category_entity.description)
)
for i, shot in enumerate(shot_label.segments):
start_time = (
shot.segment.start_time_offset.seconds
+ shot.segment.start_time_offset.microseconds / 1e6
)
end_time = (
shot.segment.end_time_offset.seconds
+ shot.segment.end_time_offset.microseconds / 1e6
)
positions = "{}s to {}s".format(start_time, end_time)
confidence = shot.confidence
print("\tSegment {}: {}".format(i, positions))
print("\tConfidence: {}".format(confidence))
print("\n")
# Process frame level label annotations
frame_labels = result.annotation_results[0].frame_label_annotations
for i, frame_label in enumerate(frame_labels):
print("Frame label description: {}".format(frame_label.entity.description))
for category_entity in frame_label.category_entities:
print(
"\tLabel category description: {}".format(category_entity.description)
)
# Each frame_label_annotation has many frames,
# here we print information only about the first frame.
frame = frame_label.frames[0]
time_offset = frame.time_offset.seconds + frame.time_offset.microseconds / 1e6
print("\tFirst frame time offset: {}s".format(time_offset))
print("\tFirst frame confidence: {}".format(frame.confidence))
print("\n")
其他语言
C#:请按照“客户端库”页面上的 C# 设置说明操作,然后访问 .NET 的 Video Intelligence 参考文档。
PHP:请按照客户端库页面上的 PHP 设置说明操作,然后访问 PHP 版 Video Intelligence 参考文档。
Ruby:请按照客户端库页面上的 Ruby 设置说明操作,然后访问 Ruby 版 Video Intelligence 参考文档。