Object tracking tracks multiple objects detected in an input video. To make
an object tracking request, call the
annotate
method and specify OBJECT_TRACKING
in the features
field.
An object tracking request annotates a video with labels (tags) for entities that are detected in the video or video segments provided. For example, a video of vehicles crossing a traffic signal might produce labels such as "car", "truck", "bike," "tires", "lights", "window" and so on. Each label can include a series of bounding boxes, with each bounding box having an associated time segment containing a time offset (timestamp) that indicates the duration offset from the beginning of the video. The annotation also contains additional entity information including an entity id that you can use to find more information about the entity in the Google Knowledge Graph Search API.
Object Tracking vs. Label Detection
Object tracking differs from label detection in that label detection provides labels without bounding boxes, while object tracking detects the presence of individual boxable objects in a given video along with the bounding box for each.
Request Object Tracking for a Video on Google Cloud Storage
The following samples demonstrate object tracking on a file located in Google Cloud Storage.
REST API
The following shows how to send a POST
request to the
videos:annotate
method. The example uses the access token for a service account set up
for the project using the Google Cloud Platform Cloud SDK. For
instructions on installing the Cloud SDK, setting up a project with
a service account, and obtaining an access token, see the
Video Intelligence API Quickstart.
curl -X POST \ -H "Authorization: Bearer $(gcloud auth application-default print-access-token)" \ -H "Content-Type: application/json; charset=utf-8" \ --data "{ 'inputUri': 'gs://cloud-ml-sandbox/video/chicago.mp4', 'features': ['OBJECT_TRACKING'], }" "https://videointelligence.googleapis.com/v1/videos:annotate"
If the request is successful, the Video Intelligence API returns the
name
for your operation. The following
shows an example of such a response, where project-name
is the name of your project and operation-id
is the ID of the long running operation created for the request.
{ "name": "projects/project-name/locations/us-west1/operations/operation-id" }
To retrieve the result of the operation, make a
GET
request, using the operation name
returned from the
call to
videos:annotate
,
as shown in the following example.
curl -H "Authorization: Bearer $(gcloud auth application-default print-access-token)" \ https://videointelligence.googleapis.com/v1/operation-name
Object tracking annotations are returned as a objectAnnotations
list.
{ "name": "projects/PROJECT_NAME/locations/us-west1/operations/OPERATION_ID", "metadata": { "@type": "type.googleapis.com/google.cloud.videointelligence.v1.AnnotateVideoProgress", "annotationProgress": [ { "inputUri": "/cloud-ml-sandbox/video/chicago.mp4", "progressPercent": 100, "startTime": "2018-06-21T16:56:46.755199Z", "updateTime": "2018-06-21T16:59:17.911197Z" } ] }, "done": true, "response": { "@type": "type.googleapis.com/google.cloud.videointelligence.v1.AnnotateVideoResponse", "annotationResults": [ { "inputUri": "/cloud-ml-sandbox/video/chicago.mp4", "objectAnnotations": [ { "entity": { "entityId": "/m/0k4j", "description": "car", "languageCode": "en-US" }, "frames": [ { "normalizedBoundingBox": { "left": 0.2672763, "top": 0.5677657, "right": 0.4388713, "bottom": 0.7623171 }, "timeOffset": "0s" }, { "normalizedBoundingBox": { "left": 0.26920167, "top": 0.5659805, "right": 0.44331276, "bottom": 0.76780635 }, "timeOffset": "0.100495s" }, ... { "normalizedBoundingBox": { "left": 0.83573246, "top": 0.6645812, "right": 1, "bottom": 0.99865407 }, "timeOffset": "2.311402s" } ], "segment": { "startTimeOffset": "0s", "endTimeOffset": "2.311402s" }, "confidence": 0.99488896 }, ... { "entity": { "entityId": "/m/0cgh4", "description": "building", "languageCode": "en-US" }, "frames": [ { "normalizedBoundingBox": { "left": 0.12340179, "top": 0.010383379, "right": 0.21914443, "bottom": 0.5591795 }, "timeOffset": "0s" }, { "normalizedBoundingBox": { "left": 0.12340179, "top": 0.009684974, "right": 0.22915152, "bottom": 0.56070584 }, "timeOffset": "0.100495s" }, ... { "normalizedBoundingBox": { "left": 0.12340179, "top": 0.008624528, "right": 0.22723165, "bottom": 0.56158626 }, "timeOffset": "0.401983s" } ], "segment": { "startTimeOffset": "0s", "endTimeOffset": "0.401983s" }, "confidence": 0.33914912 }, ... { "entity": { "entityId": "/m/0cgh4", "description": "building", "languageCode": "en-US" }, "frames": [ { "normalizedBoundingBox": { "left": 0.79324204, "top": 0.0006896425, "right": 0.99659824, "bottom": 0.5324423 }, "timeOffset": "37.585421s" }, { "normalizedBoundingBox": { "left": 0.78935236, "top": 0.0011992548, "right": 0.99659824, "bottom": 0.5374946 }, "timeOffset": "37.685917s" }, ... { "normalizedBoundingBox": { "left": 0.79404694, "right": 0.99659824, "bottom": 0.5280966 }, "timeOffset": "38.590379s" } ], "segment": { "startTimeOffset": "37.585421s", "endTimeOffset": "38.590379s" }, "confidence": 0.3415429 } ] } ] } }
C#
Go
Java
Node.js
Python
Ruby
Request Object Tracking for Video from a Local File
The following samples demonstrate object tracking on a file stored locally.
REST API
To perform annotation on a local video file, base64-encode
the contents of the video file. Include the base64-encoded contents
in the inputContent
field of the request.
For information on how to base64-encode the contents of a video
file, see
Base64 Encoding.
The following shows how to send a POST
request to the
videos:annotate
method. The example uses the access token for a service account set up
for the project using the Google Cloud Platform Cloud SDK. For
instructions on installing the Cloud SDK, setting up a project with
a service account, and obtaining an access token, see the
Video Intelligence API Quickstart.
curl -X POST \ -H "Authorization: Bearer $(gcloud auth application-default print-access-token)" \ -H "Content-Type: application/json; charset=utf-8" \ --data '{ "inputContent": "UklGRg41AwBBVkkgTElTVAwBAABoZHJsYXZpaDgAAAA1ggAAxPMBAAAAAAAQCAA...", "features": ["OBJECT_TRACKING"], }' "https://videointelligence.googleapis.com/v1/videos:annotate"
If the request is successful, the Video Intelligence API returns the
name
for your operation. The following
shows an example of such a response, where project-name
is the name of your project and operation-id
is the ID of the long running operation created for the request.
{ "name": "projects/project-name/locations/us-west1/operations/operation-id" }
To retrieve the result of the operation, make a
GET
request, using the operation name
returned from the
call to
videos:annotate
,
as shown in the following example.
curl -H "Authorization: Bearer $(gcloud auth application-default print-access-token)" \ https://videointelligence.googleapis.com/v1/operation-name
Object tracking annotations are returned as a objectAnnotations
list.
{ "name": "projects/PROJECT_NAME/locations/us-west1/operations/OPERATION_ID", "metadata": { "@type": "type.googleapis.com/google.cloud.videointelligence.v1.AnnotateVideoProgress", "annotationProgress": [ { "inputContent": "UklGRg41AwBBVkkgTElTVAwBAABoZHJsYXZpaDgAAAA1ggAAxPMBAAAAAAAQCAA...", "progressPercent": 100, "startTime": "2018-06-21T16:56:46.755199Z", "updateTime": "2018-06-21T16:59:17.911197Z" } ] }, "done": true, "response": { "@type": "type.googleapis.com/google.cloud.videointelligence.v1.AnnotateVideoResponse", "annotationResults": [ { "inputUri": "/cloud-ml-sandbox/video/chicago.mp4", "objectAnnotations": [ { "entity": { "entityId": "/m/0k4j", "description": "car", "languageCode": "en-US" }, "frames": [ { "normalizedBoundingBox": { "left": 0.2672763, "top": 0.5677657, "right": 0.4388713, "bottom": 0.7623171 }, "timeOffset": "0s" }, { "normalizedBoundingBox": { "left": 0.26920167, "top": 0.5659805, "right": 0.44331276, "bottom": 0.76780635 }, "timeOffset": "0.100495s" }, ... { "normalizedBoundingBox": { "left": 0.83573246, "top": 0.6645812, "right": 1, "bottom": 0.99865407 }, "timeOffset": "2.311402s" } ], "segment": { "startTimeOffset": "0s", "endTimeOffset": "2.311402s" }, "confidence": 0.99488896 }, ... { "entity": { "entityId": "/m/0cgh4", "description": "building", "languageCode": "en-US" }, "frames": [ { "normalizedBoundingBox": { "left": 0.12340179, "top": 0.010383379, "right": 0.21914443, "bottom": 0.5591795 }, "timeOffset": "0s" }, { "normalizedBoundingBox": { "left": 0.12340179, "top": 0.009684974, "right": 0.22915152, "bottom": 0.56070584 }, "timeOffset": "0.100495s" }, ... { "normalizedBoundingBox": { "left": 0.12340179, "top": 0.008624528, "right": 0.22723165, "bottom": 0.56158626 }, "timeOffset": "0.401983s" } ], "segment": { "startTimeOffset": "0s", "endTimeOffset": "0.401983s" }, "confidence": 0.33914912 }, ... { "entity": { "entityId": "/m/0cgh4", "description": "building", "languageCode": "en-US" }, "frames": [ { "normalizedBoundingBox": { "left": 0.79324204, "top": 0.0006896425, "right": 0.99659824, "bottom": 0.5324423 }, "timeOffset": "37.585421s" }, { "normalizedBoundingBox": { "left": 0.78935236, "top": 0.0011992548, "right": 0.99659824, "bottom": 0.5374946 }, "timeOffset": "37.685917s" }, ... { "normalizedBoundingBox": { "left": 0.79404694, "right": 0.99659824, "bottom": 0.5280966 }, "timeOffset": "38.590379s" } ], "segment": { "startTimeOffset": "37.585421s", "endTimeOffset": "38.590379s" }, "confidence": 0.3415429 } ] } ] } }
C#
Go
Java
Node.js