The Vision API can detect and extract multiple objects in an image with Object Localization.
Object localization identifies multiple objects in an image and
provides a LocalizedObjectAnnotation
for each object in the image. Each LocalizedObjectAnnotation
identifies
information about the object, the position of the object, and rectangular bounds
for the region of the image that contains the object.
Object localization identifies both significant and less-prominent objects in an image.
Object information is returned in English only. The Cloud Translation can translate English labels into any of a number of other languages.

For example, the API might return the following information and bounding location data for the objects in the image above:
Name | mid | Score | Bounds |
---|---|---|---|
Bicycle wheel | /m/01bqk0 | 0.89648587 | (0.32076266, 0.78941387), (0.43812272, 0.78941387), (0.43812272, 0.97331065), (0.32076266, 0.97331065) |
Bicycle | /m/0199g | 0.886761 | (0.312, 0.6616471), (0.638353, 0.6616471), (0.638353, 0.9705882), (0.312, 0.9705882) |
Bicycle wheel | /m/01bqk0 | 0.6345275 | (0.5125398, 0.760708), (0.6256646, 0.760708), (0.6256646, 0.94601655), (0.5125398, 0.94601655) |
Picture frame | /m/06z37_ | 0.6207608 | (0.79177403, 0.16160682), (0.97047985, 0.16160682), (0.97047985, 0.31348917), (0.79177403, 0.31348917) |
Tire | /m/0h9mv | 0.55886006 | (0.32076266, 0.78941387), (0.43812272, 0.78941387), (0.43812272, 0.97331065), (0.32076266, 0.97331065) |
Door | /m/02dgv | 0.5160098 | (0.77569866, 0.37104446), (0.9412425, 0.37104446), (0.9412425, 0.81507325), (0.77569866, 0.81507325) |
mid contains a machine-generated identifier (MID) corresponding to a label's Google Knowledge Graph entry. For information on inspecting mid values, see the Google Knowledge Graph Search API documentation.
Object Localization requests
Set up your GCP project and authentication
Detect objects in a local image
The Vision API can perform feature detection on a local image file by sending the contents of the image file as a base64 encoded string in the body of your request.
REST & CMD LINE
Before using any of the request data below, make the following replacements:
- base64-encoded-image: The base64
representation (ASCII string) of your binary image data. This string should look similar to the
following string:
/9j/4QAYRXhpZgAA...9tAVx/zDQDlGxn//2Q==
HTTP method and URL:
POST https://vision.googleapis.com/v1/images:annotate
Request JSON body:
{ "requests": [ { "image": { "content": "base64-encoded-image" }, "features": [ { "maxResults": 10, "type": "OBJECT_LOCALIZATION" }, ] } ] }
To send your request, choose one of these options:
curl
Save the request body in a file called request.json
,
and execute the following command:
curl -X POST \
-H "Authorization: Bearer "$(gcloud auth application-default print-access-token) \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
https://vision.googleapis.com/v1/images:annotate
PowerShell
Save the request body in a file called request.json
,
and execute the following command:
$cred = gcloud auth application-default print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }
Invoke-WebRequest `
-Method POST `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "https://vision.googleapis.com/v1/images:annotate" | Select-Object -Expand Content
If the request is successful, the server returns a 200 OK
HTTP status code and
the response in JSON format.
Response:
C#
Before trying this sample, follow the C# setup instructions in the Vision Quickstart Using Client Libraries. For more information, see the Vision C# API reference documentation.
Go
Before trying this sample, follow the Go setup instructions in the Vision Quickstart Using Client Libraries. For more information, see the Vision Go API reference documentation.
Java
Before trying this sample, follow the Java setup instructions in the Vision API Quickstart Using Client Libraries. For more information, see the Vision API Java API reference documentation.
Node.js
Before trying this sample, follow the Node.js setup instructions in the Vision Quickstart Using Client Libraries. For more information, see the Vision Node.js API reference documentation.
PHP
Before trying this sample, follow the PHP setup instructions in the Vision Quickstart Using Client Libraries. For more information, see the Vision PHP API reference documentation.
Python
Before trying this sample, follow the Python setup instructions in the Vision Quickstart Using Client Libraries. For more information, see the Vision Python API reference documentation.
Ruby
Before trying this sample, follow the Ruby setup instructions in the Vision Quickstart Using Client Libraries. For more information, see the Vision Ruby API reference documentation.
Detect objects in a remote image
For your convenience, the Vision API can perform feature detection directly on an image file located in Google Cloud Storage or on the Web without the need to send the contents of the image file in the body of your request.
REST & CMD LINE
Before using any of the request data below, make the following replacements:
- cloud-storage-image-uri: the path to a valid
image file in a Cloud Storage bucket. You must at least have read privileges to the file.
Example:
https://cloud.google.com/vision/docs/images/bicycle_example.png
HTTP method and URL:
POST https://vision.googleapis.com/v1/images:annotate
Request JSON body:
{ "requests": [ { "image": { "source": { "imageUri": "cloud-storage-image-uri" } }, "features": [ { "maxResults": 10, "type": "OBJECT_LOCALIZATION" }, ] } ] }
To send your request, choose one of these options:
curl
Save the request body in a file called request.json
,
and execute the following command:
curl -X POST \
-H "Authorization: Bearer "$(gcloud auth application-default print-access-token) \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
https://vision.googleapis.com/v1/images:annotate
PowerShell
Save the request body in a file called request.json
,
and execute the following command:
$cred = gcloud auth application-default print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }
Invoke-WebRequest `
-Method POST `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "https://vision.googleapis.com/v1/images:annotate" | Select-Object -Expand Content
If the request is successful, the server returns a 200 OK
HTTP status code and
the response in JSON format.
Response:
C#
Before trying this sample, follow the C# setup instructions in the Vision Quickstart Using Client Libraries. For more information, see the Vision C# API reference documentation.
Go
Before trying this sample, follow the Go setup instructions in the Vision Quickstart Using Client Libraries. For more information, see the Vision Go API reference documentation.
Java
Before trying this sample, follow the Java setup instructions in the Vision API Quickstart Using Client Libraries. For more information, see the Vision API Java API reference documentation.
Node.js
Before trying this sample, follow the Node.js setup instructions in the Vision Quickstart Using Client Libraries. For more information, see the Vision Node.js API reference documentation.
PHP
Before trying this sample, follow the PHP setup instructions in the Vision Quickstart Using Client Libraries. For more information, see the Vision PHP API reference documentation.
Python
Before trying this sample, follow the Python setup instructions in the Vision Quickstart Using Client Libraries. For more information, see the Vision Python API reference documentation.
Ruby
Before trying this sample, follow the Ruby setup instructions in the Vision Quickstart Using Client Libraries. For more information, see the Vision Ruby API reference documentation.
gcloud
To detect labels in an image, use the
gcloud ml vision detect-objects
command as shown in the following example:
gcloud ml vision detect-objects https://cloud.google.com/vision/docs/images/bicycle_example.png
Try it
Try object detection and localization below. You can use the
image specified already (https://cloud.google.com/vision/docs/images/bicycle_example.png
)
or specify your own image in its place. Send the request by selecting
Execute.
