The Vision API can detect and extract multiple objects in an image with Object Localization.
Object localization identifies multiple objects in an image and
provides a LocalizedObjectAnnotation
for each object in the image. Each LocalizedObjectAnnotation
identifies
information about the object, the position of the object, and rectangular bounds
for the region of the image that contains the object.
Object localization identifies both significant and less-prominent objects in an image.
Object information is returned in English only. The Cloud Translation can translate English labels into various other languages.
For example, the API returns the following information and bounding location data for the objects in the preceding image:
Name | mid | Score | Bounds |
---|---|---|---|
Bicycle wheel | /m/01bqk0 | 0.89648587 | (0.32076266, 0.78941387), (0.43812272, 0.78941387), (0.43812272, 0.97331065), (0.32076266, 0.97331065) |
Bicycle | /m/0199g | 0.886761 | (0.312, 0.6616471), (0.638353, 0.6616471), (0.638353, 0.9705882), (0.312, 0.9705882) |
Bicycle wheel | /m/01bqk0 | 0.6345275 | (0.5125398, 0.760708), (0.6256646, 0.760708), (0.6256646, 0.94601655), (0.5125398, 0.94601655) |
Picture frame | /m/06z37_ | 0.6207608 | (0.79177403, 0.16160682), (0.97047985, 0.16160682), (0.97047985, 0.31348917), (0.79177403, 0.31348917) |
Tire | /m/0h9mv | 0.55886006 | (0.32076266, 0.78941387), (0.43812272, 0.78941387), (0.43812272, 0.97331065), (0.32076266, 0.97331065) |
Door | /m/02dgv | 0.5160098 | (0.77569866, 0.37104446), (0.9412425, 0.37104446), (0.9412425, 0.81507325), (0.77569866, 0.81507325) |
mid contains a machine-generated identifier (MID) corresponding to a label's Google Knowledge Graph entry. For information on inspecting mid values, see the Google Knowledge Graph Search API documentation.
Try it for yourself
If you're new to Google Cloud, create an account to evaluate how Cloud Vision API performs in real-world scenarios. New customers also get $300 in free credits to run, test, and deploy workloads.
Try Cloud Vision API freeObject Localization requests
Set up your Google Cloud project and authentication
Detect objects in a local image
You can use the Vision API to perform feature detection on a local image file.
For REST requests, send the contents of the image file as a base64 encoded string in the body of your request.
For gcloud
and client library requests, specify the path to a local image in your
request.
REST
Before using any of the request data, make the following replacements:
- BASE64_ENCODED_IMAGE: The base64
representation (ASCII string) of your binary image data. This string should look similar to the
following string:
/9j/4QAYRXhpZgAA...9tAVx/zDQDlGxn//2Q==
- RESULTS_INT: (Optional) An integer value of results to
return. If you omit the
"maxResults"
field and its value, the API returns the default value of 10 results. This field does not apply to the following feature types:TEXT_DETECTION
,DOCUMENT_TEXT_DETECTION
, orCROP_HINTS
. - PROJECT_ID: Your Google Cloud project ID.
HTTP method and URL:
POST https://vision.googleapis.com/v1/images:annotate
Request JSON body:
{ "requests": [ { "image": { "content": "BASE64_ENCODED_IMAGE" }, "features": [ { "maxResults": RESULTS_INT, "type": "OBJECT_LOCALIZATION" }, ] } ] }
To send your request, choose one of these options:
curl
Save the request body in a file named request.json
,
and execute the following command:
curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "x-goog-user-project: PROJECT_ID" \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
"https://vision.googleapis.com/v1/images:annotate"
PowerShell
Save the request body in a file named request.json
,
and execute the following command:
$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred"; "x-goog-user-project" = "PROJECT_ID" }
Invoke-WebRequest `
-Method POST `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "https://vision.googleapis.com/v1/images:annotate" | Select-Object -Expand Content
If the request is successful, the server returns a 200 OK
HTTP status code and
the response in JSON format.
Response:
Go
Before trying this sample, follow the Go setup instructions in the Vision quickstart using client libraries. For more information, see the Vision Go API reference documentation.
To authenticate to Vision, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Java
Before trying this sample, follow the Java setup instructions in the Vision API Quickstart Using Client Libraries. For more information, see the Vision API Java reference documentation.
Node.js
Before trying this sample, follow the Node.js setup instructions in the Vision quickstart using client libraries. For more information, see the Vision Node.js API reference documentation.
To authenticate to Vision, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Python
Before trying this sample, follow the Python setup instructions in the Vision quickstart using client libraries. For more information, see the Vision Python API reference documentation.
To authenticate to Vision, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Additional languages
C#: Please follow the C# setup instructions on the client libraries page and then visit the Vision reference documentation for .NET.
PHP: Please follow the PHP setup instructions on the client libraries page and then visit the Vision reference documentation for PHP.
Ruby: Please follow the Ruby setup instructions on the client libraries page and then visit the Vision reference documentation for Ruby.
Detect objects in a remote image
You can use the Vision API to perform feature detection on a remote image file that is located in Cloud Storage or on the Web. To send a remote file request, specify the file's Web URL or Cloud Storage URI in the request body.
REST
Before using any of the request data, make the following replacements:
- CLOUD_STORAGE_IMAGE_URI: the path to a valid
image file in a Cloud Storage bucket. You must at least have read privileges to the file.
Example:
https://cloud.google.com/vision/docs/images/bicycle_example.png
- RESULTS_INT: (Optional) An integer value of results to
return. If you omit the
"maxResults"
field and its value, the API returns the default value of 10 results. This field does not apply to the following feature types:TEXT_DETECTION
,DOCUMENT_TEXT_DETECTION
, orCROP_HINTS
. - PROJECT_ID: Your Google Cloud project ID.
HTTP method and URL:
POST https://vision.googleapis.com/v1/images:annotate
Request JSON body:
{ "requests": [ { "image": { "source": { "imageUri": "CLOUD_STORAGE_IMAGE_URI" } }, "features": [ { "maxResults": RESULTS_INT, "type": "OBJECT_LOCALIZATION" }, ] } ] }
To send your request, choose one of these options:
curl
Save the request body in a file named request.json
,
and execute the following command:
curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "x-goog-user-project: PROJECT_ID" \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
"https://vision.googleapis.com/v1/images:annotate"
PowerShell
Save the request body in a file named request.json
,
and execute the following command:
$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred"; "x-goog-user-project" = "PROJECT_ID" }
Invoke-WebRequest `
-Method POST `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "https://vision.googleapis.com/v1/images:annotate" | Select-Object -Expand Content
If the request is successful, the server returns a 200 OK
HTTP status code and
the response in JSON format.
Response:
Go
Before trying this sample, follow the Go setup instructions in the Vision quickstart using client libraries. For more information, see the Vision Go API reference documentation.
To authenticate to Vision, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Java
Before trying this sample, follow the Java setup instructions in the Vision API Quickstart Using Client Libraries. For more information, see the Vision API Java reference documentation.
Node.js
Before trying this sample, follow the Node.js setup instructions in the Vision quickstart using client libraries. For more information, see the Vision Node.js API reference documentation.
To authenticate to Vision, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Python
Before trying this sample, follow the Python setup instructions in the Vision quickstart using client libraries. For more information, see the Vision Python API reference documentation.
To authenticate to Vision, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
gcloud
To detect labels in an image, use the
gcloud ml vision detect-objects
command as shown in the following example:
gcloud ml vision detect-objects https://cloud.google.com/vision/docs/images/bicycle_example.png
Additional languages
C#: Please follow the C# setup instructions on the client libraries page and then visit the Vision reference documentation for .NET.
PHP: Please follow the PHP setup instructions on the client libraries page and then visit the Vision reference documentation for PHP.
Ruby: Please follow the Ruby setup instructions on the client libraries page and then visit the Vision reference documentation for Ruby.
Try it
Try object detection and localization with the following tool. You can use the
image specified already (https://cloud.google.com/vision/docs/images/bicycle_example.png
)
or specify your own image in its place. Send the request by selecting
Execute.
Request body:
{ "requests": [ { "features": [ { "maxResults": 10, "type": "OBJECT_LOCALIZATION" } ], "image": { "source": { "imageUri": "https://cloud.google.com/vision/docs/images/bicycle_example.png" } } } ] }