The Vision API can run offline (asynchronous) detection services and
annotation of a large batch of image files using any Vision
feature type. For example, you
can specify one or more Vision API features (such as
TEXT_DETECTION
, LABEL_DETECTION
,
and LANDMARK_DETECTION
) for a single batch of images.
Output from an offline batch request is written to a JSON file created in the specified Cloud Storage bucket.
Limitations
The Vision API accepts up to 2,000 image files. A larger batch of image files will return an error.
Currently supported feature types
Feature type | |
---|---|
CROP_HINTS |
Determine suggested vertices for a crop region on an image. |
DOCUMENT_TEXT_DETECTION |
Perform OCR on dense text images, such as documents (PDF/TIFF), and images with
handwriting.
TEXT_DETECTION can be used for sparse text images.
Takes precedence when both DOCUMENT_TEXT_DETECTION and
TEXT_DETECTION are present.
|
FACE_DETECTION |
Detect faces within the image. |
IMAGE_PROPERTIES |
Compute a set of image properties, such as the image's dominant colors. |
LABEL_DETECTION |
Add labels based on image content. |
LANDMARK_DETECTION |
Detect geographic landmarks within the image. |
LOGO_DETECTION |
Detect company logos within the image. |
OBJECT_LOCALIZATION |
Detect and extract multiple objects in an image. |
SAFE_SEARCH_DETECTION |
Run SafeSearch to detect potentially unsafe or undesirable content. |
TEXT_DETECTION |
Perform Optical Character Recognition (OCR) on text within the image.
Text detection is optimized for areas of sparse text within a larger image.
If the image is a document (PDF/TIFF), has dense text, or contains handwriting,
use DOCUMENT_TEXT_DETECTION instead.
|
WEB_DETECTION |
Detect topical entities such as news, events, or celebrities within the image, and find similar images on the web using the power of Google Image Search. |
Sample code
Use the following code samples to run offline annotation services on a batch of image files in Cloud Storage.
Java
Before trying this sample, follow the Java setup instructions in the Vision API Quickstart Using Client Libraries. For more information, see the Vision API Java reference documentation.
Node.js
Before trying this sample, follow the Node.js setup instructions in the Vision quickstart using client libraries. For more information, see the Vision Node.js API reference documentation.
To authenticate to Vision, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Python
Before trying this sample, follow the Python setup instructions in the Vision quickstart using client libraries. For more information, see the Vision Python API reference documentation.
To authenticate to Vision, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Response
A successful request returns response JSON files in the Cloud Storage
bucket you indicated in the code sample. The number of responses per JSON file
is dictated by batch_size
in the code sample.
The returned response is similar to regular Vision API feature responses, depending on which features you request for an image.
The following responses
show LABEL_DETECTION
and TEXT_DETECTION
annotations for image1.png
,
IMAGE_PROPERTIES
annotations for image2.jpg
, and OBJECT_LOCALIZATION
annotations for image3.jpg
.
The response also contain a context
field showing
the file's URI.
offline_batch_output/output-1-to-2.json
{ "responses": [ { "labelAnnotations": [ { "mid": "/m/07s6nbt", "description": "Text", "score": 0.93413997, "topicality": 0.93413997 }, { "mid": "/m/0dwx7", "description": "Logo", "score": 0.8733531, "topicality": 0.8733531 }, ... { "mid": "/m/03bxgrp", "description": "Company", "score": 0.5682425, "topicality": 0.5682425 } ], "textAnnotations": [ { "locale": "en", "description": "Google\n", "boundingPoly": { "vertices": [ { "x": 72, "y": 40 }, { "x": 613, "y": 40 }, { "x": 613, "y": 233 }, { "x": 72, "y": 233 } ] } }, ... ], "blockType": "TEXT" } ] } ], "text": "Google\n" }, "context": { "uri": "gs://cloud-samples-data/vision/document_understanding/image1.png" } }, { "imagePropertiesAnnotation": { "dominantColors": { "colors": [ { "color": { "red": 229, "green": 230, "blue": 238 }, "score": 0.2744754, "pixelFraction": 0.075339235 }, ... { "color": { "red": 86, "green": 87, "blue": 95 }, "score": 0.025770646, "pixelFraction": 0.13109145 } ] } }, "cropHintsAnnotation": { "cropHints": [ { "boundingPoly": { "vertices": [ {}, { "x": 1599 }, { "x": 1599, "y": 1199 }, { "y": 1199 } ] }, "confidence": 0.79999995, "importanceFraction": 1 } ] }, "context": { "uri": "gs://cloud-samples-data/vision/document_understanding/image2.jpg" } } ] }
offline_batch_output/output-3-to-3.json
{ "responses": [ { "context": { "uri": "gs://cloud-samples-data/vision/document_understanding/image3.jpg" }, "localizedObjectAnnotations": [ { "mid": "/m/0bt9lr", "name": "Dog", "score": 0.9669734, "boundingPoly": { "normalizedVertices": [ { "x": 0.6035543, "y": 0.1357359 }, { "x": 0.98546547, "y": 0.1357359 }, { "x": 0.98546547, "y": 0.98426414 }, { "x": 0.6035543, "y": 0.98426414 } ] } }, ... { "mid": "/m/0jbk", "name": "Animal", "score": 0.58003056, "boundingPoly": { "normalizedVertices": [ { "x": 0.014534635, "y": 0.1357359 }, { "x": 0.37197515, "y": 0.1357359 }, { "x": 0.37197515, "y": 0.98426414 }, { "x": 0.014534635, "y": 0.98426414 } ] } } ] } ] }