The Google Cloud Vision API supports the following image types:
- Animated GIF (first frame only)
Note that some of these image formats are "lossy" (for example, JPEG). Reducing file sizes for such lossy formats may result in a degradation of image quality, and hence, Vision API accuracy.
To enable accurate image detection within the Google Cloud Vision API, images should generally be a minimum of 640 x 480 pixels (about 300k pixels). Full details for different types of Vision API Feature requests are shown below:
|Vision API Feature||Recommended Size *||Notes|
|FACE_DETECTION||1600 x 1200||Distance between eyes is most important|
|LANDMARK_DETECTION||640 x 480|
|LOGO_DETECTION||640 x 480|
|LABEL_DETECTION||640 x 480|
|TEXT_DETECTION and DOCUMENT_TEXT_DETECTION||1024 x 768||OCR requires more resolution to detect characters|
|SAFE_SEARCH_DETECTION||640 x 480|
* Note: generally, the Vision API requires images to be a sufficient size so that important features within the request can be easily distinguished. Sizes smaller or larger than these recommended sizes may work. However, smaller sizes may result in lower accuracy, while larger sizes may increase processing time and bandwidth usage without providing comparable benefits in accuracy.
These recommended sizes differ based on the feature being detected. For example,
FACE_DETECTION requests generally requires larger image sizes because the
features being detected (faces) are smaller than the image itself.
LABEL_DETECTION requests, on the other hand, generally evaluate an entire
In practice, a standard size of 640 x 480 pixels works well in most cases; sizes larger than this may not gain much in accuracy, while greatly diminishing throughput. When at all possible, pre-process your images to reduce their size to these minimum standards.
Image files sent to the Google Cloud Vision API should not exceed 4 MB. Reducing your file size can significantly improve throughput; however, be careful not to reduce image quality in the process. If you are batching images and sending them in one request, also note that the Vision API imposes an 8 MB per request limit.
Most photos taken with digital cameras currently default to "raw" file sizes based on megapixel constraints, resulting in images often in excess of 4 MB, if those images are not compressed. Be sure that you appropriately preprocess such images to reduce them to a more reasonable image size, while also downsampling them to a reasonable file size.