List of all Cloud Vision API features

Cloud Vision API currently allows you to use the following features:

All feature types

Face detection

image with 2 faces with and without annotations
  • Locates faces with bounding polygons, and identifies specific facial "landmarks" such as eyes, ears, nose, mouth, etc. along with their corresponding confidence values.
  • Returns likelihood ratings for emotion (joy, sorrow, anger, surprise) and general image properties (underexposed, blurred, headwear present).
  • Likelihoods ratings are expressed as 6 different values: UNKNOWN, VERY_UNLIKELY, UNLIKELY, POSSIBLE, LIKELY, or VERY_LIKELY.

Landmark detection

St Basil's Cathedral image
  • Provides the name of the landmark, a confidence score and a bounding box in the image for the landmark.
  • Gives coordinates for the detected entity.

Logo detection

annotated logo
  • Provides a textual description of the entity identified, a confidence score, and a bounding polygon for the logo in the file.

Label detection

Shanghai street image
  • Provides generalized labels for an image.
  • For each label returns a textual description, confidence score, and topicality rating.

Text detection

Road sign image
  • Optical character recognition (OCR) for an image; text recognition and conversion to machine-coded text.
  • Identifies and extracts UTF-8 text in an image.
  • Images: Optimized for sparse areas of text within a larger image.
  • Returns both a list of words identifed with text, bounding boxes, and confidence scores (textAnnotations), as well as the structural hierarchy for the OCR detected text (fullTextAnnotation).
  • Hierarchy of extracted text structure:
    • TextAnnotation -> Page -> Block -> Paragraph -> Word -> Symbol.
    • Each structural component from Page on may further have their own properties such as detected languages, breaks, etc.
  • Works with currently supported, mapped, and experimental languages.
  • Feature enum value: TEXT_DETECTION.

Document text detection (dense text / handwriting)

Dense image with annotations
handwriting image
  • Optical character recognition (OCR) for a file (PDF/TIFF) or dense text image; dense text recognition and conversion to machine-coded text.
  • Files: Optimized for document files (PDF/TIFF).
  • Images: Optimized for dense areas of text in an image (images that are documents), and images that contain handwriting.
  • Returns the structural hierarchy for the OCR detected text (fullTextAnnotation).
  • Hierarchy of extracted text structure:
    • TextAnnotation -> Page -> Block -> Paragraph -> Word -> Symbol.
    • Each structural component from Page on may further have their own properties such as detected languages, breaks, etc.
  • Works with currently supported, mapped, and experimental languages.
  • Feature enum value: DOCUMENT_TEXT_DETECTION.
  • Takes precedence when both DOCUMENT_TEXT_DETECTION and TEXT_DETECTION are requested.

Image properties

Bali image with properties
  • Returns dominant colors in an image.
  • Each color is represented in the RGBA color space, has a confidence score, and displays the fraction of pixels occupied by the color [0, 1].

Object localization

image with bounding boxes
  • Provides general label and bounding box annotations for multiple objects recognized in a single image.
  • For each object detected the following elements are returned: a textual description, a confidence score, and normalized vertices [0,1] for the bounding polygon around the object.

Crop hint detection

image with cropped version
  • Provides a bounding polygon for the cropped image, a confidence score, and an importance fraction of this salient region with respect to the original image for each request.
  • You can provide up to 16 image ratio values (width:height) for a single image.

Web entities and pages

image with web entities table
  • Provides a series of related Web content to an image.
  • Returns the following information:
    • Web entities: Inferred entities (labels/descriptions) from similar images on the Web.
    • Full matching images: A list of URLs for fully matching images of any size on the Internet.
    • Partial matching images: A list of URLs for images that share key-point features, such as a cropped version of the original image.
    • Pages with matching images: A list of Webpages (identified by page URL, page title, matching image URL) with an image that satisfies the conditions described above.
    • Visually similar images: A list of URLs for images that share some features with the original image.
    • Best guess label: A best guess as to the topic of the requested image inferred from similar images on the Internet.

Explicit content detection (Safe Search)

  • Provides likelihood ratings for the following explicit content catgories: adult, spoof, medical, violence, and racy.
  • Likelihoods ratings are expressed as 6 different values: UNKNOWN, VERY_UNLIKELY, UNLIKELY, POSSIBLE, LIKELY, or VERY_LIKELY.
¿Te ha resultado útil esta página? Enviar comentarios:

Enviar comentarios sobre...

Cloud Vision API Documentation
Si necesitas ayuda, visita nuestra página de asistencia.