Response to an image annotation request.
JSON representation |
---|
{ "textAnnotations": [ { object ( |
Fields | |
---|---|
textAnnotations[] |
If present, text (OCR) detection has completed successfully. |
fullTextAnnotation |
If present, text (OCR) detection or document (OCR) text detection has completed successfully. This annotation provides the structural hierarchy for the OCR detected text. |
error |
If set, represents the error message for the operation. Note that filled-in image annotations are guaranteed to be correct, even when |
context |
If present, contextual information is needed to understand where this image comes from. |
EntityAnnotation
Set of detected entity features.
JSON representation |
---|
{ "mid": string, "locale": string, "description": string, "score": number, "confidence": number, "topicality": number, "boundingPoly": { object ( |
Fields | |
---|---|
mid |
Opaque entity ID. Some IDs may be available in Google Knowledge Graph Search API. |
locale |
The language code for the locale in which the entity textual |
description |
Entity textual description, expressed in its |
score |
Overall score of the result. Range [0, 1]. |
confidence |
Deprecated. Use |
topicality |
The relevancy of the ICA (Image Content Annotation) label to the image. For example, the relevancy of "tower" is likely higher to an image containing the detected "Eiffel Tower" than to an image containing a detected distant towering building, even though the confidence that there is a tower in each image may be the same. Range [0, 1]. |
boundingPoly |
Image region to which this entity belongs. Not produced for |
properties[] |
Some entities may have optional user-supplied |
BoundingPoly
A bounding polygon for the detected image annotation.
JSON representation |
---|
{ "vertices": [ { object ( |
Fields | |
---|---|
vertices[] |
The bounding polygon vertices. |
normalizedVertices[] |
The bounding polygon normalized vertices. |
Vertex
A vertex represents a 2D point in the image. NOTE: the vertex coordinates are in the same scale as the original image.
JSON representation |
---|
{ "x": integer, "y": integer } |
Fields | |
---|---|
x |
X coordinate. |
y |
Y coordinate. |
NormalizedVertex
A vertex represents a 2D point in the image. NOTE: the normalized vertex coordinates are relative to the original image and range from 0 to 1.
JSON representation |
---|
{ "x": number, "y": number } |
Fields | |
---|---|
x |
X coordinate. |
y |
Y coordinate. |
Property
A Property
consists of a user-supplied name/value pair.
JSON representation |
---|
{ "name": string, "value": string, "uint64Value": string } |
Fields | |
---|---|
name |
Name of the property. |
value |
Value of the property. |
uint64Value |
Value of numeric properties. |
TextAnnotation
TextAnnotation
contains a structured representation of OCR-extracted text. The hierarchy of an OCR-extracted text structure is like this:
TextAnnotation
-> Page -> Block -> Paragraph -> Word -> Symbol
TextAnnotation.TextProperty
message definition that follows.
JSON representation |
---|
{
"pages": [
{
object ( |
Fields | |
---|---|
pages[] |
List of pages detected by OCR. |
text |
UTF-8 text detected on the pages. |
Page
Detected page from OCR.
JSON representation |
---|
{ "property": { object ( |
Fields | |
---|---|
property |
Additional information detected on the page. |
width |
Page width. For PDFs the unit is points. For images (including TIFFs) the unit is pixels. |
height |
Page height. For PDFs the unit is points. For images (including TIFFs) the unit is pixels. |
blocks[] |
List of blocks of text, images etc on this page. |
confidence |
Confidence of the OCR results on the page. Range [0, 1]. |
TextProperty
Additional information detected on the structural component.
JSON representation |
---|
{ "detectedLanguages": [ { object ( |
Fields | |
---|---|
detectedLanguages[] |
A list of detected languages together with confidence. |
detectedBreak |
Detected start or end of a text segment. |
DetectedLanguage
Detected language for a structural component.
JSON representation |
---|
{ "languageCode": string, "confidence": number } |
Fields | |
---|---|
languageCode |
The BCP-47 language code, such as "en-US" or "sr-Latn". For more information, see https://www.unicode.org/reports/tr35/#Unicode_locale_identifier. |
confidence |
Confidence of detected language. Range [0, 1]. |
DetectedBreak
Detected start or end of a structural component.
JSON representation |
---|
{
"type": enum ( |
Fields | |
---|---|
type |
Detected break type. |
isPrefix |
True if break prepends the element. |
BreakType
Enum to denote the type of break found. New line, space etc.
Enums | |
---|---|
UNKNOWN |
Unknown break label type. |
SPACE |
Regular space. |
SURE_SPACE |
Sure space (very wide). |
EOL_SURE_SPACE |
Line-wrapping break. |
HYPHEN |
End-line hyphen that is not present in text; does not co-occur with SPACE , LEADER_SPACE , or LINE_BREAK . |
LINE_BREAK |
Line break that ends a paragraph. |
Block
Logical element on the page.
JSON representation |
---|
{ "property": { object ( |
Fields | |
---|---|
property |
Additional information detected for the block. |
boundingBox |
The bounding box for the block. The vertices are in the order of top-left, top-right, bottom-right, bottom-left. When a rotation of the bounding box is detected the rotation is represented as around the top-left corner as defined when the text is read in the 'natural' orientation. For example:
and the vertex order will still be (0, 1, 2, 3). |
paragraphs[] |
List of paragraphs in this block (if this blocks is of type text). |
blockType |
Detected block type (text, image etc) for this block. |
confidence |
Confidence of the OCR results on the block. Range [0, 1]. |
Paragraph
Structural unit of text representing a number of words in certain order.
JSON representation |
---|
{ "property": { object ( |
Fields | |
---|---|
property |
Additional information detected for the paragraph. |
boundingBox |
The bounding box for the paragraph. The vertices are in the order of top-left, top-right, bottom-right, bottom-left. When a rotation of the bounding box is detected the rotation is represented as around the top-left corner as defined when the text is read in the 'natural' orientation. For example: * when the text is horizontal it might look like: 0----1 | | 3----2 * when it's rotated 180 degrees around the top-left corner it becomes: 2----3 | | 1----0 and the vertex order will still be (0, 1, 2, 3). |
words[] |
List of all words in this paragraph. |
confidence |
Confidence of the OCR results for the paragraph. Range [0, 1]. |
Word
A word representation.
JSON representation |
---|
{ "property": { object ( |
Fields | |
---|---|
property |
Additional information detected for the word. |
boundingBox |
The bounding box for the word. The vertices are in the order of top-left, top-right, bottom-right, bottom-left. When a rotation of the bounding box is detected the rotation is represented as around the top-left corner as defined when the text is read in the 'natural' orientation. For example: * when the text is horizontal it might look like: 0----1 | | 3----2 * when it's rotated 180 degrees around the top-left corner it becomes: 2----3 | | 1----0 and the vertex order will still be (0, 1, 2, 3). |
symbols[] |
List of symbols in the word. The order of the symbols follows the natural reading order. |
confidence |
Confidence of the OCR results for the word. Range [0, 1]. |
Symbol
A single symbol representation.
JSON representation |
---|
{ "property": { object ( |
Fields | |
---|---|
property |
Additional information detected for the symbol. |
boundingBox |
The bounding box for the symbol. The vertices are in the order of top-left, top-right, bottom-right, bottom-left. When a rotation of the bounding box is detected the rotation is represented as around the top-left corner as defined when the text is read in the 'natural' orientation. For example: * when the text is horizontal it might look like: 0----1 | | 3----2 * when it's rotated 180 degrees around the top-left corner it becomes: 2----3 | | 1----0 and the vertex order will still be (0, 1, 2, 3). |
text |
The actual UTF-8 representation of the symbol. |
confidence |
Confidence of the OCR results for the symbol. Range [0, 1]. |
BlockType
Type of a block (text, image etc) as identified by OCR.
Enums | |
---|---|
UNKNOWN |
Unknown block type. |
TEXT |
Regular text block. |
TABLE |
Table block. |
PICTURE |
Image block. |
RULER |
Horizontal/vertical line box. |
BARCODE |
Barcode block. |
ImageAnnotationContext
If an image was produced from a file (e.g. a PDF), this message gives information about the source of that image.
JSON representation |
---|
{ "uri": string, "pageNumber": integer } |
Fields | |
---|---|
uri |
The URI of the file used to produce the image. |
pageNumber |
If the file was a PDF or TIFF, this field gives the page number within the file used to produce the image. |