Class TextAnnotation (2.3.2)

TextAnnotation(mapping=None, *, ignore_unknown_fields=False, **kwargs)

TextAnnotation contains a structured representation of OCR extracted text. The hierarchy of an OCR extracted text structure is like this: TextAnnotation -> Page -> Block -> Paragraph -> Word -> Symbol Each structural component, starting from Page, may further have their own properties. Properties describe detected languages, breaks etc.. Please refer to the TextAnnotation.TextProperty message definition below for more detail.

Attributes

NameDescription
pages Sequence[google.cloud.vision_v1.types.Page]
List of pages detected by OCR.
text str
UTF-8 text detected on the pages.

Classes

DetectedBreak

DetectedBreak(mapping=None, *, ignore_unknown_fields=False, **kwargs)

Detected start or end of a structural component. .. attribute:: type_

Detected break type.

:type: google.cloud.vision_v1.types.TextAnnotation.DetectedBreak.BreakType

DetectedLanguage

DetectedLanguage(mapping=None, *, ignore_unknown_fields=False, **kwargs)

Detected language for a structural component. .. attribute:: language_code

The BCP-47 language code, such as "en-US" or "sr-Latn". For more information, see http://www.unicode.org/reports/tr35/#Unicode_locale_identifier.

:type: str

TextProperty

TextProperty(mapping=None, *, ignore_unknown_fields=False, **kwargs)

Additional information detected on the structural component. .. attribute:: detected_languages

A list of detected languages together with confidence.

:type: Sequence[google.cloud.vision_v1.types.TextAnnotation.DetectedLanguage]