Class Page (0.11.1a0)

Page(
    documentai_object: google.cloud.documentai_v1.types.document.Document.Page,
    document_text: str,
)

Represents a wrapped documentai.Document.Page .

Attributes

NameDescription
documentai_object google.cloud.documentai.Document.Page
Required. The original google.cloud.documentai.Document.Page object.
document_text str
Required. The full text of the Document containing the Page.
text str
Required. UTF-8 encoded text of the page.
page_number int
Required. The page number of the Page.
form_fields List[FormField]
Required. A list of visually detected form fields on the page.
symbols List[Symbol]
Required. A list of visually detected text symbols (characters/letters) on the page.
tokens List[Token]
Required. A list of visually detected text tokens (words) on the page.
lines List[Line]
Required. A list of visually detected text lines on the page. A collection of tokens that a human would perceive as a line.
paragraphs List[Paragraph]
Required. A list of visually detected text paragraphs on the page. A collection of lines that a human would perceive as a paragraph.
blocks List[Block]
Required. A list of visually detected text blocks on the page. A collection of lines that a human would perceive as a block.
tables List[Table]
Required. A list of visually detected tables on the page.
math_formulas List[MathFormula]
Optional. A list of visually detected math formulas on the page.

Properties

hocr_bounding_box

hOCR bounding box of the page element.

Methods

__post_init__

__post_init__() -> None

Order of Init Symbol Token Line Paragraph, Block