Page(
documentai_object: google.cloud.documentai_v1.types.document.Document.Page,
document_text: str,
)
Represents a wrapped documentai.Document.Page .
Attributes | |
---|---|
Name | Description |
documentai_object |
google.cloud.documentai.Document.Page
Required. The original google.cloud.documentai.Document.Page object. |
document_text |
str
Required. The full text of the Document containing the Page .
|
text |
str
Required. UTF-8 encoded text of the page. |
page_number |
int
Required. The page number of the Page .
|
form_fields |
List[FormField]
Required. A list of visually detected form fields on the page. |
symbols |
List[Symbol]
Required. A list of visually detected text symbols (characters/letters) on the page. |
tokens |
List[Token]
Required. A list of visually detected text tokens (words) on the page. |
lines |
List[Line]
Required. A list of visually detected text lines on the page. A collection of tokens that a human would perceive as a line. |
paragraphs |
List[Paragraph]
Required. A list of visually detected text paragraphs on the page. A collection of lines that a human would perceive as a paragraph. |
blocks |
List[Block]
Required. A list of visually detected text blocks on the page. A collection of lines that a human would perceive as a block. |
tables |
List[Table]
Required. A list of visually detected tables on the page. |
math_formulas |
List[MathFormula]
Optional. A list of visually detected math formulas on the page. |
Properties
hocr_bounding_box
hOCR bounding box of the page element.
Methods
__post_init__
__post_init__() -> None
Order of Init Symbol Token Line Paragraph, Block