Wrappers for Document AI Page type.
Classes
Block
Block(
documentai_object: google.cloud.documentai_v1.types.document.Document.Page.Block,
_page: google.cloud.documentai_toolbox.wrappers.page.Page,
)
Represents a wrapped documentai.Document.Page.Block.
FormField
FormField(
documentai_object: google.cloud.documentai_v1.types.document.Document.Page.FormField,
document_text: dataclasses.InitVar[str],
)
Represents a wrapped documentai.Document.Page.FormField.
Line
Line(
documentai_object: google.cloud.documentai_v1.types.document.Document.Page.Line,
_page: google.cloud.documentai_toolbox.wrappers.page.Page,
)
Represents a wrapped documentai.Document.Page.Line.
Page
Page(
documentai_object: google.cloud.documentai_v1.types.document.Document.Page,
document_text: str,
)
Represents a wrapped documentai.Document.Page .
Paragraph
Paragraph(
documentai_object: google.cloud.documentai_v1.types.document.Document.Page.Paragraph,
_page: google.cloud.documentai_toolbox.wrappers.page.Page,
)
Represents a wrapped documentai.Document.Page.Paragraph.
Table
Table(
documentai_object: google.cloud.documentai_v1.types.document.Document.Page.Table,
document_text: dataclasses.InitVar[str],
)
Represents a wrapped documentai.Document.Page.Table.
Token
Token(
documentai_object: google.cloud.documentai_v1.types.document.Document.Page.Token,
_page: google.cloud.documentai_toolbox.wrappers.page.Page,
)
Represents a wrapped documentai.Document.Page. .
Modules Functions
_get_children_of_element
_get_children_of_element(
element: typing.Union[
google.cloud.documentai_v1.types.document.Document.Page.Paragraph,
google.cloud.documentai_v1.types.document.Document.Page,
google.cloud.documentai_v1.types.document.Document.Page.Token,
google.cloud.documentai_v1.types.document.Document.Page.Block,
google.cloud.documentai_v1.types.document.Document.Page.Symbol,
],
children: typing.Union[
typing.List[google.cloud.documentai_toolbox.wrappers.page.Paragraph],
typing.List[google.cloud.documentai_toolbox.wrappers.page.Block],
typing.List[google.cloud.documentai_toolbox.wrappers.page.Token],
typing.List[google.cloud.documentai_toolbox.wrappers.page.Line],
],
) -> typing.List[
typing.Union[
google.cloud.documentai_toolbox.wrappers.page.Block,
google.cloud.documentai_toolbox.wrappers.page.Paragraph,
google.cloud.documentai_toolbox.wrappers.page.Line,
google.cloud.documentai_toolbox.wrappers.page.Token,
]
]
Returns a list of children inside element.
Parameters | |
---|---|
Name | Description |
element |
ElementWithLayout
Required. A element in a page. |
children |
ChildrenElements
Required. List of wrapped children. |
Returns | |
---|---|
Type | Description |
List[Union[Block, Paragraph, Line, Token]] | A list of wrapped children that are inside a element. |
_get_hocr_bounding_box
_get_hocr_bounding_box(
element_with_layout: typing.Union[
google.cloud.documentai_v1.types.document.Document.Page.Paragraph,
google.cloud.documentai_v1.types.document.Document.Page,
google.cloud.documentai_v1.types.document.Document.Page.Token,
google.cloud.documentai_v1.types.document.Document.Page.Block,
google.cloud.documentai_v1.types.document.Document.Page.Symbol,
],
page_dimension: google.cloud.documentai_v1.types.document.Document.Page.Dimension,
) -> str
Returns a hOCR bounding box string.
Parameters | |
---|---|
Name | Description |
element_with_layout |
ElementWithLayout
Required. an element with layout fields. |
dimension |
documentai.Document.Page.Dimension
Required. Page dimension. |
Returns | |
---|---|
Type | Description |
str | hOCR bounding box sring. |
_table_rows_from_documentai_table_rows
_table_rows_from_documentai_table_rows(
table_rows: typing.List[
google.cloud.documentai_v1.types.document.Document.Page.Table.TableRow
],
text: str,
) -> typing.List[typing.List[str]]
Returns a list of rows from table_rows.
Parameters | |
---|---|
Name | Description |
table_rows |
List[documentai.Document.Page.Table.TableRow]
Required. A documentai.Document.Page.Table.TableRow. |
text |
str
Required. UTF-8 encoded text in reading order from the document. |
Returns | |
---|---|
Type | Description |
List[List[str]] | A list of table rows. |
_text_from_layout
_text_from_layout(
layout: google.cloud.documentai_v1.types.document.Document.Page.Layout, text: str
) -> str
Returns a text from a single layout element.
Parameters | |
---|---|
Name | Description |
layout |
documentai.Document.Page.Layout
Required. an element with layout fields. |
text |
str
Required. UTF-8 encoded text in reading order from the document. |
Returns | |
---|---|
Type | Description |
str | Text from a single element. |
_trim_text
_trim_text(text: str) -> str
Remove extra space characters from text (blank, newline, tab, etc.)
Parameter | |
---|---|
Name | Description |
text |
str
Required. UTF-8 encoded text in reading order from the document. |
Returns | |
---|---|
Type | Description |
str | Text without trailing spaces/newlines |