Wrappers for Document AI Page type.
Classes
Block
Block(
documentai_object: typing.Union[
google.cloud.documentai_v1.types.document.Document.Page.Paragraph,
google.cloud.documentai_v1.types.document.Document.Page,
google.cloud.documentai_v1.types.document.Document.Page.Token,
google.cloud.documentai_v1.types.document.Document.Page.Block,
google.cloud.documentai_v1.types.document.Document.Page.Symbol,
],
_page: google.cloud.documentai_toolbox.wrappers.page.Page,
)
Represents a wrapped documentai.Document.Page.Block.
FormField
FormField(
documentai_object: google.cloud.documentai_v1.types.document.Document.Page.FormField,
_page: google.cloud.documentai_toolbox.wrappers.page.Page,
)
Represents a wrapped documentai.Document.Page.FormField.
Line
Line(
documentai_object: typing.Union[
google.cloud.documentai_v1.types.document.Document.Page.Paragraph,
google.cloud.documentai_v1.types.document.Document.Page,
google.cloud.documentai_v1.types.document.Document.Page.Token,
google.cloud.documentai_v1.types.document.Document.Page.Block,
google.cloud.documentai_v1.types.document.Document.Page.Symbol,
],
_page: google.cloud.documentai_toolbox.wrappers.page.Page,
)
Represents a wrapped documentai.Document.Page.Line.
MathFormula
MathFormula(
documentai_object: typing.Union[
google.cloud.documentai_v1.types.document.Document.Page.Paragraph,
google.cloud.documentai_v1.types.document.Document.Page,
google.cloud.documentai_v1.types.document.Document.Page.Token,
google.cloud.documentai_v1.types.document.Document.Page.Block,
google.cloud.documentai_v1.types.document.Document.Page.Symbol,
],
_page: google.cloud.documentai_toolbox.wrappers.page.Page,
)
Represents a wrapped documentai.Document.Page.VisualElement with type math_formula
.
https://cloud.google.com/document-ai/docs/process-documents-ocr#math_ocr
Page
Page(
documentai_object: google.cloud.documentai_v1.types.document.Document.Page,
_document_text: str,
)
Represents a wrapped documentai.Document.Page .
Paragraph
Paragraph(
documentai_object: typing.Union[
google.cloud.documentai_v1.types.document.Document.Page.Paragraph,
google.cloud.documentai_v1.types.document.Document.Page,
google.cloud.documentai_v1.types.document.Document.Page.Token,
google.cloud.documentai_v1.types.document.Document.Page.Block,
google.cloud.documentai_v1.types.document.Document.Page.Symbol,
],
_page: google.cloud.documentai_toolbox.wrappers.page.Page,
)
Represents a wrapped documentai.Document.Page.Paragraph.
Symbol
Symbol(
documentai_object: typing.Union[
google.cloud.documentai_v1.types.document.Document.Page.Paragraph,
google.cloud.documentai_v1.types.document.Document.Page,
google.cloud.documentai_v1.types.document.Document.Page.Token,
google.cloud.documentai_v1.types.document.Document.Page.Block,
google.cloud.documentai_v1.types.document.Document.Page.Symbol,
],
_page: google.cloud.documentai_toolbox.wrappers.page.Page,
)
Represents a wrapped documentai.Document.Page.Symbol. https://cloud.google.com/document-ai/docs/process-documents-ocr#enable_symbols
Table
Table(
documentai_object: google.cloud.documentai_v1.types.document.Document.Page.Table,
_page: google.cloud.documentai_toolbox.wrappers.page.Page,
)
Represents a wrapped documentai.Document.Page.Table.
Token
Token(
documentai_object: typing.Union[
google.cloud.documentai_v1.types.document.Document.Page.Paragraph,
google.cloud.documentai_v1.types.document.Document.Page,
google.cloud.documentai_v1.types.document.Document.Page.Token,
google.cloud.documentai_v1.types.document.Document.Page.Block,
google.cloud.documentai_v1.types.document.Document.Page.Symbol,
],
_page: google.cloud.documentai_toolbox.wrappers.page.Page,
)
Represents a wrapped documentai.Document.Page.Token.
_BasePageElement
_BasePageElement(
documentai_object: typing.Union[
google.cloud.documentai_v1.types.document.Document.Page.Paragraph,
google.cloud.documentai_v1.types.document.Document.Page,
google.cloud.documentai_v1.types.document.Document.Page.Token,
google.cloud.documentai_v1.types.document.Document.Page.Block,
google.cloud.documentai_v1.types.document.Document.Page.Symbol,
],
_page: google.cloud.documentai_toolbox.wrappers.page.Page,
)
Base class for representing a wrapped Document AI Page element (Symbol, Token, Line, Paragraph, Block).
Modules Functions
_get_children_of_element
_get_children_of_element(
element: typing.Union[
google.cloud.documentai_v1.types.document.Document.Page.Paragraph,
google.cloud.documentai_v1.types.document.Document.Page,
google.cloud.documentai_v1.types.document.Document.Page.Token,
google.cloud.documentai_v1.types.document.Document.Page.Block,
google.cloud.documentai_v1.types.document.Document.Page.Symbol,
],
children: typing.List[
typing.Union[
google.cloud.documentai_v1.types.document.Document.Page.Paragraph,
google.cloud.documentai_v1.types.document.Document.Page,
google.cloud.documentai_v1.types.document.Document.Page.Token,
google.cloud.documentai_v1.types.document.Document.Page.Block,
google.cloud.documentai_v1.types.document.Document.Page.Symbol,
]
],
) -> typing.List[
typing.Union[
google.cloud.documentai_v1.types.document.Document.Page.Paragraph,
google.cloud.documentai_v1.types.document.Document.Page,
google.cloud.documentai_v1.types.document.Document.Page.Token,
google.cloud.documentai_v1.types.document.Document.Page.Block,
google.cloud.documentai_v1.types.document.Document.Page.Symbol,
]
]
Returns a list of children inside element.
Parameters | |
---|---|
Name | Description |
element |
ElementWithLayout
Required. A element in a page. |
children |
List[ElementWithLayout]
Required. List of wrapped children. |
Returns | |
---|---|
Type | Description |
List[ElementWithLayout] | A list of wrapped children that are inside a element. |
_get_hocr_bounding_box
_get_hocr_bounding_box(
element_with_layout: typing.Union[
google.cloud.documentai_v1.types.document.Document.Page.Paragraph,
google.cloud.documentai_v1.types.document.Document.Page,
google.cloud.documentai_v1.types.document.Document.Page.Token,
google.cloud.documentai_v1.types.document.Document.Page.Block,
google.cloud.documentai_v1.types.document.Document.Page.Symbol,
],
page_dimension: google.cloud.documentai_v1.types.document.Document.Page.Dimension,
) -> typing.Optional[str]
Returns a hOCR bounding box string.
Parameters | |
---|---|
Name | Description |
element_with_layout |
ElementWithLayout
Required. an element with layout fields. |
dimension |
documentai.Document.Page.Dimension
Required. Page dimension. |
Returns | |
---|---|
Type | Description |
Optional[str] | hOCR bounding box sring. |
_table_rows_from_documentai_table_rows
_table_rows_from_documentai_table_rows(
table_rows: typing.List[
google.cloud.documentai_v1.types.document.Document.Page.Table.TableRow
],
text: str,
) -> typing.List[typing.List[str]]
Returns a list of rows from table_rows.
Parameters | |
---|---|
Name | Description |
table_rows |
List[documentai.Document.Page.Table.TableRow]
Required. A documentai.Document.Page.Table.TableRow. |
text |
str
Required. UTF-8 encoded text in reading order from the document. |
Returns | |
---|---|
Type | Description |
List[List[str]] | A list of table rows. |
_text_from_layout
_text_from_layout(
layout: google.cloud.documentai_v1.types.document.Document.Page.Layout, text: str
) -> str
Returns a text from a single layout element.
Parameters | |
---|---|
Name | Description |
layout |
documentai.Document.Page.Layout
Required. an element with layout fields. |
text |
str
Required. UTF-8 encoded text in reading order of the |
Returns | |
---|---|
Type | Description |
str | Text from a single element. |
_trim_text
_trim_text(text: str) -> str
Remove extra space characters from text (blank, newline, tab, etc.)
Parameter | |
---|---|
Name | Description |
text |
str
Required. UTF-8 encoded text in reading order from the document. |
Returns | |
---|---|
Type | Description |
str | Text without trailing spaces/newlines |