Class _BasePageElement (0.13.5a0)

_BasePageElement(
    documentai_object: typing.Union[
        google.cloud.documentai_v1.types.document.Document.Page.Paragraph,
        google.cloud.documentai_v1.types.document.Document.Page,
        google.cloud.documentai_v1.types.document.Document.Page.Token,
        google.cloud.documentai_v1.types.document.Document.Page.Block,
        google.cloud.documentai_v1.types.document.Document.Page.Symbol,
    ],
    _page: google.cloud.documentai_toolbox.wrappers.page.Page,
)

Base class for representing a wrapped Document AI Page element (Symbol, Token, Line, Paragraph, Block).

Properties

_text_segment

Page element text segment.

hocr_bounding_box

hOCR bounding box of the page element.

text

Text of the page element.

Methods

_get_children_of_element

_get_children_of_element(
    potential_children: typing.List[
        google.cloud.documentai_toolbox.wrappers.page._BasePageElement
    ],
) -> typing.List[google.cloud.documentai_toolbox.wrappers.page._BasePageElement]

Filters potential child elements to identify only those fully contained within this element.

This method iterates through a list of potential child elements, checking if their start and end indices fall completely within the start and end indices of this element. Elements that are only partially contained or entirely outside this element's range are excluded.

Parameter
Name Description
potential_children List[_BasePageElement]

Required. A list of wrapped page elements (e.g., words, lines, paragraphs) that could potentially be children of this element.

Returns
Type Description
List[_BasePageElement] A new list containing only the wrapped page elements that are fully contained within this element, maintaining their original order.