Class Page (1.0.0)

Page(mapping=None, *, ignore_unknown_fields=False, **kwargs)

A page in a Document. .. attribute:: page_number

1-based index for current Page in a parent Document. Useful when a page is taken out of a Document for individual processing.

:type: int

Attributes
Name	Description
`image`	`google.cloud.documentai_v1beta3.types.Document.Page.Image` Rendered image for this page. This image is preprocessed to remove any skew, rotation, and distortions such that the annotation bounding boxes can be upright and axis-aligned.
`transforms`	`Sequence[google.cloud.documentai_v1beta3.types.Document.Page.Matrix]` Transformation matrices that were applied to the original document image to produce Page.image.
`dimension`	`google.cloud.documentai_v1beta3.types.Document.Page.Dimension` Physical dimension of the page.
`layout`	`google.cloud.documentai_v1beta3.types.Document.Page.Layout` Layout for the page.
`detected_languages`	`Sequence[google.cloud.documentai_v1beta3.types.Document.Page.DetectedLanguage]` A list of detected languages together with confidence.
`blocks`	`Sequence[google.cloud.documentai_v1beta3.types.Document.Page.Block]` A list of visually detected text blocks on the page. A block has a set of lines (collected into paragraphs) that have a common line-spacing and orientation.
`paragraphs`	`Sequence[google.cloud.documentai_v1beta3.types.Document.Page.Paragraph]` A list of visually detected text paragraphs on the page. A collection of lines that a human would perceive as a paragraph.
`lines`	`Sequence[google.cloud.documentai_v1beta3.types.Document.Page.Line]` A list of visually detected text lines on the page. A collection of tokens that a human would perceive as a line.
`tokens`	`Sequence[google.cloud.documentai_v1beta3.types.Document.Page.Token]` A list of visually detected tokens on the page.
`visual_elements`	`Sequence[google.cloud.documentai_v1beta3.types.Document.Page.VisualElement]` A list of detected non-text visual elements e.g. checkbox, signature etc. on the page.
`tables`	`Sequence[google.cloud.documentai_v1beta3.types.Document.Page.Table]` A list of visually detected tables on the page.
`form_fields`	`Sequence[google.cloud.documentai_v1beta3.types.Document.Page.FormField]` A list of visually detected form fields on the page.
`provenance`	`google.cloud.documentai_v1beta3.types.Document.Provenance` The history of this page.

Classes

Block

Block(mapping=None, *, ignore_unknown_fields=False, **kwargs)

A block has a set of lines (collected into paragraphs) that have a common line-spacing and orientation.

DetectedLanguage

DetectedLanguage(mapping=None, *, ignore_unknown_fields=False, **kwargs)

Detected language for a structural component. .. attribute:: language_code

The BCP-47 language code, such as "en-US" or "sr-Latn". For more information, see http://www.unicode.org/reports/tr35/#Unicode_locale_identifier.

:type: str

Dimension

Dimension(mapping=None, *, ignore_unknown_fields=False, **kwargs)

Dimension for the page. .. attribute:: width

Page width.

:type: float

FormField

FormField(mapping=None, *, ignore_unknown_fields=False, **kwargs)

A form field detected on the page. .. attribute:: field_name

Layout for the FormField name. e.g. Address, Email, Grand total, Phone number, etc. :type: google.cloud.documentai_v1beta3.types.Document.Page.Layout

Image

Image(mapping=None, *, ignore_unknown_fields=False, **kwargs)

Rendered image contents for this page. .. attribute:: content

Raw byte content of the image.

:type: bytes

Layout

Layout(mapping=None, *, ignore_unknown_fields=False, **kwargs)

Visual element describing a layout unit on a page. .. attribute:: text_anchor

Text anchor indexing into the Document.text.

:type: google.cloud.documentai_v1beta3.types.Document.TextAnchor

Line

Line(mapping=None, *, ignore_unknown_fields=False, **kwargs)

A collection of tokens that a human would perceive as a line. Does not cross column boundaries, can be horizontal, vertical, etc.

Matrix

Matrix(mapping=None, *, ignore_unknown_fields=False, **kwargs)

Representation for transformation matrix, intended to be compatible and used with OpenCV format for image manipulation.

Paragraph

Paragraph(mapping=None, *, ignore_unknown_fields=False, **kwargs)

A collection of lines that a human would perceive as a paragraph.

Table

Table(mapping=None, *, ignore_unknown_fields=False, **kwargs)

A table representation similar to HTML table structure. .. attribute:: layout

Layout for Table. :type: google.cloud.documentai_v1beta3.types.Document.Page.Layout

Token

Token(mapping=None, *, ignore_unknown_fields=False, **kwargs)

A detected token. .. attribute:: layout

Layout for Token. :type: google.cloud.documentai_v1beta3.types.Document.Page.Layout

VisualElement

VisualElement(mapping=None, *, ignore_unknown_fields=False, **kwargs)

Detected non-text visual elements e.g. checkbox, signature etc. on the page.