- 3.0.1 (latest)
- 3.0.0
- 2.35.0
- 2.34.0
- 2.33.0
- 2.32.0
- 2.30.0
- 2.29.3
- 2.28.0
- 2.27.1
- 2.26.0
- 2.25.0
- 2.24.2
- 2.23.0
- 2.22.0
- 2.21.1
- 2.20.2
- 2.19.0
- 2.18.0
- 2.17.0
- 2.16.1
- 2.15.0
- 2.14.0
- 2.13.0
- 2.12.0
- 2.11.0
- 2.10.0
- 2.9.1
- 2.8.0
- 2.7.0
- 2.6.0
- 2.5.0
- 2.4.1
- 2.3.0
- 2.2.0
- 2.1.0
- 2.0.3
- 1.5.1
- 1.4.2
- 1.3.0
- 1.2.1
- 1.1.0
- 1.0.0
- 0.5.2
- 0.4.0
- 0.3.0
- 0.2.0
- 0.1.0
Page(mapping=None, *, ignore_unknown_fields=False, **kwargs)
A page in a Document.
Attributes | |
---|---|
Name | Description |
page_number |
int
1-based index for current Page in a parent Document. Useful when a page is taken out of a Document for individual processing. |
dimension |
Physical dimension of the page. |
layout |
Layout for the page. |
detected_languages |
Sequence[
A list of detected languages together with confidence. |
blocks |
Sequence[
A list of visually detected text blocks on the page. A block has a set of lines (collected into paragraphs) that have a common line-spacing and orientation. |
paragraphs |
Sequence[
A list of visually detected text paragraphs on the page. A collection of lines that a human would perceive as a paragraph. |
lines |
Sequence[
A list of visually detected text lines on the page. A collection of tokens that a human would perceive as a line. |
tokens |
Sequence[
A list of visually detected tokens on the page. |
visual_elements |
Sequence[
A list of detected non-text visual elements e.g. checkbox, signature etc. on the page. |
tables |
Sequence[
A list of visually detected tables on the page. |
form_fields |
Sequence[
A list of visually detected form fields on the page. |
Classes
Block
Block(mapping=None, *, ignore_unknown_fields=False, **kwargs)
A block has a set of lines (collected into paragraphs) that have a common line-spacing and orientation.
DetectedLanguage
DetectedLanguage(mapping=None, *, ignore_unknown_fields=False, **kwargs)
Detected language for a structural component.
Dimension
Dimension(mapping=None, *, ignore_unknown_fields=False, **kwargs)
Dimension for the page.
FormField
FormField(mapping=None, *, ignore_unknown_fields=False, **kwargs)
A form field detected on the page.
Layout
Layout(mapping=None, *, ignore_unknown_fields=False, **kwargs)
Visual element describing a layout unit on a page.
Line
Line(mapping=None, *, ignore_unknown_fields=False, **kwargs)
A collection of tokens that a human would perceive as a line. Does not cross column boundaries, can be horizontal, vertical, etc.
Paragraph
Paragraph(mapping=None, *, ignore_unknown_fields=False, **kwargs)
A collection of lines that a human would perceive as a paragraph.
Table
Table(mapping=None, *, ignore_unknown_fields=False, **kwargs)
A table representation similar to HTML table structure.
Token
Token(mapping=None, *, ignore_unknown_fields=False, **kwargs)
A detected token.
VisualElement
VisualElement(mapping=None, *, ignore_unknown_fields=False, **kwargs)
Detected non-text visual elements e.g. checkbox, signature etc. on the page.