Class Document (2.0.0)

Document(mapping=None, *, ignore_unknown_fields=False, **kwargs)

A structured text document e.g. a PDF.

Attributes

NameDescription
input_config .io.DocumentInputConfig
An input config specifying the content of the document.
document_text .data_items.TextSnippet
The plain text version of this document.
layout Sequence[.data_items.Document.Layout]
Describes the layout of the document. Sorted by [page_number][].
document_dimensions .data_items.DocumentDimensions
The dimensions of the page in the document.
page_count int
Number of pages in the document.

Classes

Layout

Layout(mapping=None, *, ignore_unknown_fields=False, **kwargs)

Describes the layout information of a text_segment in the document.