Classes
Barcode
Encodes the detailed information of a barcode.
BatchDocumentsInputConfig
The common config to specify a set of documents used as input.
BatchProcessMetadata
The long-running operation metadata for [BatchProcessDocuments][google.cloud.documentai.v1.DocumentProcessorService.BatchProcessDocuments].
BatchProcessMetadata.Types
Container for nested types declared in the BatchProcessMetadata message type.
BatchProcessMetadata.Types.IndividualProcessStatus
The status of a each individual document in the batch process.
BatchProcessRequest
Request message for [BatchProcessDocuments][google.cloud.documentai.v1.DocumentProcessorService.BatchProcessDocuments].
BatchProcessResponse
Response message for [BatchProcessDocuments][google.cloud.documentai.v1.DocumentProcessorService.BatchProcessDocuments].
BoundingPoly
A bounding polygon for the detected image annotation.
CommonOperationMetadata
The common metadata for long running operations.
CommonOperationMetadata.Types
Container for nested types declared in the CommonOperationMetadata message type.
CreateProcessorRequest
Request message for the [CreateProcessor][google.cloud.documentai.v1.DocumentProcessorService.CreateProcessor] method. Notice this request is sent to a regionalized backend service. If the [ProcessorType][google.cloud.documentai.v1.ProcessorType] isn't available in that region, the creation fails.
DeleteProcessorMetadata
The long-running operation metadata for the [DeleteProcessor][google.cloud.documentai.v1.DocumentProcessorService.DeleteProcessor] method.
DeleteProcessorRequest
Request message for the [DeleteProcessor][google.cloud.documentai.v1.DocumentProcessorService.DeleteProcessor] method.
DeleteProcessorVersionMetadata
The long-running operation metadata for the [DeleteProcessorVersion][google.cloud.documentai.v1.DocumentProcessorService.DeleteProcessorVersion] method.
DeleteProcessorVersionRequest
Request message for the [DeleteProcessorVersion][google.cloud.documentai.v1.DocumentProcessorService.DeleteProcessorVersion] method.
DeployProcessorVersionMetadata
The long-running operation metadata for the [DeployProcessorVersion][google.cloud.documentai.v1.DocumentProcessorService.DeployProcessorVersion] method.
DeployProcessorVersionRequest
Request message for the [DeployProcessorVersion][google.cloud.documentai.v1.DocumentProcessorService.DeployProcessorVersion] method.
DeployProcessorVersionResponse
Response message for the [DeployProcessorVersion][google.cloud.documentai.v1.DocumentProcessorService.DeployProcessorVersion] method.
DisableProcessorMetadata
The long-running operation metadata for the [DisableProcessor][google.cloud.documentai.v1.DocumentProcessorService.DisableProcessor] method.
DisableProcessorRequest
Request message for the [DisableProcessor][google.cloud.documentai.v1.DocumentProcessorService.DisableProcessor] method.
DisableProcessorResponse
Response message for the [DisableProcessor][google.cloud.documentai.v1.DocumentProcessorService.DisableProcessor] method. Intentionally empty proto for adding fields in future.
Document
Document represents the canonical document resource in Document AI. It is an interchange format that provides insights into documents and allows for collaboration between users and Document AI to iterate and optimize for quality.
Document.Types
Container for nested types declared in the Document message type.
Document.Types.Entity
An entity that could be a phrase in the text or a property that belongs to the document. It is a known entity type, such as a person, an organization, or location.
Document.Types.Entity.Types
Container for nested types declared in the Entity message type.
Document.Types.Entity.Types.NormalizedValue
Parsed and normalized entity value.
Document.Types.EntityRelation
Relationship between [Entities][google.cloud.documentai.v1.Document.Entity].
Document.Types.Page
A page in a [Document][google.cloud.documentai.v1.Document].
Document.Types.Page.Types
Container for nested types declared in the Page message type.
Document.Types.Page.Types.Block
A block has a set of lines (collected into paragraphs) that have a common line-spacing and orientation.
Document.Types.Page.Types.DetectedBarcode
A detected barcode.
Document.Types.Page.Types.DetectedLanguage
Detected language for a structural component.
Document.Types.Page.Types.Dimension
Dimension for the page.
Document.Types.Page.Types.FormField
A form field detected on the page.
Document.Types.Page.Types.Image
Rendered image contents for this page.
Document.Types.Page.Types.ImageQualityScores
Image quality scores for the page image.
Document.Types.Page.Types.ImageQualityScores.Types
Container for nested types declared in the ImageQualityScores message type.
Document.Types.Page.Types.ImageQualityScores.Types.DetectedDefect
Image Quality Defects
Document.Types.Page.Types.Layout
Visual element describing a layout unit on a page.
Document.Types.Page.Types.Layout.Types
Container for nested types declared in the Layout message type.
Document.Types.Page.Types.Line
A collection of tokens that a human would perceive as a line. Does not cross column boundaries, can be horizontal, vertical, etc.
Document.Types.Page.Types.Matrix
Representation for transformation matrix, intended to be compatible and used with OpenCV format for image manipulation.
Document.Types.Page.Types.Paragraph
A collection of lines that a human would perceive as a paragraph.
Document.Types.Page.Types.Symbol
A detected symbol.
Document.Types.Page.Types.Table
A table representation similar to HTML table structure.
Document.Types.Page.Types.Table.Types
Container for nested types declared in the Table message type.
Document.Types.Page.Types.Table.Types.TableCell
A cell representation inside the table.
Document.Types.Page.Types.Table.Types.TableRow
A row of table cells.
Document.Types.Page.Types.Token
A detected token.
Document.Types.Page.Types.Token.Types
Container for nested types declared in the Token message type.
Document.Types.Page.Types.Token.Types.DetectedBreak
Detected break at the end of a [Token][google.cloud.documentai.v1.Document.Page.Token].
Document.Types.Page.Types.Token.Types.DetectedBreak.Types
Container for nested types declared in the DetectedBreak message type.
Document.Types.Page.Types.Token.Types.StyleInfo
Font and other text style attributes.
Document.Types.Page.Types.VisualElement
Detected non-text visual elements e.g. checkbox, signature etc. on the page.
Document.Types.PageAnchor
Referencing the visual context of the entity in the [Document.pages][google.cloud.documentai.v1.Document.pages]. Page anchors can be cross-page, consist of multiple bounding polygons and optionally reference specific layout element types.
Document.Types.PageAnchor.Types
Container for nested types declared in the PageAnchor message type.
Document.Types.PageAnchor.Types.PageRef
Represents a weak reference to a page element within a document.
Document.Types.PageAnchor.Types.PageRef.Types
Container for nested types declared in the PageRef message type.
Document.Types.Provenance
Structure to identify provenance relationships between annotations in different revisions.
Document.Types.Provenance.Types
Container for nested types declared in the Provenance message type.
Document.Types.Provenance.Types.Parent
The parent element the current element is based on. Used for referencing/aligning, removal and replacement operations.
Document.Types.Revision
Contains past or forward revisions of this document.
Document.Types.Revision.Types
Container for nested types declared in the Revision message type.
Document.Types.Revision.Types.HumanReview
Human Review information of the document.
Document.Types.ShardInfo
For a large document, sharding may be performed to produce several document shards. Each document shard contains this field to detail which shard it is.
Document.Types.Style
Annotation for common text style attributes. This adheres to CSS conventions as much as possible.
Document.Types.Style.Types
Container for nested types declared in the Style message type.
Document.Types.Style.Types.FontSize
Font size with unit.
Document.Types.TextAnchor
Text reference indexing into the [Document.text][google.cloud.documentai.v1.Document.text].
Document.Types.TextAnchor.Types
Container for nested types declared in the TextAnchor message type.
Document.Types.TextAnchor.Types.TextSegment
A text segment in the [Document.text][google.cloud.documentai.v1.Document.text]. The indices may be out of bounds which indicate that the text extends into another document shard for large sharded documents. See [ShardInfo.text_offset][google.cloud.documentai.v1.Document.ShardInfo.text_offset]
Document.Types.TextChange
This message is used for text changes aka. OCR corrections.
DocumentOutputConfig
Config that controls the output of documents. All documents will be written as a JSON file.
DocumentOutputConfig.Types
Container for nested types declared in the DocumentOutputConfig message type.
DocumentOutputConfig.Types.GcsOutputConfig
The configuration used when outputting documents.
DocumentOutputConfig.Types.GcsOutputConfig.Types
Container for nested types declared in the GcsOutputConfig message type.
DocumentOutputConfig.Types.GcsOutputConfig.Types.ShardingConfig
The sharding config for the output document.
DocumentProcessorService
Service to call Document AI to process documents according to the processor's definition. Processors are built using state-of-the-art Google AI such as natural language, computer vision, and translation to extract structured information from unstructured or semi-structured documents.
DocumentProcessorService.DocumentProcessorServiceBase
Base class for server-side implementations of DocumentProcessorService
DocumentProcessorService.DocumentProcessorServiceClient
Client for DocumentProcessorService
DocumentProcessorServiceClient
DocumentProcessorService client wrapper, for convenient use.
DocumentProcessorServiceClientBuilder
Builder class for DocumentProcessorServiceClient to provide simple configuration of credentials, endpoint etc.
DocumentProcessorServiceClientImpl
DocumentProcessorService client wrapper implementation, for convenient use.
DocumentProcessorServiceSettings
Settings for DocumentProcessorServiceClient instances.
DocumentSchema
The schema defines the output of the processed document by a processor.
DocumentSchema.Types
Container for nested types declared in the DocumentSchema message type.
DocumentSchema.Types.EntityType
EntityType is the wrapper of a label of the corresponding model with detailed attributes and limitations for entity-based processors. Multiple types can also compose a dependency tree to represent nested types.
DocumentSchema.Types.EntityType.Types
Container for nested types declared in the EntityType message type.
DocumentSchema.Types.EntityType.Types.EnumValues
Defines the a list of enum values.
DocumentSchema.Types.EntityType.Types.Property
Defines properties that can be part of the entity type.
DocumentSchema.Types.EntityType.Types.Property.Types
Container for nested types declared in the Property message type.
DocumentSchema.Types.Metadata
Metadata for global schema behavior.
EnableProcessorMetadata
The long-running operation metadata for the [EnableProcessor][google.cloud.documentai.v1.DocumentProcessorService.EnableProcessor] method.
EnableProcessorRequest
Request message for the [EnableProcessor][google.cloud.documentai.v1.DocumentProcessorService.EnableProcessor] method.
EnableProcessorResponse
Response message for the [EnableProcessor][google.cloud.documentai.v1.DocumentProcessorService.EnableProcessor] method. Intentionally empty proto for adding fields in future.
EvaluateProcessorVersionMetadata
Metadata of the [EvaluateProcessorVersion][google.cloud.documentai.v1.DocumentProcessorService.EvaluateProcessorVersion] method.
EvaluateProcessorVersionRequest
Evaluates the given [ProcessorVersion][google.cloud.documentai.v1.ProcessorVersion] against the supplied documents.
EvaluateProcessorVersionResponse
Response of the [EvaluateProcessorVersion][google.cloud.documentai.v1.DocumentProcessorService.EvaluateProcessorVersion] method.
Evaluation
An evaluation of a ProcessorVersion's performance.
Evaluation.Types
Container for nested types declared in the Evaluation message type.
Evaluation.Types.ConfidenceLevelMetrics
Evaluations metrics, at a specific confidence level.
Evaluation.Types.Counters
Evaluation counters for the documents that were used.
Evaluation.Types.Metrics
Evaluation metrics, either in aggregate or about a specific entity.
Evaluation.Types.MultiConfidenceMetrics
Metrics across multiple confidence levels.
Evaluation.Types.MultiConfidenceMetrics.Types
Container for nested types declared in the MultiConfidenceMetrics message type.
EvaluationName
Resource name for the Evaluation
resource.
EvaluationReference
Gives a short summary of an evaluation, and links to the evaluation itself.
FetchProcessorTypesRequest
Request message for the [FetchProcessorTypes][google.cloud.documentai.v1.DocumentProcessorService.FetchProcessorTypes] method. Some processor types may require the project be added to an allowlist.
FetchProcessorTypesResponse
Response message for the [FetchProcessorTypes][google.cloud.documentai.v1.DocumentProcessorService.FetchProcessorTypes] method.
GcsDocument
Specifies a document stored on Cloud Storage.
GcsDocuments
Specifies a set of documents on Cloud Storage.
GcsPrefix
Specifies all documents on Cloud Storage with a common prefix.
GetEvaluationRequest
Retrieves a specific Evaluation.
GetProcessorRequest
Request message for the [GetProcessor][google.cloud.documentai.v1.DocumentProcessorService.GetProcessor] method.
GetProcessorTypeRequest
Request message for the [GetProcessorType][google.cloud.documentai.v1.DocumentProcessorService.GetProcessorType] method.
GetProcessorVersionRequest
Request message for the [GetProcessorVersion][google.cloud.documentai.v1.DocumentProcessorService.GetProcessorVersion] method.
HumanReviewConfigName
Resource name for the HumanReviewConfig
resource.
HumanReviewStatus
The status of human review on a processed document.
HumanReviewStatus.Types
Container for nested types declared in the HumanReviewStatus message type.
ListEvaluationsRequest
Retrieves a list of evaluations for a given [ProcessorVersion][google.cloud.documentai.v1.ProcessorVersion].
ListEvaluationsResponse
The response from ListEvaluations
.
ListProcessorTypesRequest
Request message for the [ListProcessorTypes][google.cloud.documentai.v1.DocumentProcessorService.ListProcessorTypes] method. Some processor types may require the project be added to an allowlist.
ListProcessorTypesResponse
Response message for the [ListProcessorTypes][google.cloud.documentai.v1.DocumentProcessorService.ListProcessorTypes] method.
ListProcessorVersionsRequest
Request message for list all processor versions belongs to a processor.
ListProcessorVersionsResponse
Response message for the [ListProcessorVersions][google.cloud.documentai.v1.DocumentProcessorService.ListProcessorVersions] method.
ListProcessorsRequest
Request message for list all processors belongs to a project.
ListProcessorsResponse
Response message for the [ListProcessors][google.cloud.documentai.v1.DocumentProcessorService.ListProcessors] method.
NormalizedVertex
A vertex represents a 2D point in the image. NOTE: the normalized vertex coordinates are relative to the original image and range from 0 to 1.
OcrConfig
Config for Document OCR.
OcrConfig.Types
Container for nested types declared in the OcrConfig message type.
OcrConfig.Types.Hints
Hints for OCR Engine
OcrConfig.Types.PremiumFeatures
Configurations for premium OCR features.
ProcessOptions
Options for Process API
ProcessOptions.Types
Container for nested types declared in the ProcessOptions message type.
ProcessOptions.Types.IndividualPageSelector
A list of individual page numbers.
ProcessRequest
Request message for the [ProcessDocument][google.cloud.documentai.v1.DocumentProcessorService.ProcessDocument] method.
ProcessResponse
Response message for the [ProcessDocument][google.cloud.documentai.v1.DocumentProcessorService.ProcessDocument] method.
Processor
The first-class citizen for Document AI. Each processor defines how to extract structural information from a document.
Processor.Types
Container for nested types declared in the Processor message type.
ProcessorName
Resource name for the Processor
resource.
ProcessorType
A processor type is responsible for performing a certain document understanding task on a certain type of document.
ProcessorType.Types
Container for nested types declared in the ProcessorType message type.
ProcessorType.Types.LocationInfo
The location information about where the processor is available.
ProcessorTypeName
Resource name for the ProcessorType
resource.
ProcessorVersion
A processor version is an implementation of a processor. Each processor can have multiple versions, pretrained by Google internally or uptrained by the customer. A processor can only have one default version at a time. Its document-processing behavior is defined by that version.
ProcessorVersion.Types
Container for nested types declared in the ProcessorVersion message type.
ProcessorVersion.Types.DeprecationInfo
Information about the upcoming deprecation of this processor version.
ProcessorVersionAlias
Contains the alias and the aliased resource name of processor version.
ProcessorVersionName
Resource name for the ProcessorVersion
resource.
RawDocument
Payload message of raw document content (bytes).
ReviewDocumentOperationMetadata
The long-running operation metadata for the [ReviewDocument][google.cloud.documentai.v1.DocumentProcessorService.ReviewDocument] method.
ReviewDocumentRequest
Request message for the [ReviewDocument][google.cloud.documentai.v1.DocumentProcessorService.ReviewDocument] method.
ReviewDocumentRequest.Types
Container for nested types declared in the ReviewDocumentRequest message type.
ReviewDocumentResponse
Response message for the [ReviewDocument][google.cloud.documentai.v1.DocumentProcessorService.ReviewDocument] method.
ReviewDocumentResponse.Types
Container for nested types declared in the ReviewDocumentResponse message type.
SetDefaultProcessorVersionMetadata
The long-running operation metadata for the [SetDefaultProcessorVersion][google.cloud.documentai.v1.DocumentProcessorService.SetDefaultProcessorVersion] method.
SetDefaultProcessorVersionRequest
Request message for the [SetDefaultProcessorVersion][google.cloud.documentai.v1.DocumentProcessorService.SetDefaultProcessorVersion] method.
SetDefaultProcessorVersionResponse
Response message for the [SetDefaultProcessorVersion][google.cloud.documentai.v1.DocumentProcessorService.SetDefaultProcessorVersion] method.
TrainProcessorVersionMetadata
The metadata that represents a processor version being created.
TrainProcessorVersionMetadata.Types
Container for nested types declared in the TrainProcessorVersionMetadata message type.
TrainProcessorVersionMetadata.Types.DatasetValidation
The dataset validation information. This includes any and all errors with documents and the dataset.
TrainProcessorVersionRequest
Request message for the [TrainProcessorVersion][google.cloud.documentai.v1.DocumentProcessorService.TrainProcessorVersion] method.
TrainProcessorVersionRequest.Types
Container for nested types declared in the TrainProcessorVersionRequest message type.
TrainProcessorVersionRequest.Types.CustomDocumentExtractionOptions
Options to control the training of the Custom Document Extraction (CDE) Processor.
TrainProcessorVersionRequest.Types.CustomDocumentExtractionOptions.Types
Container for nested types declared in the CustomDocumentExtractionOptions message type.
TrainProcessorVersionRequest.Types.FoundationModelTuningOptions
Options to control foundation model tuning of the processor.
TrainProcessorVersionRequest.Types.InputData
The input data used to train a new [ProcessorVersion][google.cloud.documentai.v1.ProcessorVersion].
TrainProcessorVersionResponse
The response for [TrainProcessorVersion][google.cloud.documentai.v1.DocumentProcessorService.TrainProcessorVersion].
UndeployProcessorVersionMetadata
The long-running operation metadata for the [UndeployProcessorVersion][google.cloud.documentai.v1.DocumentProcessorService.UndeployProcessorVersion] method.
UndeployProcessorVersionRequest
Request message for the [UndeployProcessorVersion][google.cloud.documentai.v1.DocumentProcessorService.UndeployProcessorVersion] method.
UndeployProcessorVersionResponse
Response message for the [UndeployProcessorVersion][google.cloud.documentai.v1.DocumentProcessorService.UndeployProcessorVersion] method.
Vertex
A vertex represents a 2D point in the image. NOTE: the vertex coordinates are in the same scale as the original image.
Enums
BatchDocumentsInputConfig.SourceOneofCase
Enum of possible cases for the "source" oneof.
BatchProcessMetadata.Types.State
Possible states of the batch processing operation.
CommonOperationMetadata.Types.State
State of the longrunning operation.
Document.SourceOneofCase
Enum of possible cases for the "source" oneof.
Document.Types.Entity.Types.NormalizedValue.StructuredValueOneofCase
Enum of possible cases for the "structured_value" oneof.
Document.Types.Page.Types.Layout.Types.Orientation
Detected human reading orientation.
Document.Types.Page.Types.Token.Types.DetectedBreak.Types.Type
Enum to denote the type of break found.
Document.Types.PageAnchor.Types.PageRef.Types.LayoutType
The type of layout that is being referenced.
Document.Types.Provenance.Types.OperationType
If a processor or agent does an explicit operation on existing elements.
Document.Types.Revision.SourceOneofCase
Enum of possible cases for the "source" oneof.
DocumentOutputConfig.DestinationOneofCase
Enum of possible cases for the "destination" oneof.
DocumentSchema.Types.EntityType.Types.Property.Types.OccurrenceType
Types of occurrences of the entity type in the document. This
represents the number of instances, not mentions, of an entity.
For example, a bank statement might only have one
account_number
, but this account number can be mentioned in several
places on the document. In this case, the account_number
is
considered a REQUIRED_ONCE
entity type. If, on the other hand, we
expect a bank statement to contain the status of multiple different
accounts for the customers, the occurrence type is set to
REQUIRED_MULTIPLE
.
DocumentSchema.Types.EntityType.ValueSourceOneofCase
Enum of possible cases for the "value_source" oneof.
Evaluation.Types.MultiConfidenceMetrics.Types.MetricsType
A type that determines how metrics should be interpreted.
EvaluationName.ResourceNameType
The possible contents of EvaluationName.
HumanReviewConfigName.ResourceNameType
The possible contents of HumanReviewConfigName.
HumanReviewStatus.Types.State
The final state of human review on a processed document.
ProcessOptions.PageRangeOneofCase
Enum of possible cases for the "page_range" oneof.
ProcessRequest.SourceOneofCase
Enum of possible cases for the "source" oneof.
Processor.Types.State
The possible states of the processor.
ProcessorName.ResourceNameType
The possible contents of ProcessorName.
ProcessorTypeName.ResourceNameType
The possible contents of ProcessorTypeName.
ProcessorVersion.Types.ModelType
The possible model types of the processor version.
ProcessorVersion.Types.State
The possible states of the processor version.
ProcessorVersionName.ResourceNameType
The possible contents of ProcessorVersionName.
ReviewDocumentRequest.SourceOneofCase
Enum of possible cases for the "source" oneof.
ReviewDocumentRequest.Types.Priority
The priority level of the human review task.
ReviewDocumentResponse.Types.State
Possible states of the review operation.
TrainProcessorVersionRequest.ProcessorFlagsOneofCase
Enum of possible cases for the "processor_flags" oneof.
TrainProcessorVersionRequest.Types.CustomDocumentExtractionOptions.Types.TrainingMethod
Training Method for CDE. TRAINING_METHOD_UNSPECIFIED
will fall back to
MODEL_BASED
.