Types overview

GoogleCloudDocumentaiUiv1beta3BatchDeleteDocumentsMetadata

(No description provided)
Fields
commonMetadata

object (GoogleCloudDocumentaiUiv1beta3CommonOperationMetadata)

The basic metadata of the long running operation.

errorDocumentCount

integer (int32 format)

Total number of documents that failed to be deleted in storage.

individualBatchDeleteStatuses[]

object (GoogleCloudDocumentaiUiv1beta3BatchDeleteDocumentsMetadataIndividualBatchDeleteStatus)

The list of response details of each document.

totalDocumentCount

integer (int32 format)

Total number of documents deleting from dataset.

GoogleCloudDocumentaiUiv1beta3BatchDeleteDocumentsMetadataIndividualBatchDeleteStatus

The status of each individual document in the batch delete process.
Fields
documentId

object (GoogleCloudDocumentaiUiv1beta3DocumentId)

The document id of the document.

status

object (GoogleRpcStatus)

The status of deleting the document in storage.

GoogleCloudDocumentaiUiv1beta3BatchMoveDocumentsMetadata

(No description provided)
Fields
commonMetadata

object (GoogleCloudDocumentaiUiv1beta3CommonOperationMetadata)

The basic metadata of the long running operation.

destDatasetType

enum

The destination dataset split type.

Enum type. Can be one of the following:
DATASET_SPLIT_TYPE_UNSPECIFIED Default value if the enum is not set.
DATASET_SPLIT_TRAIN Identifies the train documents.
DATASET_SPLIT_TEST Identifies the test documents.
DATASET_SPLIT_UNASSIGNED Identifies the unassigned documents.
destSplitType

enum

The destination dataset split type.

Enum type. Can be one of the following:
DATASET_SPLIT_TYPE_UNSPECIFIED Default value if the enum is not set.
DATASET_SPLIT_TRAIN Identifies the train documents.
DATASET_SPLIT_TEST Identifies the test documents.
DATASET_SPLIT_UNASSIGNED Identifies the unassigned documents.
individualBatchMoveStatuses[]

object (GoogleCloudDocumentaiUiv1beta3BatchMoveDocumentsMetadataIndividualBatchMoveStatus)

The list of response details of each document.

GoogleCloudDocumentaiUiv1beta3BatchMoveDocumentsMetadataIndividualBatchMoveStatus

The status of each individual document in the batch move process.
Fields
documentId

object (GoogleCloudDocumentaiUiv1beta3DocumentId)

The document id of the document.

status

object (GoogleRpcStatus)

The status of moving the document.

GoogleCloudDocumentaiUiv1beta3CommonOperationMetadata

The common metadata for long running operations.
Fields
createTime

string (Timestamp format)

The creation time of the operation.

resource

string

A related resource to this operation.

state

enum

The state of the operation.

Enum type. Can be one of the following:
STATE_UNSPECIFIED Unspecified state.
RUNNING Operation is still running.
CANCELLING Operation is being cancelled.
SUCCEEDED Operation succeeded.
FAILED Operation failed.
CANCELLED Operation is cancelled.
stateMessage

string

A message providing more details about the current state of processing.

updateTime

string (Timestamp format)

The last update time of the operation.

GoogleCloudDocumentaiUiv1beta3CreateLabelerPoolOperationMetadata

The long running operation metadata for CreateLabelerPool.
Fields
commonMetadata

object (GoogleCloudDocumentaiUiv1beta3CommonOperationMetadata)

The basic metadata of the long running operation.

GoogleCloudDocumentaiUiv1beta3DeleteDataLabelingJobOperationMetadata

The long running operation metadata for DeleteDataLabelingJob.
Fields
commonMetadata

object (GoogleCloudDocumentaiUiv1beta3CommonOperationMetadata)

The basic metadata of the long running operation.

GoogleCloudDocumentaiUiv1beta3DeleteLabelerPoolOperationMetadata

The long running operation metadata for DeleteLabelerPool.
Fields
commonMetadata

object (GoogleCloudDocumentaiUiv1beta3CommonOperationMetadata)

The basic metadata of the long running operation.

GoogleCloudDocumentaiUiv1beta3DeleteProcessorMetadata

The long running operation metadata for delete processor method.
Fields
commonMetadata

object (GoogleCloudDocumentaiUiv1beta3CommonOperationMetadata)

The basic metadata of the long running operation.

GoogleCloudDocumentaiUiv1beta3DeleteProcessorVersionMetadata

The long running operation metadata for delete processor version method.
Fields
commonMetadata

object (GoogleCloudDocumentaiUiv1beta3CommonOperationMetadata)

The basic metadata of the long running operation.

GoogleCloudDocumentaiUiv1beta3DeployProcessorVersionMetadata

The long running operation metadata for deploy processor version method.
Fields
commonMetadata

object (GoogleCloudDocumentaiUiv1beta3CommonOperationMetadata)

The basic metadata of the long running operation.

GoogleCloudDocumentaiUiv1beta3DisableProcessorMetadata

The long running operation metadata for disable processor method.
Fields
commonMetadata

object (GoogleCloudDocumentaiUiv1beta3CommonOperationMetadata)

The basic metadata of the long running operation.

GoogleCloudDocumentaiUiv1beta3DocumentId

Document Identifier.
Fields
gcsManagedDocId

object (GoogleCloudDocumentaiUiv1beta3DocumentIdGCSManagedDocumentId)

(No description provided)

revisionReference

object (GoogleCloudDocumentaiUiv1beta3RevisionReference)

Points to a specific revision of the document if set.

GoogleCloudDocumentaiUiv1beta3DocumentIdGCSManagedDocumentId

Identifies a document uniquely within the scope of a dataset in the Cloud Storage option.
Fields
cwDocId

string

Id of the document (indexed) managed by Content Warehouse.

gcsUri

string

Required. The Cloud Storage uri where the actual document is stored.

GoogleCloudDocumentaiUiv1beta3EnableProcessorMetadata

The long running operation metadata for enable processor method.
Fields
commonMetadata

object (GoogleCloudDocumentaiUiv1beta3CommonOperationMetadata)

The basic metadata of the long running operation.

GoogleCloudDocumentaiUiv1beta3EvaluateProcessorVersionMetadata

Metadata of the EvaluateProcessorVersion method.
Fields
commonMetadata

object (GoogleCloudDocumentaiUiv1beta3CommonOperationMetadata)

The basic metadata of the long running operation.

GoogleCloudDocumentaiUiv1beta3EvaluateProcessorVersionResponse

Metadata of the EvaluateProcessorVersion method.
Fields
evaluation

string

The resource name of the created evaluation.

GoogleCloudDocumentaiUiv1beta3ExportDocumentsMetadata

Metadata of the batch export documents operation.
Fields
commonMetadata

object (GoogleCloudDocumentaiUiv1beta3CommonOperationMetadata)

The basic metadata of the long running operation.

individualExportStatuses[]

object (GoogleCloudDocumentaiUiv1beta3ExportDocumentsMetadataIndividualExportStatus)

The list of response details of each document.

splitExportStats[]

object (GoogleCloudDocumentaiUiv1beta3ExportDocumentsMetadataSplitExportStat)

The list of statistics for each dataset split type.

GoogleCloudDocumentaiUiv1beta3ExportDocumentsMetadataIndividualExportStatus

The status of each individual document in the export process.
Fields
documentId

object (GoogleCloudDocumentaiUiv1beta3DocumentId)

The path to source docproto of the document.

outputGcsDestination

string

The output_gcs_destination of the exported document if it was successful, otherwise empty.

status

object (GoogleRpcStatus)

The status of the exporting of the document.

GoogleCloudDocumentaiUiv1beta3ExportDocumentsMetadataSplitExportStat

The statistic representing a dataset split type for this export.
Fields
splitType

enum

The dataset split type.

Enum type. Can be one of the following:
DATASET_SPLIT_TYPE_UNSPECIFIED Default value if the enum is not set.
DATASET_SPLIT_TRAIN Identifies the train documents.
DATASET_SPLIT_TEST Identifies the test documents.
DATASET_SPLIT_UNASSIGNED Identifies the unassigned documents.
totalDocumentCount

integer (int32 format)

Total number of documents with the given dataset split type to be exported.

GoogleCloudDocumentaiUiv1beta3ExportProcessorVersionMetadata

Metadata message associated with the ExportProcessorVersion operation.
Fields
commonMetadata

object (GoogleCloudDocumentaiUiv1beta3CommonOperationMetadata)

The common metadata about the operation.

GoogleCloudDocumentaiUiv1beta3ExportProcessorVersionResponse

Response message associated with the ExportProcessorVersion operation.
Fields
gcsUri

string

The Cloud Storage URI containing the output artifacts.

GoogleCloudDocumentaiUiv1beta3ImportDocumentsMetadata

Metadata of the import document operation.
Fields
commonMetadata

object (GoogleCloudDocumentaiUiv1beta3CommonOperationMetadata)

The basic metadata of the long running operation.

importConfigValidationResults[]

object (GoogleCloudDocumentaiUiv1beta3ImportDocumentsMetadataImportConfigValidationResult)

Validation statuses of the batch documents import config.

individualImportStatuses[]

object (GoogleCloudDocumentaiUiv1beta3ImportDocumentsMetadataIndividualImportStatus)

The list of response details of each document.

totalDocumentCount

integer (int32 format)

Total number of the documents that are qualified for importing.

GoogleCloudDocumentaiUiv1beta3ImportDocumentsMetadataImportConfigValidationResult

The validation status of each import config. Status is set to errors if there is no documents to import in the import_config, or OK if the operation will try to proceed at least one document.
Fields
inputGcsSource

string

The source Cloud Storage URI specified in the import config.

status

object (GoogleRpcStatus)

The validation status of import config.

GoogleCloudDocumentaiUiv1beta3ImportDocumentsMetadataIndividualImportStatus

The status of each individual document in the import process.
Fields
inputGcsSource

string

The source Cloud Storage URI of the document.

outputGcsDestination

string

The output_gcs_destination of the processed document if it was successful, otherwise empty.

status

object (GoogleRpcStatus)

The status of the importing of the document.

GoogleCloudDocumentaiUiv1beta3ResyncDatasetMetadata

The metadata proto of ResyncDataset method.
Fields
commonMetadata

object (GoogleCloudDocumentaiUiv1beta3CommonOperationMetadata)

The basic metadata of the long running operation.

datasetResyncStatuses[]

object (GoogleCloudDocumentaiUiv1beta3ResyncDatasetMetadataDatasetResyncStatus)

The list of dataset resync statuses. Not checked when dataset_documents is specified in ResyncRequest.

individualDocumentResyncStatuses[]

object (GoogleCloudDocumentaiUiv1beta3ResyncDatasetMetadataIndividualDocumentResyncStatus)

The list of document resync statuses. The same document could have multiple individual_document_resync_statuses if it has multiple inconsistencies.

GoogleCloudDocumentaiUiv1beta3ResyncDatasetMetadataDatasetResyncStatus

Resync status against inconsistency types on the dataset level.
Fields
datasetInconsistencyType

enum

The type of the inconsistency of the dataset.

Enum type. Can be one of the following:
DATASET_INCONSISTENCY_TYPE_UNSPECIFIED Default value.
DATASET_INCONSISTENCY_TYPE_NO_STORAGE_MARKER The marker file under the dataset folder is not found.
status

object (GoogleRpcStatus)

The status of resyncing the dataset with regards to the detected inconsistency. Empty if validate_only is true in the request.

GoogleCloudDocumentaiUiv1beta3ResyncDatasetMetadataIndividualDocumentResyncStatus

Resync status for each document per inconsistency type.
Fields
documentId

object (GoogleCloudDocumentaiUiv1beta3DocumentId)

The document identifier.

documentInconsistencyType

enum

The type of document inconsistency.

Enum type. Can be one of the following:
DOCUMENT_INCONSISTENCY_TYPE_UNSPECIFIED Default value.
DOCUMENT_INCONSISTENCY_TYPE_INVALID_DOCPROTO The document proto is invalid.
DOCUMENT_INCONSISTENCY_TYPE_MISMATCHED_METADATA Indexed docproto metadata is mismatched.
DOCUMENT_INCONSISTENCY_TYPE_NO_PAGE_IMAGE The page image or thumbnails are missing.
status

object (GoogleRpcStatus)

The status of resyncing the document with regards to the detected inconsistency. Empty if validate_only is true in the request.

GoogleCloudDocumentaiUiv1beta3RevisionReference

The revision reference specifies which revision on the document to read.
Fields
latestProcessorVersion

string

Reads the revision generated by the processor version.

revisionCase

enum

Reads the revision by the predefined case.

Enum type. Can be one of the following:
REVISION_CASE_UNSPECIFIED Unspecified case, fallback to read the first (OCR) revision.
LATEST_HUMAN_REVIEW The latest revision made by a human.
LATEST_TIMESTAMP The latest revision based on timestamp.
revisionId

string

Reads the revision given by the id.

GoogleCloudDocumentaiUiv1beta3SetDefaultProcessorVersionMetadata

The long running operation metadata for set default processor version method.
Fields
commonMetadata

object (GoogleCloudDocumentaiUiv1beta3CommonOperationMetadata)

The basic metadata of the long running operation.

GoogleCloudDocumentaiUiv1beta3TrainProcessorVersionMetadata

The metadata that represents a processor version being created.
Fields
commonMetadata

object (GoogleCloudDocumentaiUiv1beta3CommonOperationMetadata)

The basic metadata of the long running operation.

testDatasetValidation

object (GoogleCloudDocumentaiUiv1beta3TrainProcessorVersionMetadataDatasetValidation)

The test dataset validation information.

trainingDatasetValidation

object (GoogleCloudDocumentaiUiv1beta3TrainProcessorVersionMetadataDatasetValidation)

The training dataset validation information.

GoogleCloudDocumentaiUiv1beta3TrainProcessorVersionMetadataDatasetValidation

The dataset validation information. This includes any and all errors with documents and the dataset.
Fields
datasetErrorCount

integer (int32 format)

The total number of dataset errors.

datasetErrors[]

object (GoogleRpcStatus)

Error information for the dataset as a whole. A maximum of 10 dataset errors will be returned. A single dataset error is terminal for training.

documentErrorCount

integer (int32 format)

The total number of document errors.

documentErrors[]

object (GoogleRpcStatus)

Error information pertaining to specific documents. A maximum of 10 document errors will be returned. Any document with errors will not be used throughout training.

GoogleCloudDocumentaiUiv1beta3TrainProcessorVersionResponse

The response for the TrainProcessorVersion method.
Fields
processorVersion

string

The resource name of the processor version produced by training.

GoogleCloudDocumentaiUiv1beta3UndeployProcessorVersionMetadata

The long running operation metadata for the undeploy processor version method.
Fields
commonMetadata

object (GoogleCloudDocumentaiUiv1beta3CommonOperationMetadata)

The basic metadata of the long running operation.

GoogleCloudDocumentaiUiv1beta3UpdateDatasetOperationMetadata

(No description provided)
Fields
commonMetadata

object (GoogleCloudDocumentaiUiv1beta3CommonOperationMetadata)

The basic metadata of the long running operation.

GoogleCloudDocumentaiUiv1beta3UpdateHumanReviewConfigMetadata

The long running operation metadata for updating the human review configuration.
Fields
commonMetadata

object (GoogleCloudDocumentaiUiv1beta3CommonOperationMetadata)

The basic metadata of the long running operation.

GoogleCloudDocumentaiUiv1beta3UpdateLabelerPoolOperationMetadata

The long running operation metadata for UpdateLabelerPool.
Fields
commonMetadata

object (GoogleCloudDocumentaiUiv1beta3CommonOperationMetadata)

The basic metadata of the long running operation.

GoogleCloudDocumentaiV1Barcode

Encodes the detailed information of a barcode.
Fields
format

string

Format of a barcode. The supported formats are: - CODE_128: Code 128 type. - CODE_39: Code 39 type. - CODE_93: Code 93 type. - CODABAR: Codabar type. - DATA_MATRIX: 2D Data Matrix type. - ITF: ITF type. - EAN_13: EAN-13 type. - EAN_8: EAN-8 type. - QR_CODE: 2D QR code type. - UPC_A: UPC-A type. - UPC_E: UPC-E type. - PDF417: PDF417 type. - AZTEC: 2D Aztec code type. - DATABAR: GS1 DataBar code type.

rawValue

string

Raw value encoded in the barcode. For example: 'MEBKM:TITLE:Google;URL:https://www.google.com;;'.

valueFormat

string

Value format describes the format of the value that a barcode encodes. The supported formats are: - CONTACT_INFO: Contact information. - EMAIL: Email address. - ISBN: ISBN identifier. - PHONE: Phone number. - PRODUCT: Product. - SMS: SMS message. - TEXT: Text string. - URL: URL address. - WIFI: Wifi information. - GEO: Geo-localization. - CALENDAR_EVENT: Calendar event. - DRIVER_LICENSE: Driver's license.

GoogleCloudDocumentaiV1BatchDocumentsInputConfig

The common config to specify a set of documents used as input.
Fields
gcsDocuments

object (GoogleCloudDocumentaiV1GcsDocuments)

The set of documents individually specified on Cloud Storage.

gcsPrefix

object (GoogleCloudDocumentaiV1GcsPrefix)

The set of documents that match the specified Cloud Storage gcs_prefix.

GoogleCloudDocumentaiV1BatchProcessMetadata

The long running operation metadata for batch process method.
Fields
createTime

string (Timestamp format)

The creation time of the operation.

individualProcessStatuses[]

object (GoogleCloudDocumentaiV1BatchProcessMetadataIndividualProcessStatus)

The list of response details of each document.

state

enum

The state of the current batch processing.

Enum type. Can be one of the following:
STATE_UNSPECIFIED The default value. This value is used if the state is omitted.
WAITING Request operation is waiting for scheduling.
RUNNING Request is being processed.
SUCCEEDED The batch processing completed successfully.
CANCELLING The batch processing was being cancelled.
CANCELLED The batch processing was cancelled.
FAILED The batch processing has failed.
stateMessage

string

A message providing more details about the current state of processing. For example, the error message if the operation is failed.

updateTime

string (Timestamp format)

The last update time of the operation.

GoogleCloudDocumentaiV1BatchProcessMetadataIndividualProcessStatus

The status of a each individual document in the batch process.
Fields
humanReviewStatus

object (GoogleCloudDocumentaiV1HumanReviewStatus)

The status of human review on the processed document.

inputGcsSource

string

The source of the document, same as the [input_gcs_source] field in the request when the batch process started. The batch process is started by take snapshot of that document, since a user can move or change that document during the process.

outputGcsDestination

string

The output_gcs_destination (in the request as output_gcs_destination) of the processed document if it was successful, otherwise empty.

status

object (GoogleRpcStatus)

The status processing the document.

GoogleCloudDocumentaiV1BatchProcessRequest

Request message for batch process document method.
Fields
documentOutputConfig

object (GoogleCloudDocumentaiV1DocumentOutputConfig)

The overall output config for batch process.

inputDocuments

object (GoogleCloudDocumentaiV1BatchDocumentsInputConfig)

The input documents for batch process.

skipHumanReview

boolean

Whether Human Review feature should be skipped for this request. Default to false.

GoogleCloudDocumentaiV1BoundingPoly

A bounding polygon for the detected image annotation.
Fields
normalizedVertices[]

object (GoogleCloudDocumentaiV1NormalizedVertex)

The bounding polygon normalized vertices.

vertices[]

object (GoogleCloudDocumentaiV1Vertex)

The bounding polygon vertices.

GoogleCloudDocumentaiV1CommonOperationMetadata

The common metadata for long running operations.
Fields
createTime

string (Timestamp format)

The creation time of the operation.

resource

string

A related resource to this operation.

state

enum

The state of the operation.

Enum type. Can be one of the following:
STATE_UNSPECIFIED Unspecified state.
RUNNING Operation is still running.
CANCELLING Operation is being cancelled.
SUCCEEDED Operation succeeded.
FAILED Operation failed.
CANCELLED Operation is cancelled.
stateMessage

string

A message providing more details about the current state of processing.

updateTime

string (Timestamp format)

The last update time of the operation.

GoogleCloudDocumentaiV1DeleteProcessorMetadata

The long running operation metadata for delete processor method.
Fields
commonMetadata

object (GoogleCloudDocumentaiV1CommonOperationMetadata)

The basic metadata of the long running operation.

GoogleCloudDocumentaiV1DeleteProcessorVersionMetadata

The long running operation metadata for delete processor version method.
Fields
commonMetadata

object (GoogleCloudDocumentaiV1CommonOperationMetadata)

The basic metadata of the long running operation.

GoogleCloudDocumentaiV1DeployProcessorVersionMetadata

The long running operation metadata for deploy processor version method.
Fields
commonMetadata

object (GoogleCloudDocumentaiV1CommonOperationMetadata)

The basic metadata of the long running operation.

GoogleCloudDocumentaiV1DisableProcessorMetadata

The long running operation metadata for disable processor method.
Fields
commonMetadata

object (GoogleCloudDocumentaiV1CommonOperationMetadata)

The basic metadata of the long running operation.

GoogleCloudDocumentaiV1Document

Document represents the canonical document resource in Document AI. It is an interchange format that provides insights into documents and allows for collaboration between users and Document AI to iterate and optimize for quality.
Fields
content

string (bytes format)

Optional. Inline document content, represented as a stream of bytes. Note: As with all bytes fields, protobuffers use a pure binary representation, whereas JSON representations use base64.

entities[]

object (GoogleCloudDocumentaiV1DocumentEntity)

A list of entities detected on Document.text. For document shards, entities in this list may cross shard boundaries.

entityRelations[]

object (GoogleCloudDocumentaiV1DocumentEntityRelation)

Placeholder. Relationship among Document.entities.

error

object (GoogleRpcStatus)

Any error that occurred while processing this document.

mimeType

string

An IANA published MIME type (also referred to as media type). For more information, see https://www.iana.org/assignments/media-types/media-types.xhtml.

pages[]

object (GoogleCloudDocumentaiV1DocumentPage)

Visual page layout for the Document.

revisions[]

object (GoogleCloudDocumentaiV1DocumentRevision)

Placeholder. Revision history of this document.

shardInfo

object (GoogleCloudDocumentaiV1DocumentShardInfo)

Information about the sharding if this document is sharded part of a larger document. If the document is not sharded, this message is not specified.

text

string

Optional. UTF-8 encoded text in reading order from the document.

textChanges[]

object (GoogleCloudDocumentaiV1DocumentTextChange)

Placeholder. A list of text corrections made to Document.text. This is usually used for annotating corrections to OCR mistakes. Text changes for a given revision may not overlap with each other.

textStyles[]

object (GoogleCloudDocumentaiV1DocumentStyle)

Styles for the Document.text.

uri

string

Optional. Currently supports Google Cloud Storage URI of the form gs://bucket_name/object_name. Object versioning is not supported. See Google Cloud Storage Request URIs for more info.

GoogleCloudDocumentaiV1DocumentEntity

An entity that could be a phrase in the text or a property that belongs to the document. It is a known entity type, such as a person, an organization, or location.
Fields
confidence

number (float format)

Optional. Confidence of detected Schema entity. Range [0, 1].

id

string

Optional. Canonical id. This will be a unique value in the entity list for this document.

mentionId

string

Optional. Deprecated. Use id field instead.

mentionText

string

Optional. Text value of the entity e.g. 1600 Amphitheatre Pkwy.

normalizedValue

object (GoogleCloudDocumentaiV1DocumentEntityNormalizedValue)

Optional. Normalized entity value. Absent if the extracted value could not be converted or the type (e.g. address) is not supported for certain parsers. This field is also only populated for certain supported document types.

pageAnchor

object (GoogleCloudDocumentaiV1DocumentPageAnchor)

Optional. Represents the provenance of this entity wrt. the location on the page where it was found.

properties[]

object (GoogleCloudDocumentaiV1DocumentEntity)

Optional. Entities can be nested to form a hierarchical data structure representing the content in the document.

provenance

object (GoogleCloudDocumentaiV1DocumentProvenance)

Optional. The history of this annotation.

redacted

boolean

Optional. Whether the entity will be redacted for de-identification purposes.

textAnchor

object (GoogleCloudDocumentaiV1DocumentTextAnchor)

Optional. Provenance of the entity. Text anchor indexing into the Document.text.

type

string

Required. Entity type from a schema e.g. Address.

GoogleCloudDocumentaiV1DocumentEntityNormalizedValue

Parsed and normalized entity value.
Fields
addressValue

object (GoogleTypePostalAddress)

Postal address. See also: https://github.com/googleapis/googleapis/blob/master/google/type/postal_address.proto

booleanValue

boolean

Boolean value. Can be used for entities with binary values, or for checkboxes.

dateValue

object (GoogleTypeDate)

Date value. Includes year, month, day. See also: https://github.com/googleapis/googleapis/blob/master/google/type/date.proto

datetimeValue

object (GoogleTypeDateTime)

DateTime value. Includes date, time, and timezone. See also: https://github.com/googleapis/googleapis/blob/master/google/type/datetime.proto

floatValue

number (float format)

Float value.

integerValue

integer (int32 format)

Integer value.

moneyValue

object (GoogleTypeMoney)

Money value. See also: https://github.com/googleapis/googleapis/blob/master/google/type/money.proto

text

string

Optional. An optional field to store a normalized string. For some entity types, one of respective structured_value fields may also be populated. Also not all the types of structured_value will be normalized. For example, some processors may not generate float or integer normalized text by default. Below are sample formats mapped to structured values. - Money/Currency type (money_value) is in the ISO 4217 text format. - Date type (date_value) is in the ISO 8601 text format. - Datetime type (datetime_value) is in the ISO 8601 text format.

GoogleCloudDocumentaiV1DocumentEntityRelation

Relationship between Entities.
Fields
objectId

string

Object entity id.

relation

string

Relationship description.

subjectId

string

Subject entity id.

GoogleCloudDocumentaiV1DocumentOutputConfig

Config that controls the output of documents. All documents will be written as a JSON file.
Fields
gcsOutputConfig

object (GoogleCloudDocumentaiV1DocumentOutputConfigGcsOutputConfig)

Output config to write the results to Cloud Storage.

GoogleCloudDocumentaiV1DocumentOutputConfigGcsOutputConfig

The configuration used when outputting documents.
Fields
fieldMask

string (FieldMask format)

Specifies which fields to include in the output documents. Only supports top level document and pages field so it must be in the form of {document_field_name} or pages.{page_field_name}.

gcsUri

string

The Cloud Storage uri (a directory) of the output.

shardingConfig

object (GoogleCloudDocumentaiV1DocumentOutputConfigGcsOutputConfigShardingConfig)

Specifies the sharding config for the output document.

GoogleCloudDocumentaiV1DocumentOutputConfigGcsOutputConfigShardingConfig

The sharding config for the output document.
Fields
pagesOverlap

integer (int32 format)

The number of overlapping pages between consecutive shards.

pagesPerShard

integer (int32 format)

The number of pages per shard.

GoogleCloudDocumentaiV1DocumentPage

A page in a Document.
Fields
blocks[]

object (GoogleCloudDocumentaiV1DocumentPageBlock)

A list of visually detected text blocks on the page. A block has a set of lines (collected into paragraphs) that have a common line-spacing and orientation.

detectedBarcodes[]

object (GoogleCloudDocumentaiV1DocumentPageDetectedBarcode)

A list of detected barcodes.

detectedLanguages[]

object (GoogleCloudDocumentaiV1DocumentPageDetectedLanguage)

A list of detected languages together with confidence.

dimension

object (GoogleCloudDocumentaiV1DocumentPageDimension)

Physical dimension of the page.

formFields[]

object (GoogleCloudDocumentaiV1DocumentPageFormField)

A list of visually detected form fields on the page.

image

object (GoogleCloudDocumentaiV1DocumentPageImage)

Rendered image for this page. This image is preprocessed to remove any skew, rotation, and distortions such that the annotation bounding boxes can be upright and axis-aligned.

imageQualityScores

object (GoogleCloudDocumentaiV1DocumentPageImageQualityScores)

Image Quality Scores.

layout

object (GoogleCloudDocumentaiV1DocumentPageLayout)

Layout for the page.

lines[]

object (GoogleCloudDocumentaiV1DocumentPageLine)

A list of visually detected text lines on the page. A collection of tokens that a human would perceive as a line.

pageNumber

integer (int32 format)

1-based index for current Page in a parent Document. Useful when a page is taken out of a Document for individual processing.

paragraphs[]

object (GoogleCloudDocumentaiV1DocumentPageParagraph)

A list of visually detected text paragraphs on the page. A collection of lines that a human would perceive as a paragraph.

provenance

object (GoogleCloudDocumentaiV1DocumentProvenance)

The history of this page.

symbols[]

object (GoogleCloudDocumentaiV1DocumentPageSymbol)

A list of visually detected symbols on the page.

tables[]

object (GoogleCloudDocumentaiV1DocumentPageTable)

A list of visually detected tables on the page.

tokens[]

object (GoogleCloudDocumentaiV1DocumentPageToken)

A list of visually detected tokens on the page.

transforms[]

object (GoogleCloudDocumentaiV1DocumentPageMatrix)

Transformation matrices that were applied to the original document image to produce Page.image.

visualElements[]

object (GoogleCloudDocumentaiV1DocumentPageVisualElement)

A list of detected non-text visual elements e.g. checkbox, signature etc. on the page.

GoogleCloudDocumentaiV1DocumentPageAnchor

Referencing the visual context of the entity in the Document.pages. Page anchors can be cross-page, consist of multiple bounding polygons and optionally reference specific layout element types.
Fields
pageRefs[]

object (GoogleCloudDocumentaiV1DocumentPageAnchorPageRef)

One or more references to visual page elements

GoogleCloudDocumentaiV1DocumentPageAnchorPageRef

Represents a weak reference to a page element within a document.
Fields
boundingPoly

object (GoogleCloudDocumentaiV1BoundingPoly)

Optional. Identifies the bounding polygon of a layout element on the page.

confidence

number (float format)

Optional. Confidence of detected page element, if applicable. Range [0, 1].

layoutId

string

Optional. Deprecated. Use PageRef.bounding_poly instead.

layoutType

enum

Optional. The type of the layout element that is being referenced if any.

Enum type. Can be one of the following:
LAYOUT_TYPE_UNSPECIFIED Layout Unspecified.
BLOCK References a Page.blocks element.
PARAGRAPH References a Page.paragraphs element.
LINE References a Page.lines element.
TOKEN References a Page.tokens element.
VISUAL_ELEMENT References a Page.visual_elements element.
TABLE Refrrences a Page.tables element.
FORM_FIELD References a Page.form_fields element.
page

string (int64 format)

Required. Index into the Document.pages element, for example using Document.pages to locate the related page element. This field is skipped when its value is the default 0. See https://developers.google.com/protocol-buffers/docs/proto3#json.

GoogleCloudDocumentaiV1DocumentPageBlock

A block has a set of lines (collected into paragraphs) that have a common line-spacing and orientation.
Fields
detectedLanguages[]

object (GoogleCloudDocumentaiV1DocumentPageDetectedLanguage)

A list of detected languages together with confidence.

layout

object (GoogleCloudDocumentaiV1DocumentPageLayout)

Layout for Block.

provenance

object (GoogleCloudDocumentaiV1DocumentProvenance)

The history of this annotation.

GoogleCloudDocumentaiV1DocumentPageDetectedBarcode

A detected barcode.
Fields
barcode

object (GoogleCloudDocumentaiV1Barcode)

Detailed barcode information of the DetectedBarcode.

layout

object (GoogleCloudDocumentaiV1DocumentPageLayout)

Layout for DetectedBarcode.

GoogleCloudDocumentaiV1DocumentPageDetectedLanguage

Detected language for a structural component.
Fields
confidence

number (float format)

Confidence of detected language. Range [0, 1].

languageCode

string

The BCP-47 language code, such as en-US or sr-Latn. For more information, see https://www.unicode.org/reports/tr35/#Unicode_locale_identifier.

GoogleCloudDocumentaiV1DocumentPageDimension

Dimension for the page.
Fields
height

number (float format)

Page height.

unit

string

Dimension unit.

width

number (float format)

Page width.

GoogleCloudDocumentaiV1DocumentPageFormField

A form field detected on the page.
Fields
correctedKeyText

string

Created for Labeling UI to export key text. If corrections were made to the text identified by the field_name.text_anchor, this field will contain the correction.

correctedValueText

string

Created for Labeling UI to export value text. If corrections were made to the text identified by the field_value.text_anchor, this field will contain the correction.

fieldName

object (GoogleCloudDocumentaiV1DocumentPageLayout)

Layout for the FormField name. e.g. Address, Email, Grand total, Phone number, etc.

fieldValue

object (GoogleCloudDocumentaiV1DocumentPageLayout)

Layout for the FormField value.

nameDetectedLanguages[]

object (GoogleCloudDocumentaiV1DocumentPageDetectedLanguage)

A list of detected languages for name together with confidence.

provenance

object (GoogleCloudDocumentaiV1DocumentProvenance)

The history of this annotation.

valueDetectedLanguages[]

object (GoogleCloudDocumentaiV1DocumentPageDetectedLanguage)

A list of detected languages for value together with confidence.

valueType

string

If the value is non-textual, this field represents the type. Current valid values are: - blank (this indicates the field_value is normal text) - unfilled_checkbox - filled_checkbox

GoogleCloudDocumentaiV1DocumentPageImage

Rendered image contents for this page.
Fields
content

string (bytes format)

Raw byte content of the image.

height

integer (int32 format)

Height of the image in pixels.

mimeType

string

Encoding mime type for the image.

width

integer (int32 format)

Width of the image in pixels.

GoogleCloudDocumentaiV1DocumentPageImageQualityScores

Image Quality Scores for the page image
Fields
detectedDefects[]

object (GoogleCloudDocumentaiV1DocumentPageImageQualityScoresDetectedDefect)

A list of detected defects.

qualityScore

number (float format)

The overall quality score. Range [0, 1] where 1 is perfect quality.

GoogleCloudDocumentaiV1DocumentPageImageQualityScoresDetectedDefect

Image Quality Defects
Fields
confidence

number (float format)

Confidence of detected defect. Range [0, 1] where 1 indicates strong confidence of that the defect exists.

type

string

Name of the defect type. Supported values are: - quality/defect_blurry - quality/defect_noisy - quality/defect_dark - quality/defect_faint - quality/defect_text_too_small - quality/defect_document_cutoff - quality/defect_text_cutoff - quality/defect_glare

GoogleCloudDocumentaiV1DocumentPageLayout

Visual element describing a layout unit on a page.
Fields
boundingPoly

object (GoogleCloudDocumentaiV1BoundingPoly)

The bounding polygon for the Layout.

confidence

number (float format)

Confidence of the current Layout within context of the object this layout is for. e.g. confidence can be for a single token, a table, a visual element, etc. depending on context. Range [0, 1].

orientation

enum

Detected orientation for the Layout.

Enum type. Can be one of the following:
ORIENTATION_UNSPECIFIED Unspecified orientation.
PAGE_UP Orientation is aligned with page up.
PAGE_RIGHT Orientation is aligned with page right. Turn the head 90 degrees clockwise from upright to read.
PAGE_DOWN Orientation is aligned with page down. Turn the head 180 degrees from upright to read.
PAGE_LEFT Orientation is aligned with page left. Turn the head 90 degrees counterclockwise from upright to read.
textAnchor

object (GoogleCloudDocumentaiV1DocumentTextAnchor)

Text anchor indexing into the Document.text.

GoogleCloudDocumentaiV1DocumentPageLine

A collection of tokens that a human would perceive as a line. Does not cross column boundaries, can be horizontal, vertical, etc.
Fields
detectedLanguages[]

object (GoogleCloudDocumentaiV1DocumentPageDetectedLanguage)

A list of detected languages together with confidence.

layout

object (GoogleCloudDocumentaiV1DocumentPageLayout)

Layout for Line.

provenance

object (GoogleCloudDocumentaiV1DocumentProvenance)

The history of this annotation.

GoogleCloudDocumentaiV1DocumentPageMatrix

Representation for transformation matrix, intended to be compatible and used with OpenCV format for image manipulation.
Fields
cols

integer (int32 format)

Number of columns in the matrix.

data

string (bytes format)

The matrix data.

rows

integer (int32 format)

Number of rows in the matrix.

type

integer (int32 format)

This encodes information about what data type the matrix uses. For example, 0 (CV_8U) is an unsigned 8-bit image. For the full list of OpenCV primitive data types, please refer to https://docs.opencv.org/4.3.0/d1/d1b/groupcorehal__interface.html

GoogleCloudDocumentaiV1DocumentPageParagraph

A collection of lines that a human would perceive as a paragraph.
Fields
detectedLanguages[]

object (GoogleCloudDocumentaiV1DocumentPageDetectedLanguage)

A list of detected languages together with confidence.

layout

object (GoogleCloudDocumentaiV1DocumentPageLayout)

Layout for Paragraph.

provenance

object (GoogleCloudDocumentaiV1DocumentProvenance)

The history of this annotation.

GoogleCloudDocumentaiV1DocumentPageSymbol

A detected symbol.
Fields
detectedLanguages[]

object (GoogleCloudDocumentaiV1DocumentPageDetectedLanguage)

A list of detected languages together with confidence.

layout

object (GoogleCloudDocumentaiV1DocumentPageLayout)

Layout for Symbol.

GoogleCloudDocumentaiV1DocumentPageTable

A table representation similar to HTML table structure.
Fields
bodyRows[]

object (GoogleCloudDocumentaiV1DocumentPageTableTableRow)

Body rows of the table.

detectedLanguages[]

object (GoogleCloudDocumentaiV1DocumentPageDetectedLanguage)

A list of detected languages together with confidence.

headerRows[]

object (GoogleCloudDocumentaiV1DocumentPageTableTableRow)

Header rows of the table.

layout

object (GoogleCloudDocumentaiV1DocumentPageLayout)

Layout for Table.

provenance

object (GoogleCloudDocumentaiV1DocumentProvenance)

The history of this table.

GoogleCloudDocumentaiV1DocumentPageTableTableCell

A cell representation inside the table.
Fields
colSpan

integer (int32 format)

How many columns this cell spans.

detectedLanguages[]

object (GoogleCloudDocumentaiV1DocumentPageDetectedLanguage)

A list of detected languages together with confidence.

layout

object (GoogleCloudDocumentaiV1DocumentPageLayout)

Layout for TableCell.

rowSpan

integer (int32 format)

How many rows this cell spans.

GoogleCloudDocumentaiV1DocumentPageTableTableRow

A row of table cells.
Fields
cells[]

object (GoogleCloudDocumentaiV1DocumentPageTableTableCell)

Cells that make up this row.

GoogleCloudDocumentaiV1DocumentPageToken

A detected token.
Fields
detectedBreak

object (GoogleCloudDocumentaiV1DocumentPageTokenDetectedBreak)

Detected break at the end of a Token.

detectedLanguages[]

object (GoogleCloudDocumentaiV1DocumentPageDetectedLanguage)

A list of detected languages together with confidence.

layout

object (GoogleCloudDocumentaiV1DocumentPageLayout)

Layout for Token.

provenance

object (GoogleCloudDocumentaiV1DocumentProvenance)

The history of this annotation.

GoogleCloudDocumentaiV1DocumentPageTokenDetectedBreak

Detected break at the end of a Token.
Fields
type

enum

Detected break type.

Enum type. Can be one of the following:
TYPE_UNSPECIFIED Unspecified break type.
SPACE A single whitespace.
WIDE_SPACE A wider whitespace.
HYPHEN A hyphen that indicates that a token has been split across lines.

GoogleCloudDocumentaiV1DocumentPageVisualElement

Detected non-text visual elements e.g. checkbox, signature etc. on the page.
Fields
detectedLanguages[]

object (GoogleCloudDocumentaiV1DocumentPageDetectedLanguage)

A list of detected languages together with confidence.

layout

object (GoogleCloudDocumentaiV1DocumentPageLayout)

Layout for VisualElement.

type

string

Type of the VisualElement.

GoogleCloudDocumentaiV1DocumentProvenance

Structure to identify provenance relationships between annotations in different revisions.
Fields
id

integer (int32 format)

The Id of this operation. Needs to be unique within the scope of the revision.

parents[]

object (GoogleCloudDocumentaiV1DocumentProvenanceParent)

References to the original elements that are replaced.

revision

integer (int32 format)

The index of the revision that produced this element.

type

enum

The type of provenance operation.

Enum type. Can be one of the following:
OPERATION_TYPE_UNSPECIFIED Operation type unspecified. If no operation is specified a provenance entry is simply used to match against a parent.
ADD Add an element.
REMOVE Remove an element identified by parent.
REPLACE Replace an element identified by parent.
EVAL_REQUESTED Request human review for the element identified by parent.
EVAL_APPROVED Element is reviewed and approved at human review, confidence will be set to 1.0.
EVAL_SKIPPED Element is skipped in the validation process.

GoogleCloudDocumentaiV1DocumentProvenanceParent

The parent element the current element is based on. Used for referencing/aligning, removal and replacement operations.
Fields
id

integer (int32 format)

The id of the parent provenance.

index

integer (int32 format)

The index of the parent item in the corresponding item list (eg. list of entities, properties within entities, etc.) in the parent revision.

revision

integer (int32 format)

The index of the index into current revision's parent_ids list.

GoogleCloudDocumentaiV1DocumentRevision

Contains past or forward revisions of this document.
Fields
agent

string

If the change was made by a person specify the name or id of that person.

createTime

string (Timestamp format)

The time that the revision was created.

humanReview

object (GoogleCloudDocumentaiV1DocumentRevisionHumanReview)

Human Review information of this revision.

id

string

Id of the revision. Unique within the context of the document.

parent[]

integer (int32 format)

The revisions that this revision is based on. This can include one or more parent (when documents are merged.) This field represents the index into the revisions field.

parentIds[]

string

The revisions that this revision is based on. Must include all the ids that have anything to do with this revision - eg. there are provenance.parent.revision fields that index into this field.

processor

string

If the annotation was made by processor identify the processor by its resource name.

GoogleCloudDocumentaiV1DocumentRevisionHumanReview

Human Review information of the document.
Fields
state

string

Human review state. e.g. requested, succeeded, rejected.

stateMessage

string

A message providing more details about the current state of processing. For example, the rejection reason when the state is rejected.

GoogleCloudDocumentaiV1DocumentSchema

The schema defines the output of the processed document by a processor.
Fields
description

string

Description of the schema.

displayName

string

Display name to show to users.

entityTypes[]

object (GoogleCloudDocumentaiV1DocumentSchemaEntityType)

Entity types of the schema.

metadata

object (GoogleCloudDocumentaiV1DocumentSchemaMetadata)

Metadata of the schema.

GoogleCloudDocumentaiV1DocumentSchemaEntityType

EntityType is the wrapper of a label of the corresponding model with detailed attributes and limitations for entity-based processors. Multiple types can also compose a dependency tree to represent nested types.
Fields
baseTypes[]

string

The entity type that this type is derived from. For now, one and only one should be set.

displayName

string

User defined name for the type.

enumValues

object (GoogleCloudDocumentaiV1DocumentSchemaEntityTypeEnumValues)

If specified, lists all the possible values for this entity. This should not be more than a handful of values. If the number of values is >10 or could change frequently use the EntityType.value_ontology field and specify a list of all possible values in a value ontology file.

name

string

Name of the type. It must be unique within the schema file and cannot be a 'Common Type'. Besides that we use the following naming conventions: - use snake_casing - name matching is case-sensitive - Maximum 64 characters. - Must start with a letter. - Allowed characters: ASCII letters [a-z0-9_-]. (For backward compatibility internal infrastructure and tooling can handle any ascii character) - The / is sometimes used to denote a property of a type. For example line_item/amount. This convention is deprecated, but will still be honored for backward compatibility.

properties[]

object (GoogleCloudDocumentaiV1DocumentSchemaEntityTypeProperty)

Describing the nested structure, or composition of an entity.

GoogleCloudDocumentaiV1DocumentSchemaEntityTypeEnumValues

Defines the a list of enum values.
Fields
values[]

string

The individual values that this enum values type can include.

GoogleCloudDocumentaiV1DocumentSchemaEntityTypeProperty

Defines properties that can be part of the entity type.
Fields
name

string

The name of the property. Follows the same guidelines as the EntityType name.

occurrenceType

enum

Occurrence type limits the number of instances an entity type appears in the document.

Enum type. Can be one of the following:
OCCURRENCE_TYPE_UNSPECIFIED Unspecified occurrence type.
OPTIONAL_ONCE There will be zero or one instance of this entity type.
OPTIONAL_MULTIPLE The entity type will appear zero or multiple times.
REQUIRED_ONCE The entity type will only appear exactly once.
REQUIRED_MULTIPLE The entity type will appear once or more times.
valueType

string

A reference to the value type of the property. This type is subject to the same conventions as the Entity.base_types field.

GoogleCloudDocumentaiV1DocumentSchemaMetadata

Metadata for global schema behavior.
Fields
documentAllowMultipleLabels

boolean

If true, on a given page, there can be multiple document annotations covering it.

documentSplitter

boolean

If true, a document entity type can be applied to subdocument ( splitting). Otherwise, it can only be applied to the entire document (classification).

prefixedNamingOnProperties

boolean

If set, all the nested entities must be prefixed with the parents.

skipNamingValidation

boolean

If set, we will skip the naming format validation in the schema. So the string values in DocumentSchema.EntityType.name and DocumentSchema.EntityType.Property.name will not be checked.

GoogleCloudDocumentaiV1DocumentShardInfo

For a large document, sharding may be performed to produce several document shards. Each document shard contains this field to detail which shard it is.
Fields
shardCount

string (int64 format)

Total number of shards.

shardIndex

string (int64 format)

The 0-based index of this shard.

textOffset

string (int64 format)

The index of the first character in Document.text in the overall document global text.

GoogleCloudDocumentaiV1DocumentStyle

Annotation for common text style attributes. This adheres to CSS conventions as much as possible.
Fields
backgroundColor

object (GoogleTypeColor)

Text background color.

color

object (GoogleTypeColor)

Text color.

fontFamily

string

Font family such as Arial, Times New Roman. https://www.w3schools.com/cssref/pr_font_font-family.asp

fontSize

object (GoogleCloudDocumentaiV1DocumentStyleFontSize)

Font size.

fontWeight

string

Font weight. Possible values are normal, bold, bolder, and lighter. https://www.w3schools.com/cssref/pr_font_weight.asp

textAnchor

object (GoogleCloudDocumentaiV1DocumentTextAnchor)

Text anchor indexing into the Document.text.

textDecoration

string

Text decoration. Follows CSS standard. https://www.w3schools.com/cssref/pr_text_text-decoration.asp

textStyle

string

Text style. Possible values are normal, italic, and oblique. https://www.w3schools.com/cssref/pr_font_font-style.asp

GoogleCloudDocumentaiV1DocumentStyleFontSize

Font size with unit.
Fields
size

number (float format)

Font size for the text.

unit

string

Unit for the font size. Follows CSS naming (in, px, pt, etc.).

GoogleCloudDocumentaiV1DocumentTextAnchor

Text reference indexing into the Document.text.
Fields
content

string

Contains the content of the text span so that users do not have to look it up in the text_segments. It is always populated for formFields.

textSegments[]

object (GoogleCloudDocumentaiV1DocumentTextAnchorTextSegment)

The text segments from the Document.text.

GoogleCloudDocumentaiV1DocumentTextAnchorTextSegment

A text segment in the Document.text. The indices may be out of bounds which indicate that the text extends into another document shard for large sharded documents. See ShardInfo.text_offset
Fields
endIndex

string (int64 format)

TextSegment half open end UTF-8 char index in the Document.text.

startIndex

string (int64 format)

TextSegment start UTF-8 char index in the Document.text.

GoogleCloudDocumentaiV1DocumentTextChange

This message is used for text changes aka. OCR corrections.
Fields
changedText

string

The text that replaces the text identified in the text_anchor.

provenance[]

object (GoogleCloudDocumentaiV1DocumentProvenance)

The history of this annotation.

textAnchor

object (GoogleCloudDocumentaiV1DocumentTextAnchor)

Provenance of the correction. Text anchor indexing into the Document.text. There can only be a single TextAnchor.text_segments element. If the start and end index of the text segment are the same, the text change is inserted before that index.

GoogleCloudDocumentaiV1EnableProcessorMetadata

The long running operation metadata for enable processor method.
Fields
commonMetadata

object (GoogleCloudDocumentaiV1CommonOperationMetadata)

The basic metadata of the long running operation.

GoogleCloudDocumentaiV1FetchProcessorTypesResponse

Response message for fetch processor types.
Fields
processorTypes[]

object (GoogleCloudDocumentaiV1ProcessorType)

The list of processor types.

GoogleCloudDocumentaiV1GcsDocument

Specifies a document stored on Cloud Storage.
Fields
gcsUri

string

The Cloud Storage object uri.

mimeType

string

An IANA MIME type (RFC6838) of the content.

GoogleCloudDocumentaiV1GcsDocuments

Specifies a set of documents on Cloud Storage.
Fields
documents[]

object (GoogleCloudDocumentaiV1GcsDocument)

The list of documents.

GoogleCloudDocumentaiV1GcsPrefix

Specifies all documents on Cloud Storage with a common prefix.
Fields
gcsUriPrefix

string

The URI prefix.

GoogleCloudDocumentaiV1HumanReviewStatus

The status of human review on a processed document.
Fields
humanReviewOperation

string

The name of the operation triggered by the processed document. This field is populated only when the [state] is [HUMAN_REVIEW_IN_PROGRESS]. It has the same response type and metadata as the long running operation returned by [ReviewDocument] method.

state

enum

The state of human review on the processing request.

Enum type. Can be one of the following:
STATE_UNSPECIFIED Human review state is unspecified. Most likely due to an internal error.
SKIPPED Human review is skipped for the document. This can happen because human review is not enabled on the processor or the processing request has been set to skip this document.
VALIDATION_PASSED Human review validation is triggered and passed, so no review is needed.
IN_PROGRESS Human review validation is triggered and the document is under review.
ERROR Some error happened during triggering human review, see the [state_message] for details.
stateMessage

string

A message providing more details about the human review state.

GoogleCloudDocumentaiV1ListProcessorTypesResponse

Response message for list processor types.
Fields
nextPageToken

string

Points to the next page, otherwise empty.

processorTypes[]

object (GoogleCloudDocumentaiV1ProcessorType)

The processor types.

GoogleCloudDocumentaiV1ListProcessorVersionsResponse

Response message for list processors.
Fields
nextPageToken

string

Points to the next processor, otherwise empty.

processorVersions[]

object (GoogleCloudDocumentaiV1ProcessorVersion)

The list of processors.

GoogleCloudDocumentaiV1ListProcessorsResponse

Response message for list processors.
Fields
nextPageToken

string

Points to the next processor, otherwise empty.

processors[]

object (GoogleCloudDocumentaiV1Processor)

The list of processors.

GoogleCloudDocumentaiV1NormalizedVertex

A vertex represents a 2D point in the image. NOTE: the normalized vertex coordinates are relative to the original image and range from 0 to 1.
Fields
x

number (float format)

X coordinate.

y

number (float format)

Y coordinate (starts from the top of the image).

GoogleCloudDocumentaiV1ProcessRequest

Request message for the process document method.
Fields
fieldMask

string (FieldMask format)

Specifies which fields to include in ProcessResponse's document. Only supports top level document and pages field so it must be in the form of {document_field_name} or pages.{page_field_name}.

inlineDocument

object (GoogleCloudDocumentaiV1Document)

An inline document proto.

rawDocument

object (GoogleCloudDocumentaiV1RawDocument)

A raw document content (bytes).

skipHumanReview

boolean

Whether Human Review feature should be skipped for this request. Default to false.

GoogleCloudDocumentaiV1ProcessResponse

Response message for the process document method.
Fields
document

object (GoogleCloudDocumentaiV1Document)

The document payload, will populate fields based on the processor's behavior.

humanReviewStatus

object (GoogleCloudDocumentaiV1HumanReviewStatus)

The status of human review on the processed document.

GoogleCloudDocumentaiV1Processor

The first-class citizen for Document AI. Each processor defines how to extract structural information from a document.
Fields
createTime

string (Timestamp format)

The time the processor was created.

defaultProcessorVersion

string

The default processor version.

displayName

string

The display name of the processor.

kmsKeyName

string

The KMS key used for encryption/decryption in CMEK scenarios. See https://cloud.google.com/security-key-management.

name

string

Output only. Immutable. The resource name of the processor. Format: projects/{project}/locations/{location}/processors/{processor}

processEndpoint

string

Output only. Immutable. The http endpoint that can be called to invoke processing.

state

enum

Output only. The state of the processor.

Enum type. Can be one of the following:
STATE_UNSPECIFIED The processor is in an unspecified state.
ENABLED The processor is enabled, i.e., has an enabled version which can currently serve processing requests and all the feature dependencies have been successfully initialized.
DISABLED The processor is disabled.
ENABLING The processor is being enabled, will become ENABLED if successful.
DISABLING The processor is being disabled, will become DISABLED if successful.
CREATING The processor is being created, will become either ENABLED (for successful creation) or FAILED (for failed ones). Once a processor is in this state, it can then be used for document processing, but the feature dependencies of the processor might not be fully created yet.
FAILED The processor failed during creation or initialization of feature dependencies. The user should delete the processor and recreate one as all the functionalities of the processor are disabled.
DELETING The processor is being deleted, will be removed if successful.
type

string

The processor type, e.g., OCR_PROCESSOR, INVOICE_PROCESSOR, etc. To get a list of processors types, see FetchProcessorTypes.

GoogleCloudDocumentaiV1ProcessorType

A processor type is responsible for performing a certain document understanding task on a certain type of document.
Fields
allowCreation

boolean

Whether the processor type allows creation. If true, users can create a processor of this processor type. Otherwise, users need to request access.

availableLocations[]

object (GoogleCloudDocumentaiV1ProcessorTypeLocationInfo)

The locations in which this processor is available.

category

string

The processor category, used by UI to group processor types.

launchStage

enum

Launch stage of the processor type

Enum type. Can be one of the following:
LAUNCH_STAGE_UNSPECIFIED Do not use this default value.
UNIMPLEMENTED The feature is not yet implemented. Users can not use it.
PRELAUNCH Prelaunch features are hidden from users and are only visible internally.
EARLY_ACCESS Early Access features are limited to a closed group of testers. To use these features, you must sign up in advance and sign a Trusted Tester agreement (which includes confidentiality provisions). These features may be unstable, changed in backward-incompatible ways, and are not guaranteed to be released.
ALPHA Alpha is a limited availability test for releases before they are cleared for widespread use. By Alpha, all significant design issues are resolved and we are in the process of verifying functionality. Alpha customers need to apply for access, agree to applicable terms, and have their projects allowlisted. Alpha releases don't have to be feature complete, no SLAs are provided, and there are no technical support obligations, but they will be far enough along that customers can actually use them in test environments or for limited-use tests -- just like they would in normal production cases.
BETA Beta is the point at which we are ready to open a release for any customer to use. There are no SLA or technical support obligations in a Beta release. Products will be complete from a feature perspective, but may have some open outstanding issues. Beta releases are suitable for limited production use cases.
GA GA features are open to all developers and are considered stable and fully qualified for production use.
DEPRECATED Deprecated features are scheduled to be shut down and removed. For more information, see the "Deprecation Policy" section of our Terms of Service and the Google Cloud Platform Subject to the Deprecation Policy documentation.
name

string

The resource name of the processor type. Format: projects/{project}/processorTypes/{processor_type}

sampleDocumentUris[]

string

A set of Cloud Storage URIs of sample documents for this processor.

type

string

The processor type, e.g., OCR_PROCESSOR, INVOICE_PROCESSOR, etc.

GoogleCloudDocumentaiV1ProcessorTypeLocationInfo

The location information about where the processor is available.
Fields
locationId

string

The location id, currently must be one of [us, eu].

GoogleCloudDocumentaiV1ProcessorVersion

A processor version is an implementation of a processor. Each processor can have multiple versions, pre-trained by Google internally or up-trained by the customer. At a time, a processor can only have one default version version. So the processor's behavior (when processing documents) is defined by a default version
Fields
createTime

string (Timestamp format)

The time the processor version was created.

deprecationInfo

object (GoogleCloudDocumentaiV1ProcessorVersionDeprecationInfo)

If set, information about the eventual deprecation of this version.

displayName

string

The display name of the processor version.

documentSchema

object (GoogleCloudDocumentaiV1DocumentSchema)

The schema of the processor version. Describes the output.

googleManaged

boolean

Denotes that this ProcessorVersion is managed by google.

kmsKeyName

string

The KMS key name used for encryption.

kmsKeyVersionName

string

The KMS key version with which data is encrypted.

name

string

The resource name of the processor version. Format: projects/{project}/locations/{location}/processors/{processor}/processorVersions/{processor_version}

state

enum

The state of the processor version.

Enum type. Can be one of the following:
STATE_UNSPECIFIED The processor version is in an unspecified state.
DEPLOYED The processor version is deployed and can be used for processing.
DEPLOYING The processor version is being deployed.
UNDEPLOYED The processor version is not deployed and cannot be used for processing.
UNDEPLOYING The processor version is being undeployed.
CREATING The processor version is being created.
DELETING The processor version is being deleted.
FAILED The processor version failed and is in an indeterminate state.

GoogleCloudDocumentaiV1ProcessorVersionDeprecationInfo

Information about the upcoming deprecation of this processor version.
Fields
deprecationTime

string (Timestamp format)

The time at which this processor version will be deprecated.

replacementProcessorVersion

string

If set, the processor version that will be used as a replacement.

GoogleCloudDocumentaiV1RawDocument

Payload message of raw document content (bytes).
Fields
content

string (bytes format)

Inline document content.

mimeType

string

An IANA MIME type (RFC6838) indicating the nature and format of the content.

GoogleCloudDocumentaiV1ReviewDocumentOperationMetadata

The long running operation metadata for review document method.
Fields
commonMetadata

object (GoogleCloudDocumentaiV1CommonOperationMetadata)

The basic metadata of the long running operation.

questionId

string

The Crowd Compute question ID.

GoogleCloudDocumentaiV1ReviewDocumentRequest

Request message for review document method.
Fields
documentSchema

object (GoogleCloudDocumentaiV1DocumentSchema)

The document schema of the human review task.

enableSchemaValidation

boolean

Whether the validation should be performed on the ad-hoc review request.

inlineDocument

object (GoogleCloudDocumentaiV1Document)

An inline document proto.

priority

enum

The priority of the human review task.

Enum type. Can be one of the following:
DEFAULT The default priority level.
URGENT The urgent priority level. The labeling manager should allocate labeler resource to the urgent task queue to respect this priority level.

GoogleCloudDocumentaiV1ReviewDocumentResponse

Response message for review document method.
Fields
gcsDestination

string

The Cloud Storage uri for the human reviewed document if the review is succeeded.

rejectionReason

string

The reason why the review is rejected by reviewer.

state

enum

The state of the review operation.

Enum type. Can be one of the following:
STATE_UNSPECIFIED The default value. This value is used if the state is omitted.
REJECTED The review operation is rejected by the reviewer.
SUCCEEDED The review operation is succeeded.

GoogleCloudDocumentaiV1SetDefaultProcessorVersionMetadata

The long running operation metadata for set default processor version method.
Fields
commonMetadata

object (GoogleCloudDocumentaiV1CommonOperationMetadata)

The basic metadata of the long running operation.

GoogleCloudDocumentaiV1SetDefaultProcessorVersionRequest

Request message for the set default processor version method.
Fields
defaultProcessorVersion

string

Required. The resource name of child ProcessorVersion to use as default. Format: projects/{project}/locations/{location}/processors/{processor}/processorVersions/{version}

GoogleCloudDocumentaiV1UndeployProcessorVersionMetadata

The long running operation metadata for the undeploy processor version method.
Fields
commonMetadata

object (GoogleCloudDocumentaiV1CommonOperationMetadata)

The basic metadata of the long running operation.

GoogleCloudDocumentaiV1Vertex

A vertex represents a 2D point in the image. NOTE: the vertex coordinates are in the same scale as the original image.
Fields
x

integer (int32 format)

X coordinate.

y

integer (int32 format)

Y coordinate (starts from the top of the image).

GoogleCloudDocumentaiV1beta1Barcode

Encodes the detailed information of a barcode.
Fields
format

string

Format of a barcode. The supported formats are: - CODE_128: Code 128 type. - CODE_39: Code 39 type. - CODE_93: Code 93 type. - CODABAR: Codabar type. - DATA_MATRIX: 2D Data Matrix type. - ITF: ITF type. - EAN_13: EAN-13 type. - EAN_8: EAN-8 type. - QR_CODE: 2D QR code type. - UPC_A: UPC-A type. - UPC_E: UPC-E type. - PDF417: PDF417 type. - AZTEC: 2D Aztec code type. - DATABAR: GS1 DataBar code type.

rawValue

string

Raw value encoded in the barcode. For example: 'MEBKM:TITLE:Google;URL:https://www.google.com;;'.

valueFormat

string

Value format describes the format of the value that a barcode encodes. The supported formats are: - CONTACT_INFO: Contact information. - EMAIL: Email address. - ISBN: ISBN identifier. - PHONE: Phone number. - PRODUCT: Product. - SMS: SMS message. - TEXT: Text string. - URL: URL address. - WIFI: Wifi information. - GEO: Geo-localization. - CALENDAR_EVENT: Calendar event. - DRIVER_LICENSE: Driver's license.

GoogleCloudDocumentaiV1beta1BatchProcessDocumentsResponse

Response to an batch document processing request. This is returned in the LRO Operation after the operation is complete.
Fields
responses[]

object (GoogleCloudDocumentaiV1beta1ProcessDocumentResponse)

Responses for each individual document.

GoogleCloudDocumentaiV1beta1BoundingPoly

A bounding polygon for the detected image annotation.
Fields
normalizedVertices[]

object (GoogleCloudDocumentaiV1beta1NormalizedVertex)

The bounding polygon normalized vertices.

vertices[]

object (GoogleCloudDocumentaiV1beta1Vertex)

The bounding polygon vertices.

GoogleCloudDocumentaiV1beta1Document

Document represents the canonical document resource in Document AI. It is an interchange format that provides insights into documents and allows for collaboration between users and Document AI to iterate and optimize for quality.
Fields
content

string (bytes format)

Optional. Inline document content, represented as a stream of bytes. Note: As with all bytes fields, protobuffers use a pure binary representation, whereas JSON representations use base64.

entities[]

object (GoogleCloudDocumentaiV1beta1DocumentEntity)

A list of entities detected on Document.text. For document shards, entities in this list may cross shard boundaries.

entityRelations[]

object (GoogleCloudDocumentaiV1beta1DocumentEntityRelation)

Placeholder. Relationship among Document.entities.

error

object (GoogleRpcStatus)

Any error that occurred while processing this document.

mimeType

string

An IANA published MIME type (also referred to as media type). For more information, see https://www.iana.org/assignments/media-types/media-types.xhtml.

pages[]

object (GoogleCloudDocumentaiV1beta1DocumentPage)

Visual page layout for the Document.

revisions[]

object (GoogleCloudDocumentaiV1beta1DocumentRevision)

Placeholder. Revision history of this document.

shardInfo

object (GoogleCloudDocumentaiV1beta1DocumentShardInfo)

Information about the sharding if this document is sharded part of a larger document. If the document is not sharded, this message is not specified.

text

string

Optional. UTF-8 encoded text in reading order from the document.

textChanges[]

object (GoogleCloudDocumentaiV1beta1DocumentTextChange)

Placeholder. A list of text corrections made to Document.text. This is usually used for annotating corrections to OCR mistakes. Text changes for a given revision may not overlap with each other.

textStyles[]

object (GoogleCloudDocumentaiV1beta1DocumentStyle)

Styles for the Document.text.

uri

string

Optional. Currently supports Google Cloud Storage URI of the form gs://bucket_name/object_name. Object versioning is not supported. See Google Cloud Storage Request URIs for more info.

GoogleCloudDocumentaiV1beta1DocumentEntity

An entity that could be a phrase in the text or a property that belongs to the document. It is a known entity type, such as a person, an organization, or location.
Fields
confidence

number (float format)

Optional. Confidence of detected Schema entity. Range [0, 1].

id

string

Optional. Canonical id. This will be a unique value in the entity list for this document.

mentionId

string

Optional. Deprecated. Use id field instead.

mentionText

string

Optional. Text value of the entity e.g. 1600 Amphitheatre Pkwy.

normalizedValue

object (GoogleCloudDocumentaiV1beta1DocumentEntityNormalizedValue)

Optional. Normalized entity value. Absent if the extracted value could not be converted or the type (e.g. address) is not supported for certain parsers. This field is also only populated for certain supported document types.

pageAnchor

object (GoogleCloudDocumentaiV1beta1DocumentPageAnchor)

Optional. Represents the provenance of this entity wrt. the location on the page where it was found.

properties[]

object (GoogleCloudDocumentaiV1beta1DocumentEntity)

Optional. Entities can be nested to form a hierarchical data structure representing the content in the document.

provenance

object (GoogleCloudDocumentaiV1beta1DocumentProvenance)

Optional. The history of this annotation.

redacted

boolean

Optional. Whether the entity will be redacted for de-identification purposes.

textAnchor

object (GoogleCloudDocumentaiV1beta1DocumentTextAnchor)

Optional. Provenance of the entity. Text anchor indexing into the Document.text.

type

string

Required. Entity type from a schema e.g. Address.

GoogleCloudDocumentaiV1beta1DocumentEntityNormalizedValue

Parsed and normalized entity value.
Fields
addressValue

object (GoogleTypePostalAddress)

Postal address. See also: https://github.com/googleapis/googleapis/blob/master/google/type/postal_address.proto

booleanValue

boolean

Boolean value. Can be used for entities with binary values, or for checkboxes.

dateValue

object (GoogleTypeDate)

Date value. Includes year, month, day. See also: https://github.com/googleapis/googleapis/blob/master/google/type/date.proto

datetimeValue

object (GoogleTypeDateTime)

DateTime value. Includes date, time, and timezone. See also: https://github.com/googleapis/googleapis/blob/master/google/type/datetime.proto

floatValue

number (float format)

Float value.

integerValue

integer (int32 format)

Integer value.

moneyValue

object (GoogleTypeMoney)

Money value. See also: https://github.com/googleapis/googleapis/blob/master/google/type/money.proto

text

string

Optional. An optional field to store a normalized string. For some entity types, one of respective structured_value fields may also be populated. Also not all the types of structured_value will be normalized. For example, some processors may not generate float or integer normalized text by default. Below are sample formats mapped to structured values. - Money/Currency type (money_value) is in the ISO 4217 text format. - Date type (date_value) is in the ISO 8601 text format. - Datetime type (datetime_value) is in the ISO 8601 text format.

GoogleCloudDocumentaiV1beta1DocumentEntityRelation

Relationship between Entities.
Fields
objectId

string

Object entity id.

relation

string

Relationship description.

subjectId

string

Subject entity id.

GoogleCloudDocumentaiV1beta1DocumentPage

A page in a Document.
Fields
blocks[]

object (GoogleCloudDocumentaiV1beta1DocumentPageBlock)

A list of visually detected text blocks on the page. A block has a set of lines (collected into paragraphs) that have a common line-spacing and orientation.

detectedBarcodes[]

object (GoogleCloudDocumentaiV1beta1DocumentPageDetectedBarcode)

A list of detected barcodes.

detectedLanguages[]

object (GoogleCloudDocumentaiV1beta1DocumentPageDetectedLanguage)

A list of detected languages together with confidence.

dimension

object (GoogleCloudDocumentaiV1beta1DocumentPageDimension)

Physical dimension of the page.

formFields[]

object (GoogleCloudDocumentaiV1beta1DocumentPageFormField)

A list of visually detected form fields on the page.

image

object (GoogleCloudDocumentaiV1beta1DocumentPageImage)

Rendered image for this page. This image is preprocessed to remove any skew, rotation, and distortions such that the annotation bounding boxes can be upright and axis-aligned.

imageQualityScores

object (GoogleCloudDocumentaiV1beta1DocumentPageImageQualityScores)

Image Quality Scores.

layout

object (GoogleCloudDocumentaiV1beta1DocumentPageLayout)

Layout for the page.

lines[]

object (GoogleCloudDocumentaiV1beta1DocumentPageLine)

A list of visually detected text lines on the page. A collection of tokens that a human would perceive as a line.

pageNumber

integer (int32 format)

1-based index for current Page in a parent Document. Useful when a page is taken out of a Document for individual processing.

paragraphs[]

object (GoogleCloudDocumentaiV1beta1DocumentPageParagraph)

A list of visually detected text paragraphs on the page. A collection of lines that a human would perceive as a paragraph.

provenance

object (GoogleCloudDocumentaiV1beta1DocumentProvenance)

The history of this page.

symbols[]

object (GoogleCloudDocumentaiV1beta1DocumentPageSymbol)

A list of visually detected symbols on the page.

tables[]

object (GoogleCloudDocumentaiV1beta1DocumentPageTable)

A list of visually detected tables on the page.

tokens[]

object (GoogleCloudDocumentaiV1beta1DocumentPageToken)

A list of visually detected tokens on the page.

transforms[]

object (GoogleCloudDocumentaiV1beta1DocumentPageMatrix)

Transformation matrices that were applied to the original document image to produce Page.image.

visualElements[]

object (GoogleCloudDocumentaiV1beta1DocumentPageVisualElement)

A list of detected non-text visual elements e.g. checkbox, signature etc. on the page.

GoogleCloudDocumentaiV1beta1DocumentPageAnchor

Referencing the visual context of the entity in the Document.pages. Page anchors can be cross-page, consist of multiple bounding polygons and optionally reference specific layout element types.
Fields
pageRefs[]

object (GoogleCloudDocumentaiV1beta1DocumentPageAnchorPageRef)

One or more references to visual page elements

GoogleCloudDocumentaiV1beta1DocumentPageAnchorPageRef

Represents a weak reference to a page element within a document.
Fields
boundingPoly

object (GoogleCloudDocumentaiV1beta1BoundingPoly)

Optional. Identifies the bounding polygon of a layout element on the page.

confidence

number (float format)

Optional. Confidence of detected page element, if applicable. Range [0, 1].

layoutId

string

Optional. Deprecated. Use PageRef.bounding_poly instead.

layoutType

enum

Optional. The type of the layout element that is being referenced if any.

Enum type. Can be one of the following:
LAYOUT_TYPE_UNSPECIFIED Layout Unspecified.
BLOCK References a Page.blocks element.
PARAGRAPH References a Page.paragraphs element.
LINE References a Page.lines element.
TOKEN References a Page.tokens element.
VISUAL_ELEMENT References a Page.visual_elements element.
TABLE Refrrences a Page.tables element.
FORM_FIELD References a Page.form_fields element.
page

string (int64 format)

Required. Index into the Document.pages element, for example using Document.pages to locate the related page element. This field is skipped when its value is the default 0. See https://developers.google.com/protocol-buffers/docs/proto3#json.

GoogleCloudDocumentaiV1beta1DocumentPageBlock

A block has a set of lines (collected into paragraphs) that have a common line-spacing and orientation.
Fields
detectedLanguages[]

object (GoogleCloudDocumentaiV1beta1DocumentPageDetectedLanguage)

A list of detected languages together with confidence.

layout

object (GoogleCloudDocumentaiV1beta1DocumentPageLayout)

Layout for Block.

provenance

object (GoogleCloudDocumentaiV1beta1DocumentProvenance)

The history of this annotation.

GoogleCloudDocumentaiV1beta1DocumentPageDetectedBarcode

A detected barcode.
Fields
barcode

object (GoogleCloudDocumentaiV1beta1Barcode)

Detailed barcode information of the DetectedBarcode.

layout

object (GoogleCloudDocumentaiV1beta1DocumentPageLayout)

Layout for DetectedBarcode.

GoogleCloudDocumentaiV1beta1DocumentPageDetectedLanguage

Detected language for a structural component.
Fields
confidence

number (float format)

Confidence of detected language. Range [0, 1].

languageCode

string

The BCP-47 language code, such as en-US or sr-Latn. For more information, see https://www.unicode.org/reports/tr35/#Unicode_locale_identifier.

GoogleCloudDocumentaiV1beta1DocumentPageDimension

Dimension for the page.
Fields
height

number (float format)

Page height.

unit

string

Dimension unit.

width

number (float format)

Page width.

GoogleCloudDocumentaiV1beta1DocumentPageFormField

A form field detected on the page.
Fields
correctedKeyText

string

Created for Labeling UI to export key text. If corrections were made to the text identified by the field_name.text_anchor, this field will contain the correction.

correctedValueText

string

Created for Labeling UI to export value text. If corrections were made to the text identified by the field_value.text_anchor, this field will contain the correction.

fieldName

object (GoogleCloudDocumentaiV1beta1DocumentPageLayout)

Layout for the FormField name. e.g. Address, Email, Grand total, Phone number, etc.

fieldValue

object (GoogleCloudDocumentaiV1beta1DocumentPageLayout)

Layout for the FormField value.

nameDetectedLanguages[]

object (GoogleCloudDocumentaiV1beta1DocumentPageDetectedLanguage)

A list of detected languages for name together with confidence.

provenance

object (GoogleCloudDocumentaiV1beta1DocumentProvenance)

The history of this annotation.

valueDetectedLanguages[]

object (GoogleCloudDocumentaiV1beta1DocumentPageDetectedLanguage)

A list of detected languages for value together with confidence.

valueType

string

If the value is non-textual, this field represents the type. Current valid values are: - blank (this indicates the field_value is normal text) - unfilled_checkbox - filled_checkbox

GoogleCloudDocumentaiV1beta1DocumentPageImage

Rendered image contents for this page.
Fields
content

string (bytes format)

Raw byte content of the image.

height

integer (int32 format)

Height of the image in pixels.

mimeType

string

Encoding mime type for the image.

width

integer (int32 format)

Width of the image in pixels.

GoogleCloudDocumentaiV1beta1DocumentPageImageQualityScores

Image Quality Scores for the page image
Fields
detectedDefects[]

object (GoogleCloudDocumentaiV1beta1DocumentPageImageQualityScoresDetectedDefect)

A list of detected defects.

qualityScore

number (float format)

The overall quality score. Range [0, 1] where 1 is perfect quality.

GoogleCloudDocumentaiV1beta1DocumentPageImageQualityScoresDetectedDefect

Image Quality Defects
Fields
confidence

number (float format)

Confidence of detected defect. Range [0, 1] where 1 indicates strong confidence of that the defect exists.

type

string

Name of the defect type. Supported values are: - quality/defect_blurry - quality/defect_noisy - quality/defect_dark - quality/defect_faint - quality/defect_text_too_small - quality/defect_document_cutoff - quality/defect_text_cutoff - quality/defect_glare

GoogleCloudDocumentaiV1beta1DocumentPageLayout

Visual element describing a layout unit on a page.
Fields
boundingPoly

object (GoogleCloudDocumentaiV1beta1BoundingPoly)

The bounding polygon for the Layout.

confidence

number (float format)

Confidence of the current Layout within context of the object this layout is for. e.g. confidence can be for a single token, a table, a visual element, etc. depending on context. Range [0, 1].

orientation

enum

Detected orientation for the Layout.

Enum type. Can be one of the following:
ORIENTATION_UNSPECIFIED Unspecified orientation.
PAGE_UP Orientation is aligned with page up.
PAGE_RIGHT Orientation is aligned with page right. Turn the head 90 degrees clockwise from upright to read.
PAGE_DOWN Orientation is aligned with page down. Turn the head 180 degrees from upright to read.
PAGE_LEFT Orientation is aligned with page left. Turn the head 90 degrees counterclockwise from upright to read.
textAnchor

object (GoogleCloudDocumentaiV1beta1DocumentTextAnchor)

Text anchor indexing into the Document.text.

GoogleCloudDocumentaiV1beta1DocumentPageLine

A collection of tokens that a human would perceive as a line. Does not cross column boundaries, can be horizontal, vertical, etc.
Fields
detectedLanguages[]

object (GoogleCloudDocumentaiV1beta1DocumentPageDetectedLanguage)

A list of detected languages together with confidence.

layout

object (GoogleCloudDocumentaiV1beta1DocumentPageLayout)

Layout for Line.

provenance

object (GoogleCloudDocumentaiV1beta1DocumentProvenance)

The history of this annotation.

GoogleCloudDocumentaiV1beta1DocumentPageMatrix

Representation for transformation matrix, intended to be compatible and used with OpenCV format for image manipulation.
Fields
cols

integer (int32 format)

Number of columns in the matrix.

data

string (bytes format)

The matrix data.

rows

integer (int32 format)

Number of rows in the matrix.

type

integer (int32 format)

This encodes information about what data type the matrix uses. For example, 0 (CV_8U) is an unsigned 8-bit image. For the full list of OpenCV primitive data types, please refer to https://docs.opencv.org/4.3.0/d1/d1b/groupcorehal__interface.html

GoogleCloudDocumentaiV1beta1DocumentPageParagraph

A collection of lines that a human would perceive as a paragraph.
Fields
detectedLanguages[]

object (GoogleCloudDocumentaiV1beta1DocumentPageDetectedLanguage)

A list of detected languages together with confidence.

layout

object (GoogleCloudDocumentaiV1beta1DocumentPageLayout)

Layout for Paragraph.

provenance

object (GoogleCloudDocumentaiV1beta1DocumentProvenance)

The history of this annotation.

GoogleCloudDocumentaiV1beta1DocumentPageSymbol

A detected symbol.
Fields
detectedLanguages[]

object (GoogleCloudDocumentaiV1beta1DocumentPageDetectedLanguage)

A list of detected languages together with confidence.

layout

object (GoogleCloudDocumentaiV1beta1DocumentPageLayout)

Layout for Symbol.

GoogleCloudDocumentaiV1beta1DocumentPageTable

A table representation similar to HTML table structure.
Fields
bodyRows[]

object (GoogleCloudDocumentaiV1beta1DocumentPageTableTableRow)

Body rows of the table.

detectedLanguages[]

object (GoogleCloudDocumentaiV1beta1DocumentPageDetectedLanguage)

A list of detected languages together with confidence.

headerRows[]

object (GoogleCloudDocumentaiV1beta1DocumentPageTableTableRow)

Header rows of the table.

layout

object (GoogleCloudDocumentaiV1beta1DocumentPageLayout)

Layout for Table.

provenance

object (GoogleCloudDocumentaiV1beta1DocumentProvenance)

The history of this table.

GoogleCloudDocumentaiV1beta1DocumentPageTableTableCell

A cell representation inside the table.
Fields
colSpan

integer (int32 format)

How many columns this cell spans.

detectedLanguages[]

object (GoogleCloudDocumentaiV1beta1DocumentPageDetectedLanguage)

A list of detected languages together with confidence.

layout

object (GoogleCloudDocumentaiV1beta1DocumentPageLayout)

Layout for TableCell.

rowSpan

integer (int32 format)

How many rows this cell spans.

GoogleCloudDocumentaiV1beta1DocumentPageTableTableRow

A row of table cells.
Fields
cells[]

object (GoogleCloudDocumentaiV1beta1DocumentPageTableTableCell)

Cells that make up this row.

GoogleCloudDocumentaiV1beta1DocumentPageToken

A detected token.
Fields
detectedBreak

object (GoogleCloudDocumentaiV1beta1DocumentPageTokenDetectedBreak)

Detected break at the end of a Token.

detectedLanguages[]

object (GoogleCloudDocumentaiV1beta1DocumentPageDetectedLanguage)

A list of detected languages together with confidence.

layout

object (GoogleCloudDocumentaiV1beta1DocumentPageLayout)

Layout for Token.

provenance

object (GoogleCloudDocumentaiV1beta1DocumentProvenance)

The history of this annotation.

GoogleCloudDocumentaiV1beta1DocumentPageTokenDetectedBreak

Detected break at the end of a Token.
Fields
type

enum

Detected break type.

Enum type. Can be one of the following:
TYPE_UNSPECIFIED Unspecified break type.
SPACE A single whitespace.
WIDE_SPACE A wider whitespace.
HYPHEN A hyphen that indicates that a token has been split across lines.

GoogleCloudDocumentaiV1beta1DocumentPageVisualElement

Detected non-text visual elements e.g. checkbox, signature etc. on the page.
Fields
detectedLanguages[]

object (GoogleCloudDocumentaiV1beta1DocumentPageDetectedLanguage)

A list of detected languages together with confidence.

layout

object (GoogleCloudDocumentaiV1beta1DocumentPageLayout)

Layout for VisualElement.

type

string

Type of the VisualElement.

GoogleCloudDocumentaiV1beta1DocumentProvenance

Structure to identify provenance relationships between annotations in different revisions.
Fields
id

integer (int32 format)

The Id of this operation. Needs to be unique within the scope of the revision.

parents[]

object (GoogleCloudDocumentaiV1beta1DocumentProvenanceParent)

References to the original elements that are replaced.

revision

integer (int32 format)

The index of the revision that produced this element.

type

enum

The type of provenance operation.

Enum type. Can be one of the following:
OPERATION_TYPE_UNSPECIFIED Operation type unspecified. If no operation is specified a provenance entry is simply used to match against a parent.
ADD Add an element.
REMOVE Remove an element identified by parent.
REPLACE Replace an element identified by parent.
EVAL_REQUESTED Request human review for the element identified by parent.
EVAL_APPROVED Element is reviewed and approved at human review, confidence will be set to 1.0.
EVAL_SKIPPED Element is skipped in the validation process.

GoogleCloudDocumentaiV1beta1DocumentProvenanceParent

The parent element the current element is based on. Used for referencing/aligning, removal and replacement operations.
Fields
id

integer (int32 format)

The id of the parent provenance.

index

integer (int32 format)

The index of the parent item in the corresponding item list (eg. list of entities, properties within entities, etc.) in the parent revision.

revision

integer (int32 format)

The index of the index into current revision's parent_ids list.

GoogleCloudDocumentaiV1beta1DocumentRevision

Contains past or forward revisions of this document.
Fields
agent

string

If the change was made by a person specify the name or id of that person.

createTime

string (Timestamp format)

The time that the revision was created.

humanReview

object (GoogleCloudDocumentaiV1beta1DocumentRevisionHumanReview)

Human Review information of this revision.

id

string

Id of the revision. Unique within the context of the document.

parent[]

integer (int32 format)

The revisions that this revision is based on. This can include one or more parent (when documents are merged.) This field represents the index into the revisions field.

parentIds[]

string

The revisions that this revision is based on. Must include all the ids that have anything to do with this revision - eg. there are provenance.parent.revision fields that index into this field.

processor

string

If the annotation was made by processor identify the processor by its resource name.

GoogleCloudDocumentaiV1beta1DocumentRevisionHumanReview

Human Review information of the document.
Fields
state

string

Human review state. e.g. requested, succeeded, rejected.

stateMessage

string

A message providing more details about the current state of processing. For example, the rejection reason when the state is rejected.

GoogleCloudDocumentaiV1beta1DocumentShardInfo

For a large document, sharding may be performed to produce several document shards. Each document shard contains this field to detail which shard it is.
Fields
shardCount

string (int64 format)

Total number of shards.

shardIndex

string (int64 format)

The 0-based index of this shard.

textOffset

string (int64 format)

The index of the first character in Document.text in the overall document global text.

GoogleCloudDocumentaiV1beta1DocumentStyle

Annotation for common text style attributes. This adheres to CSS conventions as much as possible.
Fields
backgroundColor

object (GoogleTypeColor)

Text background color.

color

object (GoogleTypeColor)

Text color.

fontFamily

string

Font family such as Arial, Times New Roman. https://www.w3schools.com/cssref/pr_font_font-family.asp

fontSize

object (GoogleCloudDocumentaiV1beta1DocumentStyleFontSize)

Font size.

fontWeight

string

Font weight. Possible values are normal, bold, bolder, and lighter. https://www.w3schools.com/cssref/pr_font_weight.asp

textAnchor

object (GoogleCloudDocumentaiV1beta1DocumentTextAnchor)

Text anchor indexing into the Document.text.

textDecoration

string

Text decoration. Follows CSS standard. https://www.w3schools.com/cssref/pr_text_text-decoration.asp

textStyle

string

Text style. Possible values are normal, italic, and oblique. https://www.w3schools.com/cssref/pr_font_font-style.asp

GoogleCloudDocumentaiV1beta1DocumentStyleFontSize

Font size with unit.
Fields
size

number (float format)

Font size for the text.

unit

string

Unit for the font size. Follows CSS naming (in, px, pt, etc.).

GoogleCloudDocumentaiV1beta1DocumentTextAnchor

Text reference indexing into the Document.text.
Fields
content

string

Contains the content of the text span so that users do not have to look it up in the text_segments. It is always populated for formFields.

textSegments[]

object (GoogleCloudDocumentaiV1beta1DocumentTextAnchorTextSegment)

The text segments from the Document.text.

GoogleCloudDocumentaiV1beta1DocumentTextAnchorTextSegment

A text segment in the Document.text. The indices may be out of bounds which indicate that the text extends into another document shard for large sharded documents. See ShardInfo.text_offset
Fields
endIndex

string (int64 format)

TextSegment half open end UTF-8 char index in the Document.text.

startIndex

string (int64 format)

TextSegment start UTF-8 char index in the Document.text.

GoogleCloudDocumentaiV1beta1DocumentTextChange

This message is used for text changes aka. OCR corrections.
Fields
changedText

string

The text that replaces the text identified in the text_anchor.

provenance[]

object (GoogleCloudDocumentaiV1beta1DocumentProvenance)

The history of this annotation.

textAnchor

object (GoogleCloudDocumentaiV1beta1DocumentTextAnchor)

Provenance of the correction. Text anchor indexing into the Document.text. There can only be a single TextAnchor.text_segments element. If the start and end index of the text segment are the same, the text change is inserted before that index.

GoogleCloudDocumentaiV1beta1GcsDestination

The Google Cloud Storage location where the output file will be written to.
Fields
uri

string

(No description provided)

GoogleCloudDocumentaiV1beta1GcsSource

The Google Cloud Storage location where the input file will be read from.
Fields
uri

string

(No description provided)

GoogleCloudDocumentaiV1beta1InputConfig

The desired input location and metadata.
Fields
gcsSource

object (GoogleCloudDocumentaiV1beta1GcsSource)

The Google Cloud Storage location to read the input from. This must be a single file.

mimeType

string

Required. Mimetype of the input. Current supported mimetypes are application/pdf, image/tiff, and image/gif. In addition, application/json type is supported for requests with ProcessDocumentRequest.automl_params field set. The JSON file needs to be in Document format.

GoogleCloudDocumentaiV1beta1NormalizedVertex

A vertex represents a 2D point in the image. NOTE: the normalized vertex coordinates are relative to the original image and range from 0 to 1.
Fields
x

number (float format)

X coordinate.

y

number (float format)

Y coordinate (starts from the top of the image).

GoogleCloudDocumentaiV1beta1OperationMetadata

Contains metadata for the BatchProcessDocuments operation.
Fields
createTime

string (Timestamp format)

The creation time of the operation.

state

enum

The state of the current batch processing.

Enum type. Can be one of the following:
STATE_UNSPECIFIED The default value. This value is used if the state is omitted.
ACCEPTED Request is received.
WAITING Request operation is waiting for scheduling.
RUNNING Request is being processed.
SUCCEEDED The batch processing completed successfully.
CANCELLED The batch processing was cancelled.
FAILED The batch processing has failed.
stateMessage

string

A message providing more details about the current state of processing.

updateTime

string (Timestamp format)

The last update time of the operation.

GoogleCloudDocumentaiV1beta1OutputConfig

The desired output location and metadata.
Fields
gcsDestination

object (GoogleCloudDocumentaiV1beta1GcsDestination)

The Google Cloud Storage location to write the output to.

pagesPerShard

integer (int32 format)

The max number of pages to include into each output Document shard JSON on Google Cloud Storage. The valid range is [1, 100]. If not specified, the default value is 20. For example, for one pdf file with 100 pages, 100 parsed pages will be produced. If pages_per_shard = 20, then 5 Document shard JSON files each containing 20 parsed pages will be written under the prefix OutputConfig.gcs_destination.uri and suffix pages-x-to-y.json where x and y are 1-indexed page numbers. Example GCS outputs with 157 pages and pages_per_shard = 50: pages-001-to-050.json pages-051-to-100.json pages-101-to-150.json pages-151-to-157.json

GoogleCloudDocumentaiV1beta1ProcessDocumentResponse

Response to a single document processing request.
Fields
inputConfig

object (GoogleCloudDocumentaiV1beta1InputConfig)

Information about the input file. This is the same as the corresponding input config in the request.

outputConfig

object (GoogleCloudDocumentaiV1beta1OutputConfig)

The output location of the parsed responses. The responses are written to this location as JSON-serialized Document objects.

GoogleCloudDocumentaiV1beta1Vertex

A vertex represents a 2D point in the image. NOTE: the vertex coordinates are in the same scale as the original image.
Fields
x

integer (int32 format)

X coordinate.

y

integer (int32 format)

Y coordinate (starts from the top of the image).

GoogleCloudDocumentaiV1beta2Barcode

Encodes the detailed information of a barcode.
Fields
format

string

Format of a barcode. The supported formats are: - CODE_128: Code 128 type. - CODE_39: Code 39 type. - CODE_93: Code 93 type. - CODABAR: Codabar type. - DATA_MATRIX: 2D Data Matrix type. - ITF: ITF type. - EAN_13: EAN-13 type. - EAN_8: EAN-8 type. - QR_CODE: 2D QR code type. - UPC_A: UPC-A type. - UPC_E: UPC-E type. - PDF417: PDF417 type. - AZTEC: 2D Aztec code type. - DATABAR: GS1 DataBar code type.

rawValue

string

Raw value encoded in the barcode. For example: 'MEBKM:TITLE:Google;URL:https://www.google.com;;'.

valueFormat

string

Value format describes the format of the value that a barcode encodes. The supported formats are: - CONTACT_INFO: Contact information. - EMAIL: Email address. - ISBN: ISBN identifier. - PHONE: Phone number. - PRODUCT: Product. - SMS: SMS message. - TEXT: Text string. - URL: URL address. - WIFI: Wifi information. - GEO: Geo-localization. - CALENDAR_EVENT: Calendar event. - DRIVER_LICENSE: Driver's license.

GoogleCloudDocumentaiV1beta2BatchProcessDocumentsResponse

Response to an batch document processing request. This is returned in the LRO Operation after the operation is complete.
Fields
responses[]

object (GoogleCloudDocumentaiV1beta2ProcessDocumentResponse)

Responses for each individual document.

GoogleCloudDocumentaiV1beta2BoundingPoly

A bounding polygon for the detected image annotation.
Fields
normalizedVertices[]

object (GoogleCloudDocumentaiV1beta2NormalizedVertex)

The bounding polygon normalized vertices.

vertices[]

object (GoogleCloudDocumentaiV1beta2Vertex)

The bounding polygon vertices.

GoogleCloudDocumentaiV1beta2Document

Document represents the canonical document resource in Document AI. It is an interchange format that provides insights into documents and allows for collaboration between users and Document AI to iterate and optimize for quality.
Fields
content

string (bytes format)

Optional. Inline document content, represented as a stream of bytes. Note: As with all bytes fields, protobuffers use a pure binary representation, whereas JSON representations use base64.

entities[]

object (GoogleCloudDocumentaiV1beta2DocumentEntity)

A list of entities detected on Document.text. For document shards, entities in this list may cross shard boundaries.

entityRelations[]

object (GoogleCloudDocumentaiV1beta2DocumentEntityRelation)

Placeholder. Relationship among Document.entities.

error

object (GoogleRpcStatus)

Any error that occurred while processing this document.

labels[]

object (GoogleCloudDocumentaiV1beta2DocumentLabel)

Labels for this document.

mimeType

string

An IANA published MIME type (also referred to as media type). For more information, see https://www.iana.org/assignments/media-types/media-types.xhtml.

pages[]

object (GoogleCloudDocumentaiV1beta2DocumentPage)

Visual page layout for the Document.

revisions[]

object (GoogleCloudDocumentaiV1beta2DocumentRevision)

Placeholder. Revision history of this document.

shardInfo

object (GoogleCloudDocumentaiV1beta2DocumentShardInfo)

Information about the sharding if this document is sharded part of a larger document. If the document is not sharded, this message is not specified.

text

string

Optional. UTF-8 encoded text in reading order from the document.

textChanges[]

object (GoogleCloudDocumentaiV1beta2DocumentTextChange)

Placeholder. A list of text corrections made to Document.text. This is usually used for annotating corrections to OCR mistakes. Text changes for a given revision may not overlap with each other.

textStyles[]

object (GoogleCloudDocumentaiV1beta2DocumentStyle)

Styles for the Document.text.

uri

string

Optional. Currently supports Google Cloud Storage URI of the form gs://bucket_name/object_name. Object versioning is not supported. See Google Cloud Storage Request URIs for more info.

GoogleCloudDocumentaiV1beta2DocumentEntity

An entity that could be a phrase in the text or a property that belongs to the document. It is a known entity type, such as a person, an organization, or location.
Fields
confidence

number (float format)

Optional. Confidence of detected Schema entity. Range [0, 1].

id

string

Optional. Canonical id. This will be a unique value in the entity list for this document.

mentionId

string

Optional. Deprecated. Use id field instead.

mentionText

string

Optional. Text value of the entity e.g. 1600 Amphitheatre Pkwy.

normalizedValue

object (GoogleCloudDocumentaiV1beta2DocumentEntityNormalizedValue)

Optional. Normalized entity value. Absent if the extracted value could not be converted or the type (e.g. address) is not supported for certain parsers. This field is also only populated for certain supported document types.

pageAnchor

object (GoogleCloudDocumentaiV1beta2DocumentPageAnchor)

Optional. Represents the provenance of this entity wrt. the location on the page where it was found.

properties[]

object (GoogleCloudDocumentaiV1beta2DocumentEntity)

Optional. Entities can be nested to form a hierarchical data structure representing the content in the document.

provenance

object (GoogleCloudDocumentaiV1beta2DocumentProvenance)

Optional. The history of this annotation.

redacted

boolean

Optional. Whether the entity will be redacted for de-identification purposes.

textAnchor

object (GoogleCloudDocumentaiV1beta2DocumentTextAnchor)

Optional. Provenance of the entity. Text anchor indexing into the Document.text.

type

string

Required. Entity type from a schema e.g. Address.

GoogleCloudDocumentaiV1beta2DocumentEntityNormalizedValue

Parsed and normalized entity value.
Fields
addressValue

object (GoogleTypePostalAddress)

Postal address. See also: https://github.com/googleapis/googleapis/blob/master/google/type/postal_address.proto

booleanValue

boolean

Boolean value. Can be used for entities with binary values, or for checkboxes.

dateValue

object (GoogleTypeDate)

Date value. Includes year, month, day. See also: https://github.com/googleapis/googleapis/blob/master/google/type/date.proto

datetimeValue

object (GoogleTypeDateTime)

DateTime value. Includes date, time, and timezone. See also: https://github.com/googleapis/googleapis/blob/master/google/type/datetime.proto

floatValue

number (float format)

Float value.

integerValue

integer (int32 format)

Integer value.

moneyValue

object (GoogleTypeMoney)

Money value. See also: https://github.com/googleapis/googleapis/blob/master/google/type/money.proto

text

string

Optional. An optional field to store a normalized string. For some entity types, one of respective structured_value fields may also be populated. Also not all the types of structured_value will be normalized. For example, some processors may not generate float or integer normalized text by default. Below are sample formats mapped to structured values. - Money/Currency type (money_value) is in the ISO 4217 text format. - Date type (date_value) is in the ISO 8601 text format. - Datetime type (datetime_value) is in the ISO 8601 text format.

GoogleCloudDocumentaiV1beta2DocumentEntityRelation

Relationship between Entities.
Fields
objectId

string

Object entity id.

relation

string

Relationship description.

subjectId

string

Subject entity id.

GoogleCloudDocumentaiV1beta2DocumentLabel

Label attaches schema information and/or other metadata to segments within a Document. Multiple Labels on a single field can denote either different labels, different instances of the same label created at different times, or some combination of both.
Fields
automlModel

string

Label is generated AutoML model. This field stores the full resource name of the AutoML model. Format: projects/{project-id}/locations/{location-id}/models/{model-id}

confidence

number (float format)

Confidence score between 0 and 1 for label assignment.

name

string

Name of the label. When the label is generated from AutoML Text Classification model, this field represents the name of the category.

GoogleCloudDocumentaiV1beta2DocumentPage

A page in a Document.
Fields
blocks[]

object (GoogleCloudDocumentaiV1beta2DocumentPageBlock)

A list of visually detected text blocks on the page. A block has a set of lines (collected into paragraphs) that have a common line-spacing and orientation.

detectedBarcodes[]

object (GoogleCloudDocumentaiV1beta2DocumentPageDetectedBarcode)

A list of detected barcodes.

detectedLanguages[]

object (GoogleCloudDocumentaiV1beta2DocumentPageDetectedLanguage)

A list of detected languages together with confidence.

dimension

object (GoogleCloudDocumentaiV1beta2DocumentPageDimension)

Physical dimension of the page.

formFields[]

object (GoogleCloudDocumentaiV1beta2DocumentPageFormField)

A list of visually detected form fields on the page.

image

object (GoogleCloudDocumentaiV1beta2DocumentPageImage)

Rendered image for this page. This image is preprocessed to remove any skew, rotation, and distortions such that the annotation bounding boxes can be upright and axis-aligned.

imageQualityScores

object (GoogleCloudDocumentaiV1beta2DocumentPageImageQualityScores)

Image Quality Scores.

layout

object (GoogleCloudDocumentaiV1beta2DocumentPageLayout)

Layout for the page.

lines[]

object (GoogleCloudDocumentaiV1beta2DocumentPageLine)

A list of visually detected text lines on the page. A collection of tokens that a human would perceive as a line.

pageNumber

integer (int32 format)

1-based index for current Page in a parent Document. Useful when a page is taken out of a Document for individual processing.

paragraphs[]

object (GoogleCloudDocumentaiV1beta2DocumentPageParagraph)

A list of visually detected text paragraphs on the page. A collection of lines that a human would perceive as a paragraph.

provenance

object (GoogleCloudDocumentaiV1beta2DocumentProvenance)

The history of this page.

symbols[]

object (GoogleCloudDocumentaiV1beta2DocumentPageSymbol)

A list of visually detected symbols on the page.

tables[]

object (GoogleCloudDocumentaiV1beta2DocumentPageTable)

A list of visually detected tables on the page.

tokens[]

object (GoogleCloudDocumentaiV1beta2DocumentPageToken)

A list of visually detected tokens on the page.

transforms[]

object (GoogleCloudDocumentaiV1beta2DocumentPageMatrix)

Transformation matrices that were applied to the original document image to produce Page.image.

visualElements[]

object (GoogleCloudDocumentaiV1beta2DocumentPageVisualElement)

A list of detected non-text visual elements e.g. checkbox, signature etc. on the page.

GoogleCloudDocumentaiV1beta2DocumentPageAnchor

Referencing the visual context of the entity in the Document.pages. Page anchors can be cross-page, consist of multiple bounding polygons and optionally reference specific layout element types.
Fields
pageRefs[]

object (GoogleCloudDocumentaiV1beta2DocumentPageAnchorPageRef)

One or more references to visual page elements

GoogleCloudDocumentaiV1beta2DocumentPageAnchorPageRef

Represents a weak reference to a page element within a document.
Fields
boundingPoly

object (GoogleCloudDocumentaiV1beta2BoundingPoly)

Optional. Identifies the bounding polygon of a layout element on the page.

confidence

number (float format)

Optional. Confidence of detected page element, if applicable. Range [0, 1].

layoutId

string

Optional. Deprecated. Use PageRef.bounding_poly instead.

layoutType

enum

Optional. The type of the layout element that is being referenced if any.

Enum type. Can be one of the following:
LAYOUT_TYPE_UNSPECIFIED Layout Unspecified.
BLOCK References a Page.blocks element.
PARAGRAPH References a Page.paragraphs element.
LINE References a Page.lines element.
TOKEN References a Page.tokens element.
VISUAL_ELEMENT References a Page.visual_elements element.
TABLE Refrrences a Page.tables element.
FORM_FIELD References a Page.form_fields element.
page

string (int64 format)

Required. Index into the Document.pages element, for example using Document.pages to locate the related page element. This field is skipped when its value is the default 0. See https://developers.google.com/protocol-buffers/docs/proto3#json.

GoogleCloudDocumentaiV1beta2DocumentPageBlock

A block has a set of lines (collected into paragraphs) that have a common line-spacing and orientation.
Fields
detectedLanguages[]

object (GoogleCloudDocumentaiV1beta2DocumentPageDetectedLanguage)

A list of detected languages together with confidence.

layout

object (GoogleCloudDocumentaiV1beta2DocumentPageLayout)

Layout for Block.

provenance

object (GoogleCloudDocumentaiV1beta2DocumentProvenance)

The history of this annotation.

GoogleCloudDocumentaiV1beta2DocumentPageDetectedBarcode

A detected barcode.
Fields
barcode

object (GoogleCloudDocumentaiV1beta2Barcode)

Detailed barcode information of the DetectedBarcode.

layout

object (GoogleCloudDocumentaiV1beta2DocumentPageLayout)

Layout for DetectedBarcode.

GoogleCloudDocumentaiV1beta2DocumentPageDetectedLanguage

Detected language for a structural component.
Fields
confidence

number (float format)

Confidence of detected language. Range [0, 1].

languageCode

string

The BCP-47 language code, such as en-US or sr-Latn. For more information, see https://www.unicode.org/reports/tr35/#Unicode_locale_identifier.

GoogleCloudDocumentaiV1beta2DocumentPageDimension

Dimension for the page.
Fields
height

number (float format)

Page height.

unit

string

Dimension unit.

width

number (float format)

Page width.

GoogleCloudDocumentaiV1beta2DocumentPageFormField

A form field detected on the page.
Fields
correctedKeyText

string

Created for Labeling UI to export key text. If corrections were made to the text identified by the field_name.text_anchor, this field will contain the correction.

correctedValueText

string

Created for Labeling UI to export value text. If corrections were made to the text identified by the field_value.text_anchor, this field will contain the correction.

fieldName

object (GoogleCloudDocumentaiV1beta2DocumentPageLayout)

Layout for the FormField name. e.g. Address, Email, Grand total, Phone number, etc.

fieldValue

object (GoogleCloudDocumentaiV1beta2DocumentPageLayout)

Layout for the FormField value.

nameDetectedLanguages[]

object (GoogleCloudDocumentaiV1beta2DocumentPageDetectedLanguage)

A list of detected languages for name together with confidence.

provenance

object (GoogleCloudDocumentaiV1beta2DocumentProvenance)

The history of this annotation.

valueDetectedLanguages[]

object (GoogleCloudDocumentaiV1beta2DocumentPageDetectedLanguage)

A list of detected languages for value together with confidence.

valueType

string

If the value is non-textual, this field represents the type. Current valid values are: - blank (this indicates the field_value is normal text) - unfilled_checkbox - filled_checkbox

GoogleCloudDocumentaiV1beta2DocumentPageImage

Rendered image contents for this page.
Fields
content

string (bytes format)

Raw byte content of the image.

height

integer (int32 format)

Height of the image in pixels.

mimeType

string

Encoding mime type for the image.

width

integer (int32 format)

Width of the image in pixels.

GoogleCloudDocumentaiV1beta2DocumentPageImageQualityScores

Image Quality Scores for the page image
Fields
detectedDefects[]

object (GoogleCloudDocumentaiV1beta2DocumentPageImageQualityScoresDetectedDefect)

A list of detected defects.

qualityScore

number (float format)

The overall quality score. Range [0, 1] where 1 is perfect quality.

GoogleCloudDocumentaiV1beta2DocumentPageImageQualityScoresDetectedDefect

Image Quality Defects
Fields
confidence

number (float format)

Confidence of detected defect. Range [0, 1] where 1 indicates strong confidence of that the defect exists.

type

string

Name of the defect type. Supported values are: - quality/defect_blurry - quality/defect_noisy - quality/defect_dark - quality/defect_faint - quality/defect_text_too_small - quality/defect_document_cutoff - quality/defect_text_cutoff - quality/defect_glare

GoogleCloudDocumentaiV1beta2DocumentPageLayout

Visual element describing a layout unit on a page.
Fields
boundingPoly

object (GoogleCloudDocumentaiV1beta2BoundingPoly)

The bounding polygon for the Layout.

confidence

number (float format)

Confidence of the current Layout within context of the object this layout is for. e.g. confidence can be for a single token, a table, a visual element, etc. depending on context. Range [0, 1].

orientation

enum

Detected orientation for the Layout.

Enum type. Can be one of the following:
ORIENTATION_UNSPECIFIED Unspecified orientation.
PAGE_UP Orientation is aligned with page up.
PAGE_RIGHT Orientation is aligned with page right. Turn the head 90 degrees clockwise from upright to read.
PAGE_DOWN Orientation is aligned with page down. Turn the head 180 degrees from upright to read.
PAGE_LEFT Orientation is aligned with page left. Turn the head 90 degrees counterclockwise from upright to read.
textAnchor

object (GoogleCloudDocumentaiV1beta2DocumentTextAnchor)

Text anchor indexing into the Document.text.

GoogleCloudDocumentaiV1beta2DocumentPageLine

A collection of tokens that a human would perceive as a line. Does not cross column boundaries, can be horizontal, vertical, etc.
Fields
detectedLanguages[]

object (GoogleCloudDocumentaiV1beta2DocumentPageDetectedLanguage)

A list of detected languages together with confidence.

layout

object (GoogleCloudDocumentaiV1beta2DocumentPageLayout)

Layout for Line.

provenance

object (GoogleCloudDocumentaiV1beta2DocumentProvenance)

The history of this annotation.

GoogleCloudDocumentaiV1beta2DocumentPageMatrix

Representation for transformation matrix, intended to be compatible and used with OpenCV format for image manipulation.
Fields
cols

integer (int32 format)

Number of columns in the matrix.

data

string (bytes format)

The matrix data.

rows

integer (int32 format)

Number of rows in the matrix.

type

integer (int32 format)

This encodes information about what data type the matrix uses. For example, 0 (CV_8U) is an unsigned 8-bit image. For the full list of OpenCV primitive data types, please refer to https://docs.opencv.org/4.3.0/d1/d1b/groupcorehal__interface.html

GoogleCloudDocumentaiV1beta2DocumentPageParagraph

A collection of lines that a human would perceive as a paragraph.
Fields
detectedLanguages[]

object (GoogleCloudDocumentaiV1beta2DocumentPageDetectedLanguage)

A list of detected languages together with confidence.

layout

object (GoogleCloudDocumentaiV1beta2DocumentPageLayout)

Layout for Paragraph.

provenance

object (GoogleCloudDocumentaiV1beta2DocumentProvenance)

The history of this annotation.

GoogleCloudDocumentaiV1beta2DocumentPageSymbol

A detected symbol.
Fields
detectedLanguages[]

object (GoogleCloudDocumentaiV1beta2DocumentPageDetectedLanguage)

A list of detected languages together with confidence.

layout

object (GoogleCloudDocumentaiV1beta2DocumentPageLayout)

Layout for Symbol.

GoogleCloudDocumentaiV1beta2DocumentPageTable

A table representation similar to HTML table structure.
Fields
bodyRows[]

object (GoogleCloudDocumentaiV1beta2DocumentPageTableTableRow)

Body rows of the table.

detectedLanguages[]

object (GoogleCloudDocumentaiV1beta2DocumentPageDetectedLanguage)

A list of detected languages together with confidence.

headerRows[]

object (GoogleCloudDocumentaiV1beta2DocumentPageTableTableRow)

Header rows of the table.

layout

object (GoogleCloudDocumentaiV1beta2DocumentPageLayout)

Layout for Table.

provenance

object (GoogleCloudDocumentaiV1beta2DocumentProvenance)

The history of this table.

GoogleCloudDocumentaiV1beta2DocumentPageTableTableCell

A cell representation inside the table.
Fields
colSpan

integer (int32 format)

How many columns this cell spans.

detectedLanguages[]

object (GoogleCloudDocumentaiV1beta2DocumentPageDetectedLanguage)

A list of detected languages together with confidence.

layout

object (GoogleCloudDocumentaiV1beta2DocumentPageLayout)

Layout for TableCell.

rowSpan

integer (int32 format)

How many rows this cell spans.

GoogleCloudDocumentaiV1beta2DocumentPageTableTableRow

A row of table cells.
Fields
cells[]

object (GoogleCloudDocumentaiV1beta2DocumentPageTableTableCell)

Cells that make up this row.

GoogleCloudDocumentaiV1beta2DocumentPageToken

A detected token.
Fields
detectedBreak

object (GoogleCloudDocumentaiV1beta2DocumentPageTokenDetectedBreak)

Detected break at the end of a Token.

detectedLanguages[]

object (GoogleCloudDocumentaiV1beta2DocumentPageDetectedLanguage)

A list of detected languages together with confidence.

layout

object (GoogleCloudDocumentaiV1beta2DocumentPageLayout)

Layout for Token.

provenance

object (GoogleCloudDocumentaiV1beta2DocumentProvenance)

The history of this annotation.

GoogleCloudDocumentaiV1beta2DocumentPageTokenDetectedBreak

Detected break at the end of a Token.
Fields
type

enum

Detected break type.

Enum type. Can be one of the following:
TYPE_UNSPECIFIED Unspecified break type.
SPACE A single whitespace.
WIDE_SPACE A wider whitespace.
HYPHEN A hyphen that indicates that a token has been split across lines.

GoogleCloudDocumentaiV1beta2DocumentPageVisualElement

Detected non-text visual elements e.g. checkbox, signature etc. on the page.
Fields
detectedLanguages[]

object (GoogleCloudDocumentaiV1beta2DocumentPageDetectedLanguage)

A list of detected languages together with confidence.

layout

object (GoogleCloudDocumentaiV1beta2DocumentPageLayout)

Layout for VisualElement.

type

string

Type of the VisualElement.

GoogleCloudDocumentaiV1beta2DocumentProvenance

Structure to identify provenance relationships between annotations in different revisions.
Fields
id

integer (int32 format)

The Id of this operation. Needs to be unique within the scope of the revision.

parents[]

object (GoogleCloudDocumentaiV1beta2DocumentProvenanceParent)

References to the original elements that are replaced.

revision

integer (int32 format)

The index of the revision that produced this element.

type

enum

The type of provenance operation.

Enum type. Can be one of the following:
OPERATION_TYPE_UNSPECIFIED Operation type unspecified. If no operation is specified a provenance entry is simply used to match against a parent.
ADD Add an element.
REMOVE Remove an element identified by parent.
REPLACE Replace an element identified by parent.
EVAL_REQUESTED Request human review for the element identified by parent.
EVAL_APPROVED Element is reviewed and approved at human review, confidence will be set to 1.0.
EVAL_SKIPPED Element is skipped in the validation process.

GoogleCloudDocumentaiV1beta2DocumentProvenanceParent

The parent element the current element is based on. Used for referencing/aligning, removal and replacement operations.
Fields
id

integer (int32 format)

The id of the parent provenance.

index

integer (int32 format)

The index of the parent item in the corresponding item list (eg. list of entities, properties within entities, etc.) in the parent revision.

revision

integer (int32 format)

The index of the index into current revision's parent_ids list.

GoogleCloudDocumentaiV1beta2DocumentRevision

Contains past or forward revisions of this document.
Fields
agent

string

If the change was made by a person specify the name or id of that person.

createTime

string (Timestamp format)

The time that the revision was created.

humanReview

object (GoogleCloudDocumentaiV1beta2DocumentRevisionHumanReview)

Human Review information of this revision.

id

string

Id of the revision. Unique within the context of the document.

parent[]

integer (int32 format)

The revisions that this revision is based on. This can include one or more parent (when documents are merged.) This field represents the index into the revisions field.

parentIds[]

string

The revisions that this revision is based on. Must include all the ids that have anything to do with this revision - eg. there are provenance.parent.revision fields that index into this field.

processor

string

If the annotation was made by processor identify the processor by its resource name.

GoogleCloudDocumentaiV1beta2DocumentRevisionHumanReview

Human Review information of the document.
Fields
state

string

Human review state. e.g. requested, succeeded, rejected.

stateMessage

string

A message providing more details about the current state of processing. For example, the rejection reason when the state is rejected.

GoogleCloudDocumentaiV1beta2DocumentShardInfo

For a large document, sharding may be performed to produce several document shards. Each document shard contains this field to detail which shard it is.
Fields
shardCount

string (int64 format)

Total number of shards.

shardIndex

string (int64 format)

The 0-based index of this shard.

textOffset

string (int64 format)

The index of the first character in Document.text in the overall document global text.

GoogleCloudDocumentaiV1beta2DocumentStyle

Annotation for common text style attributes. This adheres to CSS conventions as much as possible.
Fields
backgroundColor

object (GoogleTypeColor)

Text background color.

color

object (GoogleTypeColor)

Text color.

fontFamily

string

Font family such as Arial, Times New Roman. https://www.w3schools.com/cssref/pr_font_font-family.asp

fontSize

object (GoogleCloudDocumentaiV1beta2DocumentStyleFontSize)

Font size.

fontWeight

string

Font weight. Possible values are normal, bold, bolder, and lighter. https://www.w3schools.com/cssref/pr_font_weight.asp

textAnchor

object (GoogleCloudDocumentaiV1beta2DocumentTextAnchor)

Text anchor indexing into the Document.text.

textDecoration

string

Text decoration. Follows CSS standard. https://www.w3schools.com/cssref/pr_text_text-decoration.asp

textStyle

string

Text style. Possible values are normal, italic, and oblique. https://www.w3schools.com/cssref/pr_font_font-style.asp

GoogleCloudDocumentaiV1beta2DocumentStyleFontSize

Font size with unit.
Fields
size

number (float format)

Font size for the text.

unit

string

Unit for the font size. Follows CSS naming (in, px, pt, etc.).

GoogleCloudDocumentaiV1beta2DocumentTextAnchor

Text reference indexing into the Document.text.
Fields
content

string

Contains the content of the text span so that users do not have to look it up in the text_segments. It is always populated for formFields.

textSegments[]

object (GoogleCloudDocumentaiV1beta2DocumentTextAnchorTextSegment)

The text segments from the Document.text.

GoogleCloudDocumentaiV1beta2DocumentTextAnchorTextSegment

A text segment in the Document.text. The indices may be out of bounds which indicate that the text extends into another document shard for large sharded documents. See ShardInfo.text_offset
Fields
endIndex

string (int64 format)

TextSegment half open end UTF-8 char index in the Document.text.

startIndex

string (int64 format)

TextSegment start UTF-8 char index in the Document.text.

GoogleCloudDocumentaiV1beta2DocumentTextChange

This message is used for text changes aka. OCR corrections.
Fields
changedText

string

The text that replaces the text identified in the text_anchor.

provenance[]

object (GoogleCloudDocumentaiV1beta2DocumentProvenance)

The history of this annotation.

textAnchor

object (GoogleCloudDocumentaiV1beta2DocumentTextAnchor)

Provenance of the correction. Text anchor indexing into the Document.text. There can only be a single TextAnchor.text_segments element. If the start and end index of the text segment are the same, the text change is inserted before that index.

GoogleCloudDocumentaiV1beta2GcsDestination

The Google Cloud Storage location where the output file will be written to.
Fields
uri

string

(No description provided)

GoogleCloudDocumentaiV1beta2GcsSource

The Google Cloud Storage location where the input file will be read from.
Fields
uri

string

(No description provided)

GoogleCloudDocumentaiV1beta2InputConfig

The desired input location and metadata.
Fields
contents

string (bytes format)

Content in bytes, represented as a stream of bytes. Note: As with all bytes fields, proto buffer messages use a pure binary representation, whereas JSON representations use base64. This field only works for synchronous ProcessDocument method.

gcsSource

object (GoogleCloudDocumentaiV1beta2GcsSource)

The Google Cloud Storage location to read the input from. This must be a single file.

mimeType

string

Required. Mimetype of the input. Current supported mimetypes are application/pdf, image/tiff, and image/gif. In addition, application/json type is supported for requests with ProcessDocumentRequest.automl_params field set. The JSON file needs to be in Document format.

GoogleCloudDocumentaiV1beta2NormalizedVertex

A vertex represents a 2D point in the image. NOTE: the normalized vertex coordinates are relative to the original image and range from 0 to 1.
Fields
x

number (float format)

X coordinate.

y

number (float format)

Y coordinate (starts from the top of the image).

GoogleCloudDocumentaiV1beta2OperationMetadata

Contains metadata for the BatchProcessDocuments operation.
Fields
createTime

string (Timestamp format)

The creation time of the operation.

state

enum

The state of the current batch processing.

Enum type. Can be one of the following:
STATE_UNSPECIFIED The default value. This value is used if the state is omitted.
ACCEPTED Request is received.
WAITING Request operation is waiting for scheduling.
RUNNING Request is being processed.
SUCCEEDED The batch processing completed successfully.
CANCELLED The batch processing was cancelled.
FAILED The batch processing has failed.
stateMessage

string

A message providing more details about the current state of processing.

updateTime

string (Timestamp format)

The last update time of the operation.

GoogleCloudDocumentaiV1beta2OutputConfig

The desired output location and metadata.
Fields
gcsDestination

object (GoogleCloudDocumentaiV1beta2GcsDestination)

The Google Cloud Storage location to write the output to.

pagesPerShard

integer (int32 format)

The max number of pages to include into each output Document shard JSON on Google Cloud Storage. The valid range is [1, 100]. If not specified, the default value is 20. For example, for one pdf file with 100 pages, 100 parsed pages will be produced. If pages_per_shard = 20, then 5 Document shard JSON files each containing 20 parsed pages will be written under the prefix OutputConfig.gcs_destination.uri and suffix pages-x-to-y.json where x and y are 1-indexed page numbers. Example GCS outputs with 157 pages and pages_per_shard = 50: pages-001-to-050.json pages-051-to-100.json pages-101-to-150.json pages-151-to-157.json

GoogleCloudDocumentaiV1beta2ProcessDocumentResponse

Response to a single document processing request.
Fields
inputConfig

object (GoogleCloudDocumentaiV1beta2InputConfig)

Information about the input file. This is the same as the corresponding input config in the request.

outputConfig

object (GoogleCloudDocumentaiV1beta2OutputConfig)

The output location of the parsed responses. The responses are written to this location as JSON-serialized Document objects.

GoogleCloudDocumentaiV1beta2Vertex

A vertex represents a 2D point in the image. NOTE: the vertex coordinates are in the same scale as the original image.
Fields
x

integer (int32 format)

X coordinate.

y

integer (int32 format)

Y coordinate (starts from the top of the image).

GoogleCloudDocumentaiV1beta3BatchProcessMetadata

The long running operation metadata for batch process method.
Fields
createTime

string (Timestamp format)

The creation time of the operation.

individualProcessStatuses[]

object (GoogleCloudDocumentaiV1beta3BatchProcessMetadataIndividualProcessStatus)

The list of response details of each document.

state

enum

The state of the current batch processing.

Enum type. Can be one of the following:
STATE_UNSPECIFIED The default value. This value is used if the state is omitted.
WAITING Request operation is waiting for scheduling.
RUNNING Request is being processed.
SUCCEEDED The batch processing completed successfully.
CANCELLING The batch processing was being cancelled.
CANCELLED The batch processing was cancelled.
FAILED The batch processing has failed.
stateMessage

string

A message providing more details about the current state of processing. For example, the error message if the operation is failed.

updateTime

string (Timestamp format)

The last update time of the operation.

GoogleCloudDocumentaiV1beta3BatchProcessMetadataIndividualProcessStatus

The status of a each individual document in the batch process.
Fields
humanReviewOperation

string

The name of the operation triggered by the processed document. If the human review process is not triggered, this field will be empty. It has the same response type and metadata as the long running operation returned by ReviewDocument method.

humanReviewStatus

object (GoogleCloudDocumentaiV1beta3HumanReviewStatus)

The status of human review on the processed document.

inputGcsSource

string

The source of the document, same as the [input_gcs_source] field in the request when the batch process started. The batch process is started by take snapshot of that document, since a user can move or change that document during the process.

outputGcsDestination

string

The output_gcs_destination (in the request as output_gcs_destination) of the processed document if it was successful, otherwise empty.

status

object (GoogleRpcStatus)

The status processing the document.

GoogleCloudDocumentaiV1beta3CommonOperationMetadata

The common metadata for long running operations.
Fields
createTime

string (Timestamp format)

The creation time of the operation.

resource

string

A related resource to this operation.

state

enum

The state of the operation.

Enum type. Can be one of the following:
STATE_UNSPECIFIED Unspecified state.
RUNNING Operation is still running.
CANCELLING Operation is being cancelled.
SUCCEEDED Operation succeeded.
FAILED Operation failed.
CANCELLED Operation is cancelled.
stateMessage

string

A message providing more details about the current state of processing.

updateTime

string (Timestamp format)

The last update time of the operation.

GoogleCloudDocumentaiV1beta3DeleteProcessorMetadata

The long running operation metadata for delete processor method.
Fields
commonMetadata

object (GoogleCloudDocumentaiV1beta3CommonOperationMetadata)

The basic metadata of the long running operation.

GoogleCloudDocumentaiV1beta3DeleteProcessorVersionMetadata

The long running operation metadata for delete processor version method.
Fields
commonMetadata

object (GoogleCloudDocumentaiV1beta3CommonOperationMetadata)

The basic metadata of the long running operation.

GoogleCloudDocumentaiV1beta3DeployProcessorVersionMetadata

The long running operation metadata for deploy processor version method.
Fields
commonMetadata

object (GoogleCloudDocumentaiV1beta3CommonOperationMetadata)

The basic metadata of the long running operation.

GoogleCloudDocumentaiV1beta3DisableProcessorMetadata

The long running operation metadata for disable processor method.
Fields
commonMetadata

object (GoogleCloudDocumentaiV1beta3CommonOperationMetadata)

The basic metadata of the long running operation.

GoogleCloudDocumentaiV1beta3EnableProcessorMetadata

The long running operation metadata for enable processor method.
Fields
commonMetadata

object (GoogleCloudDocumentaiV1beta3CommonOperationMetadata)

The basic metadata of the long running operation.

GoogleCloudDocumentaiV1beta3EvaluateProcessorVersionMetadata

Metadata of the EvaluateProcessorVersion method.
Fields
commonMetadata

object (GoogleCloudDocumentaiV1beta3CommonOperationMetadata)

The basic metadata of the long running operation.

GoogleCloudDocumentaiV1beta3EvaluateProcessorVersionResponse

Metadata of the EvaluateProcessorVersion method.
Fields
evaluation

string

The resource name of the created evaluation.

GoogleCloudDocumentaiV1beta3HumanReviewStatus

The status of human review on a processed document.
Fields
humanReviewOperation

string

The name of the operation triggered by the processed document. This field is populated only when the [state] is [HUMAN_REVIEW_IN_PROGRESS]. It has the same response type and metadata as the long running operation returned by [ReviewDocument] method.

state

enum

The state of human review on the processing request.

Enum type. Can be one of the following:
STATE_UNSPECIFIED Human review state is unspecified. Most likely due to an internal error.
SKIPPED Human review is skipped for the document. This can happen because human review is not enabled on the processor or the processing request has been set to skip this document.
VALIDATION_PASSED Human review validation is triggered and passed, so no review is needed.
IN_PROGRESS Human review validation is triggered and the document is under review.
ERROR Some error happened during triggering human review, see the [state_message] for details.
stateMessage

string

A message providing more details about the human review state.

GoogleCloudDocumentaiV1beta3ReviewDocumentOperationMetadata

The long running operation metadata for review document method.
Fields
commonMetadata

object (GoogleCloudDocumentaiV1beta3CommonOperationMetadata)

The basic metadata of the long running operation.

createTime

string (Timestamp format)

The creation time of the operation.

questionId

string

The Crowd Compute question ID.

state

enum

Used only when Operation.done is false.

Enum type. Can be one of the following:
STATE_UNSPECIFIED Unspecified state.
RUNNING Operation is still running.
CANCELLING Operation is being cancelled.
SUCCEEDED Operation succeeded.
FAILED Operation failed.
CANCELLED Operation is cancelled.
stateMessage

string

A message providing more details about the current state of processing. For example, the error message if the operation is failed.

updateTime

string (Timestamp format)

The last update time of the operation.

GoogleCloudDocumentaiV1beta3ReviewDocumentResponse

Response message for review document method.
Fields
gcsDestination

string

The Cloud Storage uri for the human reviewed document if the review is succeeded.

rejectionReason

string

The reason why the review is rejected by reviewer.

state

enum

The state of the review operation.

Enum type. Can be one of the following:
STATE_UNSPECIFIED The default value. This value is used if the state is omitted.
REJECTED The review operation is rejected by the reviewer.
SUCCEEDED The review operation is succeeded.

GoogleCloudDocumentaiV1beta3SetDefaultProcessorVersionMetadata

The long running operation metadata for set default processor version method.
Fields
commonMetadata

object (GoogleCloudDocumentaiV1beta3CommonOperationMetadata)

The basic metadata of the long running operation.

GoogleCloudDocumentaiV1beta3TrainProcessorVersionMetadata

The metadata that represents a processor version being created.
Fields
commonMetadata

object (GoogleCloudDocumentaiV1beta3CommonOperationMetadata)

The basic metadata of the long running operation.

testDatasetValidation

object (GoogleCloudDocumentaiV1beta3TrainProcessorVersionMetadataDatasetValidation)

The test dataset validation information.

trainingDatasetValidation

object (GoogleCloudDocumentaiV1beta3TrainProcessorVersionMetadataDatasetValidation)

The training dataset validation information.

GoogleCloudDocumentaiV1beta3TrainProcessorVersionMetadataDatasetValidation

The dataset validation information. This includes any and all errors with documents and the dataset.
Fields
datasetErrorCount

integer (int32 format)

The total number of dataset errors.

datasetErrors[]

object (GoogleRpcStatus)

Error information for the dataset as a whole. A maximum of 10 dataset errors will be returned. A single dataset error is terminal for training.

documentErrorCount

integer (int32 format)

The total number of document errors.

documentErrors[]

object (GoogleRpcStatus)

Error information pertaining to specific documents. A maximum of 10 document errors will be returned. Any document with errors will not be used throughout training.

GoogleCloudDocumentaiV1beta3TrainProcessorVersionResponse

The response for the TrainProcessorVersion method.
Fields
processorVersion

string

The resource name of the processor version produced by training.

GoogleCloudDocumentaiV1beta3UndeployProcessorVersionMetadata

The long running operation metadata for the undeploy processor version method.
Fields
commonMetadata

object (GoogleCloudDocumentaiV1beta3CommonOperationMetadata)

The basic metadata of the long running operation.

GoogleCloudLocationListLocationsResponse

The response message for Locations.ListLocations.
Fields
locations[]

object (GoogleCloudLocationLocation)

A list of locations that matches the specified filter in the request.

nextPageToken

string

The standard List next-page token.

GoogleCloudLocationLocation

A resource that represents Google Cloud Platform location.
Fields
displayName

string

The friendly name for this location, typically a nearby city name. For example, "Tokyo".

labels

map (key: string, value: string)

Cross-service attributes for the location. For example {"cloud.googleapis.com/region": "us-east1"}

locationId

string

The canonical id for this location. For example: "us-east1".

metadata

map (key: string, value: any)

Service-specific metadata. For example the available capacity at the given location.

name

string

Resource name for the location, which may vary between implementations. For example: "projects/example-project/locations/us-east1"

GoogleLongrunningListOperationsResponse

The response message for Operations.ListOperations.
Fields
nextPageToken

string

The standard List next-page token.

operations[]

object (GoogleLongrunningOperation)

A list of operations that matches the specified filter in the request.

GoogleLongrunningOperation

This resource represents a long-running operation that is the result of a network API call.
Fields
done

boolean

If the value is false, it means the operation is still in progress. If true, the operation is completed, and either error or response is available.

error

object (GoogleRpcStatus)

The error result of the operation in case of failure or cancellation.

metadata

map (key: string, value: any)

Service-specific metadata associated with the operation. It typically contains progress information and common metadata such as create time. Some services might not provide such metadata. Any method that returns a long-running operation should document the metadata type, if any.

name

string

The server-assigned name, which is only unique within the same service that originally returns it. If you use the default HTTP mapping, the name should be a resource name ending with operations/{unique_id}.

response

map (key: string, value: any)

The normal response of the operation in case of success. If the original method returns no data on success, such as Delete, the response is google.protobuf.Empty. If the original method is standard Get/Create/Update, the response should be the resource. For other methods, the response should have the type XxxResponse, where Xxx is the original method name. For example, if the original method name is TakeSnapshot(), the inferred response type is TakeSnapshotResponse.

GoogleRpcStatus

The Status type defines a logical error model that is suitable for different programming environments, including REST APIs and RPC APIs. It is used by gRPC. Each Status message contains three pieces of data: error code, error message, and error details. You can find out more about this error model and how to work with it in the API Design Guide.
Fields
code

integer (int32 format)

The status code, which should be an enum value of google.rpc.Code.

details[]

object

A list of messages that carry the error details. There is a common set of message types for APIs to use.

message

string

A developer-facing error message, which should be in English. Any user-facing error message should be localized and sent in the google.rpc.Status.details field, or localized by the client.

GoogleTypeColor

Represents a color in the RGBA color space. This representation is designed for simplicity of conversion to/from color representations in various languages over compactness. For example, the fields of this representation can be trivially provided to the constructor of java.awt.Color in Java; it can also be trivially provided to UIColor's +colorWithRed:green:blue:alpha method in iOS; and, with just a little work, it can be easily formatted into a CSS rgba() string in JavaScript. This reference page doesn't carry information about the absolute color space that should be used to interpret the RGB value (e.g. sRGB, Adobe RGB, DCI-P3, BT.2020, etc.). By default, applications should assume the sRGB color space. When color equality needs to be decided, implementations, unless documented otherwise, treat two colors as equal if all their red, green, blue, and alpha values each differ by at most 1e-5. Example (Java): import com.google.type.Color; // ... public static java.awt.Color fromProto(Color protocolor) { float alpha = protocolor.hasAlpha() ? protocolor.getAlpha().getValue() : 1.0; return new java.awt.Color( protocolor.getRed(), protocolor.getGreen(), protocolor.getBlue(), alpha); } public static Color toProto(java.awt.Color color) { float red = (float) color.getRed(); float green = (float) color.getGreen(); float blue = (float) color.getBlue(); float denominator = 255.0; Color.Builder resultBuilder = Color .newBuilder() .setRed(red / denominator) .setGreen(green / denominator) .setBlue(blue / denominator); int alpha = color.getAlpha(); if (alpha != 255) { result.setAlpha( FloatValue .newBuilder() .setValue(((float) alpha) / denominator) .build()); } return resultBuilder.build(); } // ... Example (iOS / Obj-C): // ... static UIColor fromProto(Color protocolor) { float red = [protocolor red]; float green = [protocolor green]; float blue = [protocolor blue]; FloatValue alpha_wrapper = [protocolor alpha]; float alpha = 1.0; if (alpha_wrapper != nil) { alpha = [alpha_wrapper value]; } return [UIColor colorWithRed:red green:green blue:blue alpha:alpha]; } static Color toProto(UIColor color) { CGFloat red, green, blue, alpha; if (![color getRed:&red green:&green blue:&blue alpha:&alpha]) { return nil; } Color result = [[Color alloc] init]; [result setRed:red]; [result setGreen:green]; [result setBlue:blue]; if (alpha <= 0.9999) { [result setAlpha:floatWrapperWithValue(alpha)]; } [result autorelease]; return result; } // ... Example (JavaScript): // ... var protoToCssColor = function(rgb_color) { var redFrac = rgb_color.red || 0.0; var greenFrac = rgb_color.green || 0.0; var blueFrac = rgb_color.blue || 0.0; var red = Math.floor(redFrac * 255); var green = Math.floor(greenFrac * 255); var blue = Math.floor(blueFrac * 255); if (!('alpha' in rgb_color)) { return rgbToCssColor(red, green, blue); } var alphaFrac = rgb_color.alpha.value || 0.0; var rgbParams = [red, green, blue].join(','); return ['rgba(', rgbParams, ',', alphaFrac, ')'].join(''); }; var rgbToCssColor = function(red, green, blue) { var rgbNumber = new Number((red << 16) | (green << 8) | blue); var hexString = rgbNumber.toString(16); var missingZeros = 6 - hexString.length; var resultBuilder = ['#']; for (var i = 0; i < missingZeros; i++) { resultBuilder.push('0'); } resultBuilder.push(hexString); return resultBuilder.join(''); }; // ...
Fields
alpha

number (float format)

The fraction of this color that should be applied to the pixel. That is, the final pixel color is defined by the equation: pixel color = alpha * (this color) + (1.0 - alpha) * (background color) This means that a value of 1.0 corresponds to a solid color, whereas a value of 0.0 corresponds to a completely transparent color. This uses a wrapper message rather than a simple float scalar so that it is possible to distinguish between a default value and the value being unset. If omitted, this color object is rendered as a solid color (as if the alpha value had been explicitly given a value of 1.0).

blue

number (float format)

The amount of blue in the color as a value in the interval [0, 1].

green

number (float format)

The amount of green in the color as a value in the interval [0, 1].

red

number (float format)

The amount of red in the color as a value in the interval [0, 1].

GoogleTypeDate

Represents a whole or partial calendar date, such as a birthday. The time of day and time zone are either specified elsewhere or are insignificant. The date is relative to the Gregorian Calendar. This can represent one of the following: * A full date, with non-zero year, month, and day values. * A month and day, with a zero year (for example, an anniversary). * A year on its own, with a zero month and a zero day. * A year and month, with a zero day (for example, a credit card expiration date). Related types: * google.type.TimeOfDay * google.type.DateTime * google.protobuf.Timestamp
Fields
day

integer (int32 format)

Day of a month. Must be from 1 to 31 and valid for the year and month, or 0 to specify a year by itself or a year and month where the day isn't significant.

month

integer (int32 format)

Month of a year. Must be from 1 to 12, or 0 to specify a year without a month and day.

year

integer (int32 format)

Year of the date. Must be from 1 to 9999, or 0 to specify a date without a year.

GoogleTypeDateTime

Represents civil time (or occasionally physical time). This type can represent a civil time in one of a few possible ways: * When utc_offset is set and time_zone is unset: a civil time on a calendar day with a particular offset from UTC. * When time_zone is set and utc_offset is unset: a civil time on a calendar day in a particular time zone. * When neither time_zone nor utc_offset is set: a civil time on a calendar day in local time. The date is relative to the Proleptic Gregorian Calendar. If year, month, or day are 0, the DateTime is considered not to have a specific year, month, or day respectively. This type may also be used to represent a physical time if all the date and time fields are set and either case of the time_offset oneof is set. Consider using Timestamp message for physical time instead. If your use case also would like to store the user's timezone, that can be done in another field. This type is more flexible than some applications may want. Make sure to document and validate your application's limitations.
Fields
day

integer (int32 format)

Optional. Day of month. Must be from 1 to 31 and valid for the year and month, or 0 if specifying a datetime without a day.

hours

integer (int32 format)

Optional. Hours of day in 24 hour format. Should be from 0 to 23, defaults to 0 (midnight). An API may choose to allow the value "24:00:00" for scenarios like business closing time.

minutes

integer (int32 format)

Optional. Minutes of hour of day. Must be from 0 to 59, defaults to 0.

month

integer (int32 format)

Optional. Month of year. Must be from 1 to 12, or 0 if specifying a datetime without a month.

nanos

integer (int32 format)

Optional. Fractions of seconds in nanoseconds. Must be from 0 to 999,999,999, defaults to 0.

seconds

integer (int32 format)

Optional. Seconds of minutes of the time. Must normally be from 0 to 59, defaults to 0. An API may allow the value 60 if it allows leap-seconds.

timeZone

object (GoogleTypeTimeZone)

Time zone.

utcOffset

string (Duration format)

UTC offset. Must be whole seconds, between -18 hours and +18 hours. For example, a UTC offset of -4:00 would be represented as { seconds: -14400 }.

year

integer (int32 format)

Optional. Year of date. Must be from 1 to 9999, or 0 if specifying a datetime without a year.

GoogleTypeMoney

Represents an amount of money with its currency type.
Fields
currencyCode

string

The three-letter currency code defined in ISO 4217.

nanos

integer (int32 format)

Number of nano (10^-9) units of the amount. The value must be between -999,999,999 and +999,999,999 inclusive. If units is positive, nanos must be positive or zero. If units is zero, nanos can be positive, zero, or negative. If units is negative, nanos must be negative or zero. For example $-1.75 is represented as units=-1 and nanos=-750,000,000.

units

string (int64 format)

The whole units of the amount. For example if currencyCode is "USD", then 1 unit is one US dollar.

GoogleTypePostalAddress

Represents a postal address, e.g. for postal delivery or payments addresses. Given a postal address, a postal service can deliver items to a premise, P.O. Box or similar. It is not intended to model geographical locations (roads, towns, mountains). In typical usage an address would be created via user input or from importing existing data, depending on the type of process. Advice on address input / editing: - Use an internationalization-ready address widget such as https://github.com/google/libaddressinput) - Users should not be presented with UI elements for input or editing of fields outside countries where that field is used. For more guidance on how to use this schema, please see: https://support.google.com/business/answer/6397478
Fields
addressLines[]

string

Unstructured address lines describing the lower levels of an address. Because values in address_lines do not have type information and may sometimes contain multiple values in a single field (e.g. "Austin, TX"), it is important that the line order is clear. The order of address lines should be "envelope order" for the country/region of the address. In places where this can vary (e.g. Japan), address_language is used to make it explicit (e.g. "ja" for large-to-small ordering and "ja-Latn" or "en" for small-to-large). This way, the most specific line of an address can be selected based on the language. The minimum permitted structural representation of an address consists of a region_code with all remaining information placed in the address_lines. It would be possible to format such an address very approximately without geocoding, but no semantic reasoning could be made about any of the address components until it was at least partially resolved. Creating an address only containing a region_code and address_lines, and then geocoding is the recommended way to handle completely unstructured addresses (as opposed to guessing which parts of the address should be localities or administrative areas).

administrativeArea

string

Optional. Highest administrative subdivision which is used for postal addresses of a country or region. For example, this can be a state, a province, an oblast, or a prefecture. Specifically, for Spain this is the province and not the autonomous community (e.g. "Barcelona" and not "Catalonia"). Many countries don't use an administrative area in postal addresses. E.g. in Switzerland this should be left unpopulated.

languageCode

string

Optional. BCP-47 language code of the contents of this address (if known). This is often the UI language of the input form or is expected to match one of the languages used in the address' country/region, or their transliterated equivalents. This can affect formatting in certain countries, but is not critical to the correctness of the data and will never affect any validation or other non-formatting related operations. If this value is not known, it should be omitted (rather than specifying a possibly incorrect default). Examples: "zh-Hant", "ja", "ja-Latn", "en".

locality

string

Optional. Generally refers to the city/town portion of the address. Examples: US city, IT comune, UK post town. In regions of the world where localities are not well defined or do not fit into this structure well, leave locality empty and use address_lines.

organization

string

Optional. The name of the organization at the address.

postalCode

string

Optional. Postal code of the address. Not all countries use or require postal codes to be present, but where they are used, they may trigger additional validation with other parts of the address (e.g. state/zip validation in the U.S.A.).

recipients[]

string

Optional. The recipient at the address. This field may, under certain circumstances, contain multiline information. For example, it might contain "care of" information.

regionCode

string

Required. CLDR region code of the country/region of the address. This is never inferred and it is up to the user to ensure the value is correct. See https://cldr.unicode.org/ and https://www.unicode.org/cldr/charts/30/supplemental/territory_information.html for details. Example: "CH" for Switzerland.

revision

integer (int32 format)

The schema revision of the PostalAddress. This must be set to 0, which is the latest revision. All new revisions must be backward compatible with old revisions.

sortingCode

string

Optional. Additional, country-specific, sorting code. This is not used in most regions. Where it is used, the value is either a string like "CEDEX", optionally followed by a number (e.g. "CEDEX 7"), or just a number alone, representing the "sector code" (Jamaica), "delivery area indicator" (Malawi) or "post office indicator" (e.g. Côte d'Ivoire).

sublocality

string

Optional. Sublocality of the address. For example, this can be neighborhoods, boroughs, districts.

GoogleTypeTimeZone

Represents a time zone from the IANA Time Zone Database.
Fields
id

string

IANA Time Zone Database time zone, e.g. "America/New_York".

version

string

Optional. IANA Time Zone Database version number, e.g. "2019a".