BatchProcessDocumentsResponse

Response to an batch document processing request. This is returned in the LRO Operation after the operation is complete.

JSON representation
{
  "responses": [
    {
      object (ProcessDocumentResponse)
    }
  ]
}
Fields
responses[]

object (ProcessDocumentResponse)

Responses for each individual document.

ProcessDocumentResponse

Response to a single document processing request.

JSON representation
{
  "inputConfig": {
    object (InputConfig)
  },
  "outputConfig": {
    object (OutputConfig)
  }
}
Fields
inputConfig

object (InputConfig)

Information about the input file. This is the same as the corresponding input config in the request.

outputConfig

object (OutputConfig)

The output location of the parsed responses. The responses are written to this location as JSON-serialized Document objects.

InputConfig

The desired input location and metadata.

JSON representation
{
  "mimeType": string,
  "gcsSource": {
    object (GcsSource)
  }
}
Fields
mimeType

string

Required. Mimetype of the input. Current supported mimetypes are application/pdf, image/tiff, and image/gif. In addition, application/json type is supported for requests with [ProcessDocumentRequest.automl_params][] field set. The JSON file needs to be in Document format.

gcsSource

object (GcsSource)

The Google Cloud Storage location to read the input from. This must be a single file.

GcsSource

The Google Cloud Storage location where the input file will be read from.

JSON representation
{
  "uri": string
}
Fields
uri

string

OutputConfig

The desired output location and metadata.

JSON representation
{
  "pagesPerShard": integer,
  "gcsDestination": {
    object (GcsDestination)
  }
}
Fields
pagesPerShard

integer

The max number of pages to include into each output Document shard JSON on Google Cloud Storage.

The valid range is [1, 100]. If not specified, the default value is 20.

For example, for one pdf file with 100 pages, 100 parsed pages will be produced. If pagesPerShard = 20, then 5 Document shard JSON files each containing 20 parsed pages will be written under the prefix [OutputConfig.gcs_destination.uri][] and suffix pages-x-to-y.json where x and y are 1-indexed page numbers.

Example Google Cloud Storage outputs with 157 pages and pagesPerShard = 50:

pages-001-to-050.json pages-051-to-100.json pages-101-to-150.json pages-151-to-157.json

gcsDestination

object (GcsDestination)

The Google Cloud Storage location to write the output to.

GcsDestination

The Google Cloud Storage location where the output file will be written to.

JSON representation
{
  "uri": string
}
Fields
uri

string