OutputConfig

The desired output location and metadata.

JSON representation
{
  "pagesPerShard": integer,
  "gcsDestination": {
    object (GcsDestination)
  }
}
Fields
pagesPerShard

integer

The max number of pages to include into each output Document shard JSON on Google Cloud Storage.

The valid range is [1, 100]. If not specified, the default value is 20.

For example, for one pdf file with 100 pages, 100 parsed pages will be produced. If pagesPerShard = 20, then 5 Document shard JSON files each containing 20 parsed pages will be written under the prefix OutputConfig.gcs_destination.uri and suffix pages-x-to-y.json where x and y are 1-indexed page numbers.

Example value="/document-ai/docs/_project.yaml" outputs with 157 pages and pagesPerShard = 50:

  • pages-001-to-050.json
  • pages-051-to-100.json
  • pages-101-to-150.json
  • pages-151-to-157.json
gcsDestination

object (GcsDestination)

The Google Cloud Storage location to write the output to.

GcsDestination

The Google Cloud Storage location where the output file will be written to.

JSON representation
{
  "uri": string
}
Fields
uri

string