DocumentOutputConfig

Config that controls the output of documents. All documents will be written as a JSON file.

JSON representation
{

  // Union field destination can be only one of the following:
  "gcsOutputConfig": {
    object (GcsOutputConfig)
  }
  // End of list of possible types for union field destination.
}
Fields
Union field destination. The destination of the results. destination can be only one of the following:
gcsOutputConfig

object (GcsOutputConfig)

Output config to write the results to Cloud Storage.

GcsOutputConfig

The configuration used when outputting documents.

JSON representation
{
  "gcsUri": string,
  "fieldMask": string,
  "shardingConfig": {
    object (ShardingConfig)
  }
}
Fields
gcsUri

string

The Cloud Storage uri (a directory) of the output.

fieldMask

string (FieldMask format)

Specifies which fields to include in the output documents. Only supports top level document and pages field so it must be in the form of {document_field_name} or pages.{page_field_name}.

This is a comma-separated list of fully qualified names of fields. Example: "user.displayName,photo".

shardingConfig

object (ShardingConfig)

Specifies the sharding config for the output document.

ShardingConfig

The sharding config for the output document.

JSON representation
{
  "pagesPerShard": integer,
  "pagesOverlap": integer
}
Fields
pagesPerShard

integer

The number of pages per shard.

pagesOverlap

integer

The number of overlapping pages between consecutive shards.