Class ExportToCdwPipeline (0.7.7)

ExportToCdwPipeline(mapping=None, *, ignore_unknown_fields=False, **kwargs)

The configuration of exporting documents from the Document Warehouse to CDW pipeline.

Attributes

NameDescription
documents MutableSequence[str]
The list of all the resource names of the documents to be processed. Format: projects/{project_number}/locations/{location}/documents/{document_id}.
export_folder_path str
The Cloud Storage folder path used to store the exported documents before being sent to CDW. Format: gs://.
doc_ai_dataset str
Optional. The CDW dataset resource name. This field is optional. If not set, the documents will be exported to Cloud Storage only. Format: projects/{project}/locations/{location}/processors/{processor}/dataset
training_split_ratio float
Ratio of training dataset split. When importing into Document AI Workbench, documents will be automatically split into training and test split category with the specified ratio. This field is required if doc_ai_dataset is set.