Class Dataset (2.27.0)

Dataset(mapping=None, *, ignore_unknown_fields=False, **kwargs)

A singleton resource under a Processor which configures a collection of documents.

This message has oneof_ fields (mutually exclusive fields). For each oneof, at most one member field can be set at the same time. Setting any member of the oneof automatically clears all other members.

.. _oneof: https://proto-plus-python.readthedocs.io/en/stable/fields.html#oneofs-mutually-exclusive-fields

Attributes

NameDescription
gcs_managed_config google.cloud.documentai_v1beta3.types.Dataset.GCSManagedConfig
Optional. User-managed Cloud Storage dataset configuration. Use this configuration if the dataset documents are stored under a user-managed Cloud Storage location. This field is a member of oneof_ storage_source.
document_warehouse_config google.cloud.documentai_v1beta3.types.Dataset.DocumentWarehouseConfig
Optional. Deprecated. Warehouse-based dataset configuration is not supported. This field is a member of oneof_ storage_source.
unmanaged_dataset_config google.cloud.documentai_v1beta3.types.Dataset.UnmanagedDatasetConfig
Optional. Unmanaged dataset configuration. Use this configuration if the dataset documents are managed by the document service internally (not user-managed). This field is a member of oneof_ storage_source.
spanner_indexing_config google.cloud.documentai_v1beta3.types.Dataset.SpannerIndexingConfig
Optional. A lightweight indexing source with low latency and high reliability, but lacking advanced features like CMEK and content-based search. This field is a member of oneof_ indexing_source.
name str
Dataset resource name. Format: projects/{project}/locations/{location}/processors/{processor}/dataset
state google.cloud.documentai_v1beta3.types.Dataset.State
Required. State of the dataset. Ignored when updating dataset.

Classes

DocumentWarehouseConfig

DocumentWarehouseConfig(mapping=None, *, ignore_unknown_fields=False, **kwargs)

Configuration specific to the Document AI Warehouse-based implementation.

GCSManagedConfig

GCSManagedConfig(mapping=None, *, ignore_unknown_fields=False, **kwargs)

Configuration specific to the Cloud Storage-based implementation.

SpannerIndexingConfig

SpannerIndexingConfig(mapping=None, *, ignore_unknown_fields=False, **kwargs)

Configuration specific to spanner-based indexing.

State

State(value)

Different states of a dataset.

Values: STATE_UNSPECIFIED (0): Default unspecified enum, should not be used. UNINITIALIZED (1): Dataset has not been initialized. INITIALIZING (2): Dataset is being initialized. INITIALIZED (3): Dataset has been initialized.

UnmanagedDatasetConfig

UnmanagedDatasetConfig(mapping=None, *, ignore_unknown_fields=False, **kwargs)

Configuration specific to an unmanaged dataset.