Class ImportDataConfig (0.4.0)

ImportDataConfig(mapping=None, *, ignore_unknown_fields=False, **kwargs)

Describes the location from where we import data into a Dataset, together with the labels that will be applied to the DataItems and the Annotations.


gcs_source `.io.GcsSource`
The Google Cloud Storage location for the input content.
data_item_labels Sequence[`.dataset.ImportDataConfig.DataItemLabelsEntry`]
Labels that will be applied to newly imported DataItems. If an identical DataItem as one being imported already exists in the Dataset, then these labels will be appended to these of the already existing one, and if labels with identical key is imported before, the old label value will be overwritten. If two DataItems are identical in the same import data operation, the labels will be combined and if key collision happens in this case, one of the values will be picked randomly. Two DataItems are considered identical if their content bytes are identical (e.g. image bytes or pdf bytes). These labels will be overridden by Annotation labels specified inside index file refenced by ``import_schema_uri``, e.g. jsonl file.
import_schema_uri str
Required. Points to a YAML file stored on Google Cloud Storage describing the import format. Validation will be done against the schema. The schema is defined as an `OpenAPI 3.0.2 Schema Object


builtins.object > proto.message.Message > ImportDataConfig



DataItemLabelsEntry(mapping=None, *, ignore_unknown_fields=False, **kwargs)

The abstract base class for a message.

kwargs dict

Keys and values corresponding to the fields of the message.

mapping Union[dict, `.Message`]

A dictionary or message to be used to determine the values for this message.

ignore_unknown_fields Optional(bool)

If True, do not raise errors for unknown fields. Only applied if mapping is a mapping type or there are keyword parameters.