Discovery Engine V1BETA API - Class Google::Cloud::DiscoveryEngine::V1beta::TrainCustomModelRequest::GcsTrainingInput (v0.14.1)

Reference documentation and code samples for the Discovery Engine V1BETA API class Google::Cloud::DiscoveryEngine::V1beta::TrainCustomModelRequest::GcsTrainingInput.

Cloud Storage training data input.

Inherits

Object

Extended By

Google::Protobuf::MessageExts::ClassMethods

Includes

Google::Protobuf::MessageExts

Methods

#corpus_data_path

def corpus_data_path() -> ::String

Returns

(::String) — The Cloud Storage corpus data which could be associated in train data. The data path format is gs://<bucket_to_data>/<jsonl_file_name>. A newline delimited jsonl/ndjson file.

For search-tuning model, each line should have the _id, title and text. Example: {"_id": "doc1", title: "relevant doc", "text": "relevant text"}

#corpus_data_path=

def corpus_data_path=(value) -> ::String

Parameter

value (::String) — The Cloud Storage corpus data which could be associated in train data. The data path format is gs://<bucket_to_data>/<jsonl_file_name>. A newline delimited jsonl/ndjson file.

For search-tuning model, each line should have the _id, title and text. Example: {"_id": "doc1", title: "relevant doc", "text": "relevant text"}

Returns

(::String) — The Cloud Storage corpus data which could be associated in train data. The data path format is gs://<bucket_to_data>/<jsonl_file_name>. A newline delimited jsonl/ndjson file.

For search-tuning model, each line should have the _id, title and text. Example: {"_id": "doc1", title: "relevant doc", "text": "relevant text"}

#query_data_path

def query_data_path() -> ::String

Returns

(::String) — The gcs query data which could be associated in train data. The data path format is gs://<bucket_to_data>/<jsonl_file_name>. A newline delimited jsonl/ndjson file.

For search-tuning model, each line should have the _id and text. Example: {"_id": "query1", "text": "example query"}

#query_data_path=

def query_data_path=(value) -> ::String

Parameter

value (::String) — The gcs query data which could be associated in train data. The data path format is gs://<bucket_to_data>/<jsonl_file_name>. A newline delimited jsonl/ndjson file.

For search-tuning model, each line should have the _id and text. Example: {"_id": "query1", "text": "example query"}

Returns

(::String) — The gcs query data which could be associated in train data. The data path format is gs://<bucket_to_data>/<jsonl_file_name>. A newline delimited jsonl/ndjson file.

For search-tuning model, each line should have the _id and text. Example: {"_id": "query1", "text": "example query"}

#test_data_path

def test_data_path() -> ::String

Returns

(::String) — Cloud Storage test data. Same format as train_data_path. If not provided, a random 80/20 train/test split will be performed on train_data_path.

#test_data_path=

def test_data_path=(value) -> ::String

Parameter

value (::String) — Cloud Storage test data. Same format as train_data_path. If not provided, a random 80/20 train/test split will be performed on train_data_path.

Returns

(::String) — Cloud Storage test data. Same format as train_data_path. If not provided, a random 80/20 train/test split will be performed on train_data_path.

#train_data_path

def train_data_path() -> ::String

Returns

(::String) —
Cloud Storage training data path whose format should be gs://<bucket_to_data>/<tsv_file_name>. The file should be in tsv format. Each line should have the doc_id and query_id and score (number).

For search-tuning model, it should have the query-id corpus-id score as tsv file header. The score should be a number in [0, inf+). The larger the number is, the more relevant the pair is. Example:
- query-id\tcorpus-id\tscore
- query1\tdoc1\t1

#train_data_path=

def train_data_path=(value) -> ::String

Parameter

value (::String) —
Cloud Storage training data path whose format should be gs://<bucket_to_data>/<tsv_file_name>. The file should be in tsv format. Each line should have the doc_id and query_id and score (number).

For search-tuning model, it should have the query-id corpus-id score as tsv file header. The score should be a number in [0, inf+). The larger the number is, the more relevant the pair is. Example:
- query-id\tcorpus-id\tscore
- query1\tdoc1\t1

Returns

(::String) —
Cloud Storage training data path whose format should be gs://<bucket_to_data>/<tsv_file_name>. The file should be in tsv format. Each line should have the doc_id and query_id and score (number).

For search-tuning model, it should have the query-id corpus-id score as tsv file header. The score should be a number in [0, inf+). The larger the number is, the more relevant the pair is. Example:
- query-id\tcorpus-id\tscore
- query1\tdoc1\t1