Interface TrainCustomModelRequest.GcsTrainingInputOrBuilder (0.32.0)

public static interface TrainCustomModelRequest.GcsTrainingInputOrBuilder extends MessageOrBuilder

Implements

MessageOrBuilder

Methods

getCorpusDataPath()

public abstract String getCorpusDataPath()

The gcs corpus data which could be associated in train data. The data path format is gs://<bucket_to_data>/<jsonl_file_name>. A newline delimited jsonl/ndjson file.

  • For search-tuning model, each line should have the _id, title and text. Example: {"_id": "doc1", title: "relevant doc", "text": "relevant text"}

string corpus_data_path = 1;

Returns
TypeDescription
String

The corpusDataPath.

getCorpusDataPathBytes()

public abstract ByteString getCorpusDataPathBytes()

The gcs corpus data which could be associated in train data. The data path format is gs://<bucket_to_data>/<jsonl_file_name>. A newline delimited jsonl/ndjson file.

  • For search-tuning model, each line should have the _id, title and text. Example: {"_id": "doc1", title: "relevant doc", "text": "relevant text"}

string corpus_data_path = 1;

Returns
TypeDescription
ByteString

The bytes for corpusDataPath.

getQueryDataPath()

public abstract String getQueryDataPath()

The gcs query data which could be associated in train data. The data path format is gs://<bucket_to_data>/<jsonl_file_name>. A newline delimited jsonl/ndjson file.

  • For search-tuning model, each line should have the _id and text. Example: {"_id": "query1", "text": "example query"}

string query_data_path = 2;

Returns
TypeDescription
String

The queryDataPath.

getQueryDataPathBytes()

public abstract ByteString getQueryDataPathBytes()

The gcs query data which could be associated in train data. The data path format is gs://<bucket_to_data>/<jsonl_file_name>. A newline delimited jsonl/ndjson file.

  • For search-tuning model, each line should have the _id and text. Example: {"_id": "query1", "text": "example query"}

string query_data_path = 2;

Returns
TypeDescription
ByteString

The bytes for queryDataPath.

getTestDataPath()

public abstract String getTestDataPath()

Gcs test data. Same format as train_data_path. If not provided, a random 80/20 train/test split will be performed on train_data_path.

string test_data_path = 4;

Returns
TypeDescription
String

The testDataPath.

getTestDataPathBytes()

public abstract ByteString getTestDataPathBytes()

Gcs test data. Same format as train_data_path. If not provided, a random 80/20 train/test split will be performed on train_data_path.

string test_data_path = 4;

Returns
TypeDescription
ByteString

The bytes for testDataPath.

getTrainDataPath()

public abstract String getTrainDataPath()

Gcs training data path whose format should be gs://<bucket_to_data>/<tsv_file_name>. The file should be in tsv format. Each line should have the doc_id and query_id and score (number).

  • For search-tuning model, it should have the query-id corpus-id score as tsv file header. The score should be a number in [0, inf+). The larger the number is, the more relevant the pair is. Example: query-id\tcorpus-id\tscore query1\tdoc1\t1

string train_data_path = 3;

Returns
TypeDescription
String

The trainDataPath.

getTrainDataPathBytes()

public abstract ByteString getTrainDataPathBytes()

Gcs training data path whose format should be gs://<bucket_to_data>/<tsv_file_name>. The file should be in tsv format. Each line should have the doc_id and query_id and score (number).

  • For search-tuning model, it should have the query-id corpus-id score as tsv file header. The score should be a number in [0, inf+). The larger the number is, the more relevant the pair is. Example: query-id\tcorpus-id\tscore query1\tdoc1\t1

string train_data_path = 3;

Returns
TypeDescription
ByteString

The bytes for trainDataPath.