Interface TrainCustomModelRequest.GcsTrainingInputOrBuilder (0.37.0)

public static interface TrainCustomModelRequest.GcsTrainingInputOrBuilder extends MessageOrBuilder

Implements

MessageOrBuilder

Methods

getCorpusDataPath()

public abstract String getCorpusDataPath()

The Cloud Storage corpus data which could be associated in train data. The data path format is gs://<bucket_to_data>/<jsonl_file_name>. A newline delimited jsonl/ndjson file.

For search-tuning model, each line should have the _id, title and text. Example: {"_id": "doc1", title: "relevant doc", "text": "relevant text"}

string corpus_data_path = 1;

Returns
TypeDescription
String

The corpusDataPath.

getCorpusDataPathBytes()

public abstract ByteString getCorpusDataPathBytes()

The Cloud Storage corpus data which could be associated in train data. The data path format is gs://<bucket_to_data>/<jsonl_file_name>. A newline delimited jsonl/ndjson file.

For search-tuning model, each line should have the _id, title and text. Example: {"_id": "doc1", title: "relevant doc", "text": "relevant text"}

string corpus_data_path = 1;

Returns
TypeDescription
ByteString

The bytes for corpusDataPath.

getQueryDataPath()

public abstract String getQueryDataPath()

The gcs query data which could be associated in train data. The data path format is gs://<bucket_to_data>/<jsonl_file_name>. A newline delimited jsonl/ndjson file.

For search-tuning model, each line should have the _id and text. Example: {"_id": "query1", "text": "example query"}

string query_data_path = 2;

Returns
TypeDescription
String

The queryDataPath.

getQueryDataPathBytes()

public abstract ByteString getQueryDataPathBytes()

The gcs query data which could be associated in train data. The data path format is gs://<bucket_to_data>/<jsonl_file_name>. A newline delimited jsonl/ndjson file.

For search-tuning model, each line should have the _id and text. Example: {"_id": "query1", "text": "example query"}

string query_data_path = 2;

Returns
TypeDescription
ByteString

The bytes for queryDataPath.

getTestDataPath()

public abstract String getTestDataPath()

Cloud Storage test data. Same format as train_data_path. If not provided, a random 80/20 train/test split will be performed on train_data_path.

string test_data_path = 4;

Returns
TypeDescription
String

The testDataPath.

getTestDataPathBytes()

public abstract ByteString getTestDataPathBytes()

Cloud Storage test data. Same format as train_data_path. If not provided, a random 80/20 train/test split will be performed on train_data_path.

string test_data_path = 4;

Returns
TypeDescription
ByteString

The bytes for testDataPath.

getTrainDataPath()

public abstract String getTrainDataPath()

Cloud Storage training data path whose format should be gs://<bucket_to_data>/<tsv_file_name>. The file should be in tsv format. Each line should have the doc_id and query_id and score (number).

For search-tuning model, it should have the query-id corpus-id score as tsv file header. The score should be a number in [0, inf+). The larger the number is, the more relevant the pair is. Example:

  • query-id\tcorpus-id\tscore
  • query1\tdoc1\t1

string train_data_path = 3;

Returns
TypeDescription
String

The trainDataPath.

getTrainDataPathBytes()

public abstract ByteString getTrainDataPathBytes()

Cloud Storage training data path whose format should be gs://<bucket_to_data>/<tsv_file_name>. The file should be in tsv format. Each line should have the doc_id and query_id and score (number).

For search-tuning model, it should have the query-id corpus-id score as tsv file header. The score should be a number in [0, inf+). The larger the number is, the more relevant the pair is. Example:

  • query-id\tcorpus-id\tscore
  • query1\tdoc1\t1

string train_data_path = 3;

Returns
TypeDescription
ByteString

The bytes for trainDataPath.