本頁面由 Cloud Translation API 翻譯而成。

調整文字嵌入

本頁說明如何微調文字嵌入模型。

基礎嵌入模型是以大量文字資料集預先訓練而成，可為許多工作提供強大的基準。如果需要專業知識或高度客製化的成效，您可以透過模型調整功能，使用自己的相關資料微調模型的表示方式。下列文字嵌入模型支援微調：

模型
`text-embedding-004`
`text-embedding-005`
`text-multilingual-embedding-002`

文字嵌入模型支援監督式微調。監督式微調會使用標示範例，在推論期間展示您希望文字嵌入模型輸出的內容類型。

如要進一步瞭解模型調整作業，請參閱「模型調整作業的運作方式」。

預期品質提升幅度

Vertex AI 會使用參數效率調整方法進行自訂。這項方法在公開檢索基準資料集上進行的實驗中，顯示品質顯著提升，最高可達 41% (平均 12%)。

微調嵌入模型的用途

調整文字嵌入模型可讓模型適應特定領域或工作的嵌入項目。如果預先訓練的嵌入模型不符合您的特定需求，這項功能就很有用。舉例來說，您可能會針對貴公司的特定客戶服務單資料集，微調嵌入模型。這有助於聊天機器人瞭解顧客通常會遇到哪些類型的客戶服務問題，並更有效地回答問題。未經過微調的模型無法瞭解你的客戶服務單具體內容，也無法提供產品特定問題的解決方案。

調整工作流程

Vertex AI 的模型調整工作流程如下：

準備模型調整資料集。
將模型微調資料集上傳至 Cloud Storage 值區。
為 Vertex AI Pipelines 設定專案。
建立模型微調工作。
將調整過的模型部署至同名的 Vertex AI 端點。與文字或 Codey 模型微調工作不同，文字嵌入微調工作不會將微調模型部署至 Vertex AI 端點。

準備嵌入資料集

用於微調嵌入模型的資料集包括與您希望模型執行的工作相符的資料。

調整嵌入模型時適用的資料集格式

訓練資料集包含下列檔案，這些檔案必須位於 Cloud Storage 中。啟動微調管道時，系統會透過參數定義檔案路徑。這三種檔案分別是語料庫檔案、查詢檔案和標籤。您只需要訓練標籤，但也可以提供驗證和測試標籤，以便進一步控管。

語料庫檔案：路徑由參數 corpus_path 定義。這是 JSONL 檔案，每行都有 _id、title 和 text 欄位，且值為字串。_id 和 text 為必填欄位，title 則為選填欄位。以下是 corpus.jsonl 檔案範例：

{"_id": "doc1", "title": "Get an introduction to generative AI on Vertex AI", "text": "Vertex AI Studio offers a Google Cloud console tool for rapidly prototyping and testing generative AI models. Learn how you can use Vertex AI Studio to test models using prompt samples, design and save prompts, tune a foundation model, and convert between speech and text."}
{"_id": "doc2", "title": "Use gen AI for summarization, classification, and extraction", "text": "Learn how to create text prompts for handling any number of tasks with Vertex AI's generative AI support. Some of the most common tasks are classification, summarization, and extraction. Vertex AI's PaLM API for text lets you design prompts with flexibility in terms of their structure and format."}
{"_id": "doc3", "title": "Custom ML training overview and documentation", "text": "Get an overview of the custom training workflow in Vertex AI, the benefits of custom training, and the various training options that are available. This page also details every step involved in the ML training workflow from preparing data to predictions."}
{"_id": "doc4", "text": "Text embeddings are useful for clustering, information retrieval, retrieval-augmented generation (RAG), and more."}
{"_id": "doc5", "title": "Text embedding tuning", "text": "Google's text embedding models can be tuned on Vertex AI."}

查詢檔案：查詢檔案包含查詢範例。路徑是由 queries_path 參數定義，查詢檔案為 JSONL 格式，且與語料庫檔案具有相同的欄位。以下是 queries.jsonl 檔案範例：

{"_id": "query1", "text": "Does Vertex support generative AI?"}
{"_id": "query2", "text": "What can I do with Vertex GenAI offerings?"}
{"_id": "query3", "text": "How do I train my models using Vertex?"}
{"_id": "query4", "text": "What is a text embedding?"}
{"_id": "query5", "text": "Can text embedding models be tuned on Vertex?"}
{"_id": "query6", "text": "embeddings"}
{"_id": "query7", "text": "embeddings for rag"}
{"_id": "query8", "text": "custom model training"}
{"_id": "query9", "text": "Google Cloud PaLM API"}

訓練標籤：路徑是由 train_label_path 參數定義。train_label_path 是訓練標籤資料位置的 Cloud Storage URI，您可以在建立微調工作時指定。標籤必須是包含標題的 TSV 檔案。訓練標籤檔案中必須包含部分查詢和語料庫。檔案必須包含 query-id、corpus-id 和 score 欄。query-id 是與查詢檔案中的 _id 鍵相符的字串，corpus-id 是與語料庫檔案中的 _id 相符的字串。Score 是非負整數值。如果查詢和文件無關，您可以將其從訓練標籤檔案中排除，或納入檔案並將分數設為零。任何大於零的分數都表示文件與查詢相關。數字越大表示關聯性越高。如未填寫分數，預設值為 1。以下是 train_labels.tsv 檔案範例：
```
query-id  corpus-id   score
query1    doc1    1
query2    doc2    1
query3    doc3    2
query3    doc5  1
query4    doc4  1
query4    doc5  1
query5    doc5  2
query6    doc4  1
query6    doc5  1
query7    doc4  1
query8    doc3  1
query9    doc2  1
```
測試標籤：選用。測試標籤的格式與訓練標籤相同，並由 test_label_path 參數指定。如未提供 test_label_path，系統會從訓練標籤自動分割測試標籤。
驗證標籤：選用。驗證標籤的格式與訓練標籤相同，並由 validation_label_path 參數指定。如未提供 validation_label_path，系統會從訓練標籤自動分割驗證標籤。

資料集大小規定

提供的資料集檔案必須符合下列限制：

查詢次數必須介於 9 到 10,000 之間。
語料庫中的文件數必須介於 9 到 500,000 之間。
每個資料集標籤檔案都必須包含至少 3 個查詢 ID，且所有資料集分割都必須包含至少 9 個查詢 ID。
標籤總數必須小於 500,000 個。

為 Vertex AI Pipelines 設定專案

調整作業會在專案中透過 Vertex AI Pipelines 平台執行。

設定權限

管道會在兩個服務代理程式下執行訓練程式碼。必須授予這些服務代理程式特定角色，才能開始使用專案和資料集進行訓練。

Compute Engine 預設服務帳戶

PROJECT_NUMBER-compute@developer.gserviceaccount.com

這個服務帳戶需要：

Storage Object Viewer 存取您在 Cloud Storage 中建立的每個資料集檔案。
Storage Object User 存取管道的輸出 Cloud Storage 目錄，PIPELINE_OUTPUT_DIRECTORY。
Vertex AI User 存取專案。

您可以指定自訂服務帳戶，而非 Compute Engine 預設服務帳戶。詳情請參閱設定具有精細權限的服務帳戶。

Vertex AI 調整用服務代理

service-PROJECT_NUMBER@gcp-sa-aiplatform-ft.iam.gserviceaccount.com

這個服務帳戶需要：

Storage Object Viewer 存取您在 Cloud Storage 中建立的每個資料集檔案。
Storage Object User 存取管道的輸出 Cloud Storage 目錄，PIPELINE_OUTPUT_DIRECTORY。

如要進一步瞭解如何設定 Cloud Storage 資料集權限，請參閱「設定 Cloud Storage 值區做為管道構件」。

使用加速器

微調需要 GPU 加速器。下列任一加速器可用於文字嵌入調整管道：

NVIDIA_L4
NVIDIA_TESLA_A100
NVIDIA_TESLA_T4
NVIDIA_TESLA_V100
NVIDIA_TESLA_P100

如要啟動微調作業，您選取的加速器類型和區域必須有足夠的 Restricted image training GPUs 配額，例如 Restricted image training Nvidia V100 GPUs per region。如要提高專案配額，請參閱申請更多配額。

部分地區不提供某些加速器。詳情請參閱「在 Vertex AI 中使用加速器」。

建立嵌入模型微調工作

您可以使用 Google Cloud 控制台、REST API 或用戶端程式庫，建立嵌入模型微調工作。

REST

如要建立嵌入模型微調工作，請使用 projects.locations.pipelineJobs.create 方法。

使用任何要求資料之前，請先替換以下項目：

PROJECT_ID：您的 Google Cloud 專案 ID。
PIPELINE_OUTPUT_DIRECTORY：管道輸出構件的路徑，開頭為「gs://」。

HTTP 方法和網址：

POST https://us-central1-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/us-central1/pipelineJobs

JSON 要求主體：

{
  "displayName": "tune_text_embeddings_model_sample",
  "runtimeConfig": {
    "gcsOutputDirectory": "PIPELINE_OUTPUT_DIRECTORY",
    "parameterValues": {
      "corpus_path": "gs://cloud-samples-data/ai-platform/embedding/goog-10k-2024/r11/corpus.jsonl",
      "queries_path": "gs://cloud-samples-data/ai-platform/embedding/goog-10k-2024/r11/queries.jsonl",
      "train_label_path": "gs://cloud-samples-data/ai-platform/embedding/goog-10k-2024/r11/train.tsv",
      "test_label_path": "gs://cloud-samples-data/ai-platform/embedding/goog-10k-2024/r11/test.tsv",
      "base_model_version_id":"text-embedding-004",
      "task_type": "DEFAULT",
      "batch_size": "128",
      "train_steps": "1000",
      "output_dimensionality": "768",
      "learning_rate_multiplier": "1.0"
    }
  },
  "templateUri": "https://us-kfp.pkg.dev/ml-pipeline/llm-text-embedding/tune-text-embedding-model/v1.1.3"
}

如要傳送要求，請展開以下其中一個選項：

curl (Linux、macOS 或 Cloud Shell)

注意： 下列指令假設您已執行 gcloud init 或 gcloud auth login，透過使用者帳戶登入 gcloud CLI，或使用 Cloud Shell，自動登入 gcloud CLI。您可以執行 gcloud auth list 查看目前有效的帳戶。

將要求主體儲存在名為 request.json 的檔案中，然後執行下列指令：

curl -X POST \
     -H "Authorization: Bearer $(gcloud auth print-access-token)" \
     -H "Content-Type: application/json; charset=utf-8" \
     -d @request.json \
     "https://us-central1-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/us-central1/pipelineJobs"

PowerShell (Windows)

注意： 下列指令假設您已執行 gcloud init 或 gcloud auth login，透過使用者帳戶登入 gcloud CLI。您可以執行 gcloud auth list 查看目前有效的帳戶。

將要求主體儲存在名為 request.json 的檔案中，然後執行下列指令：

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
    -Method POST `
    -Headers $headers `
    -ContentType: "application/json; charset=utf-8" `
    -InFile request.json `
    -Uri "https://us-central1-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/us-central1/pipelineJobs" | Select-Object -Expand Content

您應該會收到如下的 JSON 回應：

回應

{
  "name": "projects/123456789012/locations/us-central1/pipelineJobs/tune-text-embedding-20231003231411",
  "displayName": "tune_text_embeddings_model_sample",
  "createTime": "2023-10-03T23:14:11.705749Z",
  "updateTime": "2023-10-03T23:14:11.705749Z",
  "pipelineSpec": { ... },
  "state": "PIPELINE_STATE_PENDING",
  "labels": {
    "vertex-ai-pipelines-run-billing-id": "1234567890123456789"
  },
  "runtimeConfig": {
    "gcsOutputDirectory": "gs://my-bucket/output-dir",
    "parameterValues": {
      "corpus_path": "gs://cloud-samples-data/ai-platform/embedding/goog-10k-2024/r11/corpus.jsonl",
      "queries_path": "gs://cloud-samples-data/ai-platform/embedding/goog-10k-2024/r11/queries.jsonl",
      "train_label_path": "gs://cloud-samples-data/ai-platform/embedding/goog-10k-2024/r11/train.tsv",
      "test_label_path": "gs://cloud-samples-data/ai-platform/embedding/goog-10k-2024/r11/test.tsv",
      "base_model_version_id": "text-embedding-004",
      "task_type": "DEFAULT",
      "batch_size": "128",
      "train_steps": "1000",
      "output_dimensionality": "768",
      "learning_rate_multiplier": "1.0"
    }
  },
  "serviceAccount": "123456789-compute@developer.gserviceaccount.com",
  "templateUri": "https://us-kfp.pkg.dev/ml-pipeline/llm-text-embedding/tune-text-embedding-model/v1.1.3"
}

啟動管道後，請透過 Google Cloud 控制台追蹤微調工作的進度。

前往 Google Cloud 控制台

Python

如要瞭解如何安裝或更新 Python 適用的 Vertex AI SDK，請參閱「安裝 Python 適用的 Vertex AI SDK」。詳情請參閱 Python API 參考說明文件。

import re

from google.cloud.aiplatform import initializer as aiplatform_init
from vertexai.language_models import TextEmbeddingModel


def tune_embedding_model(
    api_endpoint: str,
    base_model_name: str = "text-embedding-005",
    corpus_path: str = "gs://cloud-samples-data/ai-platform/embedding/goog-10k-2024/r11/corpus.jsonl",
    queries_path: str = "gs://cloud-samples-data/ai-platform/embedding/goog-10k-2024/r11/queries.jsonl",
    train_label_path: str = "gs://cloud-samples-data/ai-platform/embedding/goog-10k-2024/r11/train.tsv",
    test_label_path: str = "gs://cloud-samples-data/ai-platform/embedding/goog-10k-2024/r11/test.tsv",
):  # noqa: ANN201
    """Tune an embedding model using the specified parameters.
    Args:
        api_endpoint (str): The API endpoint for the Vertex AI service.
        base_model_name (str): The name of the base model to use for tuning.
        corpus_path (str): GCS URI of the JSONL file containing the corpus data.
        queries_path (str): GCS URI of the JSONL file containing the queries data.
        train_label_path (str): GCS URI of the TSV file containing the training labels.
        test_label_path (str): GCS URI of the TSV file containing the test labels.
    """
    match = re.search(r"^(\w+-\w+)", api_endpoint)
    location = match.group(1) if match else "us-central1"
    base_model = TextEmbeddingModel.from_pretrained(base_model_name)
    tuning_job = base_model.tune_model(
        task_type="DEFAULT",
        corpus_data=corpus_path,
        queries_data=queries_path,
        training_data=train_label_path,
        test_data=test_label_path,
        batch_size=128,  # The batch size to use for training.
        train_steps=1000,  # The number of training steps.
        tuned_model_location=location,
        output_dimensionality=768,  # The dimensionality of the output embeddings.
        learning_rate_multiplier=1.0,  # The multiplier for the learning rate.
    )
    return tuning_job

Java

在試用這個範例之前，請先按照Java使用用戶端程式庫的 Vertex AI 快速入門中的操作說明進行設定。詳情請參閱 Vertex AI Java API 參考說明文件。

如要向 Vertex AI 進行驗證，請設定應用程式預設憑證。詳情請參閱「為本機開發環境設定驗證」。

import com.google.cloud.aiplatform.v1.CreatePipelineJobRequest;
import com.google.cloud.aiplatform.v1.LocationName;
import com.google.cloud.aiplatform.v1.PipelineJob;
import com.google.cloud.aiplatform.v1.PipelineJob.RuntimeConfig;
import com.google.cloud.aiplatform.v1.PipelineServiceClient;
import com.google.cloud.aiplatform.v1.PipelineServiceSettings;
import com.google.protobuf.Value;
import java.io.IOException;
import java.util.Map;
import java.util.regex.Matcher;
import java.util.regex.Pattern;

public class EmbeddingModelTuningSample {
  public static void main(String[] args) throws IOException {
    // TODO(developer): Replace these variables before running this sample.
    String apiEndpoint = "us-central1-aiplatform.googleapis.com:443";
    String project = "PROJECT";
    String baseModelVersionId = "BASE_MODEL_VERSION_ID";
    String taskType = "DEFAULT";
    String pipelineJobDisplayName = "PIPELINE_JOB_DISPLAY_NAME";
    String outputDir = "OUTPUT_DIR";
    String queriesPath = "QUERIES_PATH";
    String corpusPath = "CORPUS_PATH";
    String trainLabelPath = "TRAIN_LABEL_PATH";
    String testLabelPath = "TEST_LABEL_PATH";
    double learningRateMultiplier = 1.0;
    int outputDimensionality = 768;
    int batchSize = 128;
    int trainSteps = 1000;

    createEmbeddingModelTuningPipelineJob(
        apiEndpoint,
        project,
        baseModelVersionId,
        taskType,
        pipelineJobDisplayName,
        outputDir,
        queriesPath,
        corpusPath,
        trainLabelPath,
        testLabelPath,
        learningRateMultiplier,
        outputDimensionality,
        batchSize,
        trainSteps);
  }

  public static PipelineJob createEmbeddingModelTuningPipelineJob(
      String apiEndpoint,
      String project,
      String baseModelVersionId,
      String taskType,
      String pipelineJobDisplayName,
      String outputDir,
      String queriesPath,
      String corpusPath,
      String trainLabelPath,
      String testLabelPath,
      double learningRateMultiplier,
      int outputDimensionality,
      int batchSize,
      int trainSteps)
      throws IOException {
    Matcher matcher = Pattern.compile("^(?<Location>\\w+-\\w+)").matcher(apiEndpoint);
    String location = matcher.matches() ? matcher.group("Location") : "us-central1";
    String templateUri =
        "https://us-kfp.pkg.dev/ml-pipeline/llm-text-embedding/tune-text-embedding-model/v1.1.4";
    PipelineServiceSettings settings =
        PipelineServiceSettings.newBuilder().setEndpoint(apiEndpoint).build();
    try (PipelineServiceClient client = PipelineServiceClient.create(settings)) {
      Map<String, Value> parameterValues =
          Map.of(
              "base_model_version_id", valueOf(baseModelVersionId),
              "task_type", valueOf(taskType),
              "queries_path", valueOf(queriesPath),
              "corpus_path", valueOf(corpusPath),
              "train_label_path", valueOf(trainLabelPath),
              "test_label_path", valueOf(testLabelPath),
              "learning_rate_multiplier", valueOf(learningRateMultiplier),
              "output_dimensionality", valueOf(outputDimensionality),
              "batch_size", valueOf(batchSize),
              "train_steps", valueOf(trainSteps));
      PipelineJob pipelineJob =
          PipelineJob.newBuilder()
              .setTemplateUri(templateUri)
              .setDisplayName(pipelineJobDisplayName)
              .setRuntimeConfig(
                  RuntimeConfig.newBuilder()
                      .setGcsOutputDirectory(outputDir)
                      .putAllParameterValues(parameterValues)
                      .build())
              .build();
      CreatePipelineJobRequest request =
          CreatePipelineJobRequest.newBuilder()
              .setParent(LocationName.of(project, location).toString())
              .setPipelineJob(pipelineJob)
              .build();
      return client.createPipelineJob(request);
    }
  }

  private static Value valueOf(String s) {
    return Value.newBuilder().setStringValue(s).build();
  }

  private static Value valueOf(int n) {
    return Value.newBuilder().setNumberValue(n).build();
  }

  private static Value valueOf(double n) {
    return Value.newBuilder().setNumberValue(n).build();
  }
}

Node.js

在試用這個範例之前，請先按照Node.js使用用戶端程式庫的 Vertex AI 快速入門中的操作說明進行設定。詳情請參閱 Vertex AI Node.js API 參考說明文件。

如要向 Vertex AI 進行驗證，請設定應用程式預設憑證。詳情請參閱「為本機開發環境設定驗證」。

async function main(
  apiEndpoint,
  project,
  outputDir,
  pipelineJobDisplayName = 'embedding-customization-pipeline-sample',
  baseModelVersionId = 'text-embedding-005',
  taskType = 'DEFAULT',
  corpusPath = 'gs://cloud-samples-data/ai-platform/embedding/goog-10k-2024/r11/corpus.jsonl',
  queriesPath = 'gs://cloud-samples-data/ai-platform/embedding/goog-10k-2024/r11/queries.jsonl',
  trainLabelPath = 'gs://cloud-samples-data/ai-platform/embedding/goog-10k-2024/r11/train.tsv',
  testLabelPath = 'gs://cloud-samples-data/ai-platform/embedding/goog-10k-2024/r11/test.tsv',
  outputDimensionality = 768,
  learningRateMultiplier = 1.0,
  batchSize = 128,
  trainSteps = 1000
) {
  const aiplatform = require('@google-cloud/aiplatform');
  const {PipelineServiceClient} = aiplatform.v1;
  const {helpers} = aiplatform; // helps construct protobuf.Value objects.

  const client = new PipelineServiceClient({apiEndpoint});
  const match = apiEndpoint.match(/(?<L>\w+-\w+)/);
  const location = match ? match.groups.L : 'us-central1';
  const parent = `projects/${project}/locations/${location}`;
  const params = {
    base_model_version_id: baseModelVersionId,
    task_type: taskType,
    queries_path: queriesPath,
    corpus_path: corpusPath,
    train_label_path: trainLabelPath,
    test_label_path: testLabelPath,
    batch_size: batchSize,
    train_steps: trainSteps,
    output_dimensionality: outputDimensionality,
    learning_rate_multiplier: learningRateMultiplier,
  };
  const runtimeConfig = {
    gcsOutputDirectory: outputDir,
    parameterValues: Object.fromEntries(
      Object.entries(params).map(([k, v]) => [k, helpers.toValue(v)])
    ),
  };
  const pipelineJob = {
    templateUri:
      'https://us-kfp.pkg.dev/ml-pipeline/llm-text-embedding/tune-text-embedding-model/v1.1.4',
    displayName: pipelineJobDisplayName,
    runtimeConfig,
  };
  async function createTuneJob() {
    const [response] = await client.createPipelineJob({parent, pipelineJob});
    console.log(`job_name: ${response.name}`);
    console.log(`job_state: ${response.state}`);
  }

  await createTuneJob();
}

控制台

如要使用 Google Cloud 控制台微調文字嵌入模型，請按照下列步驟啟動自訂管道：

在 Google Cloud 控制台的 Vertex AI 專區中，前往「Vertex AI Pipelines」頁面。
前往 Vertex AI Pipelines
按一下「建立執行」，開啟「建立管道執行」窗格。
按一下「從現有的 pipeline 中選取」，然後輸入下列詳細資料：
1. 從「選取資源」下拉式選單中選取「ml-pipeline」。
2. 從「Repository」(存放區) 下拉式選單中選取「llm-text-embedding」。
3. 從「管道或元件」下拉式選單中選取「tune-text-embedding-model」。
4. 從「版本」下拉式選單中，選取標示為「v1.1.3」的版本。
指定執行名稱，以唯一識別管道執行作業。
在「Region」(區域) 下拉式清單中，選取要建立管道執行的區域，這會與微調模型建立的區域相同。
按一下「繼續」。系統隨即會顯示「Runtime configuration」(執行階段設定) 窗格。
在「Cloud storage location」(Cloud Storage 位置) 下方，按一下「Browse」(瀏覽)，選取用於儲存管道輸出構件的 Cloud Storage bucket，然後按一下「Select」(選取)。
在「Pipeline parameters」(管道參數) 下方，指定微調管道的參數。三個必要參數為 corpus_path、queries_path 和 train_label_path，格式請參閱「準備嵌入資料集」。如要進一步瞭解各個參數，請參閱本節的「REST」分頁。
按一下「提交」，建立管道執行作業。

其他支援的功能

文字嵌入調整作業支援 VPC Service Controls，並可透過在建立 PipelineJob 時傳遞 network 參數，設定在虛擬私有雲 (VPC) 中執行。

如要使用客戶自行管理的加密金鑰 (CMEK)，請在建立 PipelineJob 時，將金鑰傳遞至 parameterValues.encryption_spec_key_name 管道參數，並傳遞 encryptionSpec.kmsKeyName 參數。

使用微調後的模型

在 Model Registry 中查看調整後的模型

模型調整工作完成後，系統不會自動將調整後的模型部署至端點。並以 Model 資源的形式儲存在 Model Registry 中。您可以使用 Google Cloud 控制台，查看目前專案中的模型清單，包括微調模型。

如要在 Google Cloud 控制台中查看經過微調的模型，請前往 Vertex AI Model Registry 頁面。

前往 Vertex AI Model Registry

部署模型

調整嵌入模型後，您需要部署模型資源。如要部署微調的嵌入模型，請參閱「將模型部署至端點」。

與基礎模型不同，微調後的文字嵌入模型是由使用者管理。包括管理服務資源，例如機器類型和加速器。為避免預測期間發生記憶體不足錯誤，建議您使用 NVIDIA_TESLA_A100 GPU 類型進行部署，這類 GPU 支援任何輸入長度，批次大小最多可達 5。

微調模型最多支援 3072 個權杖，可截斷較長的輸入內容。

取得已部署模型的預測結果

完成微調模型的部署作業後，您可以使用下列任一指令，向微調模型端點發出要求。

已微調 `textembedding-gecko@001` 模型的 curl 指令範例

如要從微調版 textembedding-gecko@001 取得預測結果，請使用下列範例 curl 指令。

PROJECT_ID=PROJECT_ID
LOCATION=LOCATION
ENDPOINT_URI=https://${LOCATION}-aiplatform.googleapis.com
MODEL_ENDPOINT=TUNED_MODEL_ENDPOINT_ID

curl -X POST -H "Authorization: Bearer $(gcloud auth print-access-token)" \
    -H "Content-Type: application/json"  \
    ${ENDPOINT_URI}/v1/projects/${PROJECT_ID}/locations/${LOCATION}/endpoints/${MODEL_ENDPOINT}:predict \
    -d '{
  "instances": [
    {
      "content": "Dining in New York City"
    },
    {
      "content": "Best resorts on the east coast"
    }
  ]
}'

非 `textembedding-gecko@001` 模型適用的 curl 指令範例

其他模型的微調版本 (例如 textembedding-gecko@003 和 textembedding-gecko-multilingual@001) 需要 2 個額外輸入內容：task_type 和 title。如要進一步瞭解這些參數，請參閱 curl 指令

PROJECT_ID=PROJECT_ID
LOCATION=LOCATION
ENDPOINT_URI=https://${LOCATION}-aiplatform.googleapis.com
MODEL_ENDPOINT=TUNED_MODEL_ENDPOINT_ID

curl -X POST -H "Authorization: Bearer $(gcloud auth print-access-token)" \
    -H "Content-Type: application/json"  \
    ${ENDPOINT_URI}/v1/projects/${PROJECT_ID}/locations/${LOCATION}/endpoints/${MODEL_ENDPOINT}:predict \
    -d '{
  "instances": [
    {
      "content": "Dining in New York City",
      "task_type": "DEFAULT",
      "title": ""
    },
    {
      "content": "There are many resorts to choose from on the East coast...",
      "task_type": "RETRIEVAL_DOCUMENT",
      "title": "East Coast Resorts"
    }
  ]
}'

輸出範例

對已部署的微調模型發出預測要求時，輸出內容的格式與對 Text Embedding API 發出要求時不同。

{
 "predictions": [
   [ ... ],
   [ ... ],
   ...
 ],
 "deployedModelId": "...",
 "model": "projects/.../locations/.../models/...",
 "modelDisplayName": "tuned-text-embedding-model",
 "modelVersionId": "1"
}

後續步驟

如要取得嵌入的批次預測結果，請參閱「取得文字嵌入的批次預測結果」
如要進一步瞭解多模態嵌入，請參閱「取得多模態嵌入」
如需純文字應用實例 (以文字為基礎的語意搜尋、叢集、長篇文件分析，以及其他文字檢索或問答應用實例) 的相關資訊，請參閱「取得文字嵌入」。