テキストエンベディングをチューニングする

このページでは、textembedding-gecko や textembedding-gecko-multilingual などのテキストエンベディングモデルをチューニングする方法について説明します。

基盤エンベディングモデルは、膨大なテキストデータセットで事前トレーニングされており、多くのタスクに強力なベースラインを提供します。専門知識や高度にカスタマイズされたパフォーマンスが必要なシナリオでは、モデルのチューニングにより、独自の関連データを使用してモデルの表現をファインチューニングできます。チューニングは、textembedding-gecko モデルと textembedding-gecko-multilingual モデルの安定バージョンでサポートされています。

テキストエンベディングモデルは、教師ありチューニングをサポートしています。教師ありチューニングでは、推論時にテキストエンベディングモデルから得ようとしている出力のタイプを示すラベル付きサンプルを使用します。

モデルのチューニングの詳細については、モデルのチューニングの仕組みをご覧ください。

期待される品質の向上

Vertex AI は、パラメータ効率の高いチューニング方法を使用してカスタマイズを行います。この方法では、一般公開されている検索ベンチマークデータセットで実施したテストで、品質が最大 41%（平均 12%）向上しています。

エンベディングモデルをチューニングするユースケース

テキストエンベディングモデルをチューニングすると、モデルを特定の領域やタスクへのエンベディングに適応させることができます。これは、事前トレーニング済みのエンベディングモデルが特定のニーズに合わない場合に有効です。たとえば、自社のカスタマーサポートチケットの特定のデータセットでエンベディングモデルを微調整できます。これにより、chatbot は、顧客が一般的に直面するさまざまなカスタマーサポートの問題を理解し、より効果的に質問に回答できるようになります。チューニングを行わないと、モデルはカスタマーサポートチケットの具体的な内容や、プロダクトの特定の問題に対する解決策を把握できません。

チューニングワークフロー

Vertex AI で textembedding-gecko と textembedding-gecko-multilingual のモデルをチューニングするワークフローは次のとおりです。

モデルのチューニング用データセットを準備します。
モデルのチューニング用データセットを Cloud Storage バケットにアップロードする。
Vertex AI Pipelines 用にプロジェクトを構成します。
モデルのチューニングジョブを作成します。
チューニング済みモデルを同じ名前の Vertex AI エンドポイントにデプロイします。テキストまたは Codey モデルのチューニングジョブとは異なり、テキストのエンベディングチューニングジョブは、チューニング済みモデルを Vertex AI エンドポイントにデプロイしません。

エンベディングデータセットを準備する

エンベディングモデルのチューニングに使用されるデータセットには、モデルに実行させるタスクに対応するデータが含まれています。

エンベディングモデルをチューニングするためのデータセット形式

トレーニングデータセットは、次のファイルで構成されています。これらのファイルは、Cloud Storage に保存する必要があります。ファイルのパスは、チューニングパイプラインの起動時にパラメータによって定義されます。ファイルには、コーパスファイル、クエリファイル、ラベルの 3 種類があります。必要であるのはトレーニングラベルのみですが、より詳細に制御するために検証ラベルとテストラベルを指定することもできます。

コーパスファイル: パスはパラメータ corpus_path で定義されます。これは、各行にフィールド _id、title、text と文字列値を持つ JSONL ファイルです。_id と text は必須ですが、title は省略可能です。corpus.jsonl ファイルの例を以下に示します。

{"_id": "doc1", "title": "Get an introduction to generative AI on Vertex AI", "text": "Vertex AI Studio offers a Google Cloud console tool for rapidly prototyping and testing generative AI models. Learn how you can use Vertex AI Studio to test models using prompt samples, design and save prompts, tune a foundation model, and convert between speech and text."}
{"_id": "doc2", "title": "Use gen AI for summarization, classification, and extraction", "text": "Learn how to create text prompts for handling any number of tasks with Vertex AI's generative AI support. Some of the most common tasks are classification, summarization, and extraction. Vertex AI's PaLM API for text lets you design prompts with flexibility in terms of their structure and format."}
{"_id": "doc3", "title": "Custom ML training overview and documentation", "text": "Get an overview of the custom training workflow in Vertex AI, the benefits of custom training, and the various training options that are available. This page also details every step involved in the ML training workflow from preparing data to predictions."}
{"_id": "doc4", "text": "Text embeddings are useful for clustering, information retrieval, retrieval-augmented generation (RAG), and more."}
{"_id": "doc5", "title": "Text embedding tuning", "text": "Google's text embedding models can be tuned on Vertex AI."}

クエリファイル: クエリファイルにはサンプルクエリが含まれています。パスは queries_path パラメータで定義されます。クエリファイルは JSONL 形式であり、コーパスファイルと同じフィールドを持ちます。queries.jsonl ファイルの例を以下に示します。

{"_id": "query1", "text": "Does Vertex support generative AI?"}
{"_id": "query2", "text": "What can I do with Vertex GenAI offerings?"}
{"_id": "query3", "text": "How do I train my models using Vertex?"}
{"_id": "query4", "text": "What is a text embedding?"}
{"_id": "query5", "text": "Can text embedding models be tuned on Vertex?"}
{"_id": "query6", "text": "embeddings"}
{"_id": "query7", "text": "embeddings for rag"}
{"_id": "query8", "text": "custom model training"}
{"_id": "query9", "text": "Google Cloud PaLM API"}

トレーニングラベル: パスはパラメータ train_label_path で定義されます。train_label_path は、トレーニングラベルデータの場所への Cloud Storage URI であり、チューニングジョブの作成時に指定されます。ラベルはヘッダー付きの TSV ファイルであることが必要です。クエリのサブセットとコーパスをトレーニングラベルファイルに追加する必要があります。ファイルには query-id、corpus-id、score の各列が必要です。query-id はクエリファイルの _id キーと一致する文字列であり、corpus-id はコーパスファイル内の _id と一致する文字列です。Score は正の整数値です。クエリとドキュメントのペアが関連していない場合は、トレーニングラベルファイルから除外するか、スコアを 0 にして含めます。スコアが 0 より大きい場合は、ドキュメントがクエリに関連していることを示します。数値が大きいほど、関連性が高いことを表します。スコアが省略されている場合、デフォルト値は 1 です。train_labels.tsv ファイルの例を以下に示します。
```
query-id  corpus-id   score
query1    doc1    1
query2    doc2    1
query3    doc3    2
query3    doc5  1
query4    doc4  1
query4    doc5  1
query5    doc5  2
query6    doc4  1
query6    doc5  1
query7    doc4  1
query8    doc3  1
query9    doc2  1
```
テストラベル: 省略可。テストラベルはトレーニングラベルと同じ形式であり、test_label_path パラメータで指定します。test_label_path が指定されていない場合、テストラベルはトレーニングラベルから自動的に分割されます。
検証ラベル: 省略可。検証ラベルはトレーニングラベルと同じ形式であり、validation_label_path パラメータで指定します。validation_label_path が指定されていない場合、検証ラベルはトレーニングラベルから自動的に分割されます。

データセットのサイズ要件

指定したデータセットファイルは、次の制約を満たしている必要があります。

クエリ数は 9～10,000 の範囲にする必要があります。
コーパス内のドキュメント数は 9～500,000 の範囲にする必要があります。
各データセットのラベルファイルには、少なくとも 3 個のクエリ ID を配置する必要があります。また、すべてのデータセットの分割に、少なくとも 9 個のクエリ ID を配置する必要があります。
ラベルの総数は 500,000 未満にする必要があります。

Vertex AI Pipelines 用にプロジェクトを構成する

チューニングは、Vertex AI Pipelines プラットフォームを使用してプロジェクト内で実行されます。

権限の構成

パイプラインは、2 つのサービスエージェントでトレーニングコードを実行します。これらのサービスエージェントがプロジェクトとデータセットを使用してトレーニングを開始するには、特定のロールが付与されている必要があります。

Compute Engine のデフォルトのサービスアカウント

PROJECT_NUMBER-compute@developer.gserviceaccount.com

このサービスアカウントには、次のものが必要です。

Cloud Storage で作成した各データセットファイルへの Storage Object Viewer アクセス権。
パイプラインの出力 Cloud Storage ディレクトリ（PIPELINE_OUTPUT_DIRECTORY）への Storage Object User アクセス権。
プロジェクトへの Vertex AI User アクセス権。

Compute Engine のデフォルトのサービスアカウントの代わりに、カスタムサービスアカウントを指定できます。詳細については、詳細な権限を持つサービスアカウントを構成するをご覧ください。

Vertex AI チューニングサービスエージェント

service-PROJECT_NUMBER@gcp-sa-aiplatform-ft.iam.gserviceaccount.com

このサービスアカウントには、次のものが必要です。

Cloud Storage で作成した各データセットファイルへの Storage Object Viewer アクセス権。
パイプラインの出力 Cloud Storage ディレクトリ（PIPELINE_OUTPUT_DIRECTORY）への Storage Object User アクセス権。

Cloud Storage データセットの権限の構成について詳しくは、パイプラインアーティファクト用に Cloud Storage バケットを構成するをご覧ください。

アクセラレータの使用

チューニングには GPU アクセラレータが必要です。テキストエンベディングチューニングパイプラインには、次のいずれかのアクセラレータを使用できます。

NVIDIA_L4
NVIDIA_TESLA_A100
NVIDIA_TESLA_T4
NVIDIA_TESLA_V100
NVIDIA_TESLA_P100

チューニングジョブを開始するには、選択したアクセラレータタイプとリージョンに十分な Restricted image training GPUs 割り当て（例: Restricted image training Nvidia V100 GPUs per region）が必要です。プロジェクトの割り当てを増やすには、割り当ての増加をリクエストするをご覧ください。

すべてのアクセラレータがすべてのリージョンで利用できるわけではありません。詳細については、Vertex AI でのアクセラレータの使用をご覧ください。

エンベディングモデルのチューニングジョブを作成する

エンベディングモデルのチューニングジョブは、Google Cloud コンソール、REST API、またはクライアントライブラリを使用して作成できます。

REST

エンベディングモデルのチューニングジョブを作成するには、projects.locations.pipelineJobs.create メソッドを使用します。

リクエストのデータを使用する前に、次のように置き換えます。

PROJECT_ID: Google Cloud プロジェクト ID。
PIPELINE_OUTPUT_DIRECTORY: パイプライン出力アーティファクトのパス（「gs://」で始まります）。

HTTP メソッドと URL:

POST https://us-central1-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/us-central1/pipelineJobs

リクエストの本文（JSON）:

{
  "displayName": "tune_text_embeddings_model_sample",
  "runtimeConfig": {
    "gcsOutputDirectory": "PIPELINE_OUTPUT_DIRECTORY",
    "parameterValues": {
      "corpus_path": "gs://cloud-samples-data/ai-platform/embedding/goog-10k-2024/r11/corpus.jsonl",
      "queries_path": "gs://cloud-samples-data/ai-platform/embedding/goog-10k-2024/r11/queries.jsonl",
      "train_label_path": "gs://cloud-samples-data/ai-platform/embedding/goog-10k-2024/r11/train.tsv",
      "test_label_path": "gs://cloud-samples-data/ai-platform/embedding/goog-10k-2024/r11/test.tsv",
      "base_model_version_id":"text-embedding-004",
      "task_type": "DEFAULT",
      "batch_size": "128",
      "train_steps": "1000",
      "output_dimensionality": "768",
      "learning_rate_multiplier": "1.0"
    }
  },
  "templateUri": "https://us-kfp.pkg.dev/ml-pipeline/llm-text-embedding/tune-text-embedding-model/v1.1.3"
}

リクエストを送信するには、次のいずれかのオプションを開きます。

curl（Linux、macOS、Cloud Shell）

注: 次のコマンドは、gcloud init または gcloud auth login を実行して、ユーザーアカウントで gcloud CLI にログインしているか、Cloud Shell を使用して自動的に gcloud CLI にログインしていることを前提としています。gcloud auth list を実行すると、現在アクティブなアカウントを確認できます。

リクエスト本文を request.json という名前のファイルに保存して、次のコマンドを実行します。

curl -X POST \
     -H "Authorization: Bearer $(gcloud auth print-access-token)" \
     -H "Content-Type: application/json; charset=utf-8" \
     -d @request.json \
     "https://us-central1-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/us-central1/pipelineJobs"

PowerShell（Windows）

注: 次のコマンドは、gcloud init または gcloud auth login を実行して、ご自分のユーザーアカウントで gcloud CLI にログインしていることを前提としています。gcloud auth list を実行すると、現在アクティブなアカウントを確認できます。

リクエスト本文を request.json という名前のファイルに保存して、次のコマンドを実行します。

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
    -Method POST `
    -Headers $headers `
    -ContentType: "application/json; charset=utf-8" `
    -InFile request.json `
    -Uri "https://us-central1-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/us-central1/pipelineJobs" | Select-Object -Expand Content

次のような JSON レスポンスが返されます。

レスポンス

{
  "name": "projects/123456789012/locations/us-central1/pipelineJobs/tune-text-embedding-20231003231411",
  "displayName": "tune_text_embeddings_model_sample",
  "createTime": "2023-10-03T23:14:11.705749Z",
  "updateTime": "2023-10-03T23:14:11.705749Z",
  "pipelineSpec": { ... },
  "state": "PIPELINE_STATE_PENDING",
  "labels": {
    "vertex-ai-pipelines-run-billing-id": "1234567890123456789"
  },
  "runtimeConfig": {
    "gcsOutputDirectory": "gs://my-bucket/output-dir",
    "parameterValues": {
      "corpus_path": "gs://cloud-samples-data/ai-platform/embedding/goog-10k-2024/r11/corpus.jsonl",
      "queries_path": "gs://cloud-samples-data/ai-platform/embedding/goog-10k-2024/r11/queries.jsonl",
      "train_label_path": "gs://cloud-samples-data/ai-platform/embedding/goog-10k-2024/r11/train.tsv",
      "test_label_path": "gs://cloud-samples-data/ai-platform/embedding/goog-10k-2024/r11/test.tsv",
      "base_model_version_id": "text-embedding-004",
      "task_type": "DEFAULT",
      "batch_size": "128",
      "train_steps": "1000",
      "output_dimensionality": "768",
      "learning_rate_multiplier": "1.0"
    }
  },
  "serviceAccount": "123456789-compute@developer.gserviceaccount.com",
  "templateUri": "https://us-kfp.pkg.dev/ml-pipeline/llm-text-embedding/tune-text-embedding-model/v1.1.3"
}

パイプラインを起動したら、Google Cloud コンソールでチューニングジョブの進捗状況を確認します。

Google Cloud コンソールに移動

Python

Vertex AI SDK for Python のインストールまたは更新の方法については、Vertex AI SDK for Python をインストールするをご覧ください。詳細については、Python API リファレンスドキュメントをご覧ください。

import re

from google.cloud.aiplatform import initializer as aiplatform_init
from vertexai.language_models import TextEmbeddingModel


def tune_embedding_model(
    api_endpoint: str,
    base_model_name: str = "text-embedding-004",
    corpus_path: str = "gs://cloud-samples-data/ai-platform/embedding/goog-10k-2024/r11/corpus.jsonl",
    queries_path: str = "gs://cloud-samples-data/ai-platform/embedding/goog-10k-2024/r11/queries.jsonl",
    train_label_path: str = "gs://cloud-samples-data/ai-platform/embedding/goog-10k-2024/r11/train.tsv",
    test_label_path: str = "gs://cloud-samples-data/ai-platform/embedding/goog-10k-2024/r11/test.tsv",
):  # noqa: ANN201
    """Tune an embedding model using the specified parameters.
    Args:
        api_endpoint (str): The API endpoint for the Vertex AI service.
        base_model_name (str): The name of the base model to use for tuning.
        corpus_path (str): GCS URI of the JSONL file containing the corpus data.
        queries_path (str): GCS URI of the JSONL file containing the queries data.
        train_label_path (str): GCS URI of the TSV file containing the training labels.
        test_label_path (str): GCS URI of the TSV file containing the test labels.
    """
    match = re.search(r"^(\w+-\w+)", api_endpoint)
    location = match.group(1) if match else "us-central1"
    base_model = TextEmbeddingModel.from_pretrained(base_model_name)
    tuning_job = base_model.tune_model(
        task_type="DEFAULT",
        corpus_data=corpus_path,
        queries_data=queries_path,
        training_data=train_label_path,
        test_data=test_label_path,
        batch_size=128,  # The batch size to use for training.
        train_steps=1000,  # The number of training steps.
        tuned_model_location=location,
        output_dimensionality=768,  # The dimensionality of the output embeddings.
        learning_rate_multiplier=1.0,  # The multiplier for the learning rate.
    )
    return tuning_job

Java

このサンプルを試す前に、Vertex AI クイックスタート: クライアントライブラリの使用にある Java の設定手順を完了してください。詳細については、Vertex AI Java API のリファレンスドキュメントをご覧ください。

Vertex AI に対する認証を行うには、アプリケーションのデフォルト認証情報を設定します。詳細については、ローカル開発環境の認証を設定するをご覧ください。

import com.google.cloud.aiplatform.v1.CreatePipelineJobRequest;
import com.google.cloud.aiplatform.v1.LocationName;
import com.google.cloud.aiplatform.v1.PipelineJob;
import com.google.cloud.aiplatform.v1.PipelineJob.RuntimeConfig;
import com.google.cloud.aiplatform.v1.PipelineServiceClient;
import com.google.cloud.aiplatform.v1.PipelineServiceSettings;
import com.google.protobuf.Value;
import java.io.IOException;
import java.util.Map;
import java.util.regex.Matcher;
import java.util.regex.Pattern;

public class EmbeddingModelTuningSample {
  public static void main(String[] args) throws IOException {
    // TODO(developer): Replace these variables before running this sample.
    String apiEndpoint = "us-central1-aiplatform.googleapis.com:443";
    String project = "PROJECT";
    String baseModelVersionId = "BASE_MODEL_VERSION_ID";
    String taskType = "DEFAULT";
    String pipelineJobDisplayName = "PIPELINE_JOB_DISPLAY_NAME";
    String outputDir = "OUTPUT_DIR";
    String queriesPath = "QUERIES_PATH";
    String corpusPath = "CORPUS_PATH";
    String trainLabelPath = "TRAIN_LABEL_PATH";
    String testLabelPath = "TEST_LABEL_PATH";
    double learningRateMultiplier = 1.0;
    int outputDimensionality = 768;
    int batchSize = 128;
    int trainSteps = 1000;

    createEmbeddingModelTuningPipelineJob(
        apiEndpoint,
        project,
        baseModelVersionId,
        taskType,
        pipelineJobDisplayName,
        outputDir,
        queriesPath,
        corpusPath,
        trainLabelPath,
        testLabelPath,
        learningRateMultiplier,
        outputDimensionality,
        batchSize,
        trainSteps);
  }

  public static PipelineJob createEmbeddingModelTuningPipelineJob(
      String apiEndpoint,
      String project,
      String baseModelVersionId,
      String taskType,
      String pipelineJobDisplayName,
      String outputDir,
      String queriesPath,
      String corpusPath,
      String trainLabelPath,
      String testLabelPath,
      double learningRateMultiplier,
      int outputDimensionality,
      int batchSize,
      int trainSteps)
      throws IOException {
    Matcher matcher = Pattern.compile("^(?<Location>\\w+-\\w+)").matcher(apiEndpoint);
    String location = matcher.matches() ? matcher.group("Location") : "us-central1";
    String templateUri =
        "https://us-kfp.pkg.dev/ml-pipeline/llm-text-embedding/tune-text-embedding-model/v1.1.3";
    PipelineServiceSettings settings =
        PipelineServiceSettings.newBuilder().setEndpoint(apiEndpoint).build();
    try (PipelineServiceClient client = PipelineServiceClient.create(settings)) {
      Map<String, Value> parameterValues =
          Map.of(
              "base_model_version_id", valueOf(baseModelVersionId),
              "task_type", valueOf(taskType),
              "queries_path", valueOf(queriesPath),
              "corpus_path", valueOf(corpusPath),
              "train_label_path", valueOf(trainLabelPath),
              "test_label_path", valueOf(testLabelPath),
              "learning_rate_multiplier", valueOf(learningRateMultiplier),
              "output_dimensionality", valueOf(outputDimensionality),
              "batch_size", valueOf(batchSize),
              "train_steps", valueOf(trainSteps));
      PipelineJob pipelineJob =
          PipelineJob.newBuilder()
              .setTemplateUri(templateUri)
              .setDisplayName(pipelineJobDisplayName)
              .setRuntimeConfig(
                  RuntimeConfig.newBuilder()
                      .setGcsOutputDirectory(outputDir)
                      .putAllParameterValues(parameterValues)
                      .build())
              .build();
      CreatePipelineJobRequest request =
          CreatePipelineJobRequest.newBuilder()
              .setParent(LocationName.of(project, location).toString())
              .setPipelineJob(pipelineJob)
              .build();
      return client.createPipelineJob(request);
    }
  }

  private static Value valueOf(String s) {
    return Value.newBuilder().setStringValue(s).build();
  }

  private static Value valueOf(int n) {
    return Value.newBuilder().setNumberValue(n).build();
  }

  private static Value valueOf(double n) {
    return Value.newBuilder().setNumberValue(n).build();
  }
}

Node.js

このサンプルを試す前に、Vertex AI クイックスタート: クライアントライブラリの使用にある Node.js の設定手順を完了してください。詳細については、Vertex AI Node.js API のリファレンスドキュメントをご覧ください。

async function main(
  apiEndpoint,
  project,
  outputDir,
  pipelineJobDisplayName = 'embedding-customization-pipeline-sample',
  baseModelVersionId = 'text-embedding-004',
  taskType = 'DEFAULT',
  corpusPath = 'gs://cloud-samples-data/ai-platform/embedding/goog-10k-2024/r11/corpus.jsonl',
  queriesPath = 'gs://cloud-samples-data/ai-platform/embedding/goog-10k-2024/r11/queries.jsonl',
  trainLabelPath = 'gs://cloud-samples-data/ai-platform/embedding/goog-10k-2024/r11/train.tsv',
  testLabelPath = 'gs://cloud-samples-data/ai-platform/embedding/goog-10k-2024/r11/test.tsv',
  outputDimensionality = 768,
  learningRateMultiplier = 1.0,
  batchSize = 128,
  trainSteps = 1000
) {
  const aiplatform = require('@google-cloud/aiplatform');
  const {PipelineServiceClient} = aiplatform.v1;
  const {helpers} = aiplatform; // helps construct protobuf.Value objects.

  const client = new PipelineServiceClient({apiEndpoint});
  const match = apiEndpoint.match(/(?<L>\w+-\w+)/);
  const location = match ? match.groups.L : 'us-central1';
  const parent = `projects/${project}/locations/${location}`;
  const params = {
    base_model_version_id: baseModelVersionId,
    task_type: taskType,
    queries_path: queriesPath,
    corpus_path: corpusPath,
    train_label_path: trainLabelPath,
    test_label_path: testLabelPath,
    batch_size: batchSize,
    train_steps: trainSteps,
    output_dimensionality: outputDimensionality,
    learning_rate_multiplier: learningRateMultiplier,
  };
  const runtimeConfig = {
    gcsOutputDirectory: outputDir,
    parameterValues: Object.fromEntries(
      Object.entries(params).map(([k, v]) => [k, helpers.toValue(v)])
    ),
  };
  const pipelineJob = {
    templateUri:
      'https://us-kfp.pkg.dev/ml-pipeline/llm-text-embedding/tune-text-embedding-model/v1.1.3',
    displayName: pipelineJobDisplayName,
    runtimeConfig,
  };
  async function createTuneJob() {
    const [response] = await client.createPipelineJob({parent, pipelineJob});
    console.log(`job_name: ${response.name}`);
    console.log(`job_state: ${response.state}`);
  }

  await createTuneJob();
}

コンソール

Google Cloud コンソールを使用してテキストエンベディングモデルをチューニングするには、次の手順でカスタマイズパイプラインを起動します。

Google Cloud コンソールの [Vertex AI] セクションで、[Vertex AI Pipelines] ページに移動します。
Vertex AI Pipelines に移動
[実行を作成] をクリックして、[パイプライン実行の作成] ペインを開きます。
[既存のパイプラインから選択] をクリックして、次の詳細を入力します。
1. [リソースを選択] プルダウンから [ml-pipeline] を選択します。
2. [リポジトリ] プルダウンから [llm-text-embedding] を選択します。
3. [パイプラインまたはコンポーネント] プルダウンから [tune-text-embedding-model] を選択します。
4. [バージョン] プルダウンから「v1.1.3」というラベルのバージョンを選択します。
パイプライン実行を一意に識別できるように [実行名] を指定します。
[リージョン] プルダウンリストで、パイプライン実行を作成するリージョンを選択します。これは、チューニング済みのモデルが作成されたリージョンと同じリージョンになります。
[続行] をクリックします。[ランタイムの構成] ペインが表示されます。
[Cloud Storage のロケーション] で [参照] をクリックして、パイプライン出力アーティファクトを保存する Cloud Storage バケットを選択し、[選択] をクリックします。
[パイプラインパラメータ] で、チューニングパイプラインのパラメータを指定します。必要な 3 つのパラメータは corpus_path、queries_path、train_label_path です。形式については、エンベディングデータセットを準備するをご覧ください。各パラメータの詳細については、このセクションの「REST」タブをご覧ください。
[送信] をクリックして、パイプライン実行を作成します。

サポートされているその他の機能

テキストエンベディングのチューニングは VPC Service Controls をサポートしています。PipelineJob の作成時に network パラメータを渡すことで、Virtual Private Cloud（VPC）内で実行するように構成できます。

CMEK（顧客管理の暗号鍵）を使用するには、PipelineJob の作成時に、encryptionSpec.kmsKeyName パラメータとともに鍵を parameterValues.encryption_spec_key_name パイプラインパラメータに渡します。

チューニング済みモデルを使用する

Model Registry でチューニング済みモデルを表示する

チューニングジョブが完了しても、チューニング済みモデルはエンドポイントに自動的にはデプロイされません。Model Registry でモデルリソースとして使用できます。Google Cloud コンソールを使用して、現在のプロジェクト内のモデルのリスト（調整済みのモデルを含む）を表示できます。

Google Cloud コンソールでチューニングしたモデルを表示するには、[Vertex AI Model Registry] ページに移動します。

Vertex AI Model Registry に移動

モデルをデプロイする

エンベディングモデルをチューニングしたら、モデルリソースをデプロイする必要があります。チューニング済みのエンベディングモデルをデプロイするには、エンドポイントにモデルをデプロイするをご覧ください。

基盤モデルとは異なり、チューニング済みテキストエンベディングモデルはユーザーが管理します。これには、マシンタイプやアクセラレータなどのサービス提供リソースの管理が含まれます。予測中にメモリ不足エラーが発生しないようにするには、任意の入力長に対して最大 5 までのバッチサイズをサポートできる NVIDIA_TESLA_A100 GPU タイプを使用してデプロイすることをおすすめします。

textembedding-gecko 基盤モデルと同様に、チューニング済みモデルは最大 3,072 個のトークンをサポートし、それより長い入力を切り捨てることができます。

デプロイされたモデルで予測を取得する

チューニング済みモデルをデプロイしたら、次のいずれかのコマンドを使用して、チューニング済みモデルのエンドポイントにリクエストを発行できます。

チューニング済みの `textembedding-gecko@001` モデルの curl コマンドの例

チューニング済みバージョンの textembedding-gecko@001 から予測を取得するには、以下の curl コマンドの例を使用します。

PROJECT_ID=PROJECT_ID
LOCATION=LOCATION
ENDPOINT_URI=https://${LOCATION}-aiplatform.googleapis.com
MODEL_ENDPOINT=TUNED_MODEL_ENDPOINT_ID

curl -X POST -H "Authorization: Bearer $(gcloud auth print-access-token)" \
    -H "Content-Type: application/json"  \
    ${ENDPOINT_URI}/v1/projects/${PROJECT_ID}/locations/${LOCATION}/endpoints/${MODEL_ENDPOINT}:predict \
    -d '{
  "instances": [
    {
      "content": "Dining in New York City"
    },
    {
      "content": "Best resorts on the east coast"
    }
  ]
}'

`textembedding-gecko@001` 以外のモデルに対する curl コマンドの例

他のモデルのチューニング済みバージョン（textembedding-gecko@003 や textembedding-gecko-multilingual@001 など）には、task_type と title の 2 つの追加入力が必要です。これらのパラメータの詳細については、curl コマンドをご覧ください。

PROJECT_ID=PROJECT_ID
LOCATION=LOCATION
ENDPOINT_URI=https://${LOCATION}-aiplatform.googleapis.com
MODEL_ENDPOINT=TUNED_MODEL_ENDPOINT_ID

curl -X POST -H "Authorization: Bearer $(gcloud auth print-access-token)" \
    -H "Content-Type: application/json"  \
    ${ENDPOINT_URI}/v1/projects/${PROJECT_ID}/locations/${LOCATION}/endpoints/${MODEL_ENDPOINT}:predict \
    -d '{
  "instances": [
    {
      "content": "Dining in New York City",
      "task_type": "DEFAULT",
      "title": ""
    },
    {
      "content": "There are many resorts to choose from on the East coast...",
      "task_type": "RETRIEVAL_DOCUMENT",
      "title": "East Coast Resorts"
    }
  ]
}'

出力例

この出力は、バージョンに関係なく、textembedding-gecko モデルと textembedding-gecko-multilingual モデルの両方に適用されます。

。

{
 "predictions": [
   [ ... ],
   [ ... ],
   ...
 ],
 "deployedModelId": "...",
 "model": "projects/.../locations/.../models/...",
 "modelDisplayName": "tuned-text-embedding-model",
 "modelVersionId": "1"
}

次のステップ

エンベディングのバッチ予測を取得する。バッチテキストエンベディング予測を取得するをご覧ください。
マルチモーダルエンベディングの詳細を確認する。マルチモーダルエンベディングを取得するをご覧ください。
テキストエンベディングを取得するで、テキストのみのユースケース（テキストベースのセマンティック検索、クラスタリング、長時間のドキュメント分析、その他のテキスト取得や質問応答のユースケース）を確認する。