自 2025 年 4 月 29 日起，Gemini 1.5 Pro 和 Gemini 1.5 Flash 模型將無法用於先前未使用這些模型的專案，包括新專案。詳情請參閱「模型版本和生命週期」。

本頁面由 Cloud Translation API 翻譯而成。

Text embeddings API

本指南說明如何使用 Text Embeddings API 將文字轉換為數值向量。本文涵蓋下列主題：

語法：使用 cURL 或 Python SDK 呼叫 API。
要求和回應：瞭解文字嵌入模型的要求和回應參數。
範例：查看程式碼範例，瞭解如何嵌入文字字串。
後續步驟：參閱相關說明文件。

文字嵌入 API 會將文字轉換為稱為嵌入的數值向量。這些向量表示法會擷取文字的語意和情境。

支援的機型：

您可以使用下列模型取得文字嵌入：

模型名稱	說明	輸出維度	序列長度上限	支援的文字語言
`gemini-embedding-001`	在英文、多語言和程式碼工作方面表現優異。這項模型整合了先前的專用模型 (例如 `text-embedding-005` 和 `text-multilingual-embedding-002`)，並在各自領域中展現更出色的效能。詳情請參閱我們的技術報告。	最多 3072 個	2048 個符記	支援的文字語言
`text-embedding-005`	專精於英文和程式碼工作。	最多 768 個	2048 個符記	英文
`text-multilingual-embedding-002`	擅長處理多語言工作。	最多 768 個	2048 個符記	支援的文字語言

如要取得最佳嵌入品質，請使用 gemini-embedding-001，這是我們設計的大型模型，可提供最高效能。請注意，gemini-embedding-001 每個要求僅支援一個執行個體。

語法

curl

PROJECT_ID = PROJECT_ID
REGION = us-central1
MODEL_ID = MODEL_ID

curl -X POST \
  -H "Authorization: Bearer $(gcloud auth print-access-token)" \
  -H "Content-Type: application/json" \
  https://${REGION}-aiplatform.googleapis.com/v1/projects/${PROJECT_ID}/locations/${REGION}/publishers/google/models/${MODEL_ID}:predict -d \
  '{
    "instances": [
      ...
    ],
    "parameters": {
      ...
    }
  }'

Python

PROJECT_ID = PROJECT_ID
REGION = us-central1
MODEL_ID = MODEL_ID

import vertexai
from vertexai.language_models import TextEmbeddingModel

vertexai.init(project=PROJECT_ID, location=REGION)

model = TextEmbeddingModel.from_pretrained(MODEL_ID)
embeddings = model.get_embeddings(...)

要求和回應

要求主體

{
  "instances": [
    {
      "task_type": "RETRIEVAL_DOCUMENT",
      "title": "document title",
      "content": "I would like embeddings for this text!"
    },
  ]
}

要求參數

instances：必填。包含要嵌入文字的物件清單。系統支援下列欄位：
- content (string)：要生成嵌入的文字。
- task_type (string)：選用。指定預期的下游應用程式，協助模型產生品質較佳的嵌入項目。如未指定值，預設值為 RETRIEVAL_QUERY。如要進一步瞭解工作類型，請參閱「選擇嵌入工作類型」。
- title (string)：選用。文字內容的標題。只有在 task_type 為 RETRIEVAL_DOCUMENT 時，才會套用這個欄位。
parameters：選用。包含下列欄位的物件：
- autoTruncate (bool)：如果 true，輸入文字長度超過模型上限時，系統會截斷文字。如果 false 過大，系統會傳回錯誤。預設值為 true。
- outputDimensionality (int)：所需的嵌入大小。如果設定此值，輸出嵌入內容會截斷至這個維度。

工作類型

下表說明 task_type 參數值及其適用情況：

`task_type`	說明	用途
`RETRIEVAL_QUERY`	輸入的文字是搜尋或擷取設定中的查詢。	用於搜尋文件集合時的查詢文字。與「`RETRIEVAL_DOCUMENT`」配對文件。
`RETRIEVAL_DOCUMENT`	輸入文字是搜尋或擷取設定中的文件。	用於要搜尋的集合中的文件。與搜尋查詢的 `RETRIEVAL_QUERY` 配對。
`SEMANTIC_SIMILARITY`	輸入文字會用於語意文字相似度 (STS)。	比較兩段文字，判斷意義是否相似。
`CLASSIFICATION`	嵌入內容會用於分類工作。	訓練模型，將文字分類到預先定義的類別。
`CLUSTERING`	嵌入內容會用於叢集工作。	將相似的文字歸類在一起，不必預先定義標籤。
`QUESTION_ANSWERING`	輸入文字是問答系統的查詢。	從一組文件中尋找問題的答案。使用 `RETRIEVAL_DOCUMENT` 處理文件。
`FACT_VERIFICATION`	輸入文字是待驗證的聲明，系統會根據一組文件進行驗證。	驗證陳述內容是否屬實。使用 `RETRIEVAL_DOCUMENT` 處理文件。
`CODE_RETRIEVAL_QUERY`	輸入文字是查詢，用於擷取相關程式碼片段 (Java 和 Python)。	在程式碼庫中搜尋相關函式或程式碼片段。使用 `RETRIEVAL_DOCUMENT` 取得程式碼文件。

擷取工作：
- 查詢：使用 task_type=RETRIEVAL_QUERY 做為搜尋查詢的輸入文字。
- 語料庫：使用 task_type=RETRIEVAL_DOCUMENT 輸入文字，這些文字會成為搜尋的文件集合。
相似度工作：
- 語意相似度：對兩個輸入文字使用 task_type=SEMANTIC_SIMILARITY，評估兩者在意義上的整體相似度。
注意：SEMANTIC_SIMILARITY 不適用於文件搜尋和資訊檢索等檢索用途。在這些情況下，請使用 RETRIEVAL_DOCUMENT、RETRIEVAL_QUERY、QUESTION_ANSWERING 和 FACT_VERIFICATION。

回應主體

{
  "predictions": [
    {
      "embeddings": {
        "statistics": {
          "truncated": boolean,
          "token_count": integer
        },
        "values": [ number ]
      }
    }
  ]
}

回應參數

predictions：物件清單，每個物件都對應要求中的輸入例項。每個物件都包含下列欄位：
- embeddings：從輸入文字生成的嵌入。其中包含下列欄位：
  - values：浮點數清單，代表輸入文字的嵌入向量。
  - statistics：根據輸入文字計算的統計資料。其中包含下列欄位：
    - truncated (bool)：如果輸入文字超過模型允許的權杖數量上限，則為 true。
    - token_count (int)：輸入文字中的符記數量。

回覆範例

{
  "predictions": [
    {
      "embeddings": {
        "values": [
          0.0058424929156899452,
          0.011848051100969315,
          0.032247550785541534,
          -0.031829461455345154,
          -0.055369812995195389,
          ...
        ],
        "statistics": {
          "token_count": 4,
          "truncated": false
        }
      }
    }
  ]
}

範例

嵌入文字字串

以下範例說明如何取得文字字串的嵌入。

REST

設定環境後，您可以使用 REST 測試文字提示。下列範例會將要求傳送至發布商模型端點。

使用任何要求資料之前，請先替換以下項目：

PROJECT_ID：您的專案 ID。
TEXT：要生成嵌入的文字。限制：除了 textembedding-gecko@001 以外，所有模型最多可輸入五段文字，每段文字最多 2,048 個權杖。textembedding-gecko@001 的輸入權杖長度上限為 3072。對於 gemini-embedding-001，每項要求只能包含單一輸入文字。詳情請參閱「文字嵌入限制」。
AUTO_TRUNCATE：如果設為 false，文字超出權杖限制會導致要求失敗。預設值為 true。

HTTP 方法和網址：

POST https://us-central1-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/us-central1/publishers/google/models/gemini-embedding-001:predict

JSON 要求主體：

{
  "instances": [
    { "content": "TEXT"}
  ],
  "parameters": { 
    "autoTruncate": AUTO_TRUNCATE 
  }
}

如要傳送要求，請選擇以下其中一個選項：

curl

注意： 下列指令假設您已執行 gcloud init 或 gcloud auth login，透過使用者帳戶登入 gcloud CLI，或使用 Cloud Shell，自動登入 gcloud CLI。您可以執行 gcloud auth list 查看目前有效的帳戶。

將要求主體儲存在名為 request.json 的檔案中，然後執行下列指令：

curl -X POST \
     -H "Authorization: Bearer $(gcloud auth print-access-token)" \
     -H "Content-Type: application/json; charset=utf-8" \
     -d @request.json \
     "https://us-central1-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/us-central1/publishers/google/models/gemini-embedding-001:predict"

PowerShell

注意： 下列指令假設您已執行 gcloud init 或 gcloud auth login，透過使用者帳戶登入 gcloud CLI。您可以執行 gcloud auth list 查看目前有效的帳戶。

將要求主體儲存在名為 request.json 的檔案中，然後執行下列指令：

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
    -Method POST `
    -Headers $headers `
    -ContentType: "application/json; charset=utf-8" `
    -InFile request.json `
    -Uri "https://us-central1-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/us-central1/publishers/google/models/gemini-embedding-001:predict" | Select-Object -Expand Content

您應該會收到類似如下的 JSON 回應。請注意，為節省空間，values 已遭截斷。

回應

{
  "predictions": [
    {
      "embeddings": {
        "statistics": {
          "truncated": false,
          "token_count": 6
        },
        "values": [ ... ]
      }
    }
  ]
}

請注意這個範例網址中的以下部分：

使用 generateContent 方法，要求在完整生成回覆後再傳回。如要減少人類觀眾的延遲感，請使用 streamGenerateContent 方法，在生成回覆的同時串流回覆內容。
多模態模型 ID 位於網址尾端，方法之前 (例如 gemini-2.0-flash)。這個範例也可能支援其他模型。

Python

如要瞭解如何安裝或更新 Python 適用的 Vertex AI SDK，請參閱「安裝 Python 適用的 Vertex AI SDK」。詳情請參閱 Python API 參考說明文件。

from __future__ import annotations

from vertexai.language_models import TextEmbeddingInput, TextEmbeddingModel


def embed_text() -> list[list[float]]:
    """Embeds texts with a pre-trained, foundational model.

    Returns:
        A list of lists containing the embedding vectors for each input text
    """

    # A list of texts to be embedded.
    texts = ["banana muffins? ", "banana bread? banana muffins?"]
    # The dimensionality of the output embeddings.
    dimensionality = 3072
    # The task type for embedding. Check the available tasks in the model's documentation.
    task = "RETRIEVAL_DOCUMENT"

    model = TextEmbeddingModel.from_pretrained("gemini-embedding-001")
    kwargs = dict(output_dimensionality=dimensionality) if dimensionality else {}

    embeddings = []
    # gemini-embedding-001 takes one input at a time
    for text in texts:
        text_input = TextEmbeddingInput(text, task)
        embedding = model.get_embeddings([text_input], **kwargs)
        print(embedding)
        # Example response:
        # [[0.006135190837085247, -0.01462465338408947, 0.004978656303137541, ...]]
        embeddings.append(embedding[0].values)

    return embeddings

Go

在試用這個範例之前，請先按照Go使用用戶端程式庫的 Vertex AI 快速入門中的操作說明進行設定。詳情請參閱 Vertex AI Go API 參考說明文件。

如要向 Vertex AI 進行驗證，請設定應用程式預設憑證。詳情請參閱「為本機開發環境設定驗證」。

import (
	"context"
	"fmt"
	"io"

	aiplatform "cloud.google.com/go/aiplatform/apiv1"
	"cloud.google.com/go/aiplatform/apiv1/aiplatformpb"

	"google.golang.org/api/option"
	"google.golang.org/protobuf/types/known/structpb"
)

// embedTexts shows how embeddings are set for gemini-embedding-001 model
func embedTexts(w io.Writer, project, location string) error {
	// location := "us-central1"
	ctx := context.Background()

	apiEndpoint := fmt.Sprintf("%s-aiplatform.googleapis.com:443", location)
	dimensionality := 3072
	model := "gemini-embedding-001"
	texts := []string{"banana muffins? ", "banana bread? banana muffins?"}

	client, err := aiplatform.NewPredictionClient(ctx, option.WithEndpoint(apiEndpoint))
	if err != nil {
		return err
	}
	defer client.Close()

	endpoint := fmt.Sprintf("projects/%s/locations/%s/publishers/google/models/%s", project, location, model)
	allEmbeddings := make([][]float32, 0, len(texts))
	// gemini-embedding-001 takes 1 input at a time
	for _, text := range texts {
		instances := make([]*structpb.Value, 1)
		instances[0] = structpb.NewStructValue(&structpb.Struct{
			Fields: map[string]*structpb.Value{
				"content":   structpb.NewStringValue(text),
				"task_type": structpb.NewStringValue("QUESTION_ANSWERING"),
			},
		})

		params := structpb.NewStructValue(&structpb.Struct{
			Fields: map[string]*structpb.Value{
				"outputDimensionality": structpb.NewNumberValue(float64(dimensionality)),
			},
		})

		req := &aiplatformpb.PredictRequest{
			Endpoint:   endpoint,
			Instances:  instances,
			Parameters: params,
		}
		resp, err := client.Predict(ctx, req)
		if err != nil {
			return err
		}

		// Process the prediction for the single text
		// The response will contain one prediction because we sent one instance.
		if len(resp.Predictions) == 0 {
			return fmt.Errorf("no predictions returned for text \"%s\"", text)
		}

		prediction := resp.Predictions[0]
		embeddingValues := prediction.GetStructValue().Fields["embeddings"].GetStructValue().Fields["values"].GetListValue().Values

		currentEmbedding := make([]float32, len(embeddingValues))
		for j, value := range embeddingValues {
			currentEmbedding[j] = float32(value.GetNumberValue())
		}
		allEmbeddings = append(allEmbeddings, currentEmbedding)
	}

	if len(allEmbeddings) > 0 {
		fmt.Fprintf(w, "Dimensionality: %d. Embeddings length: %d", len(allEmbeddings[0]), len(allEmbeddings))
	} else {
		fmt.Fprintln(w, "No texts were processed.")
	}
	return nil
}

Java

在試用這個範例之前，請先按照Java使用用戶端程式庫的 Vertex AI 快速入門中的操作說明進行設定。詳情請參閱 Vertex AI Java API 參考說明文件。

如要向 Vertex AI 進行驗證，請設定應用程式預設憑證。詳情請參閱「為本機開發環境設定驗證」。

import static java.util.stream.Collectors.toList;

import com.google.cloud.aiplatform.v1.EndpointName;
import com.google.cloud.aiplatform.v1.PredictRequest;
import com.google.cloud.aiplatform.v1.PredictResponse;
import com.google.cloud.aiplatform.v1.PredictionServiceClient;
import com.google.cloud.aiplatform.v1.PredictionServiceSettings;
import com.google.protobuf.Struct;
import com.google.protobuf.Value;
import java.io.IOException;
import java.util.ArrayList;
import java.util.List;
import java.util.OptionalInt;
import java.util.regex.Matcher;
import java.util.regex.Pattern;

public class PredictTextEmbeddingsSample {
  public static void main(String[] args) throws IOException {
    // TODO(developer): Replace these variables before running the sample.
    // Details about text embedding request structure and supported models are available in:
    // https://cloud.google.com/vertex-ai/docs/generative-ai/embeddings/get-text-embeddings
    String endpoint = "us-central1-aiplatform.googleapis.com:443";
    String project = "YOUR_PROJECT_ID";
    String model = "gemini-embedding-001";
    predictTextEmbeddings(
        endpoint,
        project,
        model,
        List.of("banana bread?", "banana muffins?"),
        "QUESTION_ANSWERING",
        OptionalInt.of(3072));
  }

  // Gets text embeddings from a pretrained, foundational model.
  public static List<List<Float>> predictTextEmbeddings(
      String endpoint,
      String project,
      String model,
      List<String> texts,
      String task,
      OptionalInt outputDimensionality)
      throws IOException {
    PredictionServiceSettings settings =
        PredictionServiceSettings.newBuilder().setEndpoint(endpoint).build();
    Matcher matcher = Pattern.compile("^(?<Location>\\w+-\\w+)").matcher(endpoint);
    String location = matcher.matches() ? matcher.group("Location") : "us-central1";
    EndpointName endpointName =
        EndpointName.ofProjectLocationPublisherModelName(project, location, "google", model);

    List<List<Float>> floats = new ArrayList<>();
    // You can use this prediction service client for multiple requests.
    try (PredictionServiceClient client = PredictionServiceClient.create(settings)) {
      // gemini-embedding-001 takes one input at a time.
      for (int i = 0; i < texts.size(); i++) {
        PredictRequest.Builder request = 
            PredictRequest.newBuilder().setEndpoint(endpointName.toString());
        if (outputDimensionality.isPresent()) {
          request.setParameters(
              Value.newBuilder()
                  .setStructValue(
                      Struct.newBuilder()
                          .putFields(
                              "outputDimensionality", valueOf(outputDimensionality.getAsInt()))
                          .build()));
        }
        request.addInstances(
            Value.newBuilder()
                .setStructValue(
                    Struct.newBuilder()
                        .putFields("content", valueOf(texts.get(i)))
                        .putFields("task_type", valueOf(task))
                        .build()));
        PredictResponse response = client.predict(request.build());

        for (Value prediction : response.getPredictionsList()) {
          Value embeddings = prediction.getStructValue().getFieldsOrThrow("embeddings");
          Value values = embeddings.getStructValue().getFieldsOrThrow("values");
          floats.add(
              values.getListValue().getValuesList().stream()
                  .map(Value::getNumberValue)
                  .map(Double::floatValue)
                  .collect(toList()));
        }
      }
      return floats;
    }
  }

  private static Value valueOf(String s) {
    return Value.newBuilder().setStringValue(s).build();
  }

  private static Value valueOf(int n) {
    return Value.newBuilder().setNumberValue(n).build();
  }
}

Node.js

在試用這個範例之前，請先按照Node.js使用用戶端程式庫的 Vertex AI 快速入門中的操作說明進行設定。詳情請參閱 Vertex AI Node.js API 參考說明文件。

如要向 Vertex AI 進行驗證，請設定應用程式預設憑證。詳情請參閱「為本機開發環境設定驗證」。

async function main(
  project,
  model = 'gemini-embedding-001',
  texts = 'banana bread?;banana muffins?',
  task = 'QUESTION_ANSWERING',
  dimensionality = 0,
  apiEndpoint = 'us-central1-aiplatform.googleapis.com'
) {
  const aiplatform = require('@google-cloud/aiplatform');
  const {PredictionServiceClient} = aiplatform.v1;
  const {helpers} = aiplatform; // helps construct protobuf.Value objects.
  const clientOptions = {apiEndpoint: apiEndpoint};
  const location = 'us-central1';
  const endpoint = `projects/${project}/locations/${location}/publishers/google/models/${model}`;

  async function callPredict() {
    const instances = texts
      .split(';')
      .map(e => helpers.toValue({content: e, task_type: task}));

    const client = new PredictionServiceClient(clientOptions);
    const parameters = helpers.toValue(
      dimensionality > 0 ? {outputDimensionality: parseInt(dimensionality)} : {}
    );
    const allEmbeddings = []
    // gemini-embedding-001 takes one input at a time.
    for (const instance of instances) {
      const request = {endpoint, instances: [instance], parameters};
      const [response] = await client.predict(request);
      const predictions = response.predictions;

      const embeddings = predictions.map(p => {
        const embeddingsProto = p.structValue.fields.embeddings;
        const valuesProto = embeddingsProto.structValue.fields.values;
        return valuesProto.listValue.values.map(v => v.numberValue);
      });

      allEmbeddings.push(embeddings[0])
    }


    console.log('Got embeddings: \n' + JSON.stringify(allEmbeddings));
  }

  callPredict();
}

支援的文字語言

所有文字嵌入模型都支援英文文字，且已根據英文文字進行評估。

text-multilingual-embedding-002 模型也支援下列語言。這項模型已根據「評估語言」清單中的語言進行評估。

評估語言： Arabic (ar)、Bengali (bn)、English (en)、Spanish (es)、German (de)、Persian (fa)、Finnish (fi)、French (fr)、Hindi (hi)、Indonesian (id)、Japanese (ja)、Korean (ko)、Russian (ru)、Swahili (sw)、Telugu (te)、Thai (th)、Yoruba (yo)、Chinese (zh)
支援的語言：Afrikaans、Albanian、Amharic、Arabic、Armenian、Azerbaijani、Basque、Belarusiasn、Bengali、Bulgarian、Burmese、Catalan、Cebuano、Chichewa、Chinese、Corsican、Czech、Danish、Dutch、English、Esperanto、Estonian、Filipino、Finnish、French、Galician、Georgian、German、Greek、Gujarati、Haitian Creole、Hausa、Hawaiian、Hebrew、Hindi、Hmong、Hungarian、Icelandic、Igbo、Indonesian、Irish、Italian、Japanese、Javanese、Kannada、Kazakh、Khmer、Korean、Kurdish、Kyrgyz、Lao、Latin、Latvian、Lithuanian、Luxembourgish、Macedonian、Malagasy、Malay、Malayalam、Maltese、Maori、Marathi、Mongolian、Nepali、Norwegian、Pashto、Persian、Polish、Portuguese、Punjabi、Romanian、Russian、Samoan、Scottish Gaelic、Serbian、Shona、Sindhi、Sinhala、Slovak、Slovenian、Somali、Sotho、Spanish、Sundanese、Swahili、Swedish、Tajik、Tamil、Telugu、Thai、Turkish、Ukrainian、Urdu、Uzbek、Vietnamese、Welsh、West Frisian、Xhosa、Yiddish、Yoruba、Zulu。

gemini-embedding-001 模型支援下列語言：

Arabic、Bengali、Bulgarian、Chinese (Simplified and Traditional)、Croatian、Czech、Danish、Dutch、English、Estonian、Finnish、French、German、Greek、Hebrew、Hindi、Hungarian、Indonesian、Italian、Japanese、Korean、Latvian、Lithuanian、Norwegian、Polish、Portuguese、Romanian、Russian、Serbian、Slovak、Slovenian、Spanish、Swahili、Swedish、Thai、Turkish、Ukrainian、Vietnamese、Afrikaans、Amharic、Assamese、Azerbaijani、Belarusian、Bosnian、Catalan、Cebuano、Corsican、Welsh、Dhivehi、Esperanto、Basque、Persian、Filipino (Tagalog)、Frisian、Irish、Scots Gaelic、Galician、Gujarati、Hausa、Hawaiian、Hmong、Haitian Creole、Armenian、Igbo、Icelandic、Javanese、Georgian、Kazakh、Khmer、Kannada、Krio、Kurdish、Kyrgyz、Latin、Luxembourgish、Lao、Malagasy、Maori、Macedonian、Malayalam、Mongolian、Meiteilon (Manipuri)、Marathi、Malay、Maltese、Myanmar (Burmese)、Nepali、Nyanja (Chichewa)、Odia (Oriya)、Punjabi、Pashto、Sindhi、Sinhala (Sinhalese)、Samoan、Shona、Somali、Albanian、Sesotho、Sundanese、Tamil、Telugu、Tajik、Uyghur、Urdu、Uzbek、Xhosa、Yiddish、Yoruba、Zulu。

模型版本

如要使用目前的穩定版模型，請指定模型版本號碼，例如 gemini-embedding-001。

不建議指定沒有版本號碼的模型，因為這是指向其他模型的不穩定舊版指標。

詳情請參閱「模型版本和生命週期」。

後續步驟

進一步瞭解文字嵌入：

文字嵌入

Text embeddings API 透過集合功能整理內容 你可以依據偏好儲存及分類內容。

語法

curl

Python

要求和回應

要求主體

要求參數

工作類型

回應主體

回應參數

範例

嵌入文字字串

REST

curl

PowerShell

回應

Python

Go

Java

Node.js

支援的文字語言

模型版本

後續步驟

Text embeddings API