텍스트 임베딩 가져오기

이 문서에서는 Vertex AI 텍스트 임베딩 API를 사용하여 텍스트 임베딩을 만드는 방법을 설명합니다.

Vertex AI 텍스트 임베딩 API는 밀집 벡터 표현을 사용합니다. 예를 들어 text-embedding-gecko에서는 768차원 벡터를 사용합니다. 고밀도 벡터 임베딩 모델은 대규모 언어 모델에서 사용하는 것과 유사한 심층 학습 방법을 사용합니다. 단어를 숫자에 직접 매핑하는 희소 벡터와 달리 밀집 벡터는 텍스트의 의미를 더 잘 나타내도록 설계되었습니다. 생성형 AI에서 밀집 벡터 임베딩을 사용할 때의 이점은 직접 단어 또는 문법 일치를 검색하는 대신 문구의 언어가 다른 경우에도 쿼리의 의미와 일치하는 문구를 더 효과적으로 검색할 수 있다는 것입니다.

임베딩에 관한 자세한 내용은 임베딩 API 개요를 참조하세요.
텍스트 임베딩 모델에 대한 자세한 내용은 텍스트 임베딩을 참조하세요.
각 임베딩 모델에서 지원하는 언어에 대한 자세한 내용은 지원되는 텍스트 언어를 참조하세요.

시작하기 전에

Sign in to your Google Cloud account. If you're new to Google Cloud, create an account to evaluate how our products perform in real-world scenarios. New customers also get $300 in free credits to run, test, and deploy workloads.

In the Google Cloud console, on the project selector page, select or create a Google Cloud project.

Go to project selector

Enable the Vertex AI API.

Enable the API

In the Google Cloud console, on the project selector page, select or create a Google Cloud project.

Go to project selector

Enable the Vertex AI API.

Enable the API

임베딩 작업의 태스크 유형을 선택합니다.

지원되는 모델

다음 모델을 사용하여 텍스트 임베딩을 가져올 수 있습니다.

영어 모델	다국어 모델
`textembedding-gecko@001`	`textembedding-gecko-multilingual@001`
`textembedding-gecko@003`	`text-multilingual-embedding-002`
`text-embedding-004`
`text-embedding-preview-0815`

이러한 모델을 처음 사용하는 경우 최신 버전을 사용하는 것이 좋습니다. 영어 텍스트에는 text-embedding-004를 사용합니다. 다국어 텍스트에는 text-multilingual-embedding-002를 사용합니다.

텍스트 스니펫의 텍스트 임베딩 가져오기

Vertex AI API 또는 Python용 Vertex AI SDK를 사용하여 텍스트 스니펫의 텍스트 임베딩을 가져올 수 있습니다. 요청마다 us-central1에서는 입력 텍스트가 250개로 제한되고 다른 리전에서는 최대 입력 텍스트가 5개로 제한됩니다. API의 최대 입력 토큰 한도는 20,000개입니다. 이 한도를 초과하는 입력은 500 오류가 발생합니다. 각 개별 입력 텍스트는 토큰 2,048개로 제한되며 초과하는 부분은 자동으로 잘립니다. autoTruncate를 false로 설정하여 자동 잘림을 사용 중지할 수도 있습니다.

모든 모델은 기본적으로 768개의 측정기준이 있는 출력을 생성합니다. 하지만 다음 모델에서는 사용자가 1~768 사이의 출력 크기를 선택할 수 있습니다. 더 작은 출력 크기를 선택하면 메모리와 저장용량을 절약할 수 있으므로 보다 효율적인 계산을 수행할 수 있습니다.

text-embedding-004
text-multilingual-embedding-002
text-embedding-preview-0815

다음 예시에서는 text-embedding-004 모델을 사용합니다.

REST

텍스트 임베딩을 가져오려면 게시자 모델의 모델 ID를 지정하여 POST 요청을 전송합니다.

요청 데이터를 사용하기 전에 다음을 바꿉니다.

PROJECT_ID: 프로젝트 ID
TEXT: 임베딩을 생성하려는 텍스트입니다. 한도: textembedding-gecko@001를 제외한 모든 모델에 대해 텍스트당 최대 2,048개의 토큰으로 구성된 텍스트 5개입니다. textembedding-gecko@001의 최대 입력 토큰 길이는 3072입니다.
AUTO_TRUNCATE: false로 설정하면 토큰 한도를 초과하는 텍스트로 인해 요청이 실패합니다. 기본값은 true입니다.

HTTP 메서드 및 URL:

POST https://us-central1-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/us-central1/publishers/google/models/text-embedding-004:predict

JSON 요청 본문:

{
  "instances": [
    { "content": "TEXT"}
  ],
  "parameters": { 
    "autoTruncate": AUTO_TRUNCATE 
  }
}

요청을 보내려면 다음 옵션 중 하나를 선택합니다.

curl

참고: 다음 명령어는 gcloud init 또는 gcloud auth login을 실행하거나 gcloud CLI에 자동으로 로그인하는 Cloud Shell을 사용하여 사용자 계정으로 gcloud CLI에 로그인했다고 가정합니다. gcloud auth list를 실행하면 현재 활성 계정을 확인할 수 있습니다.

요청 본문을 request.json 파일에 저장하고 다음 명령어를 실행합니다.

curl -X POST \
     -H "Authorization: Bearer $(gcloud auth print-access-token)" \
     -H "Content-Type: application/json; charset=utf-8" \
     -d @request.json \
     "https://us-central1-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/us-central1/publishers/google/models/text-embedding-004:predict"

PowerShell

참고: 다음 명령어는 gcloud init 또는 gcloud auth login을 실행하여 사용자 계정으로 gcloud CLI에 로그인했다고 가정합니다. gcloud auth list를 실행하면 현재 활성 계정을 확인할 수 있습니다.

요청 본문을 request.json 파일에 저장하고 다음 명령어를 실행합니다.

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
    -Method POST `
    -Headers $headers `
    -ContentType: "application/json; charset=utf-8" `
    -InFile request.json `
    -Uri "https://us-central1-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/us-central1/publishers/google/models/text-embedding-004:predict" | Select-Object -Expand Content

다음과 비슷한 JSON 응답이 수신됩니다. values는 공간 절약을 위해 잘렸습니다.

응답

{
  "predictions": [
    {
      "embeddings": {
        "statistics": {
          "truncated": false,
          "token_count": 6
        },
        "values": [ ... ]
      }
    }
  ]
}

curl 명령어 예시

MODEL_ID="text-embedding-004"
PROJECT_ID=PROJECT_ID

curl \
-X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json" \
https://us-central1-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/us-central1/publishers/google/models/${MODEL_ID}:predict -d \
$'{
  "instances": [
    { "content": "What is life?"}
  ],
}'

Python

Vertex AI SDK for Python을 설치하거나 업데이트하는 방법은 Vertex AI SDK for Python 설치를 참조하세요. 자세한 내용은 Python API 참고 문서를 확인하세요.

from typing import List, Optional

from vertexai.language_models import TextEmbeddingInput, TextEmbeddingModel


def embed_text(
    texts: list = None,
    task: str = "RETRIEVAL_DOCUMENT",
    dimensionality: Optional[int] = 256,
) -> List[List[float]]:
    """Embeds texts with a pre-trained, foundational model.
    Args:
        texts (List[str]): A list of texts to be embedded.
        task (str): The task type for embedding. Check the available tasks in the model's documentation.
        dimensionality (Optional[int]): The dimensionality of the output embeddings.
    Returns:
        List[List[float]]: A list of lists containing the embedding vectors for each input text
    """
    if texts is None:
        texts = ["banana muffins? ", "banana bread? banana muffins?"]
    model = TextEmbeddingModel.from_pretrained("text-embedding-004")
    inputs = [TextEmbeddingInput(text, task) for text in texts]
    kwargs = dict(output_dimensionality=dimensionality) if dimensionality else {}
    embeddings = model.get_embeddings(inputs, **kwargs)
    # Example response:
    # [[0.006135190837085247, -0.01462465338408947, 0.004978656303137541, ...],
    return [embedding.values for embedding in embeddings]

Go

이 샘플을 사용해 보기 전에 Vertex AI 빠른 시작: 클라이언트 라이브러리 사용의 Go 설정 안내를 따르세요. 자세한 내용은 Vertex AI Go API 참고 문서를 참조하세요.

Vertex AI에 인증하려면 애플리케이션 기본 사용자 인증 정보를 설정합니다. 자세한 내용은 로컬 개발 환경의 인증 설정을 참조하세요.

import (
	"context"
	"fmt"
	"io"

	aiplatform "cloud.google.com/go/aiplatform/apiv1"
	"cloud.google.com/go/aiplatform/apiv1/aiplatformpb"

	"google.golang.org/api/option"
	"google.golang.org/protobuf/types/known/structpb"
)

// embedTexts shows how embeddings are set for text-embedding-preview-0409 model
func embedTexts(w io.Writer, project, location string) error {
	// location := "us-central1"
	ctx := context.Background()

	apiEndpoint := fmt.Sprintf("%s-aiplatform.googleapis.com:443", location)
	dimensionality := 5
	model := "text-embedding-004"
	texts := []string{"banana muffins? ", "banana bread? banana muffins?"}

	client, err := aiplatform.NewPredictionClient(ctx, option.WithEndpoint(apiEndpoint))
	if err != nil {
		return err
	}
	defer client.Close()

	endpoint := fmt.Sprintf("projects/%s/locations/%s/publishers/google/models/%s", project, location, model)
	instances := make([]*structpb.Value, len(texts))
	for i, text := range texts {
		instances[i] = structpb.NewStructValue(&structpb.Struct{
			Fields: map[string]*structpb.Value{
				"content":   structpb.NewStringValue(text),
				"task_type": structpb.NewStringValue("QUESTION_ANSWERING"),
			},
		})
	}

	params := structpb.NewStructValue(&structpb.Struct{
		Fields: map[string]*structpb.Value{
			"outputDimensionality": structpb.NewNumberValue(float64(dimensionality)),
		},
	})

	req := &aiplatformpb.PredictRequest{
		Endpoint:   endpoint,
		Instances:  instances,
		Parameters: params,
	}
	resp, err := client.Predict(ctx, req)
	if err != nil {
		return err
	}
	embeddings := make([][]float32, len(resp.Predictions))
	for i, prediction := range resp.Predictions {
		values := prediction.GetStructValue().Fields["embeddings"].GetStructValue().Fields["values"].GetListValue().Values
		embeddings[i] = make([]float32, len(values))
		for j, value := range values {
			embeddings[i][j] = float32(value.GetNumberValue())
		}
	}

	fmt.Fprintf(w, "Dimensionality: %d. Embeddings length: %d", len(embeddings[0]), len(embeddings))
	return nil
}

Java

이 샘플을 사용해 보기 전에 Vertex AI 빠른 시작: 클라이언트 라이브러리 사용의 Java 설정 안내를 따르세요. 자세한 내용은 Vertex AI Java API 참고 문서를 참조하세요.

Vertex AI에 인증하려면 애플리케이션 기본 사용자 인증 정보를 설정합니다. 자세한 내용은 로컬 개발 환경의 인증 설정을 참조하세요.

import static java.util.stream.Collectors.toList;

import com.google.cloud.aiplatform.v1.EndpointName;
import com.google.cloud.aiplatform.v1.PredictRequest;
import com.google.cloud.aiplatform.v1.PredictResponse;
import com.google.cloud.aiplatform.v1.PredictionServiceClient;
import com.google.cloud.aiplatform.v1.PredictionServiceSettings;
import com.google.protobuf.Struct;
import com.google.protobuf.Value;
import java.io.IOException;
import java.util.ArrayList;
import java.util.List;
import java.util.OptionalInt;
import java.util.regex.Matcher;
import java.util.regex.Pattern;

public class PredictTextEmbeddingsSample {
  public static void main(String[] args) throws IOException {
    // TODO(developer): Replace these variables before running the sample.
    // Details about text embedding request structure and supported models are available in:
    // https://cloud.google.com/vertex-ai/docs/generative-ai/embeddings/get-text-embeddings
    String endpoint = "us-central1-aiplatform.googleapis.com:443";
    String project = "YOUR_PROJECT_ID";
    String model = "text-embedding-004";
    predictTextEmbeddings(
        endpoint,
        project,
        model,
        List.of("banana bread?", "banana muffins?"),
        "QUESTION_ANSWERING",
        OptionalInt.of(256));
  }

  // Gets text embeddings from a pretrained, foundational model.
  public static List<List<Float>> predictTextEmbeddings(
      String endpoint,
      String project,
      String model,
      List<String> texts,
      String task,
      OptionalInt outputDimensionality)
      throws IOException {
    PredictionServiceSettings settings =
        PredictionServiceSettings.newBuilder().setEndpoint(endpoint).build();
    Matcher matcher = Pattern.compile("^(?<Location>\\w+-\\w+)").matcher(endpoint);
    String location = matcher.matches() ? matcher.group("Location") : "us-central1";
    EndpointName endpointName =
        EndpointName.ofProjectLocationPublisherModelName(project, location, "google", model);

    // You can use this prediction service client for multiple requests.
    try (PredictionServiceClient client = PredictionServiceClient.create(settings)) {
      PredictRequest.Builder request =
          PredictRequest.newBuilder().setEndpoint(endpointName.toString());
      if (outputDimensionality.isPresent()) {
        request.setParameters(
            Value.newBuilder()
                .setStructValue(
                    Struct.newBuilder()
                        .putFields("outputDimensionality", valueOf(outputDimensionality.getAsInt()))
                        .build()));
      }
      for (int i = 0; i < texts.size(); i++) {
        request.addInstances(
            Value.newBuilder()
                .setStructValue(
                    Struct.newBuilder()
                        .putFields("content", valueOf(texts.get(i)))
                        .putFields("task_type", valueOf(task))
                        .build()));
      }
      PredictResponse response = client.predict(request.build());
      List<List<Float>> floats = new ArrayList<>();
      for (Value prediction : response.getPredictionsList()) {
        Value embeddings = prediction.getStructValue().getFieldsOrThrow("embeddings");
        Value values = embeddings.getStructValue().getFieldsOrThrow("values");
        floats.add(
            values.getListValue().getValuesList().stream()
                .map(Value::getNumberValue)
                .map(Double::floatValue)
                .collect(toList()));
      }
      return floats;
    }
  }

  private static Value valueOf(String s) {
    return Value.newBuilder().setStringValue(s).build();
  }

  private static Value valueOf(int n) {
    return Value.newBuilder().setNumberValue(n).build();
  }
}

Node.js

이 샘플을 사용해 보기 전에 Vertex AI 빠른 시작: 클라이언트 라이브러리 사용의 Node.js 설정 안내를 따르세요. 자세한 내용은 Vertex AI Node.js API 참고 문서를 참조하세요.

Vertex AI에 인증하려면 애플리케이션 기본 사용자 인증 정보를 설정합니다. 자세한 내용은 로컬 개발 환경의 인증 설정을 참조하세요.

async function main(
  project,
  model = 'text-embedding-004',
  texts = 'banana bread?;banana muffins?',
  task = 'QUESTION_ANSWERING',
  dimensionality = 0,
  apiEndpoint = 'us-central1-aiplatform.googleapis.com'
) {
  const aiplatform = require('@google-cloud/aiplatform');
  const {PredictionServiceClient} = aiplatform.v1;
  const {helpers} = aiplatform; // helps construct protobuf.Value objects.
  const clientOptions = {apiEndpoint: apiEndpoint};
  const location = 'us-central1';
  const endpoint = `projects/${project}/locations/${location}/publishers/google/models/${model}`;

  async function callPredict() {
    const instances = texts
      .split(';')
      .map(e => helpers.toValue({content: e, task_type: task}));
    const parameters = helpers.toValue(
      dimensionality > 0 ? {outputDimensionality: parseInt(dimensionality)} : {}
    );
    const request = {endpoint, instances, parameters};
    const client = new PredictionServiceClient(clientOptions);
    const [response] = await client.predict(request);
    const predictions = response.predictions;
    const embeddings = predictions.map(p => {
      const embeddingsProto = p.structValue.fields.embeddings;
      const valuesProto = embeddingsProto.structValue.fields.values;
      return valuesProto.listValue.values.map(v => v.numberValue);
    });
    console.log('Got embeddings: \n' + JSON.stringify(embeddings));
  }

  callPredict();
}

최근 모델

하나의 임베딩 모델을 미리보기로 사용할 수 있습니다.

text-embedding-preview-0815

이 모델은 일반 텍스트 쿼리를 사용하여 관련성 높은 코드 블록을 검색하는 데 사용할 수 있는 새로운 태스크 유형 CODE_RETRIEVAL_QUERY를 지원합니다. 이 기능을 사용하려면 코드 블록은 RETRIEVAL_DOCUMENT 작업 유형을 사용하여 임베디드하고 텍스트 쿼리는 CODE_RETRIEVAL_QUERY를 사용하여 임베디드해야 합니다.

모든 작업 유형을 살펴보려면 모델 참조를 참조하세요.

예를 들면 다음과 같습니다.

REST

PROJECT_ID=PROJECT_ID

curl \
-X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json" \
https://us-central1-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/us-central1/publishers/google/models/text-embedding-preview-0815:predict -d \
$'{
  "instances": [
    {
      "task_type": "CODE_RETRIEVAL_QUERY",
      "content": "Function to add two numbers"
    }
  ],
}'

Python

Vertex AI SDK for Python을 설치하거나 업데이트하는 방법은 Vertex AI SDK for Python 설치를 참조하세요. 자세한 내용은 Python API 참고 문서를 확인하세요.

from vertexai.language_models import TextEmbeddingInput, TextEmbeddingModel

MODEL_NAME = "text-embedding-preview-0815"
DIMENSIONALITY = 256


def embed_text(
    texts: list[str] = ["Retrieve a function that adds two numbers"],
    task: str = "CODE_RETRIEVAL_QUERY",
    model_name: str = "text-embedding-preview-0815",
    dimensionality: int | None = 256,
) -> list[list[float]]:
    """Embeds texts with a pre-trained, foundational model."""
    model = TextEmbeddingModel.from_pretrained(model_name)
    inputs = [TextEmbeddingInput(text, task) for text in texts]
    kwargs = dict(output_dimensionality=dimensionality) if dimensionality else {}
    embeddings = model.get_embeddings(inputs, **kwargs)
    # Example response:
    # [[0.025890009477734566, -0.05553026497364044, 0.006374752148985863,...],
    return [embedding.values for embedding in embeddings]


if __name__ == "__main__":
    # Embeds code block with a pre-trained, foundational model.
    # Using this function to calculate the embedding for corpus.
    texts = ["Retrieve a function that adds two numbers"]
    task = "CODE_RETRIEVAL_QUERY"
    code_block_embeddings = embed_text(
        texts=texts, task=task, model_name=MODEL_NAME, dimensionality=DIMENSIONALITY
    )

    # Embeds code retrieval with a pre-trained, foundational model.
    # Using this function to calculate the embedding for query.
    texts = [
        "def func(a, b): return a + b",
        "def func(a, b): return a - b",
        "def func(a, b): return (a ** 2 + b ** 2) ** 0.5",
    ]
    task = "RETRIEVAL_DOCUMENT"
    code_query_embeddings = embed_text(
        texts=texts, task=task, model_name=MODEL_NAME, dimensionality=DIMENSIONALITY
    )

Go

Vertex AI에 인증하려면 애플리케이션 기본 사용자 인증 정보를 설정합니다. 자세한 내용은 로컬 개발 환경의 인증 설정을 참조하세요.

import (
	"context"
	"fmt"
	"io"

	aiplatform "cloud.google.com/go/aiplatform/apiv1"
	"cloud.google.com/go/aiplatform/apiv1/aiplatformpb"

	"google.golang.org/api/option"
	"google.golang.org/protobuf/types/known/structpb"
)

// Embeds code query with a pre-trained, foundational model by specifying the task type as 'CODE_RETRIEVAL_QUERY'. e.g. 'Retrieve a function that adds two numbers'.
// Embeds code block with a pre-trained, foundational model by specifying the task type as 'RETRIEVAL_DOCUMENT'. e.g. 'texts := []string{"def func(a, b): return a + b", "def func(a, b): return a - b", "def func(a, b): return (a ** 2 + b ** 2) ** 0.5"}'.
// embedTextsPreview shows how embeddings are set for text-embedding-preview-0815 model
func embedTextsPreview(w io.Writer, projectID, location string) error {
	// location := "us-central1"
	ctx := context.Background()

	apiEndpoint := fmt.Sprintf("%s-aiplatform.googleapis.com:443", location)
	dimensionality := 5
	model := "text-embedding-preview-0815"
	texts := []string{"banana muffins? ", "banana bread? banana muffins?"}

	client, err := aiplatform.NewPredictionClient(ctx, option.WithEndpoint(apiEndpoint))
	if err != nil {
		return err
	}
	defer client.Close()

	endpoint := fmt.Sprintf("projects/%s/locations/%s/publishers/google/models/%s", projectID, location, model)
	instances := make([]*structpb.Value, len(texts))
	for i, text := range texts {
		instances[i] = structpb.NewStructValue(&structpb.Struct{
			Fields: map[string]*structpb.Value{
				"content":   structpb.NewStringValue(text),
				"task_type": structpb.NewStringValue("CODE_RETRIEVAL_QUERY"),
			},
		})
	}

	params := structpb.NewStructValue(&structpb.Struct{
		Fields: map[string]*structpb.Value{
			"outputDimensionality": structpb.NewNumberValue(float64(dimensionality)),
		},
	})

	req := &aiplatformpb.PredictRequest{
		Endpoint:   endpoint,
		Instances:  instances,
		Parameters: params,
	}
	resp, err := client.Predict(ctx, req)
	if err != nil {
		return err
	}
	embeddings := make([][]float32, len(resp.Predictions))
	for i, prediction := range resp.Predictions {
		values := prediction.GetStructValue().Fields["embeddings"].GetStructValue().Fields["values"].GetListValue().Values
		embeddings[i] = make([]float32, len(values))
		for j, value := range values {
			embeddings[i][j] = float32(value.GetNumberValue())
		}
	}

	fmt.Fprintf(w, "Dimensionality: %d. Embeddings length: %d", len(embeddings[0]), len(embeddings))
	return nil
}

Java

Vertex AI에 인증하려면 애플리케이션 기본 사용자 인증 정보를 설정합니다. 자세한 내용은 로컬 개발 환경의 인증 설정을 참조하세요.

import static java.util.stream.Collectors.toList;

import com.google.cloud.aiplatform.v1beta1.EndpointName;
import com.google.cloud.aiplatform.v1beta1.PredictRequest;
import com.google.cloud.aiplatform.v1beta1.PredictResponse;
import com.google.cloud.aiplatform.v1beta1.PredictionServiceClient;
import com.google.cloud.aiplatform.v1beta1.PredictionServiceSettings;
import com.google.protobuf.Struct;
import com.google.protobuf.Value;
import java.io.IOException;
import java.util.ArrayList;
import java.util.List;
import java.util.OptionalInt;
import java.util.regex.Matcher;
import java.util.regex.Pattern;

public class PredictTextEmbeddingsSamplePreview {
  public static void main(String[] args) throws IOException {
    // TODO(developer): Replace these variables before running the sample.
    // Details about text embedding request structure and supported models are
    // available in:
    // https://cloud.google.com/vertex-ai/docs/generative-ai/embeddings/get-text-embeddings
    String endpoint = "us-central1-aiplatform.googleapis.com";
    String project = "YOUR_PROJECT_ID";
    String model = "text-embedding-preview-0815";
    // Calculate the embedding for a code retrieval query. Using 'CODE_RETRIEVAL_QUERY' for query.
    predictTextEmbeddings(
        endpoint,
        project,
        model,
        List.of("Retrieve a function that adds two numbers"),
        "CODE_RETRIEVAL_QUERY",
        OptionalInt.of(256));

    // Calculate the embedding for code blocks. Using 'RETRIEVAL_DOCUMENT' for corpus.
    predictTextEmbeddings(
        endpoint,
        project,
        model,
        List.of(
            "def func(a, b): return a + b",
            "def func(a, b): return a - b",
            "def func(a, b): return (a ** 2 + b ** 2) ** 0.5"),
        "RETRIEVAL_DOCUMENT",
        OptionalInt.of(256));
  }

  // Gets text embeddings from a pretrained, foundational model.
  public static List<List<Float>> predictTextEmbeddings(
      String endpoint,
      String project,
      String model,
      List<String> texts,
      String task,
      OptionalInt outputDimensionality)
      throws IOException {
    PredictionServiceSettings settings =
        PredictionServiceSettings.newBuilder().setEndpoint(endpoint).build();
    Matcher matcher = Pattern.compile("^(?<Location>\\w+-\\w+)").matcher(endpoint);
    String location = matcher.matches() ? matcher.group("Location") : "us-central1";
    EndpointName endpointName =
        EndpointName.ofProjectLocationPublisherModelName(project, location, "google", model);

    // You can use this prediction service client for multiple requests.
    try (PredictionServiceClient client = PredictionServiceClient.create(settings)) {
      PredictRequest.Builder request =
          PredictRequest.newBuilder().setEndpoint(endpointName.toString());
      if (outputDimensionality.isPresent()) {
        request.setParameters(
            Value.newBuilder()
                .setStructValue(
                    Struct.newBuilder()
                        .putFields("outputDimensionality", valueOf(outputDimensionality.getAsInt()))
                        .build()));
      }
      for (int i = 0; i < texts.size(); i++) {
        request.addInstances(
            Value.newBuilder()
                .setStructValue(
                    Struct.newBuilder()
                        .putFields("content", valueOf(texts.get(i)))
                        .putFields("task_type", valueOf(task))
                        .build()));
      }
      PredictResponse response = client.predict(request.build());
      List<List<Float>> floats = new ArrayList<>();
      for (Value prediction : response.getPredictionsList()) {
        Value embeddings = prediction.getStructValue().getFieldsOrThrow("embeddings");
        Value values = embeddings.getStructValue().getFieldsOrThrow("values");
        floats.add(
            values.getListValue().getValuesList().stream()
                .map(Value::getNumberValue)
                .map(Double::floatValue)
                .collect(toList()));
      }
      return floats;
    }
  }

  private static Value valueOf(String s) {
    return Value.newBuilder().setStringValue(s).build();
  }

  private static Value valueOf(int n) {
    return Value.newBuilder().setNumberValue(n).build();
  }
}

Node.js

Vertex AI에 인증하려면 애플리케이션 기본 사용자 인증 정보를 설정합니다. 자세한 내용은 로컬 개발 환경의 인증 설정을 참조하세요.


// TODO(developer): Update the following for your own use case.
const project = 'long-door-651';
const model = 'text-embedding-preview-0815';
const location = 'us-central1';
// Calculate the embedding for code blocks. Using 'RETRIEVAL_DOCUMENT' for corpus.
// Specify the task type as 'CODE_RETRIEVAL_QUERY' for query, e.g. 'Retrieve a function that adds two numbers'.
const texts =
  'def func(a, b): return a + b;def func(a, b): return a - b;def func(a, b): return (a ** 2 + b ** 2) ** 0.5';
const task = 'RETRIEVAL_DOCUMENT';
const dimensionality = 3;
const apiEndpoint = 'us-central1-aiplatform.googleapis.com';

const aiplatform = require('@google-cloud/aiplatform');
const {PredictionServiceClient} = aiplatform.v1;
const {helpers} = aiplatform; // helps construct protobuf.Value objects.
const clientOptions = {apiEndpoint: apiEndpoint};
const endpoint = `projects/${project}/locations/${location}/publishers/google/models/${model}`;
const parameters = helpers.toValue({
  outputDimensionality: parseInt(dimensionality),
});

async function callPredict() {
  const instances = texts
    .split(';')
    .map(e => helpers.toValue({content: e, task_type: task}));
  const request = {endpoint, instances, parameters};
  const client = new PredictionServiceClient(clientOptions);
  const [response] = await client.predict(request);
  const predictions = response.predictions;
  const embeddings = predictions.map(p => {
    const embeddingsProto = p.structValue.fields.embeddings;
    const valuesProto = embeddingsProto.structValue.fields.values;
    return valuesProto.listValue.values.map(v => v.numberValue);
  });
  console.log('Got embeddings: \n' + JSON.stringify(embeddings));
}
await callPredict();

이러한 모델을 사용할 때는 다음과 같은 제한사항이 적용됩니다.

미션 크리티컬 시스템 또는 프로덕션 시스템에서는 이러한 미리보기 모델을 사용하지 마세요.
이 모델은 us-central1에서만 사용할 수 있습니다.
일괄 예측은 지원되지 않습니다.
맞춤설정은 지원되지 않습니다.

벡터 데이터베이스에 임베딩 추가

임베딩을 생성한 후 벡터 검색과 같은 벡터 데이터베이스에 임베딩을 추가할 수 있습니다. 이렇게 하면 지연 시간이 짧은 검색이 가능하며, 데이터 크기가 커질수록 매우 중요합니다.

벡터 검색에 대한 자세한 내용은 벡터 검색 개요를 참조하세요.

다음 단계

비율 제한에 대한 자세한 내용은 Vertex AI의 생성형 AI 비율 제한을 참조하세요.
임베딩을 일괄 예측하려면 일괄 텍스트 임베딩 예측 가져오기를 참조하세요.
멀티모달 임베딩에 대한 자세한 내용은 멀티모달 임베딩 가져오기를 참조하세요.
임베딩을 조정하려면 텍스트 임베딩 조정을 참조하세요.
text-embedding-004 및 text-multilingual-embedding-002에 대한 관련 연구에 대해 자세히 알아보려면 연구 논문 Gecko: 대규모 언어 모델에서 추출한 다목적 텍스트 임베딩을 참조하세요.