This document describes how to create a text embedding using the Vertex AI Text embeddings API.
Vertex AI text embeddings API uses dense vector representations: text-embedding-gecko, for example, uses 768-dimensional vectors. Dense vector embedding models use deep-learning methods similar to the ones used by large language models. Unlike sparse vectors, which tend to directly map words to numbers, dense vectors are designed to better represent the meaning of a piece of text. The benefit of using dense vector embeddings in generative AI is that instead of searching for direct word or syntax matches, you can better search for passages that align to the meaning of the query, even if the passages don't use the same language.
The vectors are normalized, so you can use cosine similarity, dot product, or Euclidean distance to provide the same similarity rankings.
- To learn more about embeddings, see the embeddings APIs overview.
- To learn about text embedding models, see Text embeddings.
- For information about which languages each embeddings model supports, see Supported text languages.
Before you begin
- Sign in to your Google Cloud account. If you're new to Google Cloud, create an account to evaluate how our products perform in real-world scenarios. New customers also get $300 in free credits to run, test, and deploy workloads.
-
In the Google Cloud console, on the project selector page, select or create a Google Cloud project.
-
Enable the Vertex AI API.
-
In the Google Cloud console, on the project selector page, select or create a Google Cloud project.
-
Enable the Vertex AI API.
- Choose a task type for your embeddings job.
Supported models
You can get text embeddings by using the following models:
English models | Multilingual models |
---|---|
textembedding-gecko@001 |
textembedding-gecko-multilingual@001 |
textembedding-gecko@002 |
text-multilingual-embedding-002 |
textembedding-gecko@003 |
|
text-embedding-004 |
|
text-embedding-005 |
If you are new to these models, we recommend that you use the latest versions.
For English text, use text-embedding-005
. For multilingual text, use
text-multilingual-embedding-002
.
Get text embeddings for a snippet of text
You can get text embeddings for a snippet of text by using the Vertex AI API or
the Vertex AI SDK for Python. For each request, you're limited to 250 input texts
in us-central1
, and in other regions, the max input text is 5.
The API has a maximum input token limit of 20,000. Inputs exceeding this limit
results in a 500 error. Each individual input text is further limited to
2048 tokens; any excess is silently truncated. You can also disable silent
truncation by setting autoTruncate
to false
.
All models produce an output with 768 dimensions by default. However, the following models give users the option to choose an output dimensionality between 1 and 768. By selecting a smaller output dimensionality, users can save memory and storage space, leading to more efficient computations.
text-embedding-005
text-multilingual-embedding-002
The following examples use the text-embedding-005
model.
Gen AI SDK for Python
Learn how to install or update the Google Gen AI SDK for Python.
For more information, see the
Gen AI SDK for Python API reference documentation or the
python-genai
GitHub repository.
Set environment variables to use the Gen AI SDK with Vertex AI:
# Replace the `GOOGLE_CLOUD_PROJECT` and `GOOGLE_CLOUD_LOCATION` values # with appropriate values for your project. export GOOGLE_CLOUD_PROJECT=GOOGLE_CLOUD_PROJECT export GOOGLE_CLOUD_LOCATION=us-central1 export GOOGLE_GENAI_USE_VERTEXAI=True
Vertex AI SDK for Python
To learn how to install or update the Vertex AI SDK for Python, see Install the Vertex AI SDK for Python. For more information, see the Vertex AI SDK for Python API reference documentation.
Go
Before trying this sample, follow the Go setup instructions in the Vertex AI quickstart using client libraries. For more information, see the Vertex AI Go API reference documentation.
To authenticate to Vertex AI, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Java
Before trying this sample, follow the Java setup instructions in the Vertex AI quickstart using client libraries. For more information, see the Vertex AI Java API reference documentation.
To authenticate to Vertex AI, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Node.js
Before trying this sample, follow the Node.js setup instructions in the Vertex AI quickstart using client libraries. For more information, see the Vertex AI Node.js API reference documentation.
To authenticate to Vertex AI, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Add an embedding to a vector database
After you've generated your embedding you can add embeddings to a vector database, like Vector Search. This enables low-latency retrieval, and is critical as the size of your data increases.
To learn more about Vector Search, see Overview of Vector Search.
What's next
- To learn more about rate limits, see Generative AI on Vertex AI rate limits.
- To get batch predictions for embeddings, see Get batch text embeddings predictions
- To learn more about multimodal embeddings, see Get multimodal embeddings
- To tune an embedding, see Tune text embeddings
- To learn more about the research behind
text-embedding-005
andtext-multilingual-embedding-002
, see the research paper Gecko: Versatile Text Embeddings Distilled from Large Language Models.