This guide shows you how to create a text embedding by using the Vertex AI Text embeddings API. This page covers the following topics: The Vertex AI text embeddings API uses dense vector representations of text. These embeddings are created using deep-learning methods similar to those used by large language models. Unlike sparse vectors, which tend to map words directly to numbers, dense vectors are designed to represent the meaning of a piece of text. This lets you search for passages that align with the meaning of a query, even if the passages don't use the same words. Key characteristics of these embeddings include: To learn more, see the following resources: In the Google Cloud console, on the project selector page,
select or create a Google Cloud project.
Enable the Vertex AI API.
In the Google Cloud console, on the project selector page,
select or create a Google Cloud project.
Enable the Vertex AI API.
Before you begin
Supported models
You can get text embeddings by using the following models:
| Model name | Description | Output Dimensions | Max sequence length | Supported text languages |
|---|---|---|---|---|
gemini-embedding-001 |
State-of-the-art performance across English, multilingual and code tasks. It unifies the previously specialized models like text-embedding-005 and text-multilingual-embedding-002 and achieves better performance in their respective domains. Read our Tech Report for more detail. |
up to 3072 | 2048 tokens | Supported text languages |
text-embedding-005 |
Specialized in English and code tasks. | up to 768 | 2048 tokens | English |
text-multilingual-embedding-002 |
Specialized in multilingual tasks. | up to 768 | 2048 tokens | Supported text languages |
For superior embedding quality, gemini-embedding-001 is our large
model designed to provide the highest performance. Note that
gemini-embedding-001 supports one instance per request.
Only use the model names as they are listed in the supported models table. Do not specify a model name without the @version suffix or use @latest, as these formats are not valid.
Get text embeddings for a snippet of text
You can get text embeddings for a snippet of text by using the Vertex AI API or the Vertex AI SDK for Python.
API limits
Each request is limited to 250 input texts and a total of 20,000 input tokens. If a request exceeds the token limit, it returns a 400 error. Each individual input text is limited to 2048 tokens, and any excess tokens are silently truncated. To disable silent truncation, set autoTruncate to false.
For more information, see Text embedding limits.
Choose an embedding dimension
By default, all models produce a full-length embedding vector. For gemini-embedding-001, this vector has 3072 dimensions, and for other models, it's 768 dimensions. To control the size of the output embedding vector, you can use the output_dimensionality parameter. A smaller output dimensionality can save storage space and increase computational efficiency for downstream applications, with a potential trade-off in quality.
The following examples use the gemini-embedding-001 model.
Python
Install
pip install --upgrade google-genai
To learn more, see the SDK reference documentation.
Set environment variables to use the Gen AI SDK with Vertex AI:
# Replace the `GOOGLE_CLOUD_PROJECT` and `GOOGLE_CLOUD_LOCATION` values # with appropriate values for your project. export GOOGLE_CLOUD_PROJECT=GOOGLE_CLOUD_PROJECT export GOOGLE_CLOUD_LOCATION=global export GOOGLE_GENAI_USE_VERTEXAI=True
Add an embedding to a vector database
After you generate your embedding, you can add it to a vector database, like Vector Search. This enables low-latency retrieval and is critical as the size of your data increases.
To learn more about Vector Search, see Overview of Vector Search.
What's next
- Learn about Generative AI on Vertex AI rate limits.
- Learn how to get batch text embeddings predictions.
- Learn how to get multimodal embeddings.
- Learn how to tune text embeddings.
- Learn about the research behind
text-embedding-005andtext-multilingual-embedding-002in the research paper Gecko: Versatile Text Embeddings Distilled from Large Language Models.