This guide shows you how to choose the optimal task type when generating embeddings with Vertex AI to improve their quality for your specific use case. It covers the following topics:
- Benefits of using task types: Learn how task types improve embedding quality for use cases like Retrieval Augmented Generation (RAG).
- Supported task types: Get a comparative overview of all available task types to help you choose the right one.
- Use case deep dive: Explore detailed explanations and examples for each task type, including classification, clustering, retrieval, and semantic similarity.
With Vertex AI embeddings models, you can generate embeddings optimized for various tasks, such as document retrieval, question and answering, and fact verification. A task type is a parameter that you specify to optimize the embeddings that the model generates for your intended use case. This document describes how to choose the optimal task type for your embeddings.
Supported models
Task types are supported by the following models:
text-embedding-005
text-multilingual-embedding-002
gemini-embedding-001
The following limitations apply when using these models:
- Don't use these preview models on mission critical or production systems.
- These models are available in
us-central1
only. - Batch predictions are not supported.
- Customization is not supported.
Benefits of task types
Using task types can improve the quality of the embeddings that a model generates.

For example, when building Retrieval Augmented Generation (RAG) systems, a common design is to use text embeddings and Vector Search to perform a similarity search. In some cases, this approach can result in lower search quality because questions and their answers are not semantically similar. For example, a question like "Why is the sky blue?" and its answer "The scattering of sunlight causes the blue color," have distinctly different meanings as statements. This means that a RAG system might not automatically recognize their relation, as demonstrated in figure 1.
Without task types, a RAG developer would need to train their model to learn the relationship between queries and answers, which requires advanced data science skills, or use LLM-based query expansion or HyDE, which can introduce high latency and costs.

Task types enable you to generate optimized embeddings for specific tasks, which saves you the time and cost it would take to develop your own task-specific embeddings. The generated embedding for a query "Why is the sky blue?" and its answer "The scattering of sunlight causes the blue color" would be in the shared embedding space that represents the relationship between them, as demonstrated in figure 2. In this RAG example, the optimized embeddings would lead to improved similarity searches.
In addition to the query and answer use case, task types also provide an optimized embedding space for tasks such as classification, clustering, and fact verification.
Supported task types
The following table describes the supported task types. The best task type for your project depends on your use case. To explore all task types, see the model reference.
Task Type Category | Task Type(s) | Description | Common Use Cases |
---|---|---|---|
Retrieval (Asymmetric) |
For queries: RETRIEVAL_QUERY , QUESTION_ANSWERING , FACT_VERIFICATION , CODE_RETRIEVAL_QUERY For corpus: RETRIEVAL_DOCUMENT
|
Generates embeddings for a short query to search against a large corpus of documents. The query and document embeddings are optimized to work together. | Document search, RAG systems, question answering, fact verification, code retrieval. |
Classification (Symmetric) | CLASSIFICATION |
Generates an embedding for a single text input, optimized for classification models. | Sentiment analysis, topic classification, spam detection. |
Clustering (Symmetric) | CLUSTERING |
Generates an embedding for a single text input, optimized for clustering algorithms. | Topic modeling, customer segmentation, identifying duplicate content. |
Semantic Similarity (Symmetric) | SEMANTIC_SIMILARITY |
Generates embeddings for comparing two pieces of text to determine their semantic similarity. Not intended for retrieval. | Calculating similarity scores for recommendations or paraphrasing detection. |
Determine your embeddings use case
Embeddings use cases typically fall into one of four categories: assessing text similarity, classifying texts, clustering texts, or retrieving information from texts. If your use case doesn't align with a documented use case, use the RETRIEVAL_QUERY
task type by default.
Task types can be symmetric or asymmetric. Depending on your use case, you use either a symmetric or asymmetric task type.
Symmetric use cases
Symmetric use cases are when the texts being compared are of similar length and content, such as comparing two sentences for similarity.
Classification
To classify texts according to preset labels, use the CLASSIFICATION
task type. This task type generates embeddings that are optimized for classification models. For example, if you generate an embedding for the social media post "I don't like traveling on airplanes," a classification model could use the embedding to classify the sentiment as negative.
Clustering
To cluster texts based on their similarities, use the CLUSTERING
task type. This task type generates embeddings that are optimized for clustering algorithms. For example, after generating and clustering embeddings for news articles, you can suggest additional sports-related articles to users who read a lot about sports.
Additional use cases for clustering include the following:
- Customer segmentation: Group customers with similar embeddings generated from their profiles or activities for targeted marketing and personalized experiences.
- Product segmentation: Cluster product embeddings based on their product title and description, product images, or customer reviews to help businesses do segment analysis on their products.
- Market research: Cluster consumer survey responses or social media data embeddings to reveal hidden patterns and trends in consumer opinions, preferences, and behaviors.
- Healthcare: Cluster patient embeddings derived from medical data to help identify groups with similar conditions or treatment responses, leading to more personalized healthcare plans.
- Customer feedback trends: Cluster customer feedback from various channels (surveys, social media, support tickets) to help identify common pain points, feature requests, and areas for product improvement.
Semantic similarity
To assess text similarity, use the SEMANTIC_SIMILARITY
task type. This task type generates embeddings that are optimized for comparing the semantic similarity between two pieces of text. For example, when comparing embeddings for "The cat is sleeping" and "The feline is napping," the similarity score would be high because both texts have nearly the same meaning.
A real-world scenario for assessing input similarity is a recommendation system that identifies items (for example, products, articles, movies) that are semantically similar to a user's preferred items to provide personalized recommendations.
Asymmetric use cases
Asymmetric use cases are when you compare a short query text against a large body of longer documents.
Information retrieval
When you build a search or retrieval system, you work with two types of text:
- Corpus: The collection of documents that you want to search over.
- Query: The text that a user provides to search for information within the corpus.
To get the best performance, use different task types to generate embeddings for your corpus and your queries.
First, generate embeddings for your entire collection of documents using the RETRIEVAL_DOCUMENT
task type. You typically perform this step once to index your entire corpus and then store the resulting embeddings in a vector database.
Next, when a user submits a search, generate an embedding for their query text in real time using a task type that matches the user's intent. Your system then uses this query embedding to find the most similar document embeddings in your vector database.
The following task types are used for queries:
RETRIEVAL_QUERY
: Use this for a standard search query where you want to find relevant documents.QUESTION_ANSWERING
: Use this when queries are expected to be proper questions, such as "Why is the sky blue?".FACT_VERIFICATION
: Use this when you want to retrieve a document from your corpus that proves or disproves a statement.
Code retrieval
text-embedding-005
supports the CODE_RETRIEVAL_QUERY
task type, which you can use to retrieve relevant code blocks using plain text queries. To use this feature, embed code blocks using the RETRIEVAL_DOCUMENT
task type, and embed text queries using CODE_RETRIEVAL_QUERY
.
Here is an example:
REST
PROJECT_ID=PROJECT_ID
curl \
-X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json" \
https://us-central1-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/us-central1/publishers/google/models/text-embedding-005:predict -d \
$'{
"instances": [
{
"task_type": "CODE_RETRIEVAL_QUERY",
"content": "Function to add two numbers"
}
],
}'
Python
To learn how to install or update the Vertex AI SDK for Python, see Install the Vertex AI SDK for Python. For more information, see the Python API reference documentation.
What's next
- Learn how to get text embeddings.