AI & Machine Learning

Introducing new Vertex AI text embedding models

April 10, 2024

https://storage.googleapis.com/gweb-cloudblog-publish/images/Next24_Blog_blank_2-08.max-2500x2500.jpg

Xiaoqi Ren

Software Engineer, Cloud AI and Industry Solutions

Jinhyuk Lee

Research Scientist, Google DeepMind

Embeddings — numerical representations of real-world data such as text, speech, image, or videos — are how the foundation models powering generative AI understand relationships within data. They are expressed as fixed-dimensional vectors where the geometric distances of two vectors in the vector space is a projection of the relationships between the two real-world objects that the vectors represent.

Text embedding models are essential for many diverse natural-language processing (NLP) applications, from document retrieval and similarity measurement to classification and clustering. Our text embedding models power applications across Google Cloud including BigQuery, Cloud Database, Vertex AI Search, and Workspace.

Today, at Google Cloud Next '24, we are introducing two new Vertex AI text embedding models, with improved performance across a range of tasks, for public preview:

English only: text-embedding-preview-0409
Multilingual: text-multilingual-embedding-preview-0409.

New text embedding models with stronger performance

We evaluated our new models and published the metrics and more technical details in the Google study: Gecko: Versatile text embeddings distilled from large language models.

Compared to our previous versions, our new English-language embedding model increased its average score on the MTEB benchmarks (a commonly used benchmark for English-language tasks, covering eight task categories) to 66.31%. In the Google study, it outperformed all existing entries on MTEB among those with a 768 embedding size, and often outperformed models that are up to 7x larger or have 5x higher dimensions — all feats that testify to the new model’s text performance in downstream tasks like retrieval, reranking, clustering, classification, and semantic similarity.

Our new i18n, or multilingual, embedding model increased its average score on the MIRACL benchmarks (a commonly used multilingual retrieval benchmark, covering 18 different languages) to 56.2%.

The pricing for our text embedding models is $0.000025/1,000 characters for online requests and $0.00002/1,000 characters for batch requests. For more details, please refer to https://cloud.google.com/vertex-ai/generative-ai/pricing. Support for online prediction for our new models is available now, and support for batch prediction for these models is coming soon.

Dynamic embedding dimensions

Our new text embedding models also offer dynamic embedding sizes¹: users can choose to output smaller embedding dimensions, with minor performance loss, to save on computing and storage costs. See the below table for the performance tradeoffs.

Models	MTEB scores
Models	256 dim	768 dim
text-embedding-preview-0409	64.37	66.31

Other offerings

We also offer text embedding customizations for our stable model versions. The customization support for the above two new models is coming soon.

We use a parameter-efficient tuning method for customization. This methodology shows significant gains in quality of up to 41% (and 12% on average) on experiments performed on public retrieval benchmark datasets.