Vertex AI Matching Engine provides high-scale low latency vector database (a.k.a, vector similarity-matching or approximate nearest neighbor service). Matching Engine provides tooling to build use cases that match semantically similar items. This ability to search for semantically similar or semantically related items out of millions has many real world use cases and is a vital part of applications such as:
- Recommendation engines
- Search engines
- Ad targeting systems
- Image classification or image search
- Text classification
- Question answering
- Chat bots
The state of the art paradigm for building semantic matching systems is by computing vector representations of the items. These vector representations are often called embeddings. Embeddings are computed by using machine learning models. The models are trained to learn an embedding space where similar examples are close while dissimilar ones are far apart. The closer two items are in the embedding space, the more similar they are.
The following illustration shows how this technique can be applied to the problem of finding books, from a database, that are the best semantic match to an input query. To answer a query with this approach, the system must first map each database item to an embedding, then map the query to the embedding space. The system must then find, among all database embeddings, the ones closest to the query; this is the nearest neighbor search problem (which is sometimes also referred to as vector similarity search).
The use of embeddings is not limited to words or text. With the use of machine learning models (often deep learning models), you can generate semantic embeddings for multiple types of data, for example, photos, audio, movies, and user preferences.
At a high level, semantic matching can be simplified into two critical steps:
- Generate embedding representations of items.
- Perform nearest neighbor searches on embeddings.
Vertex AI Matching Engine is a vector database that leverages the unique characteristics of embedding vectors to efficiently index them, for easy and scalable search and retrieval of similar embeddings. It enables high-scale, high queries per second (QPS) cost-efficient and low latency querying over indexes with more than a billion embedding vectors.
What's next
Quotas
Learn about Vertex AI Matching Engine quotas and how to request quota increases.