Vertex AI Matching Engine overview

Summary: Vertex AI Matching Engine provides the industry's leading high scale, low latency, vector-similarity matching (also known as approximate nearest neighbor) service, and industry-leading algorithms to train semantic embeddings for similarity-matching use cases.

Matching Engine provides tooling to build use cases that entail matching semantically similar items. More specifically, given a query item, Matching Engine finds the most semantically similar items to it from a large corpus of candidate items. This ability to search for semantically similar or semantically related items has many real world use cases and is a vital part of applications such as: - Recommendation engines - Search engines - Ad targeting systems - Image classification or image search - Text classification - Question answering - Chat bots

The state of the art paradigm for building such semantic matching systems is by computing vector representations of the items. These vector representations are often called 'embeddings'. Embeddings are computed by using machine learning models, increasingly deep learning models. The models are trained to learn an embedding space in which similar examples are close while dissimilar ones are far apart. Thus, the closer two items are in the embedding space, the more similar they are.

The following illustration shows how this technique can be applied to the problem of finding books, from a database, that are the best semantic match to an input query. To answer a query with this approach, the system must first map each database item to an embedding, then map the query to the embedding space. The system must then find, among all database embeddings, the ones closest to the query; this is the nearest neighbor search problem (which is sometimes also referred to as vector similarity search).

Query and database points.

The use of embeddings is not limited to words or text. With the use of machine learning models (often deep learning models), one can generate semantic embeddings for multiple types of data, for example, photos, audio, movies, and user preferences.

Thus, at a high level, the semantic matching paradigm can be simplified into two critical steps: + Generate embedding representations of items + Perform nearest neighbor searches on embeddings

Vertex AI Matching Engine provides tooling for both of the above phases. Specifically it provides: + Out of the box models that can be trained to produce embedding representations of items. + A high-scale, low-latency Approximate Nearest Neighbor (ANN) service, to find similar embeddings.

The above capabilities are provided as independent modular offerings. Customers are free to use one of the above offerings with other services. They don't have to be used together. For example, you can produce embeddings using another service, but ingest them into Matching Engine's ANN service to perform nearest neighbor queries. It is common, for example, for customers to use models like BERT or ResNet from TF-Hub to produce text or image embeddings. Similarly, you can train embeddings using Matching Engine's Two-Tower model and export them for use in another service.

What's next