Jump to Content
Databases

Introducing ScaNN vector indexing in AlloyDB, bringing 12 years of Google research to speed up vector search

April 10, 2024
https://storage.googleapis.com/gweb-cloudblog-publish/images/Next24_Blog_blank_2-01.max-2500x2500.jpg
Sandy Ghai

Group Product Manager, AlloyDB

Try Gemini 1.5 Pro

Google's most advanced multimodal model in Vertex AI

Try it

Over the past year, vector databases have skyrocketed in popularity, and have become the backbone of new semantic search and generative AI experiences. Developers use vector search for everything from product recommendations, to image search, to enhancing LLM-powered chatbots with retrieval augmented generation (RAG).

PostgreSQL is one of the most popular operational databases on the market, used by 49% of developers according to StackOverflow’s 2023 survey, and growing. So, it’s no surprise that pgvector, the most popular PostgreSQL extension for vector search, has become one of the most-loved vector databases on the market. That’s why we launched support for pgvector in Cloud SQL for PostgreSQL and AlloyDB for PostgreSQL in July of last year, adding a few enhancements in AlloyDB AI to optimize performance. 

The PostgreSQL community has come a long way since then, introducing support for the HNSW algorithm, a state-of-the art graph-based algorithm used in many popular databases. HNSW is supported in both AlloyDB and Cloud SQL. While HNSW offers good query performance for many vector workloads, we’ve heard from some customers that it doesn’t always fit for their real-world use-cases. Some customers with larger corpuses experience issues with index build time and high memory usage; others need fast, real-time index updates or better vector query performance.

That’s why this week we announced the new ScaNN index for AlloyDB, bringing 12 years of Google research and innovation in approximate nearest neighbor algorithms to AlloyDB. This new index uses the same technology that powers some of Google’s most popular services to deliver up to 4x faster vector queries, up to 8x faster index build times and typically a 3-4x smaller memory footprint than the HNSW index in standard PostgreSQL. It also offers up to 10x higher write throughput than the HNSW index in standard PostgreSQL.

The new ScaNN index is available in technology preview in AlloyDB Omni, and will become available in the AlloyDB for PostgreSQL managed service in Google Cloud shortly thereafter. 

Vector indexing using ANN algorithms

The most common use case for vectors is to find similar or relevant data. This is accomplished by querying the database for the k vectors that are closest to the query vector in terms of a distance metric such as inner product, cosine similarity, or Euclidean distance. This kind of query is referred to as a “k (exact) nearest neighbors” or “KNN” query.

Unfortunately, KNN queries don’t scale. This is where Approximate Nearest Neighbor (ANN) search comes in. ANN trades off some accuracy (specifically recall — the algorithm might miss some of the actual nearest neighbors) for big improvements in speed. For many use cases, this tradeoff is worthwhile. Consider, for example, user expectations from a search engine: they’ll happily accept 10 results that are approximately (if not perfectly) the most relevant, if it means they’ll get them in a fraction of a second rather than hours or days. 

In the database, ANN search uses vector indexes. Although database performance depends on many factors, the underlying ANN index plays a large role in indexing time, query performance, and memory footprint, and determines the fundamental tradeoffs between recall (i.e., accuracy) and latency. 

There are two popular types of ANN indices: graph-based and tree-quantization-based. Graph-based algorithms construct a network of nodes, which are connected by edges based on similarity. pgvector’s HNSW index implements the state-of-the-art Hierarchical Navigable Small Worlds (HNSW) graph algorithm used in many popular vector databases. This uses a hierarchical graph to very efficiently traverse the graph to find nearest neighbors. These types of algorithms perform well, especially for small datasets, but have higher memory footprints and longer index build times than tree-quantization-based algorithms.

https://storage.googleapis.com/gweb-cloudblog-publish/original_images/image1_wilWwWl.gif

Tree-quantization-based vector indexes, at a high level, structure the data such that clusters of nearby vectors are grouped together and are properly compressed (quantized). Tree-quantization indices have smaller memory footprints and faster index build times than graph-based ANN indices. Google’s ScaNN (Scalable Nearest Neighbor) is able to achieve these benefits without sacrificing excellent query performance, thanks to key innovations around (a) geometry awareness for smarter clustering and redundancy and (b) taking advantage of modern CPU hardware. 

AlloyDB’s ScaNN index introduces support for Google's state-of-the-art ScaNN into AlloyDB. Deeper integration between the index and the AlloyDB query execution engine further improve performance, as does AlloyDB’s tiered caching architecture. Read our ScaNN for AlloyDB whitepaper for a deep dive into Google’s ScaNN algorithm and how we’ve implemented it in PostgreSQL and AlloyDB.

Key benefits of the ScaNN index

In short, the new ScaNN index for AlloyDB gives you all of the benefits of pgvector plus access to state-of-the-art vector indexing:

  • Smaller memory footprint: With the ScaNN index, AlloyDB AI typically has a 3-4x smaller memory footprint than the HNSW index in standard PostgreSQL. That means we can offer in-memory performance for larger workloads on smaller machines. It also means more memory is available for other database activities, like the buffer cache for transactional workloads.

  • Faster index build times: AlloyDB AI’s ScaNN index has up to 8x faster index build times than the HNSW index in standard PostgreSQL, which is important for developer productivity — especially when corpus sizes are larger, or when developers need to test multiple index configurations or embeddings models.
  • Higher write throughput: 10x higher write throughput than the HNSW index in standard PostgreSQL means that it’s more able to handle real-time updates. 
  • Faster vector queries: AlloyDB AI offers up to 4x faster vector queries than the HNSW index in standard PostgreSQL.
  • Full PostgreSQL compatibility: AlloyDB’s ScaNN index is compatible with pgvector, so it works with existing vector embeddings and query syntaxes and can be used as either a drop-in replacement or complement to existing HNSW indices.
  • Excellent developer experience with SQL: Developers building semantic search and generative AI applications can leverage their existing SQL skillset for vector similarity search and take advantage of full PostgreSQL querying capabilities like joins, filters, and more. They can also perform vector queries directly on their operational data, simplifying their technology stack and leveraging that real-time data to create the richest, most relevant experiences — all without sacrificing performance.

    "At AbemaTV, we use AlloyDB AI for embeddings generation and vector search in 「ABEMA」, our streaming service, to make video recommendations. We're excited to see the rapid expansion of model support and vector capabilities in AlloyDB, and plan to use the new model catalog to more easily access the latest embeddings models like Gemini Pro. We're also looking forward to trying the newly announced scann index to speed up vector search.” - Shunya Suga, Engineering Manager, AbemaTV inc.

Getting started

AlloyDB now gives you the richest set of native vector search options in SQL databases, by offering both the graph-based HNSW index from pgvector and the pgvector-compatible ScaNN index for AlloyDB based on tree-quantization. 

ScaNN for AlloyDB is available today as a technology preview via the downloadable AlloyDB Omni. Follow our quickstart guide to deploy AlloyDB Omni on a VM in GCP, on your server or laptop, or on the cloud of your choice. And then follow our documentation to get started with easy and fast vector queries.

Posted in