Jump to Content
Databases

Cloud SQL for PostgreSQL: Faster similarity search performance with pgvector indexes

December 1, 2023
Eeshan Gupta

Software Engineer

Try Gemini 1.5 models

Google's most advanced multimodal models in Vertex AI

Try it

Update April 22, 2024: Cloud SQL for PostgreSQL version 12 and above now supports pgvector 0.6.0. This version brings multiple performance and stability improvements, with a focus on HNSW index building. Key improvements include:

  • Parallel HNSW index builds are now supported
  • Reduced memory and storage (WAL) during HNSW build
  • Improved index search performance

In the fast evolving landscape of approximate nearest neighbor (ANN) search, one of the most important recent launches is pgvector extension version 0.5.0, which adds support for Hierarchical Navigable Small Worlds (HNSW) indexes. With the constant quest for efficiency, speed and performance, HNSW indexes significantly reduce ANN query latency for the desired recall. We are excited to announce support for pgvector 0.5.0 in Cloud SQL for PostgreSQL.

In this post, we explain pgvector indexes, clarify different configurations, and give hands-on coding examples for improving the performance and viability of a pgvector-based application using HNSW indexes.

The pgvector extension in Cloud SQL for PostgreSQL

The pgvector extension can be installed within an existing instance of Cloud SQL for PostgreSQL using the CREATE EXTENSION command as shown below. If you do not have an existing instance, create one today. Cloud SQL for PostgreSQL now supports pgvector 0.5.0, which adds HNSW indexes in pgvector, in addition to Inverted File Flat (IVFFlat) indexes.

Loading...

If you have already installed an older version of pgvector in your Cloud SQL for PostgreSQL instance, it can now be upgraded to pgvector 0.5.0 using the following command.

Loading...

The extension registers a new data type called “vector” in PostgreSQL and defines several new operators on it:

  • Element-wise addition (+)
  • Element-wise subtraction (-)
  • Element-wise multiplication (*)
  • Euclidean distance (<->)
  • Cosine distance (<=>)

The distance operators allow finding vectors that are semantically similar. You can find more in-depth details about how they work in our previous pgvector blog.

The commands below demonstrate how to insert vectors to a PostgreSQL table and compute row-wise cosine distances.

Loading...

Indexes in pgvector

In the same way that database indexes allow for faster data retrieval, vector indexes can significantly speed up similarity search operation and avoid the brute-force exact nearest neighbor search that is used by default. Although a brute-force search across a large database provides a perfect recall, it also comes at a significant cost to performance.

While the pgvector extension already supported IVFFLAT indexes, version 0.5.0 adds support for HNSW indexes. Unlike indexes that are used for exact retrieval, such as b-tree indexes, HNSW indexes can lead to different results for the same queries.

HNSW index

The HNSW index uses the hierarchical navigable small worlds (HNSW) algorithm to build optimized graphs for performing approximate nearest neighbor search. It is one of the top-performing vector indexing algorithms used in vector databases. It has better query performance than many other algorithms, including IVFFlat, albeit at the cost of slower build times and more memory usage.

Another advantage of HNSW is the fact that there’s no training step when building the index. In other words, you can create an index without any data in the table, and it will incrementally build up as data is added. This is different from IVFFlat, where the index needs to be rebuilt periodically as the indexed data changes over time to get better performance.

The following code snippet demonstrates how you can build HNSW indexes in pgvector with different distance functions and index parameters.

Loading...

This snippet creates an HNSW index on the embedding column of the product_embeddings table using the cosine distance function, with HNSW parameters m and ef_construction set to 24 and 100 respectively.

Before trying it out yourself, you should understand the following HNSW tuning parameters:

  • m: The maximum number of connections with neighboring data points in the graph. Higher values would make the graph denser, providing faster lookups at the cost of an increase in build times and memory usage. According to the original paper on using HNSW, this is the most important parameter. A reasonable range for m is 5 to 48, with 16 being the default in pgvector. A larger value would provide better performance at good recall, especially for high-dimensional data.
  • ef_construction: The size of the list which holds the closest candidates during the graph traversal when building the index. Higher values would lead the algorithm to consider more candidates, potentially allowing the creation of a better index. However, after a point, increasing this parameter provides diminishing returns.
  • ef_search: Unlike the other parameters, which configure building the index, this parameter configures query execution and limits the number of nearest neighbors maintained in the list. Higher values lead to better recall at the cost of query performance since the algorithm considers and returns more nodes. This parameter can be set at the instance, session, or even transaction level.

IVFFlat index

You can also use the IVFFlat index to index vector columns by dividing the vector into lists and identifying the subset of lists closest to the input vector. IVFFlat indexes have faster build times and use less memory than HNSW indexes. However, HNSW indexes have better query performance.

The following code snippets demonstrate how to build IVFFlat indexes in pgvector:

Loading...

Enriching the sample application with indexes

We created an interactive Google Colab application using pgvector and LLMs in our previous blog. As part of that, we loaded a dataset sampled from a larger public retail dataset, which is available on Kaggle. The dataset used in the application has around 800 toy products, with descriptions available for each product. This was loaded into a PostgreSQL table called products.

These long descriptions were processed using LangChain and embeddings were generated for each of the chunks using the Vertex AI Text Embedding model. These embeddings were then stored as a vector type in a table called product_embeddings. Finally, similarity search operators were used to find the nearest neighbor of embeddings generated from search terms.

In this blog, we will extend the same sample application to leverage pgvector indexes to make our application scalable with any reasonable number of vectors. Please note that we will be using the same setup as the previous blog and build upon the interactive Google Colab notebook.

Follow the instructions in the Colab notebook to set up your environment. If an instance with the required name does not exist, the notebook creates a Cloud SQL PostgreSQL instance for you. Running the notebook may incur Google Cloud charges. You may be eligible for a free trial that gets you credits for these costs.

Adding HNSW indexes to optimize vector similarity search

Following the steps of the previous blog, we’ve made an entire product dataset searchable using natural language!

All you have to do is generate vector embeddings for the incoming input query and then use the pgvector cosine similarity search operator to find related products. Now let’s try to optimize it!

Loading...

If we analyze the query plan for this by prepending EXPLAIN ANALYZE, the query iterates over more than 800 rows and takes over 12ms.

Loading...

Instead, let’s try creating an HNSW index. Since the dataset has only 800 rows, we can consider smaller values for the index parameters.

Loading...

Running the same query again, the response is the same as the full scan. However, if we EXPLAIN ANALYZE the same query again, PostgreSQL uses the product_embeddings_embedding_idx index, resulting in a much cheaper query.

Loading...

Adding the index results in a massive improvement in the query performance, which is much cheaper to call since it uses the index. In this example, the execution time dropped to around 0.5ms, demonstrating the power of pgvector indexes.

This performance gain is for a dataset with only 800 rows, but it becomes even more apparent with bigger datasets. In some cases, it is the only realistic way to scale pgvector to real datasets with hundreds of thousands to even millions of rows.

For example, let’s use the 30,000 rows from the original dataset on Kaggle. At this scale, using an HNSW index is already much faster than a sequential scan. With even more rows and dimensions, the sequential scan becomes completely infeasible.

With sequential scan:

Loading...

With HNSW index scan:

Loading...

The ANN-benchmarks package on GitHub provides tools to benchmark several of these approximate nearest neighbor (ANN) algorithms. Using this package, you can clearly find the performance characteristics and behavior of pgvector with HNSW indexes, IVFFlat indexes and full scan on varied real-world datasets.

Leveraging these indexes, you can do similarity search at scale with an appreciable queries per second on real-world data, even with millions of vector embeddings.

Summary

We hope this post demonstrates the power of indexes in pgvector and how adding indexes like HNSW makes vector similarity search fast and scalable with production data in PostgreSQL. Our hands-on application and dataset should make it easier for you to build your own application with step-by-step instructions.

We’ve updated the existing Google Colab notebook to include the benefits of pgvector indexes described in this blog. Get started today!

Getting started

You can easily get started with generative AI applications using the following jump-start solution in GitHub repository. This jump start solution leveraged pgvector and is built using Cloud SQL, GKE, and Vertex AI. It contains the terraform templates to set up infrastructure with security best practices and sample application code for a LLM powered Chatbot. Using this jumpstart solution, you can easily get started in a few clicks and then customize it based on your needs. Try the jump start solution here in this link.

Posted in