pgvector is an extension for PostgreSQL (also called Postgres) that simplifies working with vectors—enabling you to store, search, and index them directly in your relational database.
With pgvector, adding advanced capabilities like similarity search to your applications and AI agents can be both straightforward and scalable, without having to move data around or change application architectures to connect the new vector data type.
pgvector is an open source extension for PostgreSQL that helps you to store, index, and search high-dimensional vectors directly within your existing PostgreSQL database. pgvector is known for supporting:
A vector represents data numerically in a way that captures its key characteristics, mapping it into a virtual mathematical space. In this space, similar items—like words, images, or objects—are positioned close together.
For example, consider the words “coat” and “jacket.” Traditional keyword-based searches would not connect these two words as similar, because their letters are quite different. An e-commerce system that wants to unite these keywords would need to do so manually. However, the vector representations of these two would be very close because they share similar meanings—delivering more accurate search results for users and saving time for developers.
Similarly, if you take two different pictures of cats, then pixel by pixel they might be vastly different. However, their vector embeddings would place them very close together in the mathematical space, just as a human would easily identify both of these as images of cats:

To make this work, an embedding model transforms raw data—such as images or text—into vector embeddings. pgvector stores these embeddings in your database. When a user submits a query, that input is also converted into a vector. pgvector then calculates the distance between the query vector and stored vectors to efficiently identify the "nearest neighbors" with the highest similarity scores.
Curious about different types of nearest neighbor searches? Check out our guide to generative AI app development.
PostgreSQL is a robust, open source relational database management system designed to handle structured data using tables, rows, and columns.
pgvector is an extension that runs inside PostgreSQL. It adds “vector,” a new data type, to the database, allowing storage and processing of vector embeddings alongside your standard operational data.
No, pgvector is an extension that integrates directly into your existing PostgreSQL database. This allows you to add advanced AI and search capabilities without managing new or separate infrastructure.
To support today's AI-driven features, you need the ability to store and manage vector embeddings.
PostgreSQL can be powerful on its own, but because its data is rigidly structured into tables, rows, and columns, its query capability is largely limited to keyword and pattern matching.
In the world of AI, complex data like text, images, and audio is encoded as vector representations. These encodings enable AI models to grasp the context and semantic relationships within your data, forming the backbone of features like intelligent search, recommendations, and gen AI.
The pgvector extension brings semantic search to PostgreSQL, using vector embeddings to find results based on a query's meaning—rather than just keyword matches as SQL would. This process, known as similarity search, makes it straightforward to add advanced search capabilities directly into your applications without needing to re-architect or move data to a separate vector database.
Want to learn more about vector embeddings? Check out our guide to generative AI app development.
With its ability to handle high-dimensional vectors, pgvector supports a range of advanced applications.
Keyword matching in traditional relational databases often fails to identify meaningful connections in data. Similarity search compares vector proximity using metrics like Euclidean distance and cosine distance to find deeper patterns, critical for applications like image recognition and semantic search, where results are ranked by meaning. In e-commerce, for example, similarity search enables product recommendations by analyzing user behavior and finding related items.
Vector-based natural language processing allows AI agents to understand context, leading to more personalized conversations and more accurate responses. Multi-lingual support enhances their performance as virtual assistants and customer service platforms.
pgvector enhances AI workflows by enabling the storage and querying of vector embeddings, which are essential for identifying unusual patterns in data. By analyzing vector proximity, it helps detect anomalies in real time for fraud prevention, network security, or quality control.
Sentiment analysis analyzes the intent of a message, enabling you to appropriately route negative comments for faster action—creating tailored resolutions.
By leveraging PostgreSQL’s scalability, transaction support, and robust reliability, pgvector efficiently manages high-dimensional datasets. Additionally, its usage of familiar SQL syntax makes it accessible for existing teams, eliminating the need for additional tools or infrastructure dedicated to vector indexing and search.
Easily integrates into existing PostgreSQL-based apps.
Improves PostgreSQL’s scalability for growing datasets.
Offers customizable features like distance metrics and indexing.
Inherits PostgreSQL’s trusted security and reliability.
Allows you to seamlessly query across structured and unstructured data.
Provides a developer-friendly solution for working with large-scale, high-dimensional data.
For a single database that excels in both traditional SQL queries and modern vertex search, consider AlloyDB for PostgreSQL. AlloyDB uses the ScaNN (Scalable Nearest Neighbor) vector similarity search algorithm developed by Google, delivering significantly higher performance than other cloud-based PostgreSQL services for transactional and analytical workloads within large databases.
Learn how AlloyDB performs simultaneous search on structured and unstructured data.
Cloud SQL and AlloyDB for PostgreSQL support pgvector, allowing you to store and query vector embeddings using standard SQL commands.
Use your preferred PostgreSQL client (such as psql, pgAdmin, or the Google Cloud console) to connect to your Cloud SQL or AlloyDB instance.
Run the following SQL command to enable the extension on your database. You only need to do this once per database.
Create a new table (or alter an existing one) to include a column for vector data. You must specify the dimensions of the vector. For example, to create a table for storing 3-dimensional embeddings:
You can insert vector embeddings just like standard data. Vectors are formatted as arrays enclosed in brackets.
You can now query your data to find the nearest neighbors. The <-> operator calculates Euclidean distance (L2 distance), which is commonly used to find the most similar items.
For larger datasets, adding an index can significantly speed up search performance. The HNSW and ScaNN indexes are commonly used options. Here’s an HNSW example:
Start building on Google Cloud with $300 in free credits and 20+ always free products.