Jump to Content
Databases

Building AI-powered apps on Google Cloud databases using pgvector, LLMs and LangChain

June 26, 2023
Saket Saurabh

Software Engineer

Editor's note, January 6, 2025: AlloyDB now supports the ScaNN index in addition to open source pgvector, bringing twelve years of Google Research to PostgreSQL for improved vector search performance and accuracy.


Cloud SQL for PostgreSQL and AlloyDB for PostgreSQL now support the pgvector extension, bringing the power of vector search operations to PostgreSQL databases. You can read the full announcement here

In this step-by-step tutorial, we will show you how to add generative AI features to your own applications with just a few lines of code using pgvector, LangChain and LLMs on Google Cloud. You can also follow along with our guided tutorial video. We’ll build a sample Python application that can understand and respond to human language queries about the data stored in your PostgreSQL database. Then, we’ll further push the creative limits of the application by teaching it to create new AI-generated product descriptions based on our existing dataset about children's toys. Forget about boring “orange and white cat stuffed animal” descriptions of days past — with AI, you can now generate descriptions like “ferocious and cuddly companion to keep your little one company, from crib to pre-school”. Let's see what our AI-generated taglines can do!

How to install pgvector in Cloud SQL and AlloyDB for PostgreSQL

The pgvector extension can be installed within an existing instance of Cloud SQL for PostgreSQL and AlloyDB for PostgreSQL using the CREATE EXTENSION command as shown below. If you do not have an existing instance, create one for Cloud SQL and AlloyDB.

Loading...

The new `vector` data type

Under the hood, the pgvector extension uses the PostgreSQL `CREATE TYPE` command to register a new data type called `vector`. PostgreSQL table columns can be defined using this new `vector` data type. A string with comma-separated numbers within square brackets can be used to insert values into this column as shown below. 

The following code snippet uses a three-dimensional vector as an example. In actual AI/ML applications, a vector has many more dimensions, typically in the order of hundreds.

Loading...

New similarity search operators in pgvector 

The pgvector extension also introduces new operators for performing similarity matches on vectors, allowing you to find vectors that are semantically similar. Two such operators are:

‘<->’: returns the Euclidean distance between the two vectors. Euclidean distance is a good choice for applications where the magnitude of the vectors is important — for example, in mapping and navigation applications, or when implementing the K-means clustering algorithm in machine learning. 

‘<=>’: returns the cosine distance between the two vectors. Cosine similarity is a good choice for applications where the direction of the vectors is important — for example, when trying to find the most similar document to a given document for implementing recommendation systems or natural language processing tasks. 

We use the cosine similarity search operator for our sample application.

Building the sample application

Let's get started with building our application with pgvector and LLMs. We’ll also use LangChain, which is an open-source framework that provides several pre-built components that make it easier to create complex applications using LLMs.

The entire application is available as an interactive Google Colab notebook for Cloud SQL PostgreSQL. You can directly run this sample application from your web browser without any additional installations, or writing a single line of code!

Follow the instructions in the Colab notebook to set up your environment. Note that if an instance with the required name does not exist, the notebook creates a Cloud SQL PostgreSQL instance for you. Running the notebook may incur Google Cloud charges. You may be eligible for a free trial that gets you credits for these costs. 

Loading our ‘toy’ dataset

The sample application uses an example of an e-commerce company that runs an online marketplace for buying and selling children’s toys. The dataset for this notebook has been sampled and created from a larger public retail dataset available at Kaggle. The dataset used in this notebook has only about 800 toy products, while the public dataset has over 370,000 products in different categories. 

After you set up the environment using the steps mentioned in the Colab notebook, load the provided sample dataset into a Pandas data frame. The first five rows of the dataset are shown for your reference below.

lang-py
Loading...

https://storage.googleapis.com/gweb-cloudblog-publish/images/01-product-examples.max-2000x2000.jpg

Save the dataset in a PostgreSQL table called `products` that has a simple schema with four fields: product_id, product_name, description, and list_price.

lang-py
Loading...

Generating the vector embeddings using Vertex AI

We use the Vertex AI Text Embedding model to generate the vector embeddings for the text that describes various toys in our products table. At publication, the Vertex AI Text Embedding model only accepts 3,072 input tokens in a single API request. Therefore, as a first step, split long product descriptions into smaller chunks of 500 characters each.

Split long text into smaller chunks with LangChain

You can use the RecursiveCharacterTextSplitter method from LangChain library, which provides a nice and simple way to split the large text into smaller chunks.

lang-py
Loading...

Get the vector embeddings using Vertex AI

After you split long product descriptions into smaller chunks, you can generate vector embeddings for each chunk by using the Text Embedding Model available through Vertex AI. Notice how we can accomplish this in just a few lines of code!

lang-py
Loading...

Use pgvector to store the generate embeddings

After creating the pgvector extension and registering a new vector data type, you can store a NumPy array directly into a PostgreSQL table.

lang-py
Loading...

Finding similar toys using pgvector cosine search operator

By completing the steps above, you’ve just made an entire toy dataset searchable using simple English! Check it out in the image below. You can even filter toys based on a specific price range, demonstrating the hybrid search capabilities.

https://storage.googleapis.com/gweb-cloudblog-publish/original_images/02-demo-pgvector-search.gif

So, how does this work? Let’s break it down. 

Step 1: Generate the vector embedding for the incoming input query.

lang-py
Loading...

Step 2: Use the new pgvector cosine similarity search operator to find related products 

Notice how you can combine the vector search operation with the regular SQL filters on the `list_price` column using the powerful PostgreSQL and pgvector query semantics.

lang-py
Loading...

Use case 1: Building an AI-curated contextual hybrid search

Now that you’ve learned how to find similar toy products, it’s time to super-charge your application with AI. We will use large language models (LLMs) to make our application more intelligent and capable of answering user queries about these products.

The following example shows how a grandparent uses the AI-powered search interface to find a perfect toy for their grandkid by describing their needs in simple English!

https://storage.googleapis.com/gweb-cloudblog-publish/original_images/03-demo-ai-hybrid-search.gif

What is happening behind the scenes here?

First, retrieve all the matching products and their descriptions using pgvector, following the same steps that we showed above.

Then, use the MapReduce Chain from LangChain library to build a high-quality prompt context by combining summaries of all similar toy products. 

Finally, invoke the Vertex AI text generation LLM model to get a well-formatted answer. See the code snippet below for an example.

lang-py
Loading...

Use case 2: Adding AI-powered creative content generation

Building a hybrid semantic search is a common, powerful example for using LLMs with vector embeddings. But there’s so much more you can do with this new technology! 

You can create an AI-powered creative content generation tool by adjusting LLM prompt input and model temperature settings. Temperature is an input parameter for an LLM prompt that ranges from zero to one, and it defines the randomness of the model's output. A higher temperature will result in more creative response, while a lower temperature will result in more definitive and factual output.

In the following example, you can see how a seller uses generative AI to get auto-generated item descriptions for a new bicycle product that they want to add to the platform!

https://storage.googleapis.com/gweb-cloudblog-publish/original_images/04-demo-ai-content-generation.gif

Like before, you can use the pgvector similarity search operator to find a similar product description, then use it as a prompt context to generate new creative output. Here is the LLM prompt code snippet that was used for the above output:

lang-py
Loading...

Summary

This blog demonstrated just two examples of powerful features that you can implement by combining the power of relational databases with LLMs. 

Generative AI is a powerful paradigm shift in application development that lets you create novel applications to serve users in new ways — from answering patients' complex medical questions to helping enterprises analyze cyberattacks. We've shown just one small example of what generative AI can unlock; we can't wait to see what you build with it.

Try out the Google Colab notebook described in this blog to get started today!

Posted in