Databases

Building AI-powered apps on Google Cloud databases using pgvector, LLMs and LangChain

June 26, 2023

Saket Saurabh

Software Engineer

Editor's note, January 6, 2025: AlloyDB now supports the ScaNN index in addition to open source pgvector, bringing twelve years of Google Research to PostgreSQL for improved vector search performance and accuracy.

Cloud SQL for PostgreSQL and AlloyDB for PostgreSQL now support the pgvector extension, bringing the power of vector search operations to PostgreSQL databases. You can read the full announcement here.

In this step-by-step tutorial, we will show you how to add generative AI features to your own applications with just a few lines of code using pgvector, LangChain and LLMs on Google Cloud. You can also follow along with our guided tutorial video. We’ll build a sample Python application that can understand and respond to human language queries about the data stored in your PostgreSQL database. Then, we’ll further push the creative limits of the application by teaching it to create new AI-generated product descriptions based on our existing dataset about children's toys. Forget about boring “orange and white cat stuffed animal” descriptions of days past — with AI, you can now generate descriptions like “ferocious and cuddly companion to keep your little one company, from crib to pre-school”. Let's see what our AI-generated taglines can do!

How to install pgvector in Cloud SQL and AlloyDB for PostgreSQL

The pgvector extension can be installed within an existing instance of Cloud SQL for PostgreSQL and AlloyDB for PostgreSQL using the CREATE EXTENSION command as shown below. If you do not have an existing instance, create one for Cloud SQL and AlloyDB.

The new `vector` data type

Under the hood, the pgvector extension uses the PostgreSQL `CREATE TYPE` command to register a new data type called `vector`. PostgreSQL table columns can be defined using this new `vector` data type. A string with comma-separated numbers within square brackets can be used to insert values into this column as shown below.

The following code snippet uses a three-dimensional vector as an example. In actual AI/ML applications, a vector has many more dimensions, typically in the order of hundreds.

New similarity search operators in pgvector

The pgvector extension also introduces new operators for performing similarity matches on vectors, allowing you to find vectors that are semantically similar. Two such operators are:

‘<->’: returns the Euclidean distance between the two vectors. Euclidean distance is a good choice for applications where the magnitude of the vectors is important — for example, in mapping and navigation applications, or when implementing the K-means clustering algorithm in machine learning.

‘<=>’: returns the cosine distance between the two vectors. Cosine similarity is a good choice for applications where the direction of the vectors is important — for example, when trying to find the most similar document to a given document for implementing recommendation systems or natural language processing tasks.

We use the cosine similarity search operator for our sample application.

Building the sample application

Let's get started with building our application with pgvector and LLMs. We’ll also use LangChain, which is an open-source framework that provides several pre-built components that make it easier to create complex applications using LLMs.

The entire application is available as an interactive Google Colab notebook for Cloud SQL PostgreSQL. You can directly run this sample application from your web browser without any additional installations, or writing a single line of code!

Follow the instructions in the Colab notebook to set up your environment. Note that if an instance with the required name does not exist, the notebook creates a Cloud SQL PostgreSQL instance for you. Running the notebook may incur Google Cloud charges. You may be eligible for a free trial that gets you credits for these costs.

Loading our ‘toy’ dataset

The sample application uses an example of an e-commerce company that runs an online marketplace for buying and selling children’s toys. The dataset for this notebook has been sampled and created from a larger public retail dataset available at Kaggle. The dataset used in this notebook has only about 800 toy products, while the public dataset has over 370,000 products in different categories.

After you set up the environment using the steps mentioned in the Colab notebook, load the provided sample dataset into a Pandas data frame. The first five rows of the dataset are shown for your reference below.

lang-py

https://storage.googleapis.com/gweb-cloudblog-publish/images/01-product-examples.max-2000x2000.jpg

Save the dataset in a PostgreSQL table called `products` that has a simple schema with four fields: product_id, product_name, description, and list_price.

lang-py

# Save the Pandas dataframe in a PostgreSQL table.
import asyncio
import asyncpg
from google.cloud.sql.connector import Connector

async def main():
    loop = asyncio.get_running_loop()
    async with Connector(loop=loop) as connector:
        # Create connection to Cloud SQL database
        conn: asyncpg.Connection = await connector.connect_async(
            f"{project_id}:{region}:{instance_name}",  # Cloud SQL instance connection name
            "asyncpg",
            user=f"{database_user}",
            password=f"{database_password}",
            db=f"{database_name}"
        )

# Create the `products` table.	 
        await conn.execute("""CREATE TABLE products(
                                product_id VARCHAR(1024) PRIMARY KEY,
                                product_name TEXT,
                                description TEXT,
                                list_price NUMERIC)""")

# Copy the dataframe to the `products` table.
        tuples = list(df.itertuples(index=False))
        await conn.copy_records_to_table('products', records=tuples, columns=list(df), timeout=10)
        await conn.close()

# Run the SQL commands now.
await main() # type: ignore

Generating the vector embeddings using Vertex AI

We use the Vertex AI Text Embedding model to generate the vector embeddings for the text that describes various toys in our products table. At publication, the Vertex AI Text Embedding model only accepts 3,072 input tokens in a single API request. Therefore, as a first step, split long product descriptions into smaller chunks of 500 characters each.

Split long text into smaller chunks with LangChain

You can use the RecursiveCharacterTextSplitter method from LangChain library, which provides a nice and simple way to split the large text into smaller chunks.

lang-py

Get the vector embeddings using Vertex AI

After you split long product descriptions into smaller chunks, you can generate vector embeddings for each chunk by using the Text Embedding Model available through Vertex AI. Notice how we can accomplish this in just a few lines of code!

lang-py

Use pgvector to store the generate embeddings

After creating the pgvector extension and registering a new vector data type, you can store a NumPy array directly into a PostgreSQL table.

lang-py

Finding similar toys using pgvector cosine search operator

By completing the steps above, you’ve just made an entire toy dataset searchable using simple English! Check it out in the image below. You can even filter toys based on a specific price range, demonstrating the hybrid search capabilities.

https://storage.googleapis.com/gweb-cloudblog-publish/original_images/02-demo-pgvector-search.gif

So, how does this work? Let’s break it down.

Step 1: Generate the vector embedding for the incoming input query.

lang-py

Step 2: Use the new pgvector cosine similarity search operator to find related products

Notice how you can combine the vector search operation with the regular SQL filters on the `list_price` column using the powerful PostgreSQL and pgvector query semantics.

lang-py

Use case 1: Building an AI-curated contextual hybrid search

Now that you’ve learned how to find similar toy products, it’s time to super-charge your application with AI. We will use large language models (LLMs) to make our application more intelligent and capable of answering user queries about these products.

The following example shows how a grandparent uses the AI-powered search interface to find a perfect toy for their grandkid by describing their needs in simple English!

https://storage.googleapis.com/gweb-cloudblog-publish/original_images/03-demo-ai-hybrid-search.gif

What is happening behind the scenes here?

First, retrieve all the matching products and their descriptions using pgvector, following the same steps that we showed above.

Then, use the MapReduce Chain from LangChain library to build a high-quality prompt context by combining summaries of all similar toy products.

Finally, invoke the Vertex AI text generation LLM model to get a well-formatted answer. See the code snippet below for an example.

lang-py

# Using LangChain for summarization and efficient context building.

from langchain.chains.summarize import load_summarize_chain
from langchain.docstore.document import Document
from langchain.llms import VertexAI
from langchain import PromptTemplate, LLMChain
from IPython.display import display, Markdown

llm = VertexAI()

map_prompt_template = """
              You will be given a detailed description of a toy product.
              This description is enclosed in triple backticks (```).
              Using this description only, extract the name of the toy,
              the price of the toy and its features.

```{text}```
              SUMMARY:
              """
map_prompt = PromptTemplate(template=map_prompt_template, input_variables=["text"])

combine_prompt_template = """
                You will be given a detailed description different toy products
                enclosed in triple backticks (```) and a question enclosed in
                double backticks(``).
                Select one toy that is most relevant to answer the question.
                Using that selected toy description, answer the following
                question in as much detail as possible.
                You should only use the information in the description.
                Your answer should include the name of the toy, the price of
                the toy and its features. 
                Your answer should be less than 200 words.
                Your answer should be in Markdown in a numbered list format.

Description:
                ```{text}```

Question:
                ``{user_query}``

Answer:
                """

combine_prompt = PromptTemplate(template=combine_prompt_template, input_variables=["text", "user_query"])

docs = [Document(page_content=t) for t in matches]
chain = load_summarize_chain(llm,
                             chain_type="map_reduce",
                             map_prompt=map_prompt,
                             combine_prompt=combine_prompt)
answer = chain.run({
                      'input_documents': docs,
                      'user_query': user_query,
                    })

display(Markdown(answer))

Use case 2: Adding AI-powered creative content generation

Building a hybrid semantic search is a common, powerful example for using LLMs with vector embeddings. But there’s so much more you can do with this new technology!

You can create an AI-powered creative content generation tool by adjusting LLM prompt input and model temperature settings. Temperature is an input parameter for an LLM prompt that ranges from zero to one, and it defines the randomness of the model's output. A higher temperature will result in more creative response, while a lower temperature will result in more definitive and factual output.

In the following example, you can see how a seller uses generative AI to get auto-generated item descriptions for a new bicycle product that they want to add to the platform!

https://storage.googleapis.com/gweb-cloudblog-publish/original_images/04-demo-ai-content-generation.gif

Like before, you can use the pgvector similarity search operator to find a similar product description, then use it as a prompt context to generate new creative output. Here is the LLM prompt code snippet that was used for the above output:

lang-py

from langchain.llms import VertexAI
from langchain import PromptTemplate, LLMChain
from IPython.display import display, Markdown

template = """
            You are given descriptions about some similar kind of toys in the
            context. This context is enclosed in triple backticks (```).
            Combine these descriptions and adapt them to match the 
            specifications in the initial prompt. 
            All the information from the initial prompt must be included. 
            You are allowed to be as creative as possible,
            Describe the new toy in as much detail. Your answer should be
            less than 200 words.

Context:
            ```{context}```

Initial Prompt:
            {creative_prompt}

Answer:
        """
prompt = PromptTemplate(template=template, input_variables=["context", "creative_prompt"])

# Increase the `temperature` to allow more creative writing freedom.
llm = VertexAI(temperature=0.7)

llm_chain = LLMChain(prompt=prompt, llm=llm)
answer = llm_chain.run({
    "context": '\n'.join(matches),
    "creative_prompt": creative_prompt,
    })

display(Markdown(answer))

Summary

This blog demonstrated just two examples of powerful features that you can implement by combining the power of relational databases with LLMs.

Generative AI is a powerful paradigm shift in application development that lets you create novel applications to serve users in new ways — from answering patients' complex medical questions to helping enterprises analyze cyberattacks. We've shown just one small example of what generative AI can unlock; we can't wait to see what you build with it.

Try out the Google Colab notebook described in this blog to get started today!

Databases