Generate and manage vector embeddings

This page describes how to generate and store vector embeddings. For an overview, see Vector embedding storage.

Before you begin

You need to have a Cloud SQL instance with the vector database flags enabled.

Generate vector embeddings based on row data

You can generate a vector embedding for a given row's data by using a text embedding API such as Vertex AI or OpenAI. You can use any text embedding API with Cloud SQL vector embeddings. However, you must use the same text embedding API for query string vector generation. You can't combine different APIs for source data and query vectorization.

For example, you can generate a vector embedding from Vertex AI:

from vertexai.language_models import TextEmbeddingModel

def text_embedding() -> list:
    """Text embedding with a Large Language Model."""
    model = TextEmbeddingModel.from_pretrained("text-embedding-004")
    embeddings = model.get_embeddings(["What is life?"])
    for embedding in embeddings:
        vector = embedding.values
        print(f"Length of Embedding Vector: {len(vector)}")
    return vector

if __name__ == "__main__":
    text_embedding()

Store vector embeddings

This section provides example statements for storing vector embeddings in Cloud SQL.

Create a new table with a vector embedding column

Use the CREATE TABLE statement with a column that uses the VECTOR data type.

Use the following syntax to create the table:

CREATE TABLE TABLE_NAME(
  id INTEGER
  PRIMARY KEY
    AUTO_INCREMENT,
    title VARCHAR(60),
    EMBEDDING_COLUMN_NAME
      VECTOR(VECTOR_DIMENSIONS)
  USING VARBINARY);

Replace the following parameters:

  • TABLE_NAME: the name of the table you where you want to store the embeddings.
  • EMBEDDING_COLUMN_NAME: the name of column that stores the embedding.
  • VECTOR_DIMENSIONS: the number of dimensions to use for the embedding.

In the following example, the embedding column has a vector with three dimensions. The data stored in this column has the VARBINARY data type.

CREATE TABLE books(
  id INTEGER PRIMARY KEY AUTO_INCREMENT, title VARCHAR(60), embedding VECTOR(3) USING VARBINARY);

Add a vector embedding column to an existing table

Use the ALTER TABLE statement to add a vector embedding column to an existing table. The column must use the VECTOR data type to hold the embedding.

In the following example, an embedding column that has a vector with three dimensions is inserted into the table. The data stored in this column has the VARBINARY data type.

ALTER TABLE books
ADD COLUMN embedding
  VECTOR(3)
USING VARBINARY;

Insert a vector embedding

Use INSERT with the string_to_vector function to insert a vector embedding values into a table.

In the following example, a vector with three dimensions is inserted into the embedding column.

INSERT INTO books
  (
    title,
    embedding)
VALUES (('book title', string_to_vector('[1,2,3]')));

Insert multiple vector embeddings

Use INSERT with the string_to_vector function to insert a comma-separated list of vector embeddings.

In the following statement, two embeddings, each containing a vector with three dimensions and is inserted into the embedding column.

INSERT INTO books
  (
    title,
    embedding)
VALUES
  (
    (
      'book title',
      string_to_vector('[1,2,3]')),
    ('book title', string_to_vector('[4,5,6]')));

Upsert a vector embedding

Use an INSERT or UPDATE operation on a table with the string_to_vector function to add a vector embedding column, using the following syntax.

In the following statement, an upsert is used to insert or update the embedding column with an embedding that contains a vector with three dimensions.

INSERT INTO books
  (
    id,
    title,
    embedding)
VALUES
  (
    (
      1,
      'book title',
      string_to_vector('[1,2,3]')))
ON DUPLICATE KEY UPDATE embedding = string_to_vector('[1,2,3]');

Update a vector embedding

Use UPDATE with the string_to_vector function to update a vector embedding.

In the following statement, UPDATE is used to update the embedding column with a vector with three dimensions.

UPDATE books
SET embedding = string_to_vector('[7,8,9]')
WHERE id = 1;

Retrieve vector embeddings

To retrieve vector embeddings, use the Cloud SQL vector_to_string function along with the name of the embedding.

In the following statement, the embedding column is retrieved to view.

SELECT vector_to_string(embedding) FROM books WHERE id = 1;

Delete a vector embedding

Use DELETE with the string_to_vector function to remove a vector embedding from a table. If there's a vector index, you must first delete it. For more information, see Drop a vector index.

In the following statement, DELETE is used to delete the value in the embedding column.

DELETE FROM books
WHERE embedding = string_to_vector('[1,2,3]');

What's next