이 페이지에서는 embedding() 함수가 테이블에 저장된 데이터 및 pgvector 기능과 함께 작동하는 방식을 보여주는 워크플로 예시를 제공합니다. 이 예시에서는 일반 텍스트 입력을 사용하여 텍스트 의미의 대규모 언어 모델(LLM) 기반 시맨틱 파싱을 사용하는 데이터베이스에서 결과를 가져옵니다.
예시 시나리오
다음과 같은 특성을 가진 데이터베이스가 AlloyDB에서 실행된다고 상상해 보세요.
데이터베이스에는 items 테이블이 포함되어 있습니다. 이 테이블의 각 행은 비즈니스에서 판매하는 상품을 설명합니다.
items 테이블에는 complaints 열이 포함되어 있습니다. 이 TEXT 열에는 각 항목에 대해 로깅된 구매자 불만사항이 저장됩니다.
데이터베이스는 Vertex AI Model Garden과 통합되어 text-embedding 영어 모델에 대한 액세스 권한을 부여합니다.
이 데이터베이스에 항목에 대한 불만사항이 저장되어 있지만 이러한 불만사항은 일반 텍스트로 저장되므로 쿼리하기가 어렵습니다. 예를 들어 잘못된 색상의 상품을 받은 고객이 불만사항을 가장 많이 제기한 상품이 무엇인지 확인하려면 테이블에서 일반 SQL 쿼리를 수행하여 다양한 키워드 검색 유형을 찾으면 됩니다. 하지만 이 방법은 정확한 키워드가 포함된 행만 일치시킵니다.
예를 들어 SELECT * FROM item WHERE complaints LIKE
"%wrong color%"와 같은 기본 SQL 쿼리는 The picture shows a blue one, but the one I received was red만 포함된 complaints 필드가 있는 행을 반환하지 않습니다.
LLM 기반 임베딩을 사용하는 SQL 쿼리는 이 격차를 해소하는 데 도움이 될 수 있습니다. 임베딩을 적용하면 이 예시에서는 지정된 텍스트 프롬프트(예: '색상이 잘못되었습니다')와 의미론적으로 유사한 불만사항이 있는 상목의 테이블을 쿼리할 수 있습니다.
이제 items 테이블에서 의미론적 최근접 이웃 쿼리를 수행할 수 있습니다. 다음 쿼리는 pgvector에서 제공하는 <-> 연산자를 사용하여 It was the wrong color 텍스트와의 시맨틱 유사성을 기준으로 테이블의 행을 정렬하고 상위 10개 불만사항을 반환합니다. 쿼리에는 정렬된 첫 번째 행의 id 및 name 값이 표시됩니다.
SELECTid,nameFROMitemsORDERBYcomplaint_embedding::vector<->embedding('text-embedding-005','It was the wrong color')LIMIT10;
[[["이해하기 쉬움","easyToUnderstand","thumb-up"],["문제가 해결됨","solvedMyProblem","thumb-up"],["기타","otherUp","thumb-up"]],[["이해하기 어려움","hardToUnderstand","thumb-down"],["잘못된 정보 또는 샘플 코드","incorrectInformationOrSampleCode","thumb-down"],["필요한 정보/샘플이 없음","missingTheInformationSamplesINeed","thumb-down"],["번역 문제","translationIssue","thumb-down"],["기타","otherDown","thumb-down"]],["최종 업데이트: 2025-09-02(UTC)"],[[["\u003cp\u003eThis guide demonstrates using the \u003ccode\u003eembedding()\u003c/code\u003e function with table-stored data and the \u003ccode\u003epgvector\u003c/code\u003e feature in AlloyDB to perform LLM-driven semantic analysis of text, specifically for item complaints.\u003c/p\u003e\n"],["\u003cp\u003eTo leverage this functionality, ensure the \u003ccode\u003egoogle_ml_integration\u003c/code\u003e and \u003ccode\u003evector\u003c/code\u003e extensions are installed, and integrate your database with Vertex AI to access the \u003ccode\u003etext-embedding-005\u003c/code\u003e model.\u003c/p\u003e\n"],["\u003cp\u003ePrepare your table by adding a new \u003ccode\u003evector\u003c/code\u003e column (e.g., \u003ccode\u003ecomplaint_embedding\u003c/code\u003e) to store embeddings generated from text data and populate it using the \u003ccode\u003eembedding()\u003c/code\u003e function or by importing pre-existing embeddings.\u003c/p\u003e\n"],["\u003cp\u003eCreate an index on the new \u003ccode\u003evector\u003c/code\u003e column, utilizing scalar-quantization techniques (e.g., ScaNN index), to optimize the performance of semantic nearest-neighbor queries.\u003c/p\u003e\n"],["\u003cp\u003eOnce set up, you can perform semantic nearest-neighbor queries on your table by using the \u003ccode\u003e<->\u003c/code\u003e operator to find items with complaints semantically similar to a given text prompt, allowing for more intuitive data retrieval.\u003c/p\u003e\n"]]],[],null,["# An example embedding workflow\n\nSelect a documentation version: 15.7.0keyboard_arrow_down\n\n- [Current (16.8.0)](/alloydb/omni/current/docs/ai/example-embeddings)\n- [16.8.0](/alloydb/omni/16.8.0/docs/ai/example-embeddings)\n- [16.3.0](/alloydb/omni/16.3.0/docs/ai/example-embeddings)\n- [15.12.0](/alloydb/omni/15.12.0/docs/ai/example-embeddings)\n- [15.7.1](/alloydb/omni/15.7.1/docs/ai/example-embeddings)\n- [15.7.0](/alloydb/omni/15.7.0/docs/ai/example-embeddings)\n\n\u003cbr /\u003e\n\nThis page provides an example workflow that demonstrates how [the `embedding()`\nfunction](/alloydb/omni/15.7.0/docs/ai/work-with-embeddings) works together\nwith your table-stored data and `pgvector`\nfeature. The example\nuses plain-text input to fetch a result from a database that relies on large\nlanguage model (LLM)-driven semantic parsing of the text's meaning.\n\nAn example scenario\n-------------------\n\nImagine a database running on AlloyDB with the following aspects:\n\n- The database contains a table, `items`. Each row in this table describes an\n item that your business sells.\n\n- The `items` table contains a column, `complaints`. This `TEXT` column stores\n buyer complaints logged about each item.\n\n- The database integrates with the Vertex AI\n Model Garden, giving it access to the `text-embedding` English\n models.\n\nEven though this database stores complaints about items, these complaints are\nstored as plain text, making it challenging to query. For example, if you want\nto see which items have the most complaints from customers who received the\nwrong color of merchandise, then you can perform ordinary SQL queries on the\ntable, looking for various keyword matches. However, this approach only matches\nrows that contain those exact keywords.\n\nFor example, a basic SQL query such as `SELECT * FROM item WHERE complaints LIKE\n\"%wrong color%\"` doesn't return a row whose `complaints` field contains only\n`The picture shows a blue one, but the one I received was red`.\n\nSQL queries using LLM-powered embeddings can help bridge this gap. By\napplying embeddings, you can query the table in this example for items whose\ncomplaints have semantic similarity to a given text prompt, such as \"It was the\nwrong color\".\n\nThe following steps show how to enable this in the example setup described\nearlier.\n\nBefore you begin\n----------------\n\nMake sure that you meet the following requirements.\n\n#### Required database extension\n\n- Ensure that the following extensions are installed on your\n AlloyDB database.\n\n - `google_ml_integration` extension\n\n - `vector` extension, version `0.5.0.google-1` or later\n\nFor more information about installing and managing extensions, see [Configure database extensions](/alloydb/omni/15.7.0/docs/reference/extensions).\n\n- Set the `google_ml_integration.enable_model_support` database flag to `off`.\n\n#### Set up model access\n\nBefore you can generate embeddings from an AlloyDB database, you\nmust configure AlloyDB to work with a text embedding model.\n\nTo work with the cloud-based `text-embedding-005` model, you need to\n[integrate your database with\nwith Vertex AI](/alloydb/omni/15.7.0/docs/ai/install-with-alloydb-ai).\n\nPrepare the table\n-----------------\n\nBefore you run LLM-based queries on the `items` table's content, you must\nprepare the table to store and index embeddings based on your existing\ndata.\n\n### Create a column to store embeddings\n\nAdd a column to the table for storing embeddings. \n\n ALTER TABLE items ADD COLUMN complaint_embedding vector(768);\n\nThis example specifies `768` as an argument, because that is the number of\ndimensions supported by the `text-embedding-005` English models. For more information, see\n[Generate an embedding](/alloydb/omni/15.7.0/docs/ai/work-with-embeddings).\n\nThe example applies the `vector` data type to the column for ease of using\n`pgvector` functions and operators with its values.\n\n### Populate the new column\n\nIf you already have embeddings in CSV format, follow the\nsteps in [Store vector embeddings](/alloydb/omni/15.7.0/docs/ai/store-embeddings) to store your embeddings.\n\nOptionally, use the `embedding()` function to populate this new column with\nembeddings in case you have text stored in the`complaints` column. In this\nexample setup,\nAlloyDB generates the embeddings using the\n`text-embedding` model, version `005`.\n**Note:** If you have more than 100k rows in a table, we don't recommend using the `embedding()` function to generate stored embeddings on existing data in a table. It is best suited for inline embedding generation. \n\n UPDATE items SET complaint_embedding = embedding('text-embedding-005', complaints);\n\nThis example implicitly casts the `real[]` return value of `embedding()` into a `vector` value,\nin order to store the value into the `vector` column created earlier.\n\n### Create an index\n\nTo improve performance, add an index to `items` that uses the\nscalar-quantization techniques. \n\n CREATE INDEX complaint_embed_idx ON items\n USING scann (complaint_embedding l2)\n WITH (num_leaves=20);\n\nFor more information on creating a ScaNN index, see\n[Create indexes and query vectors](/alloydb/omni/15.7.0/docs/ai/store-index-query-vectors?resource=scann).\n\nRun LLM-powered queries with provided text\n------------------------------------------\n\nYou can now make semantic nearest-neighbor queries on the `items` table. The\nfollowing query uses the `\u003c-\u003e` operator provided by `pgvector` to sort the\ntable's rows on semantic proximity to the text `It was the wrong color` and return the top ten complaints. The\nquery displays the `id` and `name` values of the first sorted row. \n\n SELECT id, name FROM items\n ORDER BY complaint_embedding::vector\n \u003c-\u003e embedding('text-embedding-005', 'It was the wrong color') LIMIT 10;\n\nWhat's next\n-----------\n\n- For another example workflow involving AlloyDB and `pgvector`, see [Building AI-powered apps on Google Cloud databases using\n pgvector, LLMs and LangChain](https://cloud.google.com/blog/products/databases/using-pgvector-llms-and-langchain-with-google-cloud-databases)."]]