例如,SELECT * FROM item WHERE complaints LIKE
"%wrong color%" 等基本 SQL 查询不会返回 complaints 字段仅包含 The picture shows a blue one, but the one I received was red 的行。
使用 LLM 支持的嵌入的 SQL 查询可以帮助缩小这种差距。通过应用嵌入,您可以在此示例的表中查询其投诉与给定文本提示(例如“It was the wrong color”)具有语义相似度的商品。
[[["易于理解","easyToUnderstand","thumb-up"],["解决了我的问题","solvedMyProblem","thumb-up"],["其他","otherUp","thumb-up"]],[["很难理解","hardToUnderstand","thumb-down"],["信息或示例代码不正确","incorrectInformationOrSampleCode","thumb-down"],["没有我需要的信息/示例","missingTheInformationSamplesINeed","thumb-down"],["翻译问题","translationIssue","thumb-down"],["其他","otherDown","thumb-down"]],["最后更新时间 (UTC):2025-08-22。"],[[["\u003cp\u003eThis guide demonstrates using the \u003ccode\u003eembedding()\u003c/code\u003e function with table-stored data and the \u003ccode\u003epgvector\u003c/code\u003e feature in AlloyDB to perform LLM-driven semantic analysis of text.\u003c/p\u003e\n"],["\u003cp\u003eThe example scenario involves querying an \u003ccode\u003eitems\u003c/code\u003e table's \u003ccode\u003ecomplaints\u003c/code\u003e column to find items with complaints semantically similar to a given prompt, like "It was the wrong color," despite potential variations in phrasing.\u003c/p\u003e\n"],["\u003cp\u003eSetting up the database involves installing the \u003ccode\u003egoogle_ml_integration\u003c/code\u003e and \u003ccode\u003evector\u003c/code\u003e extensions, configuring access to the Vertex AI \u003ccode\u003etext-embedding-005\u003c/code\u003e model, and adding a \u003ccode\u003evector\u003c/code\u003e column to store embeddings.\u003c/p\u003e\n"],["\u003cp\u003eThe \u003ccode\u003eembedding()\u003c/code\u003e function can populate the new \u003ccode\u003evector\u003c/code\u003e column, but it is recommended for inline embedding generation, not for populating large tables with over 100k rows.\u003c/p\u003e\n"],["\u003cp\u003eCreating a ScaNN index on the embedding column will enhance performance, allowing the use of \u003ccode\u003epgvector\u003c/code\u003e operators to run semantic queries and find complaints based on semantic similarity.\u003c/p\u003e\n"]]],[],null,["# An example embedding workflow\n\nSelect a documentation version: 15.7.1keyboard_arrow_down\n\n- [Current (16.8.0)](/alloydb/omni/current/docs/ai/example-embeddings)\n- [16.8.0](/alloydb/omni/16.8.0/docs/ai/example-embeddings)\n- [16.3.0](/alloydb/omni/16.3.0/docs/ai/example-embeddings)\n- [15.12.0](/alloydb/omni/15.12.0/docs/ai/example-embeddings)\n- [15.7.1](/alloydb/omni/15.7.1/docs/ai/example-embeddings)\n- [15.7.0](/alloydb/omni/15.7.0/docs/ai/example-embeddings)\n\n\u003cbr /\u003e\n\nThis page provides an example workflow that demonstrates how [the `embedding()`\nfunction](/alloydb/omni/15.7.1/docs/ai/work-with-embeddings) works together\nwith your table-stored data and `pgvector`\nfeature. The example\nuses plain-text input to fetch a result from a database that relies on large\nlanguage model (LLM)-driven semantic parsing of the text's meaning.\n\nAn example scenario\n-------------------\n\nImagine a database running on AlloyDB with the following aspects:\n\n- The database contains a table, `items`. Each row in this table describes an\n item that your business sells.\n\n- The `items` table contains a column, `complaints`. This `TEXT` column stores\n buyer complaints logged about each item.\n\n- The database integrates with the Vertex AI\n Model Garden, giving it access to the `text-embedding` English\n models.\n\nEven though this database stores complaints about items, these complaints are\nstored as plain text, making it challenging to query. For example, if you want\nto see which items have the most complaints from customers who received the\nwrong color of merchandise, then you can perform ordinary SQL queries on the\ntable, looking for various keyword matches. However, this approach only matches\nrows that contain those exact keywords.\n\nFor example, a basic SQL query such as `SELECT * FROM item WHERE complaints LIKE\n\"%wrong color%\"` doesn't return a row whose `complaints` field contains only\n`The picture shows a blue one, but the one I received was red`.\n\nSQL queries using LLM-powered embeddings can help bridge this gap. By\napplying embeddings, you can query the table in this example for items whose\ncomplaints have semantic similarity to a given text prompt, such as \"It was the\nwrong color\".\n\nThe following steps show how to enable this in the example setup described\nearlier.\n\nBefore you begin\n----------------\n\nMake sure that you meet the following requirements.\n\n#### Required database extension\n\n- Ensure that the following extensions are installed on your\n AlloyDB database.\n\n - `google_ml_integration` extension\n\n - `vector` extension, version `0.5.0.google-1` or later\n\nFor more information about installing and managing extensions, see [Configure database extensions](/alloydb/omni/15.7.1/docs/reference/extensions).\n\n- Set the `google_ml_integration.enable_model_support` database flag to `off`.\n\n#### Set up model access\n\nBefore you can generate embeddings from an AlloyDB database, you\nmust configure AlloyDB to work with a text embedding model.\n\nTo work with the cloud-based `text-embedding-005` model, you need to\n[integrate your database with\nwith Vertex AI](/alloydb/omni/15.7.1/docs/ai/install-with-alloydb-ai).\n\nPrepare the table\n-----------------\n\nBefore you run LLM-based queries on the `items` table's content, you must\nprepare the table to store and index embeddings based on your existing\ndata.\n\n### Create a column to store embeddings\n\nAdd a column to the table for storing embeddings. \n\n ALTER TABLE items ADD COLUMN complaint_embedding vector(768);\n\nThis example specifies `768` as an argument, because that is the number of\ndimensions supported by the `text-embedding-005` English models. For more information, see\n[Generate an embedding](/alloydb/omni/15.7.1/docs/ai/work-with-embeddings).\n\nThe example applies the `vector` data type to the column for ease of using\n`pgvector` functions and operators with its values.\n\n### Populate the new column\n\nIf you already have embeddings in CSV format, follow the\nsteps in [Store vector embeddings](/alloydb/omni/15.7.1/docs/ai/store-embeddings) to store your embeddings.\n\nOptionally, use the `embedding()` function to populate this new column with\nembeddings in case you have text stored in the`complaints` column. In this\nexample setup,\nAlloyDB generates the embeddings using the\n`text-embedding` model, version `005`.\n**Note:** If you have more than 100k rows in a table, we don't recommend using the `embedding()` function to generate stored embeddings on existing data in a table. It is best suited for inline embedding generation. \n\n UPDATE items SET complaint_embedding = embedding('text-embedding-005', complaints);\n\nThis example implicitly casts the `real[]` return value of `embedding()` into a `vector` value,\nin order to store the value into the `vector` column created earlier.\n\n### Create an index\n\nTo improve performance, add an index to `items` that uses the\nscalar-quantization techniques. \n\n CREATE INDEX complaint_embed_idx ON items\n USING scann (complaint_embedding l2)\n WITH (num_leaves=20);\n\nFor more information on creating a ScaNN index, see\n[Create indexes and query vectors](/alloydb/omni/15.7.1/docs/ai/store-index-query-vectors?resource=scann).\n\nRun LLM-powered queries with provided text\n------------------------------------------\n\nYou can now make semantic nearest-neighbor queries on the `items` table. The\nfollowing query uses the `\u003c-\u003e` operator provided by `pgvector` to sort the\ntable's rows on semantic proximity to the text `It was the wrong color` and return the top ten complaints. The\nquery displays the `id` and `name` values of the first sorted row. \n\n SELECT id, name FROM items\n ORDER BY complaint_embedding::vector\n \u003c-\u003e embedding('text-embedding-005', 'It was the wrong color') LIMIT 10;\n\nWhat's next\n-----------\n\n- For another example workflow involving AlloyDB and `pgvector`, see [Building AI-powered apps on Google Cloud databases using\n pgvector, LLMs and LangChain](https://cloud.google.com/blog/products/databases/using-pgvector-llms-and-langchain-with-google-cloud-databases)."]]