此页面由 Cloud Translation API 翻译。

SemanticCacheLookup 政策

本页面适用于 Apigee 和 Apigee Hybrid。

查看 Apigee Edge 文档。

概览

SemanticCacheLookup 政策是一种高级缓存政策，旨在优化 AI 工作负载（尤其是涉及大语言模型 [LLM] 的工作负载）的性能。

该政策使用 Vertex AI 文本嵌入 API 为文本生成嵌入，并使用向量搜索根据语义相似度（而非完全匹配）查找相似的提示。

SemanticCacheLookup 政策可以通过减少对 LLM 的调用量来缩短重复查询的响应时间并优化成本。

此政策与 SemanticCachePopulate 政策结合使用。

此政策是一项可扩展政策，使用此政策可能会影响费用或使用情况，具体取决于您的 Apigee 许可。如需了解政策类型和使用情况影响，请参阅政策类型。

准备工作

在使用 SemanticCacheLookup 政策之前，请完成以下任务：

创建 Vertex AI 项目。
创建向量搜索索引。
为索引创建 Vertex AI 端点。
创建 SemanticCachePopulate 政策。

如需详细了解如何完成这些任务，请参阅语义缓存政策使用入门。

所需的角色

如需获得应用和使用 SemanticCacheLookup 政策所需的权限，请让管理员为您授予用于部署 Apigee 代理的服务账号的 AI Platform User (roles/aiplatform.user) IAM 角色。如需详细了解如何授予角色，请参阅管理对项目、文件夹和组织的访问权限。

您也可以通过自定义角色或其他预定义角色来获取所需的权限。

启用 API

Enable the Compute Engine, Vertex AI, and Cloud Storage APIs.

Roles required to enable APIs

To enable APIs, you need the Service Usage Admin IAM role (roles/serviceusage.serviceUsageAdmin), which contains the serviceusage.services.enable permission. Learn how to grant roles.

Enable the APIs

`<SemanticCacheLookup>` 元素

定义 SemanticCacheLookup 政策。

默认值	请参阅下面的默认政策标签页
是否必需？	需要
类型	复杂对象
父元素	不适用
子元素	`<DisplayName>` `<IgnoreUnresolvedVariables>` `<UserPromptSource>` `<Embeddings>` `<SimilaritySearch>`

<SemanticCacheLookup> 元素使用以下语法：

语法

<SemanticCacheLookup> 元素使用以下语法：

<SemanticCacheLookup async="false" continueOnError="false" enabled="true" name="SCL-lookup">
  <DisplayName>SCL-lookup</DisplayName>
  <IgnoreUnresolvedVariables>false</IgnoreUnresolvedVariables>
  <UserPromptSource>{jsonPath('$.contents[-1].parts[-1].text',request.content,true)}</UserPromptSource>
  <Embeddings>
    <VertexAI>
      <URL>https://{LOCATION}-aiplatform.googleapis.com/v1/projects/{PROJECT_ID}/locations/{LOCATION}/publishers/google/models/{MODEL_ID}:predict</URL>
    </VertexAI>
  </Embeddings>
  <SimilaritySearch>
    <VertexAI>
      <URL>https://{PUBLIC_DOMAIN_NAME}/v1/projects/{PROJECT_ID}/locations/{LOCATION}/indexEndpoints/{INDEX_ENDPOINT_ID}:findNeighbors</URL>
      <DeployedIndexID>{DEPLOYED_INDEX_ID}</DeployedIndexID>
      <Threshold>0.95</Threshold>
    </VertexAI>
  </SimilaritySearch>
</SemanticCacheLookup>

默认政策

以下示例展示了在 Apigee 界面中将 SemanticCacheLookup 政策添加到流时的默认设置：

<SemanticCacheLookup async="false" continueOnError="false"enabled="true" name="SCL-lookup">
  <DisplayName>SCL-lookup</DisplayName>
  <IgnoreUnresolvedVariables>false</IgnoreUnresolvedVariables>
  <UserPromptSource>{jsonPath('$.contents[-1].parts[-1].text',request.content,true)}</UserPromptSource>
  <Embeddings>
    <VertexAI>
      <URL>https://{LOCATION}-aiplatform.googleapis.com/v1/projects/{PROJECT_ID}/locations/{LOCATION}/publishers/google/models/{MODEL_ID}:predict
      </URL>
    </VertexAI>
  </Embeddings>
  <SimilaritySearch>
    <VertexAI>
      <URL>https://{PUBLIC_DOMAIN_NAME}/v1/projects/{PROJECT_ID}/locations/{LOCATION}/indexEndpoints/{INDEX_ENDPOINT_ID}:findNeighbors</URL>
      <Threshold>0.9</Threshold>
      <DeployedIndexID>{DEPLOYED_INDEX_ID}</DeployedIndexID>
    </VertexAI>
  </SimilaritySearch>
</SemanticCacheLookup>

当您在 Apigee 界面中插入新的 SemanticCacheLookup 政策时，模板包含所有可能操作的桩。如需了解所需的元素，请参阅下文。

此元素具有所有政策中常见的以下属性：

属性	默认	是否必需？	说明
`name`	无	必需	政策的内部名称。`name` 属性的值可以包含字母、数字、空格、连字符、下划线和英文句点。此值不能超过 255 个字符。（可选）使用 `<DisplayName>` 元素在管理界面代理编辑器中给政策添加不同的自然语言名称标签。
`continueOnError`	false	可选	设置为 `false` 可在政策失败时返回错误。这是大多数政策的预期行为。设置为 `true`，即使在政策失败后，仍可以继续执行流。另请参阅：故障规则仅在错误状态下触发（关于 continueOnError）处理当前流中的故障
`enabled`	true	可选	设置为 `true` 可实施政策。设为 `false` 可关闭政策。即使政策仍附加到某个流，也不会强制执行该政策。
`async`	false	已弃用	此属性已弃用。

下表提供了 <SemanticCacheLookup> 的子元素的简要说明：

子元素	是否必需？	说明
`<DisplayName>`	可选	政策的名称。
`<IgnoreUnresolvedVariables>`	可选	确定在变量无法解析时处理是否停止。设置为 `true` 可忽略无法解析的变量并继续处理。
`<UserPromptSource>`	可选	要提取的用户提示文本的载荷位置。仅支持字符串文本值。此字段支持 Apigee 消息模板语法，包括使用变量或 JSON 路径函数。例如： {jsonPath('$.contents[-1].parts[-1].text',request.content,true)}
`<Embeddings>`	必需	包含生成嵌入所需信息的元素。
`<SimilaritySearch>`	必需	包含执行相似度搜索所需信息的元素。如需了解详情，请参阅查询公共索引以获取最近邻。

子元素参考

本部分介绍 <SemanticCacheLookup> 的子元素。

`<DisplayName>`

除了用于 name 属性之外，还可用于在管理界面代理编辑器中使用其他更加自然的名称标记政策。

<DisplayName> 元素适用于所有政策。

默认值	不适用
是否必需？	可选。如果省略 `<DisplayName>`，则会使用政策的 `name` 属性的值
类型	字符串
父元素	<`PolicyElement`>
子元素	无

<DisplayName> 元素使用以下语法：

语法

<PolicyElement>
  <DisplayName>POLICY_DISPLAY_NAME</DisplayName>
  ...
</PolicyElement>

示例

<PolicyElement>
  <DisplayName>My Validation Policy</DisplayName>
</PolicyElement>

<DisplayName> 元素没有属性或子元素。

<IgnoreUnresolvedVariables>

确定在变量无法解析时处理是否停止。设置为 true 可忽略无法解析的变量并继续处理。

如果提供了 <DefaultValue>，则 IgnoreUnresolvedVariables 不适用。

默认值	错误
是否必需？	可选
类型	布尔值
父元素	`<SemanticCacheLookup>`
子元素	无

`<UserPromptSource>`

要提取的用户提示文本的载荷位置。仅支持字符串文本值。

此字段支持 Apigee 消息模板语法，包括使用变量或 JSON 路径函数。

例如：

{jsonPath('$.contents[-1].parts[-1].text',request.content,true)}

默认值	{jsonPath('$.contents[-1].parts[-1].text',request.content,true)}
是否必需？	可选
类型	字符串
父元素	`<SemanticCacheLookup>`
子元素	无

`<Embeddings>`

此元素包含生成文本嵌入所需的信息。

默认值	不适用
是否必需？	可选
类型	字符串
父元素	`<SemanticCacheLookup>`
子元素	`<VertexAI>`

<Embeddings> 元素使用以下语法：

<Embeddings>
  <VertexAI>
    <URL>https://{LOCATION}-aiplatform.googleapis.com/v1/projects/{PROJECT_ID}/locations/{LOCATION}/publishers/google/models/{MODEL_ID}:predict</URL>
  </VertexAI>
</Embeddings>

<VertexAI>（`<Embeddings>` 的子级）

包含 Vertex AI 特有属性的 <URL> 元素。

默认值	不适用
是否必需？	需要
类型	字符串
父元素	`<Embeddings>`
子元素	`<URL>`

VertexAI 元素使用以下语法：

<VertexAI>
  <URL>https://{LOCATION}-aiplatform.googleapis.com/v1/projects/{PROJECT_ID}/locations/{LOCATION}/publishers/google/models/{MODEL_ID}:predict</URL>
</VertexAI>

<URL>（`<VertexAI>` 的子级）

用于生成文本嵌入的网址。如需查看为 SemanticCacheLookup 政策提供文本嵌入的模型列表，请参阅支持的模型。

默认值	不适用
是否必需？	需要
类型	字符串
父元素	`<VertexAI>`
子元素	无

URL 元素使用以下语法：

<URL>https://{LOCATION}-aiplatform.googleapis.com/v1/projects/{PROJECT_ID}/locations/{LOCATION}/publishers/google/models/{MODEL_ID}:predict</URL>

URL 元素支持使用网址模板。如果您愿意，可在此元素中提供一个变量来包含网址的值，如以下示例所示：

<URL>https://{URL_VARIABLE}</URL>

`<SimilaritySearch>`

此元素包含执行相似度搜索所需的信息。

如需了解详情，请参阅查询公共索引以获取最近邻。

默认值	不适用
是否必需？	需要
类型	字符串
父元素	`<SemanticCacheLookup>`
子元素	`<VertexAI>`

<SimilaritySearch> 元素使用以下语法：

<SimilaritySearch>
  <VertexAI>
    <URL>https://{PUBLIC_DOMAIN_NAME}/v1/projects/{PROJECT_ID}/locations/{LOCATION}/indexEndpoints/{INDEX_ENDPOINT_ID}:findNeighbors
    </URL>
    <Threshold>0.9</Threshold>
    <DeployedIndexID>{DEPLOYED_INDEX_ID}</DeployedIndexID>
  </VertexAI>
</SimilaritySearch>

<VertexAI>（`<SimilaritySearch>` 的子级）

包含 Vertex AI 特有属性的 <URL> 元素。

默认值	不适用
是否必需？	需要
类型	字符串
父元素	`<SimilaritySearch>`
子元素	`<URL>`

VertexAI 元素使用以下语法：

<VertexAI>
  <URL>https://{PUBLIC_DOMAIN_NAME}/v1/projects/{PROJECT_ID}/locations/{LOCATION}/indexEndpoints/{INDEX_ENDPOINT_ID}:findNeighbors</URL>
  <Threshold>0.9</Threshold>
  <DeployedIndexID>{DEPLOYED_INDEX_ID}</DeployedIndexID>
</VertexAI>

下表提供了 <VertexAI> 的子元素的简要说明。

子元素是否必需？说明

子元素	是否必需？	说明
`<URL>`	需要	字符串用于执行相似度搜索的网址。根据相似度阈值确定的最高匹配数据点是所使用的唯一数据点。 `URL` 元素支持使用网址模板。如果您愿意，可在此元素中提供一个变量来包含网址的值，如以下示例所示： <URL>https://{URL_VARIABLE}</URL>
`<Threshold>`	可选	字符串用于确定两个提示是否被视为匹配的相似度得分。一个介于 0 到 1 之间的值。默认值为 0.9。请参阅
`<DeployedIndexID>`	必需	字符串部署在用于语义缓存的索引端点上的索引的 ID。

<URL>

需要

字符串

用于执行相似度搜索的网址。根据相似度阈值确定的最高匹配数据点是所使用的唯一数据点。

URL 元素支持使用网址模板。如果您愿意，可在此元素中提供一个变量来包含网址的值，如以下示例所示：

<URL>https://{URL_VARIABLE}</URL>

<Threshold>

可选

字符串

用于确定两个提示是否被视为匹配的相似度得分。一个介于 0 到 1 之间的值。

默认值为 0.9。

请参阅

<DeployedIndexID>

必需

字符串

部署在用于语义缓存的索引端点上的索引的 ID。

流变量

流变量可基于 HTTP 标头或消息内容或者流中提供的上下文，为政策和流配置动态运行时行为。如需详细了解流变量，请参阅流变量参考。

此政策在执行期间提供以下只读流变量集。您可以将这些流变量与 DataCapture 政策搭配使用，以创建自定义分析报告。如需了解详情，请参阅使用 Data Capture 政策收集客户数据。

变量名称	说明
`request.content`	包含传入 API 请求的完整内容。
`request.url`	包含传入 API 请求的网址。
`semanticcache.lookup.policy_name.user_prompt`	包含从请求提示中提取的特定组成部分，用于生成嵌入或执行相似度搜索。
`semanticcache.lookup.policy_name.embeddings_request`	包含发送到 Vertex AI Embeddings API 以便为输入文本生成文本嵌入的请求载荷。
`semanticcache.lookup.policy_name.embeddings_response`	包含 Vertex AI Embeddings API 的响应，其中包括生成的文本嵌入。
`semanticcache.lookup.policy_name.dense_embeddings`	包含由 Vertex AI Embeddings API 生成的实际数值嵌入值。
`semanticcache.lookup.policy_name.is_nearest_neighbor_hit`	指定是否在向量数据库中找到了给定请求的最近邻，以及数据点是否满足相似度阈值。
`semanticcache.lookup.policy_name.cache_hit`	指定是否在语义缓存中找到了响应。
`semanticcache.lookup.policy_name.cached_llm_response`	包含从语义缓存中检索到的响应（如果发生缓存命中）。

错误参考信息

本部分介绍了 Apigee 针对 <SemanticCacheLookup> 政策返回的故障代码和错误消息以及设置的故障变量。在开发故障规则以处理故障时，请务必了解此信息。如需了解详情，请参阅有关政策错误的注意事项和处理故障。

运行时错误

在政策执行时可能会发生以下错误。

故障代码	HTTP 状态	原因
`steps.semanticcache.lookup.MessageTemplateExtractionFailed`	`400`	无法使用 JSON 路径表达式从请求中提取数据。
`steps.semanticcache.lookup.FailedToExtractUserPrompt`	`500`	无法从 API 请求中提取用户提示。
`steps.semanticcache.lookup.EmbeddingsServiceUnavailable`	`400`	Vertex AI Embeddings 服务目前不可用。
`steps.semanticcache.lookup.EmbeddingsAPIFailed`	`400`	Vertex AI Embeddings 服务失败。
`steps.semanticcache.lookup.VectorSearchServiceUnavailable`	`400`	Vertex AI Vector Search 服务目前不可用。
`steps.semanticcache.lookup.VectorSearchAPIFailed`	`400`	Vertex AI Vector Search 服务失败。
`steps.semanticcache.lookup.AuthenticationFailure`	`500`	服务账号没有所需的权限。
`steps.semanticcache.lookup.InternalError`	`500`	SemanticCacheLookup 政策中发生了意外错误。
`steps.semanticcache.lookup.CalloutError`	`500`	Vertex AI 服务调用失败。

部署错误

在您部署包含此政策的代理时可能会发生以下错误。

错误名称	原因
`The Embeddings/VertexAI element is required.`	如果 <Embeddings> 中的 <VertexAI> 元素为空，则会发生此错误。
`The SimilaritySearch/VertexAI element is required.`	如果 <SimilaritySearch> 中的 <VertexAI> 元素为空，则会发生此错误。
`The Embeddings/URL element is required.`	如果 <Embeddings> 中的 <URL> 元素为空，则会发生此错误。
`The SimilaritySearch/URL element is required.`	如果 <SimilaritySearch> 中的 <URL> 元素为空，则会发生此错误。
`Embeddings URL {url} is invalid.`	如果 <Embeddings> 中的 <URL> 元素为空或无效，则会发生此错误。
`The SimilaritySearch URL {url} is invalid.`	如果 <SimilaritySearch> 中的 <URL> 元素为空或无效，则会发生此错误。
`The scheme {http-scheme} of Embeddings URL {url} must be one of http, https.`	如果嵌入 <URL> 元素的 `http` 架构无效，则会发生此错误。
`The scheme {http-scheme} of SimilaritySearch URL {url} must be one of http, https.`	如果 SimilaritySearch <URL> 元素的 `http` 方案无效，则会发生此错误。
`SimilaritySearch/Threshold element must be >= 0 and <= 1.`	如果属性不在 0 到 1 之间，则 API 代理的部署会失败。
`SimilaritySearch/DeployedIndexID element is required.`	如果 <SimilaritySearch> 中的 <DeployedIndexID> 元素为空，则会发生此错误。
`SimilaritySearch/DeployedIndexID element must not contain spaces.`	如果 <SimilaritySearch> 中的 <DeployedIndexID> 元素包含空格，则会发生此错误。

故障变量

当此政策在运行时触发错误时，它会设置以下这些变量。如需了解详情，请参阅您需要了解的有关政策错误的信息。

变量	其中	示例
`fault.name="FAULT_NAME"`	`FAULT_NAME` 是故障名称，如上面的运行时错误表中所列。故障名称是故障代码的最后一部分。	`fault.name Matches "UnresolvedVariable"`
`semanticcachelookup.POLICY_NAME.failed`	`POLICY_NAME` 是抛出故障的政策的用户指定名称。	`semanticcachelookup.SC-lookup.failed = true`

错误响应示例

注意：处理错误时，最佳做法是捕获错误响应的 errorcode 部分。不要依赖 faultstring 中的文本，因为它可能会发生变化。

{
  "fault": {
    "faultstring": "SemanticCacheLookup[SC-lookup]: unable to resolve variable [variable_name]",
    "detail": {
      "errorcode": "steps.semanticcachelookup.UnresolvedVariable"
    }
  }
}

故障规则示例

<FaultRule name="SemanticCacheLookup Faults">
    <Step>
        <Name>SCL-CustomSetVariableErrorResponse</Name>
        <Condition>(fault.name = "SetVariableFailed")</Condition>
    </Step>
    <Condition>(semanticcachelookup.failed = true)</Condition>
</FaultRule>

架构

每种政策类型均由 XML 架构 (.xsd) 定义。GitHub 提供了政策架构作为参考。

SemanticCacheLookup 政策

概览

准备工作

所需的角色

启用 API

<SemanticCacheLookup> 元素

语法

默认政策

子元素参考

<DisplayName>

语法

示例

<IgnoreUnresolvedVariables>

<UserPromptSource>

<Embeddings>

<VertexAI>（<Embeddings> 的子级）

<URL>（<VertexAI> 的子级）

<SimilaritySearch>

<VertexAI>（<SimilaritySearch> 的子级）

流变量

错误参考信息

运行时错误

部署错误

故障变量

错误响应示例

故障规则示例

架构

`<SemanticCacheLookup>` 元素

`<DisplayName>`

`<UserPromptSource>`

`<Embeddings>`

<VertexAI>（`<Embeddings>` 的子级）

<URL>（`<VertexAI>` 的子级）

`<SimilaritySearch>`

<VertexAI>（`<SimilaritySearch>` 的子级）