此页面由 Cloud Translation API 翻译。

使用上下文缓存

您可以使用 REST API 或 Python SDK 引用生成式 AI 应用中存储在上下文缓存中的内容。您必须先创建上下文缓存，然后才能使用它。

您在代码中使用的上下文缓存对象包含以下属性：

name - 上下文缓存资源名称。其格式为 projects/PROJECT_NUMBER/locations/LOCATION/cachedContents/CACHE_ID。创建上下文缓存后，您可以在响应中找到其资源名称。项目编号是您的项目的唯一标识符。缓存 ID 是缓存的 ID。在代码中指定上下文缓存时，您必须使用完整的上下文缓存资源名称。以下示例展示了如何在请求正文中指定缓存的内容资源名称：
```
"cached_content": "projects/123456789012/locations/us-central1/123456789012345678"
```
model - 用于创建缓存的模型的资源名称。其格式为 projects/PROJECT_NUMBER/locations/LOCATION/publishers/PUBLISHER_NAME/models/MODEL_ID。
createTime - 一个 Timestamp，用于指定上下文缓存的创建时间。
updateTime - 一个 Timestamp，用于指定上下文缓存的最近更新时间。在上下文缓存创建后，在更新之前，其 createTime 和 updateTime 相同。
expireTime - 用于指定上下文缓存的到期时间的 Timestamp。默认的 expireTime 是 createTime 之后 60 分钟。您可以使用新的到期时间更新缓存。如需了解详情，请参阅更新上下文缓存。
cache 过期后，系统会将其标记为待删除，因此您不应假定其可供使用或更新。如果您需要使用已过期的上下文缓存，则需要使用适当的到期时间重新创建它。

上下文缓存使用限制

创建上下文缓存时，您可以指定以下功能。您不应在请求中再次指定这些信息：

GenerativeModel.system_instructions 属性。此属性用于在模型收到用户的指令之前向模型指定指令。如需了解详情，请参阅系统指令。
GenerativeModel.tool_config 属性。tool_config 属性用于指定 Gemini 模型使用的工具，例如函数调用功能使用的工具。
GenerativeModel.tools 属性。GenerativeModel.tools 属性用于指定用于创建函数调用应用的函数。如需了解详情，请参阅函数调用。

使用上下文缓存示例

以下示例展示了如何使用上下文缓存。使用上下文缓存时，您无法指定以下属性：

GenerativeModel.system_instructions
GenerativeModel.tool_config
GenerativeModel.tools

Python

如需了解如何安装或更新 Python 版 Vertex AI SDK，请参阅安装 Python 版 Vertex AI SDK。如需了解详情，请参阅 Vertex AI SDK for Python API 参考文档。

流式回答和非流式回答

您可以选择模型是生成流式回答还是非流式回答。对于流式回答，您将在生成每个响应的输出词元后立即收到响应。对于非流式回答，您将在生成所有输出词元后收到所有响应。

对于流式回答，请使用 generate_content 中的 stream 参数。

  response = model.generate_content(contents=[...], stream = True)

对于非流式回答，请移除该参数或将参数设置为 False。

示例代码

import vertexai

from vertexai.preview.generative_models import GenerativeModel
from vertexai.preview import caching

# TODO(developer): Update and un-comment below lines
# PROJECT_ID = "your-project-id"
# cache_id = "your-cache-id"

vertexai.init(project=PROJECT_ID, location="us-central1")

cached_content = caching.CachedContent(cached_content_name=cache_id)

model = GenerativeModel.from_cached_content(cached_content=cached_content)

response = model.generate_content("What are the papers about?")

print(response.text)
# Example response:
# The provided text is about a new family of multimodal models called Gemini, developed by Google.
# ...

Go

在尝试此示例之前，请按照《Vertex AI 快速入门》中的 Go 设置说明执行操作。如需了解详情，请参阅适用于 Gemini 的 Vertex AI Go SDK 参考文档。

如需向 Vertex AI 进行身份验证，请设置应用默认凭据。如需了解详情，请参阅为本地开发环境设置身份验证。

流式回答和非流式回答

您可以选择模型是生成流式回答还是非流式回答。对于流式回答，您将在生成每个响应的输出词元后立即收到响应。对于非流式回答，您会在生成所有输出词元之后收到所有回答。

对于流式回答，请使用 GenerateContentStream 方法。

  iter := model.GenerateContentStream(ctx, genai.Text("Tell me a story about a lumberjack and his giant ox. Keep it very short."))

对于非流式回答，请使用 GenerateContent 方法。

  resp, err := model.GenerateContent(ctx, genai.Text("What is the average size of a swallow?"))

示例代码

import (
	"context"
	"errors"
	"fmt"
	"io"

	"cloud.google.com/go/vertexai/genai"
)

// useContextCache shows how to use an existing cached content, when prompting the model
// contentName is the ID of the cached content
func useContextCache(w io.Writer, contentName string, projectID, location, modelName string) error {
	// location := "us-central1"
	// modelName := "gemini-1.5-pro-001"
	ctx := context.Background()

	client, err := genai.NewClient(ctx, projectID, location)
	if err != nil {
		return fmt.Errorf("unable to create client: %w", err)
	}
	defer client.Close()

	model := client.GenerativeModel(modelName)
	model.CachedContentName = contentName
	prompt := genai.Text("What are the papers about?")

	res, err := model.GenerateContent(ctx, prompt)
	if err != nil {
		return fmt.Errorf("error generating content: %w", err)
	}

	if len(res.Candidates) == 0 ||
		len(res.Candidates[0].Content.Parts) == 0 {
		return errors.New("empty response from model")
	}

	fmt.Fprintf(w, "generated response: %s\n", res.Candidates[0].Content.Parts[0])
	return nil
}

REST

您可以借助 REST 通过提示使用上下文缓存，方法是使用 Vertex AI API 向发布方模型端点发送 POST 请求。

在使用任何请求数据之前，请先进行以下替换：

PROJECT_ID：您的项目 ID。
LOCATION：处理该上下文缓存创建请求的区域。
MIME_TYPE：要提交给模型的文本提示。

HTTP 方法和网址：

POST https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/gemini-1.5-pro-002:generateContent

请求 JSON 正文：

{
  "cachedContent": "projects/PROJECT_NUMBER/locations/LOCATION/cachedContents/CACHE_ID",
  "contents": [
      {"role":"user","parts":[{"text":"PROMPT_TEXT"}]}
  ],
  "generationConfig": {
      "maxOutputTokens": 8192,
      "temperature": 1,
      "topP": 0.95,
  },
  "safetySettings": [
      {
          "category": "HARM_CATEGORY_HATE_SPEECH",
          "threshold": "BLOCK_MEDIUM_AND_ABOVE"
      },
      {
          "category": "HARM_CATEGORY_DANGEROUS_CONTENT",
          "threshold": "BLOCK_MEDIUM_AND_ABOVE"
      },
      {
          "category": "HARM_CATEGORY_SEXUALLY_EXPLICIT",
          "threshold": "BLOCK_MEDIUM_AND_ABOVE"
      },
      {
          "category": "HARM_CATEGORY_HARASSMENT",
          "threshold": "BLOCK_MEDIUM_AND_ABOVE"
      }
  ],
}

如需发送请求，请选择以下方式之一：

curl

注意：以下命令假定您已使用您的用户账号通过运行 gcloud init 或 gcloud auth login 登录 gcloud CLI，或者使用了 Cloud Shell，这会使您自动登录 gcloud CLI。您可以运行 gcloud auth list 来检查当前活跃的账号。

将请求正文保存在名为 request.json 的文件中，然后执行以下命令：

curl -X POST \
     -H "Authorization: Bearer $(gcloud auth print-access-token)" \
     -H "Content-Type: application/json; charset=utf-8" \
     -d @request.json \
     "https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/gemini-1.5-pro-002:generateContent"

PowerShell

注意：以下命令假定您已使用您的用户账号通过运行 gcloud init 或 gcloud auth login 登录 gcloud CLI。您可以运行 gcloud auth list 来检查当前活跃的账号。

将请求正文保存在名为 request.json 的文件中，然后执行以下命令：

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
    -Method POST `
    -Headers $headers `
    -ContentType: "application/json; charset=utf-8" `
    -InFile request.json `
    -Uri "https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/gemini-1.5-pro-002:generateContent" | Select-Object -Expand Content

您应该收到类似以下内容的 JSON 响应。

响应

{
  "candidates": [
    {
      "content": {
        "role": "model",
        "parts": [
          {
            "text": "MODEL_RESPONSE"
          }
        ]
      },
      "finishReason": "STOP",
      "safetyRatings": [
        {
          "category": "HARM_CATEGORY_HATE_SPEECH",
          "probability": "NEGLIGIBLE",
          "probabilityScore": 0.21866937,
          "severity": "HARM_SEVERITY_NEGLIGIBLE",
          "severityScore": 0.19946389
        },
        {
          "category": "HARM_CATEGORY_DANGEROUS_CONTENT",
          "probability": "MEDIUM",
          "probabilityScore": 0.6880493,
          "severity": "HARM_SEVERITY_MEDIUM",
          "severityScore": 0.43374163
        },
        {
          "category": "HARM_CATEGORY_HARASSMENT",
          "probability": "NEGLIGIBLE",
          "probabilityScore": 0.4442634,
          "severity": "HARM_SEVERITY_LOW",
          "severityScore": 0.37903354
        },
        {
          "category": "HARM_CATEGORY_SEXUALLY_EXPLICIT",
          "probability": "NEGLIGIBLE",
          "probabilityScore": 0.10502681,
          "severity": "HARM_SEVERITY_LOW",
          "severityScore": 0.28170192
        }
      ]
    }
  ],
  "usageMetadata": {
    "promptTokenCount": 55927,
    "candidatesTokenCount": 105,
    "totalTokenCount": 56032
  }
}

示例 curl 命令

LOCATION="us-central1"
MODEL_ID="gemini-1.5-pro-002"
PROJECT_ID="test-project"

curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json" \
"https://${LOCATION}-aiplatform.googleapis.com/v1/projects/${PROJECT_ID}/locations/${LOCATION}/publishers/google/models/${MODEL_ID}:generateContent" -d \
'{
  "cachedContent": "projects/${PROJECT_NUMBER}/locations/${LOCATION}/cachedContents/${CACHE_ID}",
  "contents": [
      {"role":"user","parts":[{"text":"What are the benefits of exercise?"}]}
  ],
  "generationConfig": {
      "maxOutputTokens": 8192,
      "temperature": 1,
      "topP": 0.95,
  },
  "safetySettings": [
    {
      "category": "HARM_CATEGORY_HATE_SPEECH",
      "threshold": "BLOCK_MEDIUM_AND_ABOVE"
    },
    {
      "category": "HARM_CATEGORY_DANGEROUS_CONTENT",
      "threshold": "BLOCK_MEDIUM_AND_ABOVE"
    },
    {
      "category": "HARM_CATEGORY_SEXUALLY_EXPLICIT",
      "threshold": "BLOCK_MEDIUM_AND_ABOVE"
    },
    {
      "category": "HARM_CATEGORY_HARASSMENT",
      "threshold": "BLOCK_MEDIUM_AND_ABOVE"
    }
  ],
}'