此页面由 Cloud Translation API 翻译。

创建上下文缓存

您必须先创建上下文缓存，然后才能使用它。您创建的上下文缓存包含大量数据，您可以在向 Gemini 模型发出的多个请求中使用这些数据。缓存的内容存储在您发出创建缓存请求的区域中。

缓存的内容可以是 Gemini 多模态模型支持的任何 MIME 类型。例如，您可以缓存大量文本、音频或视频。您可以指定多个要缓存的文件。如需了解详情，请参阅以下媒体要求：

您可以使用 Blob、文本或存储在 Cloud Storage 存储桶中的文件的路径来指定要缓存的内容。如果要缓存的内容大小超过 10 MB，则必须使用存储在 Cloud Storage 存储桶中的文件的 URI 指定该内容。

缓存内容的有效期是有限的。上下文缓存的默认到期时间为创建时间后的 60 分钟。如果您需要不同的到期时间，可以在创建上下文缓存时使用 ttl 或 expire_time 属性指定不同的到期时间。您还可以更新未过期的上下文缓存的到期时间。如需了解如何指定 ttl 和 expire_time，请参阅更新过期时间。

上下文缓存过期后，将不再可用。如果您想在未来的提示请求中引用已过期的上下文缓存中的内容，则需要重新创建上下文缓存。

限制

您缓存的内容必须遵守下表中显示的限制：

上下文缓存限制
缓存 token 数下限	`2,048` (Gemini 2.5 Pro) `1,024`（Gemini 2.5 Flash） `1,024` (Gemini 2.0 Flash) `1,024` (Gemini 2.0 Flash-Lite)
您可以使用 Blob 或文本缓存的内容的最大大小	10 MB
缓存创建后过期前的最短时间	1 分钟
缓存创建后过期前的最长时间	没有最长缓存时长

位置信息支持

澳大利亚悉尼 (australia-southeast1) 区域不支持上下文缓存。

加密密钥支持

上下文缓存支持客户管理的加密密钥 (CMEK)，让您能够控制缓存数据的加密，并使用您管理和拥有的加密密钥保护敏感信息。这可提供额外的安全保障和合规性。

如需了解详情，请参阅示例。

Access Transparency 支持

上下文缓存支持 Access Transparency。

创建上下文缓存示例

以下示例展示了如何创建上下文缓存。

Python

安装

pip install --upgrade google-genai

如需了解详情，请参阅 SDK 参考文档。

设置环境变量以将 Gen AI SDK 与 Vertex AI 搭配使用：

# Replace the `GOOGLE_CLOUD_PROJECT` and `GOOGLE_CLOUD_LOCATION` values
# with appropriate values for your project.
export GOOGLE_CLOUD_PROJECT=GOOGLE_CLOUD_PROJECT
export GOOGLE_CLOUD_LOCATION=us-central1
export GOOGLE_GENAI_USE_VERTEXAI=True

from google import genai
from google.genai.types import Content, CreateCachedContentConfig, HttpOptions, Part

client = genai.Client(http_options=HttpOptions(api_version="v1"))

system_instruction = """
You are an expert researcher. You always stick to the facts in the sources provided, and never make up new facts.
Now look at these research papers, and answer the following questions.
"""

contents = [
    Content(
        role="user",
        parts=[
            Part.from_uri(
                file_uri="gs://cloud-samples-data/generative-ai/pdf/2312.11805v3.pdf",
                mime_type="application/pdf",
            ),
            Part.from_uri(
                file_uri="gs://cloud-samples-data/generative-ai/pdf/2403.05530.pdf",
                mime_type="application/pdf",
            ),
        ],
    )
]

content_cache = client.caches.create(
    model="gemini-2.5-flash",
    config=CreateCachedContentConfig(
        contents=contents,
        system_instruction=system_instruction,
        # (Optional) For enhanced security, the content cache can be encrypted using a Cloud KMS key
        # kms_key_name = "projects/.../locations/us-central1/keyRings/.../cryptoKeys/..."
        display_name="example-cache",
        ttl="86400s",
    ),
)

print(content_cache.name)
print(content_cache.usage_metadata)
# Example response:
#   projects/111111111111/locations/us-central1/cachedContents/1111111111111111111
#   CachedContentUsageMetadata(audio_duration_seconds=None, image_count=167,
#       text_count=153, total_token_count=43130, video_duration_seconds=None)

Go

了解如何安装或更新 Go。

如需了解详情，请参阅 SDK 参考文档。

设置环境变量以将 Gen AI SDK 与 Vertex AI 搭配使用：

# Replace the `GOOGLE_CLOUD_PROJECT` and `GOOGLE_CLOUD_LOCATION` values
# with appropriate values for your project.
export GOOGLE_CLOUD_PROJECT=GOOGLE_CLOUD_PROJECT
export GOOGLE_CLOUD_LOCATION=us-central1
export GOOGLE_GENAI_USE_VERTEXAI=True

import (
	"context"
	"encoding/json"
	"fmt"
	"io"
	"time"

	genai "google.golang.org/genai"
)

// createContentCache shows how to create a content cache with an expiration parameter.
func createContentCache(w io.Writer) (string, error) {
	ctx := context.Background()

	client, err := genai.NewClient(ctx, &genai.ClientConfig{
		HTTPOptions: genai.HTTPOptions{APIVersion: "v1"},
	})
	if err != nil {
		return "", fmt.Errorf("failed to create genai client: %w", err)
	}

	modelName := "gemini-2.0-flash-001"

	systemInstruction := "You are an expert researcher. You always stick to the facts " +
		"in the sources provided, and never make up new facts. " +
		"Now look at these research papers, and answer the following questions."

	cacheContents := []*genai.Content{
		{
			Parts: []*genai.Part{
				{FileData: &genai.FileData{
					FileURI:  "gs://cloud-samples-data/generative-ai/pdf/2312.11805v3.pdf",
					MIMEType: "application/pdf",
				}},
				{FileData: &genai.FileData{
					FileURI:  "gs://cloud-samples-data/generative-ai/pdf/2403.05530.pdf",
					MIMEType: "application/pdf",
				}},
			},
			Role: "user",
		},
	}
	config := &genai.CreateCachedContentConfig{
		Contents: cacheContents,
		SystemInstruction: &genai.Content{
			Parts: []*genai.Part{
				{Text: systemInstruction},
			},
		},
		DisplayName: "example-cache",
		TTL:         time.Duration(time.Duration.Seconds(86400)),
	}

	res, err := client.Caches.Create(ctx, modelName, config)
	if err != nil {
		return "", fmt.Errorf("failed to create content cache: %w", err)
	}

	cachedContent, err := json.MarshalIndent(res, "", "  ")
	if err != nil {
		return "", fmt.Errorf("failed to marshal cache info: %w", err)
	}

	// See the documentation: https://pkg.go.dev/google.golang.org/genai#CachedContent
	fmt.Fprintln(w, string(cachedContent))

	// Example response:
	// {
	//   "name": "projects/111111111111/locations/us-central1/cachedContents/1111111111111111111",
	//   "displayName": "example-cache",
	//   "model": "projects/111111111111/locations/us-central1/publishers/google/models/gemini-2.0-flash-001",
	//   "createTime": "2025-02-18T15:05:08.29468Z",
	//   "updateTime": "2025-02-18T15:05:08.29468Z",
	//   "expireTime": "2025-02-19T15:05:08.280828Z",
	//   "usageMetadata": {
	//     "imageCount": 167,
	//     "textCount": 153,
	//     "totalTokenCount": 43125
	//   }
	// }

	return res.Name, nil
}

REST

您可以使用 REST 创建上下文缓存，方法是使用 Vertex AI API 向发布方模型端点发送 POST 请求。以下示例展示了如何使用存储在 Cloud Storage 存储桶中的文件创建上下文缓存。

在使用任何请求数据之前，请先进行以下替换：

PROJECT_ID：您的项目 ID。
LOCATION：处理请求的区域以及存储缓存内容的区域。如需查看支持的区域列表，请参阅可用区域。
CACHE_DISPLAY_NAME：一个有意义的显示名称，用于描述和帮助您识别每个上下文缓存。
MIME_TYPE：要缓存的内容的 MIME 类型。
CONTENT_TO_CACHE_URI：要缓存的内容的 Cloud Storage URI。
MODEL_ID：用于缓存的模型。

HTTP 方法和网址：

POST https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/cachedContents

请求 JSON 正文：

{
  "model": "projects/PROJECT_ID/locations/LOCATION/publishers/google/models/MODEL_ID",
  "displayName": "CACHE_DISPLAY_NAME",
  "contents": [{
    "role": "user",
      "parts": [{
        "fileData": {
          "mimeType": "MIME_TYPE",
          "fileUri": "CONTENT_TO_CACHE_URI"
        }
      }]
  },
  {
    "role": "model",
      "parts": [{
        "text": "This is sample text to demonstrate explicit caching."
      }]
  }]
}

如需发送请求，请选择以下方式之一：

curl

注意：以下命令假定您已使用您的用户账号通过运行 gcloud init 或 gcloud auth login 登录 gcloud CLI，或者使用了 Cloud Shell，这会使您自动登录 gcloud CLI。您可以运行 gcloud auth list 来检查当前活跃的账号。

将请求正文保存在名为 request.json 的文件中，然后执行以下命令：

curl -X POST \
     -H "Authorization: Bearer $(gcloud auth print-access-token)" \
     -H "Content-Type: application/json; charset=utf-8" \
     -d @request.json \
     "https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/cachedContents"

PowerShell

注意：以下命令假定您已使用您的用户账号通过运行 gcloud init 或 gcloud auth login 登录 gcloud CLI。您可以运行 gcloud auth list 来检查当前活跃的账号。

将请求正文保存在名为 request.json 的文件中，然后执行以下命令：

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
    -Method POST `
    -Headers $headers `
    -ContentType: "application/json; charset=utf-8" `
    -InFile request.json `
    -Uri "https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/cachedContents" | Select-Object -Expand Content

您应该收到类似以下内容的 JSON 响应：

响应

{
  "name": "projects/PROJECT_NUMBER/locations/us-central1/cachedContents/CACHE_ID",
  "model": "projects/PROJECT_ID/locations/us-central1/publishers/google/models/gemini-2.0-flash-001",
  "createTime": "2024-06-04T01:11:50.808236Z",
  "updateTime": "2024-06-04T01:11:50.808236Z",
  "expireTime": "2024-06-04T02:11:50.794542Z"
}

示例 curl 命令

LOCATION="us-central1"
MODEL_ID="gemini-2.0-flash-001"
PROJECT_ID="test-project"
MIME_TYPE="video/mp4"
CACHED_CONTENT_URI="gs://path-to-bucket/video-file-name.mp4"

curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json" \
https://${LOCATION}-aiplatform.googleapis.com/v1/projects/${PROJECT_ID}/locations/${LOCATION}/cachedContents -d \
'{
  "model":"projects/${PROJECT_ID}/locations/${LOCATION}/publishers/google/models/${MODEL_ID}",
  "contents": [
    {
      "role": "user",
      "parts": [
        {
          "fileData": {
            "mimeType": "${MIME_TYPE}",
            "fileUri": "${CACHED_CONTENT_URI}"
          }
        }
      ]
    }
  ]
}'

创建使用 CMEK 的上下文缓存

如需使用 CMEK 实现上下文缓存，请按照相关说明创建 CMEK，并确保 Vertex AI 的每个产品、每个项目的服务账号 (P4SA) 拥有该密钥的必要 Cloud KMS CryptoKey Encrypter/Decrypter 权限。这样一来，您就可以安全地创建和管理缓存内容，还可以进行其他调用，例如 {List、Update、Delete、Get} CachedContent，而无需重复指定 KMS 密钥。

REST

在使用任何请求数据之前，请先进行以下替换：

PROJECT_ID：。
LOCATION：处理请求的区域以及存储缓存内容的区域。如需查看支持的区域列表，请参阅可用区域。
MODEL_ID：gemini-2.0-flash-001。
CACHE_DISPLAY_NAME：一个有意义的显示名称，用于描述和帮助您识别每个上下文缓存。
MIME_TYPE：要缓存的内容的 MIME 类型。
CACHED_CONTENT_URI：要缓存的内容的 Cloud Storage URI。
KMS_KEY_NAME：Cloud KMS 密钥名称。

HTTP 方法和网址：

POST https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/cachedContents

请求 JSON 正文：

{
  "model": "projects/PROJECT_ID/locations/LOCATION/publishers/google/models/gemini-2.0-flash-001",
  "displayName": "CACHE_DISPLAY_NAME",
  "contents": [{
    "role": "user",
      "parts": [{
        "fileData": {
          "mimeType": "MIME_TYPE",
          "fileUri": "CONTENT_TO_CACHE_URI"
        }
      }]}],
    "encryptionSpec": {
      "kmsKeyName": "KMS_KEY_NAME"
    }
}

如需发送请求，请选择以下方式之一：

curl

将请求正文保存在名为 request.json 的文件中，然后执行以下命令：

curl -X POST \
     -H "Authorization: Bearer $(gcloud auth print-access-token)" \
     -H "Content-Type: application/json; charset=utf-8" \
     -d @request.json \
     "https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/cachedContents"

PowerShell

注意：以下命令假定您已使用您的用户账号通过运行 gcloud init 或 gcloud auth login 登录 gcloud CLI。您可以运行 gcloud auth list 来检查当前活跃的账号。

将请求正文保存在名为 request.json 的文件中，然后执行以下命令：

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
    -Method POST `
    -Headers $headers `
    -ContentType: "application/json; charset=utf-8" `
    -InFile request.json `
    -Uri "https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/cachedContents" | Select-Object -Expand Content

您应该收到类似以下内容的 JSON 响应：

响应

{
  "name": "projects/PROJECT_NUMBER/locations/us-central1/cachedContents/CACHE_ID",
  "model": "projects/PROJECT_ID/locations/us-central1/publishers/google/models/gemini-2.0-flash-001",
  "createTime": "2024-06-04T01:11:50.808236Z",
  "updateTime": "2024-06-04T01:11:50.808236Z",
  "expireTime": "2024-06-04T02:11:50.794542Z"
}

示例 curl 命令

LOCATION="us-central1"
MODEL_ID="gemini-2.0-flash-001"
PROJECT_ID="test-project"
MIME_TYPE="video/mp4"
CACHED_CONTENT_URI="gs://path-to-bucket/video-file-name.mp4"
KMS_KEY_NAME="projects/${PROJECT_ID}/locations/{LOCATION}/keyRings/your-key-ring/cryptoKeys/your-key"

curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json" \
https://${LOCATION}-aiplatform.googleapis.com/v1/projects/${PROJECT_ID}/locations/${LOCATION}/cachedContents -d \
'{

"model": "projects/{PROJECT_ID}}/locations/{LOCATION}/publishers/google/models/{MODEL_ID}",
  "contents" : [
    {
      "role": "user",
      "parts": [
        {
          "file_data": {
            "mime_type":"{MIME_TYPE}",
            "file_uri":"{CACHED_CONTENT_URI}"
          }
        }
      ]
    }
  ],
  "encryption_spec" :
  {
    "kms_key_name":"{KMS_KEY_NAME}"
  }
}'

GenAI SDK for Python

安装

pip install --upgrade google-genai

如需了解详情，请参阅 SDK 参考文档。

设置环境变量以将 Gen AI SDK 与 Vertex AI 搭配使用：

# Replace the `GOOGLE_CLOUD_PROJECT` and `GOOGLE_CLOUD_LOCATION` values
# with appropriate values for your project.
export GOOGLE_CLOUD_PROJECT=GOOGLE_CLOUD_PROJECT
export GOOGLE_CLOUD_LOCATION=us-central1
export GOOGLE_GENAI_USE_VERTEXAI=True

import os
from google import genai
from google.genai.types import Content, CreateCachedContentConfig, HttpOptions, Part

os.environ['GOOGLE_CLOUD_PROJECT'] = 'vertexsdk'
os.environ['GOOGLE_CLOUD_LOCATION'] = 'us-central1'
os.environ['GOOGLE_GENAI_USE_VERTEXAI'] = 'True'
  
client = genai.Client(http_options=HttpOptions(api_version="v1"))

system_instruction = """
You are an expert researcher. You always stick to the facts in the sources provided, and never make up new facts.
Now look at these research papers, and answer the following questions.
"""

contents = [
    Content(
        role="user",
        parts=[
            Part.from_uri(
                file_uri="gs://cloud-samples-data/generative-ai/pdf/2312.11805v3.pdf",
                mime_type="application/pdf",
            ),
            Part.from_uri(
                file_uri="gs://cloud-samples-data/generative-ai/pdf/2403.05530.pdf",
                mime_type="application/pdf",
            ),
        ],
    )
]

content_cache = client.caches.create(
    model="gemini-2.0-flash-001",
    config=CreateCachedContentConfig(
        contents=contents,
        system_instruction=system_instruction,
        display_name="example-cache",
        kms_key_name="projects/vertexsdk/locations/us-central1/keyRings/your-project/cryptoKeys/your-key",
        ttl="86400s",
    ),
)

print(content_cache.name)
print(content_cache.usage_metadata)

GenAI SDK for Go

了解如何安装或更新 Gen AI SDK for Go。

如需了解详情，请参阅 SDK 参考文档。

设置环境变量以将 Gen AI SDK 与 Vertex AI 搭配使用：


import (
    "context"
    "encoding/json"
    "fmt"
    "io"

    genai "google.golang.org/genai"
)

// createContentCache shows how to create a content cache with an expiration parameter.
func createContentCache(w io.Writer) (string, error) {
    ctx := context.Background()

    client, err := genai.NewClient(ctx, &genai.ClientConfig{
        HTTPOptions: genai.HTTPOptions{APIVersion: "v1beta1"},
    })
    if err != nil {
        return "", fmt.Errorf("failed to create genai client: %w", err)
    }

    modelName := "gemini-2.0-flash-001"

    systemInstruction := "You are an expert researcher. You always stick to the facts " +
        "in the sources provided, and never make up new facts. " +
        "Now look at these research papers, and answer the following questions."

    cacheContents := []*genai.Content{
        {
            Parts: []*genai.Part{
                {FileData: &genai.FileData{
                    FileURI:  "gs://cloud-samples-data/generative-ai/pdf/2312.11805v3.pdf",
                    MIMEType: "application/pdf",
                }},
                {FileData: &genai.FileData{
                    FileURI:  "gs://cloud-samples-data/generative-ai/pdf/2403.05530.pdf",
                    MIMEType: "application/pdf",
                }},
            },
            Role: "user",
        },
    }
    config := &genai.CreateCachedContentConfig{
        Contents: cacheContents,
        SystemInstruction: &genai.Content{
            Parts: []*genai.Part{
                {Text: systemInstruction},
            },
        },
        DisplayName: "example-cache",
        KmsKeyName:  "projects/vertexsdk/locations/us-central1/keyRings/your-project/cryptoKeys/your-key",
        TTL:         "86400s",
    }

    res, err := client.Caches.Create(ctx, modelName, config)
    if err != nil {
        return "", fmt.Errorf("failed to create content cache: %w", err)
    }

    cachedContent, err := json.MarshalIndent(res, "", "  ")
    if err != nil {
        return "", fmt.Errorf("failed to marshal cache info: %w", err)
    }

    // See the documentation: https://pkg.go.dev/google.golang.org/genai#CachedContent
    fmt.Fprintln(w, string(cachedContent))

    return res.Name, nil
}

后续步骤

了解如何使用上下文缓存。
了解如何更新上下文缓存的到期时间。

创建上下文缓存 使用集合让一切井井有条 根据您的偏好保存内容并对其进行分类。

限制

位置信息支持

加密密钥支持

Access Transparency 支持

创建上下文缓存示例

Python

安装

Go

REST

curl

PowerShell

响应

示例 curl 命令

创建使用 CMEK 的上下文缓存

REST

curl

PowerShell

响应

示例 curl 命令

GenAI SDK for Python

GenAI SDK for Go

后续步骤

创建上下文缓存