このページは Cloud Translation API によって翻訳されました。

コンテキストキャッシュを使用する

REST API または Python SDK を使用して、生成 AI アプリケーションのコンテキストキャッシュに保存されているコンテンツを参照できます。使用するには、まずコンテキストキャッシュを作成する必要があります。

コードで使用するコンテキストキャッシュオブジェクトには、次のプロパティが含まれます。

name - コンテキストキャッシュのリソース名。形式は　projects/PROJECT_NUMBER/locations/LOCATION/cachedContents/CACHE_ID　です。コンテキストキャッシュを作成すると、レスポンスにはそのリソース名が含まれます。プロジェクト番号はプロジェクトの固有識別子です。キャッシュ ID はキャッシュの ID です。コード内でコンテキストキャッシュを指定する場合、完全なコンテキストキャッシュリソース名を使用する必要があります。以下に示すのは、キャッシュに保存されたコンテンツのリソース名をリクエスト本文で指定する方法です。
```
"cached_content": "projects/123456789012/locations/us-central1/123456789012345678"
```
model - キャッシュの作成に使用されるモデルのリソース名。形式は　projects/PROJECT_NUMBER/locations/LOCATION/publishers/PUBLISHER_NAME/models/MODEL_ID　です。
createTime - コンテキストキャッシュの作成日時を指定する Timestamp。
updateTime - コンテキストキャッシュの最新更新時間を指定する Timestamp。コンテキストキャッシュが作成されてから更新されるまでの間、createTime と updateTime は同じです。
expireTime - コンテキストキャッシュの有効期限を指定する Timestamp。デフォルトの expireTime は createTime の 60 分後です。キャッシュに新しい有効期限を設定できます。詳細については、コンテキストキャッシュを更新するをご覧ください。有効期限が切れたキャッシュは削除対象としてマークされます。期限切れのキャッシュは使用または更新できません。期限切れのコンテキストキャッシュを使用する必要がある場合は、適切な有効期限を設定して再作成する必要があります。

コンテキストキャッシュの使用制限

コンテキストキャッシュの作成時に、次の機能を指定できます。リクエストで再度指定しないでください。

GenerativeModel.system_instructions プロパティ。このプロパティは、モデルがユーザーから指示を受け取る前に、モデルへの指示を指定するために使用されます。詳細については、システム指示をご覧ください。
GenerativeModel.tool_config プロパティ。tool_config プロパティは、関数呼び出し機能で使用されるツールなど、Gemini モデルで使用されるツールを指定するために使用されます。
GenerativeModel.tools プロパティ。GenerativeModel.tools プロパティは、関数呼び出しアプリケーションを作成する関数を指定するために使用されます。詳細については、関数呼び出しをご覧ください。

コンテキストキャッシュの使用例

コンテキストキャッシュを使用するには、以下の方法があります。コンテキストキャッシュを使用する場合は、これらのプロパティを指定できません。

GenerativeModel.system_instructions
GenerativeModel.tool_config
GenerativeModel.tools

Python

インストール

pip install --upgrade google-genai

詳しくは、SDK リファレンスドキュメントをご覧ください。

Vertex AI で Gen AI SDK を使用するための環境変数を設定します。

# Replace the `GOOGLE_CLOUD_PROJECT` and `GOOGLE_CLOUD_LOCATION` values
# with appropriate values for your project.
export GOOGLE_CLOUD_PROJECT=GOOGLE_CLOUD_PROJECT
export GOOGLE_CLOUD_LOCATION=global
export GOOGLE_GENAI_USE_VERTEXAI=True

from google import genai
from google.genai.types import GenerateContentConfig, HttpOptions

client = genai.Client(http_options=HttpOptions(api_version="v1"))
# Use content cache to generate text response
# E.g cache_name = 'projects/.../locations/.../cachedContents/1111111111111111111'
response = client.models.generate_content(
    model="gemini-2.5-flash",
    contents="Summarize the pdfs",
    config=GenerateContentConfig(
        cached_content=cache_name,
    ),
)
print(response.text)
# Example response
#   The Gemini family of multimodal models from Google DeepMind demonstrates remarkable capabilities across various
#   modalities, including image, audio, video, and text....

Go

Go をインストールまたは更新する方法について学びます。

詳しくは、SDK リファレンスドキュメントをご覧ください。

Vertex AI で Gen AI SDK を使用するための環境変数を設定します。

# Replace the `GOOGLE_CLOUD_PROJECT` and `GOOGLE_CLOUD_LOCATION` values
# with appropriate values for your project.
export GOOGLE_CLOUD_PROJECT=GOOGLE_CLOUD_PROJECT
export GOOGLE_CLOUD_LOCATION=global
export GOOGLE_GENAI_USE_VERTEXAI=True

import (
	"context"
	"fmt"
	"io"

	genai "google.golang.org/genai"
)

// useContentCacheWithTxt shows how to use content cache to generate text content.
func useContentCacheWithTxt(w io.Writer, cacheName string) error {
	ctx := context.Background()

	client, err := genai.NewClient(ctx, &genai.ClientConfig{
		HTTPOptions: genai.HTTPOptions{APIVersion: "v1"},
	})
	if err != nil {
		return fmt.Errorf("failed to create genai client: %w", err)
	}

	resp, err := client.Models.GenerateContent(ctx,
		"gemini-2.5-flash",
		genai.Text("Summarize the pdfs"),
		&genai.GenerateContentConfig{
			CachedContent: cacheName,
		},
	)
	if err != nil {
		return fmt.Errorf("failed to use content cache to generate content: %w", err)
	}

	respText := resp.Text()

	fmt.Fprintln(w, respText)

	// Example response:
	// The provided research paper introduces Gemini 1.5 Pro, a multimodal model capable of recalling
	// and reasoning over information from very long contexts (up to 10 million tokens).  Key findings include:
	//
	// * **Long Context Performance:**
	// ...

	return nil
}

Java

Java をインストールまたは更新します。

詳しくは、SDK リファレンスドキュメントをご覧ください。

Vertex AI で Gen AI SDK を使用するための環境変数を設定します。

# Replace the `GOOGLE_CLOUD_PROJECT` and `GOOGLE_CLOUD_LOCATION` values
# with appropriate values for your project.
export GOOGLE_CLOUD_PROJECT=GOOGLE_CLOUD_PROJECT
export GOOGLE_CLOUD_LOCATION=global
export GOOGLE_GENAI_USE_VERTEXAI=True


import com.google.genai.Client;
import com.google.genai.types.GenerateContentConfig;
import com.google.genai.types.GenerateContentResponse;
import com.google.genai.types.HttpOptions;

public class ContentCacheUseWithText {

  public static void main(String[] args) {
    // TODO(developer): Replace these variables before running the sample.
    String modelId = "gemini-2.5-flash";
    // E.g cacheName = "projects/111111111111/locations/global/cachedContents/1111111111111111111"
    String cacheName = "your-cache-name";
    contentCacheUseWithText(modelId, cacheName);
  }

  // Shows how to generate text using cached content
  public static String contentCacheUseWithText(String modelId, String cacheName) {
    // Initialize client that will be used to send requests. This client only needs to be created
    // once, and can be reused for multiple requests.
    try (Client client =
        Client.builder()
            .location("global")
            .vertexAI(true)
            .httpOptions(HttpOptions.builder().apiVersion("v1").build())
            .build()) {

      GenerateContentResponse response =
          client.models.generateContent(
              modelId,
              "Summarize the pdfs",
              GenerateContentConfig.builder().cachedContent(cacheName).build());

      System.out.println(response.text());
      // Example response
      // The Gemini family of multimodal models from Google DeepMind demonstrates remarkable
      // capabilities across various
      // modalities, including image, audio, video, and text....
      return response.text();
    }
  }
}

Node.js

インストール

npm install @google/genai

詳しくは、SDK リファレンスドキュメントをご覧ください。

Vertex AI で Gen AI SDK を使用するための環境変数を設定します。

# Replace the `GOOGLE_CLOUD_PROJECT` and `GOOGLE_CLOUD_LOCATION` values
# with appropriate values for your project.
export GOOGLE_CLOUD_PROJECT=GOOGLE_CLOUD_PROJECT
export GOOGLE_CLOUD_LOCATION=global
export GOOGLE_GENAI_USE_VERTEXAI=True


const {GoogleGenAI} = require('@google/genai');

const GOOGLE_CLOUD_PROJECT = process.env.GOOGLE_CLOUD_PROJECT;
const GOOGLE_CLOUD_LOCATION = process.env.GOOGLE_CLOUD_LOCATION || 'global';

async function useContentCache(
  projectId = GOOGLE_CLOUD_PROJECT,
  location = GOOGLE_CLOUD_LOCATION,
  cacheName = 'example-cache'
) {
  const client = new GoogleGenAI({
    vertexai: true,
    project: projectId,
    location: location,
    httpOptions: {
      apiVersion: 'v1',
    },
  });

  const response = await client.models.generateContent({
    model: 'gemini-2.5-flash',
    contents: 'Summarize the pdfs',
    config: {
      cachedContent: cacheName,
    },
  });

  console.log(response.text);

  return response.text;
}
// Example response
//    The Gemini family of multimodal models from Google DeepMind demonstrates remarkable capabilities across various
//    modalities, including image, audio, video, and text....

REST

REST を使用してプロンプトでコンテキストキャッシュを使用するには、Vertex AI API を使用してパブリッシャーモデルエンドポイントに POST リクエストを送信します。

リクエストのデータを使用する前に、次のように置き換えます。

PROJECT_ID: 実際のプロジェクト ID。
LOCATION: コンテキストキャッシュの作成のリクエストが処理されたリージョン。
MIME_TYPE: モデルに送信するテキストプロンプト。

HTTP メソッドと URL:

POST https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/gemini-2.0-flash-001:generateContent

リクエストの本文（JSON）:

{
  "cachedContent": "projects/PROJECT_NUMBER/locations/LOCATION/cachedContents/CACHE_ID",
  "contents": [
      {"role":"user","parts":[{"text":"PROMPT_TEXT"}]}
  ],
  "generationConfig": {
      "maxOutputTokens": 8192,
      "temperature": 1,
      "topP": 0.95,
  },
  "safetySettings": [
      {
          "category": "HARM_CATEGORY_HATE_SPEECH",
          "threshold": "BLOCK_MEDIUM_AND_ABOVE"
      },
      {
          "category": "HARM_CATEGORY_DANGEROUS_CONTENT",
          "threshold": "BLOCK_MEDIUM_AND_ABOVE"
      },
      {
          "category": "HARM_CATEGORY_SEXUALLY_EXPLICIT",
          "threshold": "BLOCK_MEDIUM_AND_ABOVE"
      },
      {
          "category": "HARM_CATEGORY_HARASSMENT",
          "threshold": "BLOCK_MEDIUM_AND_ABOVE"
      }
  ],
}

リクエストを送信するには、次のいずれかのオプションを選択します。

curl

注: 次のコマンドは、gcloud init または gcloud auth login を実行して、ユーザーアカウントで gcloud CLI にログインしているか、Cloud Shell を使用して自動的に gcloud CLI にログインしていることを前提としています。gcloud auth list を実行すると、現在アクティブなアカウントを確認できます。

リクエスト本文を request.json という名前のファイルに保存して、次のコマンドを実行します。

curl -X POST \
     -H "Authorization: Bearer $(gcloud auth print-access-token)" \
     -H "Content-Type: application/json; charset=utf-8" \
     -d @request.json \
     "https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/gemini-2.0-flash-001:generateContent"

PowerShell

注: 次のコマンドは、gcloud init または gcloud auth login を実行して、ご自分のユーザーアカウントで gcloud CLI にログインしていることを前提としています。gcloud auth list を実行すると、現在アクティブなアカウントを確認できます。

リクエスト本文を request.json という名前のファイルに保存して、次のコマンドを実行します。

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
    -Method POST `
    -Headers $headers `
    -ContentType: "application/json; charset=utf-8" `
    -InFile request.json `
    -Uri "https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/gemini-2.0-flash-001:generateContent" | Select-Object -Expand Content

次のような JSON レスポンスが返されます。

レスポンス

{
  "candidates": [
    {
      "content": {
        "role": "model",
        "parts": [
          {
            "text": "MODEL_RESPONSE"
          }
        ]
      },
      "finishReason": "STOP",
      "safetyRatings": [
        {
          "category": "HARM_CATEGORY_HATE_SPEECH",
          "probability": "NEGLIGIBLE",
          "probabilityScore": 0.21866937,
          "severity": "HARM_SEVERITY_NEGLIGIBLE",
          "severityScore": 0.19946389
        },
        {
          "category": "HARM_CATEGORY_DANGEROUS_CONTENT",
          "probability": "MEDIUM",
          "probabilityScore": 0.6880493,
          "severity": "HARM_SEVERITY_MEDIUM",
          "severityScore": 0.43374163
        },
        {
          "category": "HARM_CATEGORY_HARASSMENT",
          "probability": "NEGLIGIBLE",
          "probabilityScore": 0.4442634,
          "severity": "HARM_SEVERITY_LOW",
          "severityScore": 0.37903354
        },
        {
          "category": "HARM_CATEGORY_SEXUALLY_EXPLICIT",
          "probability": "NEGLIGIBLE",
          "probabilityScore": 0.10502681,
          "severity": "HARM_SEVERITY_LOW",
          "severityScore": 0.28170192
        }
      ]
    }
  ],
  "usageMetadata": {
    "promptTokenCount": 55927,
    "candidatesTokenCount": 105,
    "totalTokenCount": 56032
  }
}

curl コマンドの例

LOCATION="us-central1"
MODEL_ID="gemini-2.0-flash-001"
PROJECT_ID="test-project"

curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json" \
"https://${LOCATION}-aiplatform.googleapis.com/v1/projects/${PROJECT_ID}/locations/${LOCATION}/publishers/google/models/${MODEL_ID}:generateContent" -d \
'{
  "cachedContent": "projects/${PROJECT_NUMBER}/locations/${LOCATION}/cachedContents/${CACHE_ID}",
  "contents": [
      {"role":"user","parts":[{"text":"What are the benefits of exercise?"}]}
  ],
  "generationConfig": {
      "maxOutputTokens": 8192,
      "temperature": 1,
      "topP": 0.95,
  },
  "safetySettings": [
    {
      "category": "HARM_CATEGORY_HATE_SPEECH",
      "threshold": "BLOCK_MEDIUM_AND_ABOVE"
    },
    {
      "category": "HARM_CATEGORY_DANGEROUS_CONTENT",
      "threshold": "BLOCK_MEDIUM_AND_ABOVE"
    },
    {
      "category": "HARM_CATEGORY_SEXUALLY_EXPLICIT",
      "threshold": "BLOCK_MEDIUM_AND_ABOVE"
    },
    {
      "category": "HARM_CATEGORY_HARASSMENT",
      "threshold": "BLOCK_MEDIUM_AND_ABOVE"
    }
  ],
}'

コンテキストキャッシュの有効期限を更新する方法を確認する。
新しいコンテキストキャッシュを作成する方法を確認する。
Google Cloud プロジェクトに関連付けられているすべてのコンテキストキャッシュに関する情報を取得する方法を確認する。
コンテキストキャッシュを削除する方法を確認する。

コンテキスト キャッシュを使用する

コンテキスト キャッシュの使用制限

コンテキスト キャッシュの使用例

Python

インストール

Go

Java

Node.js

インストール

REST

curl

PowerShell

レスポンス

curl コマンドの例

コンテキストキャッシュを使用する

コンテキストキャッシュの使用制限

コンテキストキャッシュの使用例