Get information about a context cache

This guide shows you how to get information about one or more context caches, covering the following topics:

The following diagram summarizes the workflow:

You can retrieve details about a context cache, such as its creation time, last update time, and expiration time. To get information for all context caches in a Google Cloud project, including their IDs, you can list them. If you already have a cache's ID, you can retrieve its information directly.

Get a list of context caches

To get a list of the context caches in your Google Cloud project, you need the project ID and the region where the caches were created.

Python

Install

pip install --upgrade google-genai

To learn more, see the SDK reference documentation.

Set environment variables to use the Gen AI SDK with Vertex AI:

# Replace the `GOOGLE_CLOUD_PROJECT` and `GOOGLE_CLOUD_LOCATION` values
# with appropriate values for your project.
export GOOGLE_CLOUD_PROJECT=GOOGLE_CLOUD_PROJECT
export GOOGLE_CLOUD_LOCATION=us-central1
export GOOGLE_GENAI_USE_VERTEXAI=True

from google import genai
from google.genai.types import HttpOptions

client = genai.Client(http_options=HttpOptions(api_version="v1"))

content_cache_list = client.caches.list()

# Access individual properties of a ContentCache object(s)
for content_cache in content_cache_list:
    print(f"Cache `{content_cache.name}` for model `{content_cache.model}`")
    print(f"Last updated at: {content_cache.update_time}")
    print(f"Expires at: {content_cache.expire_time}")

# Example response:
# * Cache `projects/111111111111/locations/us-central1/cachedContents/1111111111111111111` for
#       model `projects/111111111111/locations/us-central1/publishers/google/models/gemini-XXX-pro-XXX`
# * Last updated at: 2025-02-13 14:46:42.620490+00:00
# * CachedContentUsageMetadata(audio_duration_seconds=None, image_count=167, text_count=153, total_token_count=43130, video_duration_seconds=None)
# ...

REST

The following shows how to use REST to list the context caches associated with a Google Cloud project by sending a GET request to the publisher model endpoint.

Before using any of the request data, make the following replacements:

HTTP method and URL:

GET https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/cachedContents

To send your request, choose one of these options:

curl

Execute the following command:

curl -X GET \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
"https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/cachedContents"

PowerShell

Execute the following command:

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
-Method GET `
-Headers $headers `
-Uri "https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/cachedContents" | Select-Object -Expand Content

You should receive a JSON response similar to the following:

Example curl command

LOCATION="us-central1"
PROJECT_ID="PROJECT_ID"

curl \
-X GET \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
https://${LOCATION}-aiplatform.googleapis.com/v1/projects/${PROJECT_ID}/locations/${LOCATION}/cachedContents

Get information about a specific context cache

To get information about a specific context cache, you need its cache ID, the associated Google Cloud project ID, and the region where it was created. A cache ID is returned when you create a context cache and can also be found by listing the context caches in your project.

Go

Before trying this sample, follow the Go setup instructions in the Vertex AI quickstart. For more information, see the Vertex AI Go SDK for Gemini reference documentation.

To authenticate to Vertex AI, set up Application Default Credentials. For more information, see Set up ADC for a local development environment.

Streaming and non-streaming responses

You can choose whether the model generates streaming responses or non-streaming responses. For streaming responses, you receive each response as soon as its output token is generated. For non-streaming responses, you receive all responses after all of the output tokens are generated.

For a streaming response, use the GenerateContentStream method.

  iter := model.GenerateContentStream(ctx, genai.Text("Tell me a story about a lumberjack and his giant ox. Keep it very short."))
  

For a non-streaming response, use the GenerateContent method.

  resp, err := model.GenerateContent(ctx, genai.Text("What is the average size of a swallow?"))
  

Sample code

import (
	"context"
	"fmt"
	"io"

	"cloud.google.com/go/vertexai/genai"
)

// getContextCache shows how to retrieve the metadata of a cached content
// contentName is the ID of the cached content to retrieve
func getContextCache(w io.Writer, contentName string, projectID, location string) error {
	// location := "us-central1"
	ctx := context.Background()

	client, err := genai.NewClient(ctx, projectID, location)
	if err != nil {
		return fmt.Errorf("unable to create client: %w", err)
	}
	defer client.Close()

	cachedContent, err := client.GetCachedContent(ctx, contentName)
	if err != nil {
		return fmt.Errorf("GetCachedContent: %w", err)
	}
	fmt.Fprintf(w, "Retrieved cached content %q", cachedContent.Name)
	return nil
}

REST

The following shows how to use REST to list the context caches associated with a Google Cloud project by sending a GET request to the publisher model endpoint.

Before using any of the request data, make the following replacements:

  • PROJECT_ID: .
  • LOCATION: The region where the request to create the context cache was processed.
  • CACHE_ID: The ID of the context cache. The context cache ID is returned when you create the context cache. You can also find context cache IDs by listing the context caches for a Google Cloud project using. For more information, see create a context cache and list context caches.

HTTP method and URL:

GET https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/cachedContents/CACHE_ID

To send your request, choose one of these options:

curl

Execute the following command:

curl -X GET \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
"https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/cachedContents/CACHE_ID"

PowerShell

Execute the following command:

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
-Method GET `
-Headers $headers `
-Uri "https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/cachedContents/CACHE_ID" | Select-Object -Expand Content

You should receive a JSON response similar to the following:

Example curl command

LOCATION="us-central1"
PROJECT_ID="PROJECT_ID"
CACHE_ID="CACHE_ID"

curl \
-X GET \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
https://${LOCATION}-aiplatform.googleapis.com/v1/projects/${PROJECT_ID}/locations/${LOCATION}/${CACHE_ID}

What's next