Try Gemini 1.5 Pro, our most advanced multimodal model in Vertex AI, and see what you can build with a 1M token context window. Try Gemini 1.5 Pro, our most advanced multimodal model in Vertex AI, and see what you can build with a 1M token context window.

Get token count and billable characters

This page shows you how to get the token count and the number of billable characters for a prompt.

Supported models

The following multimodal models support getting prompt token counts:

gemini-1.5-pro-preview-0409 (Preview)
gemini-1.0-pro-002
gemini-1.0-pro-vision-001

To learn more about model versions, see Gemini model versions and lifecycle.

Get the token count for a prompt

You can get the token count and the number of billable characters for a prompt by using the Vertex AI API.

Python

To learn how to install or update the Vertex AI SDK for Python, see Install the Vertex AI SDK for Python. For more information, see the Python API reference documentation.

import vertexai
from vertexai.generative_models import GenerativeModel

# TODO(developer): Update and un-comment below line
# project_id = "PROJECT_ID"

vertexai.init(project=project_id, location="us-central1")

model = GenerativeModel(model_name="gemini-1.0-pro-002")

prompt = "Why is the sky blue?"

# Prompt tokens count
response = model.count_tokens(prompt)
print(f"Prompt Token Count: {response.total_tokens}")
print(f"Prompt Character Count: {response.total_billable_characters}")

# Send text to Gemini
response = model.generate_content(prompt)

# Response tokens count
usage_metadata = response.usage_metadata
print(f"Prompt Token Count: {usage_metadata.prompt_token_count}")
print(f"Candidates Token Count: {usage_metadata.candidates_token_count}")
print(f"Total Token Count: {usage_metadata.total_token_count}")

Java

Before trying this sample, follow the Java setup instructions in the Vertex AI quickstart using client libraries. For more information, see the Vertex AI Java API reference documentation.

To authenticate to Vertex AI, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.

import com.google.cloud.vertexai.VertexAI;
import com.google.cloud.vertexai.api.CountTokensResponse;
import com.google.cloud.vertexai.generativeai.GenerativeModel;
import java.io.IOException;
  public static int getTokenCount(String projectId, String location, String modelName,
                                  String textPrompt)
      throws IOException {
    try (VertexAI vertexAI = new VertexAI(projectId, location)) {
      GenerativeModel model = new GenerativeModel(modelName, vertexAI);
      CountTokensResponse response = model.countTokens(textPrompt);

      int tokenCount = response.getTotalTokens();
      System.out.println("There are " + tokenCount + " tokens in the prompt.");

      return tokenCount;
    }
  }

Node.js

Before trying this sample, follow the Node.js setup instructions in the Vertex AI quickstart using client libraries. For more information, see the Vertex AI Node.js API reference documentation.

To authenticate to Vertex AI, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.

const {VertexAI} = require('@google-cloud/vertexai');

/**
 * TODO(developer): Update these variables before running the sample.
 */
async function countTokens(
  projectId = 'PROJECT_ID',
  location = 'us-central1',
  model = 'gemini-1.0-pro-002'
) {
  // Initialize Vertex with your Cloud project and location
  const vertexAI = new VertexAI({project: projectId, location: location});

  // Instantiate the model
  const generativeModel = vertexAI.getGenerativeModel({
    model: model,
  });

  const req = {
    contents: [{role: 'user', parts: [{text: 'How are you doing today?'}]}],
  };

  const countTokensResp = await generativeModel.countTokens(req);
  console.log('count tokens response: ', countTokensResp);
}

REST

To get the token count and the number of billable characters for a prompt by using the Vertex AI API, send a POST request to the publisher model endpoint.

Before using any of the request data, make the following replacements:

LOCATION: The region to process the request. Available options include the following:
Click to expand available regions
- us-central1
- us-west4
- northamerica-northeast1
- us-east4
- us-west1
- asia-northeast3
- asia-southeast1
- asia-northeast1
PROJECT_ID: Your project ID.
MODEL_ID: The model ID of the multimodal model that you want to use. The options are:
- gemini-1.0-pro-vision
- gemini-1.0-pro
ROLE: The role in a conversation associated with the content. Specifying a role is required even in singleturn use cases. Acceptable values include the following:
- USER: Specifies content that's sent by you.
TEXT: The text instructions to include in the prompt.
IMAGE_BYTES: A sequence of bytes instead of characters.
FILE_URI: The Cloud Storage URI of the image or video to include in the prompt. The bucket that stores the file must be in the same Google Cloud project that's sending the request. You must also specify MIMETYPE.
MIME_TYPE: The media type of the image, PDF, or video specified in the data or fileUri fields. Acceptable values include the following:
Click to expand MIME types
- application/pdf
- audio/mpeg
- audio/mp3
- audio/wav
- image/png
- image/jpeg
- text/plain
- video/mov
- video/mpeg
- video/mp4
- video/mpg
- video/avi
- video/wmv
- video/mpegps
- video/flv

HTTP method and URL:

POST https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/MODEL_ID:countTokens

Request JSON body:

{
  "contents": {
    "role": "ROLE",
    "parts": [
      {
        "inlineData": {
          "mimeType": "MIME_TYPE",
          "data": "IMAGE_BYTES"
        }
      },
      {
        "fileData": {
          "mimeType": "MIME_TYPE",
          "fileUri": "FILE_URI"
        }
      },
      {
        "text": "TEXT"
      }
    ]
  }
}

To send your request, choose one of these options:

curl

Note: The following command assumes that you have logged in to the gcloud CLI with your user account by running gcloud init or gcloud auth login , or by using Cloud Shell, which automatically logs you into the gcloud CLI . You can check the currently active account by running gcloud auth list.

Save the request body in a file named request.json, and execute the following command:

curl -X POST \
     -H "Authorization: Bearer $(gcloud auth print-access-token)" \
     -H "Content-Type: application/json; charset=utf-8" \
     -d @request.json \
     "https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/MODEL_ID:countTokens"

PowerShell

Note: The following command assumes that you have logged in to the gcloud CLI with your user account by running gcloud init or gcloud auth login . You can check the currently active account by running gcloud auth list.

Save the request body in a file named request.json, and execute the following command:

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
    -Method POST `
    -Headers $headers `
    -ContentType: "application/json; charset=utf-8" `
    -InFile request.json `
    -Uri "https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/MODEL_ID:countTokens" | Select-Object -Expand Content

You should receive a JSON response similar to the following.

Response

{ "totalTokens": 43 }

Example curl command for text with image or video:

MODEL_ID="gemini-1.0-pro-vision"
PROJECT_ID="my-project"
TEXT="Provide a summary with about two sentences for the following article."
REGION="us-central1"

curl \
-X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json" \
https://${REGION}-aiplatform.googleapis.com/v1/projects/${PROJECT_ID}/locations/${REGION}/publishers/google/models/${MODEL_ID}:countTokens -d \
$'{
    "contents": [{
      "role": "user",
      "parts": [
        {
          "file_data": {
            "file_uri": "gs://cloud-samples-data/generative-ai/video/pixel8.mp4",
            "mime_type": "video/mp4"
          }
        },
        {
          "text": "'"$TEXT"'"
        }]
    }]
 }'

Example curl command for text only:

MODEL_ID="gemini-1.0-pro-vision"
PROJECT_ID="my-project"
TEXT="Provide a summary with about two sentences for the following article."
REGION="us-central1"

curl \
-X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json" \
https://${REGION}-aiplatform.googleapis.com/v1/projects/${PROJECT_ID}/locations/${REGION}/publishers/google/models/${MODEL_ID}:countTokens -d \
$'{
  "contents": [{
      "role": "user",
      "parts": [{
        "text": "'"$TEXT"'"
      }]
    }]
 }'

Pricing and quota

There is no charge or quota restriction for using the CountTokens API. The maximum quota for the CountTokens API is 3000 requests per minute.

What's next

Learn how to test chat prompts.
Learn how to test text prompts.