Get token count and billable characters

Generative AI models break down text data into units called tokens for processing. The way text data is converted into tokens depends on the tokenizer used. A token can be characters, words, or phrases. Each model has a maximum number of tokens that it can handle in a prompt and response. This page shows you how to get the token count and the number of billable characters for a prompt.

Supported models

The following foundation models support getting prompt token counts:

  • text-bison
  • chat-bison
  • code-bison
  • codechat-bison
  • code-gecko
  • textembedding-gecko

To learn how to get the prompt token count for Gemini models, see the get token count instructions for the Vertex AI Gemini API.

Get the token count for a prompt

You can get the token count and the number of billable characters for a prompt by using the countTokens API. The input format for countTokens depends on the model you use. Each input format is the same as the predict input format.

text-bison

REST

To get the token count and the number of billable characters for a prompt by using the Vertex AI API, send a POST request to the publisher model endpoint.

Before using any of the request data, make the following replacements:

  • LOCATION: Enter a supported region. For the full list of supported regions, see Available locations.
  • PROJECT_ID: Your project ID.
  • PROMPT: The prompt to get the token count and billable characters for. (Don't add quotes around the prompt here.)

HTTP method and URL:

POST https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/text-bison:countTokens

Request JSON body:

{
  "instances": [
    { "prompt": "PROMPT" }
  ]
}

To send your request, choose one of these options:

curl

Save the request body in a file named request.json. Run the following command in the terminal to create or overwrite this file in the current directory:

cat > request.json << 'EOF'
{
  "instances": [
    { "prompt": "PROMPT" }
  ]
}
EOF

Then execute the following command to send your REST request:

curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
"https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/text-bison:countTokens"

PowerShell

Save the request body in a file named request.json. Run the following command in the terminal to create or overwrite this file in the current directory:

@'
{
  "instances": [
    { "prompt": "PROMPT" }
  ]
}
'@  | Out-File -FilePath request.json -Encoding utf8

Then execute the following command to send your REST request:

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
-Method POST `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/text-bison:countTokens" | Select-Object -Expand Content

You should receive a JSON response similar to the following.

{
  "totalTokens": 12,
  "totalBillableCharacters": 54
}

Example curl command

PROJECT_ID="PROJECT_ID"

curl \
-X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json" \
https://us-central1-aiplatform.googleapis.com/v1/projects/${PROJECT_ID}/locations/us-central1/publishers/google/models/text-bison:countTokens -d \
$'{
  "instances": [
    { "prompt": "Give me ten interview questions for the role of program manager." },
    { "prompt": "List some good qualities for a program manager." }
  ]
}'

chat-bison

REST

To get the token count and the number of billable characters for a prompt by using the Vertex AI API, send a POST request to the publisher model endpoint.

Before using any of the request data, make the following replacements:

  • LOCATION: Enter a supported region. For the full list of supported regions, see Available locations.
  • PROJECT_ID: Your project ID.
  • CONTEXT: Optional. Context can be instructions that you give to the model on how it should respond or information that it uses or references to generate a response. Add contextual information in your prompt when you need to give information to the model, or restrict the boundaries of the responses to only what's within the context.
  • EXAMPLE_AUTHOR_1: The author of EXAMPLE_INPUT (the user).
  • EXAMPLE_INPUT: Example of a message.
  • EXAMPLE_AUTHOR_2: The author of EXAMPLE_OUTPUT (the bot).
  • EXAMPLE_OUTPUT: Example of the ideal response.
  • AUTHOR_1: The author of the message (the user).
  • CONTENT: The content of the message.

HTTP method and URL:

POST https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/chat-bison:countTokens

Request JSON body:

{
  "instances": [
    {
      "context": "CONTEXT",
      "examples": [
        {
          "input": {
            "author": "EXAMPLE_AUTHOR_1",
            "content": "EXAMPLE_INPUT"
          },
          "output": {
            "author": "EXAMPLE_AUTHOR_2",
            "content": "EXAMPLE_OUTPUT"
          }
        }
      ],
      "messages": [
        {
          "author": "AUTHOR_1",
          "content": "CONTENT"
        }
      ]
    }
  ]
}

To send your request, choose one of these options:

curl

Save the request body in a file named request.json. Run the following command in the terminal to create or overwrite this file in the current directory:

cat > request.json << 'EOF'
{
  "instances": [
    {
      "context": "CONTEXT",
      "examples": [
        {
          "input": {
            "author": "EXAMPLE_AUTHOR_1",
            "content": "EXAMPLE_INPUT"
          },
          "output": {
            "author": "EXAMPLE_AUTHOR_2",
            "content": "EXAMPLE_OUTPUT"
          }
        }
      ],
      "messages": [
        {
          "author": "AUTHOR_1",
          "content": "CONTENT"
        }
      ]
    }
  ]
}
EOF

Then execute the following command to send your REST request:

curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
"https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/chat-bison:countTokens"

PowerShell

Save the request body in a file named request.json. Run the following command in the terminal to create or overwrite this file in the current directory:

@'
{
  "instances": [
    {
      "context": "CONTEXT",
      "examples": [
        {
          "input": {
            "author": "EXAMPLE_AUTHOR_1",
            "content": "EXAMPLE_INPUT"
          },
          "output": {
            "author": "EXAMPLE_AUTHOR_2",
            "content": "EXAMPLE_OUTPUT"
          }
        }
      ],
      "messages": [
        {
          "author": "AUTHOR_1",
          "content": "CONTENT"
        }
      ]
    }
  ]
}
'@  | Out-File -FilePath request.json -Encoding utf8

Then execute the following command to send your REST request:

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
-Method POST `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/chat-bison:countTokens" | Select-Object -Expand Content

You should receive a JSON response similar to the following.

{
  "totalTokens": 43,
  "totalBillableCharacters": 182
}

Example curl command

PROJECT_ID="PROJECT_ID"

curl \
-X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json" \
https://us-central1-aiplatform.googleapis.com/v1/projects/${PROJECT_ID}/locations/us-central1/publishers/google/models/chat-bison:countTokens -d \
$'{
  "instances": [
    {
      "context": "You are Captain Bartholomew, the most feared pirate dog of the seven seas.",
      "examples": [
        {
          "input": {
            "author": "User",
            "content": "Hello!"
          },
          "output": {
            "author": "Captain Barktholomew",
            "content": "Argh! What brings ye to my ship?"
          }
        },
        {
          "input": {
            "author": "User",
            "content": "Who are you?"
          },
          "output": {
            "author": "Captain Barktholomew",
            "content": "I be Captain Barktholomew, the most feared pirate dog of the seven seas."
          }
        }
      ],
      "messages": [
        {
          "author": "User",
          "content": "Hello!"
        },
        {
          "author": "Captain Barktholomew",
          "content": "Ahoy there, landlubber! What brings ye to me ship?"
        },
        {
          "author": "User",
          "content": "Can you tell me a tale of your most recent adventure?"
        },
        {
          "author": "Captain Barktholomew",
          "content": "Aye, I\'ll spin ye a tale of me latest adventure....",
        },
        {
          "author": "User",
          "content": "I\'m listening."
        }
      ]
    }
  ]
}'

code-bison

REST

To get the token count and the number of billable characters for a prompt by using the Vertex AI API, send a POST request to the publisher model endpoint.

Before using any of the request data, make the following replacements:

  • LOCATION: Enter a supported region. For the full list of supported regions, see Available locations.
  • PROJECT_ID: Your project ID.
  • PREFIX: For code models, prefix represents the beginning of a piece of meaningful programming code or a natural language prompt that describes code to be generated.

HTTP method and URL:

POST https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/code-bison:countTokens

Request JSON body:

{
  "instances": [
    {
      "prefix": "PREFIX"
    }
  ]
}

To send your request, choose one of these options:

curl

Save the request body in a file named request.json. Run the following command in the terminal to create or overwrite this file in the current directory:

cat > request.json << 'EOF'
{
  "instances": [
    {
      "prefix": "PREFIX"
    }
  ]
}
EOF

Then execute the following command to send your REST request:

curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
"https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/code-bison:countTokens"

PowerShell

Save the request body in a file named request.json. Run the following command in the terminal to create or overwrite this file in the current directory:

@'
{
  "instances": [
    {
      "prefix": "PREFIX"
    }
  ]
}
'@  | Out-File -FilePath request.json -Encoding utf8

Then execute the following command to send your REST request:

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
-Method POST `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/code-bison:countTokens" | Select-Object -Expand Content

You should receive a JSON response similar to the following.

{
  "totalTokens": 43,
  "totalBillableCharacters": 182
}

Example curl command

PROJECT_ID=PROJECT_ID

curl \
-X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json" \
https://us-central1-aiplatform.googleapis.com/v1/projects/${PROJECT_ID}/locations/us-central1/publishers/google/models/code-bison:countTokens -d \
$'{
  "instances": [
    {
      "prefix": "Write a function that checks if a year is a leap year."
    }
  ]
}'

codechat-bison

REST

To get the token count and the number of billable characters for a prompt by using the Vertex AI API, send a POST request to the publisher model endpoint.

Before using any of the request data, make the following replacements:

  • LOCATION: Enter a supported region. For the full list of supported regions, see Available locations.
  • PROJECT_ID: Your project ID.
  • AUTHOR: The author of the message.
  • CONTENT: The content of the message.

HTTP method and URL:

POST https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/codechat-bison:countTokens

Request JSON body:

{
  "instances": {
    "messages": [
      {
         "author": "AUTHOR",
         "content": "CONTENT"
      }
    ]
  }
}

To send your request, choose one of these options:

curl

Save the request body in a file named request.json. Run the following command in the terminal to create or overwrite this file in the current directory:

cat > request.json << 'EOF'
{
  "instances": {
    "messages": [
      {
         "author": "AUTHOR",
         "content": "CONTENT"
      }
    ]
  }
}
EOF

Then execute the following command to send your REST request:

curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
"https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/codechat-bison:countTokens"

PowerShell

Save the request body in a file named request.json. Run the following command in the terminal to create or overwrite this file in the current directory:

@'
{
  "instances": {
    "messages": [
      {
         "author": "AUTHOR",
         "content": "CONTENT"
      }
    ]
  }
}
'@  | Out-File -FilePath request.json -Encoding utf8

Then execute the following command to send your REST request:

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
-Method POST `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/codechat-bison:countTokens" | Select-Object -Expand Content

You should receive a JSON response similar to the following.

{
  "totalTokens": 43,
  "totalBillableCharacters": 182
}

Example curl command

PROJECT_ID=PROJECT_ID

curl \
-X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json" \
https://us-central1-aiplatform.googleapis.com/v1/projects/${PROJECT_ID}/locations/us-central1/publishers/google/models/codechat-bison:countTokens -d \
$'{
  "instances": {
    "messages": [
      {
        "author": "user",
        "content": "Hi, how are you?"
      },
      {
        "author": "system",
        "content": "I am doing good. What Can I help you with in the coding world?"
      },
      {
        "author": "user",
        "content": "Please help write a function to calculate the min of two numbers"
      }
    ]
  }
}'

code-gecko

REST

To get the token count and the number of billable characters for a prompt by using the Vertex AI API, send a POST request to the publisher model endpoint.

Before using any of the request data, make the following replacements:

  • LOCATION: Enter a supported region. For the full list of supported regions, see Available locations.
  • PROJECT_ID: Your project ID.
  • PREFIX: For code models, prefix represents the beginning of a piece of meaningful programming code or a natural language prompt that describes code to be generated. The model attempts to fill in the code in between the prefix and suffix.
  • SUFFIX: For code completion, suffix represents the end of a piece of meaningful programming code. The model attempts to fill in the code in between the prefix and suffix.

HTTP method and URL:

POST https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/code-gecko:countTokens

Request JSON body:

{
  "instances": [
    {
      "prefix": "PREFIX",
      "suffix": "SUFFIX"
    }
  ]
}

To send your request, choose one of these options:

curl

Save the request body in a file named request.json. Run the following command in the terminal to create or overwrite this file in the current directory:

cat > request.json << 'EOF'
{
  "instances": [
    {
      "prefix": "PREFIX",
      "suffix": "SUFFIX"
    }
  ]
}
EOF

Then execute the following command to send your REST request:

curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
"https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/code-gecko:countTokens"

PowerShell

Save the request body in a file named request.json. Run the following command in the terminal to create or overwrite this file in the current directory:

@'
{
  "instances": [
    {
      "prefix": "PREFIX",
      "suffix": "SUFFIX"
    }
  ]
}
'@  | Out-File -FilePath request.json -Encoding utf8

Then execute the following command to send your REST request:

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
-Method POST `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/code-gecko:countTokens" | Select-Object -Expand Content

You should receive a JSON response similar to the following.

{
  "totalTokens": 43,
  "totalBillableCharacters": 182
}

Example curl command

PROJECT_ID=PROJECT_ID

curl \
-X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json" \
https://us-central1-aiplatform.googleapis.com/v1/projects/${PROJECT_ID}/locations/us-central1/publishers/google/models/code-gecko:countTokens -d \
$'{
  "instances": [
    {
      "prefix": "def reverse_string(s):",
      "suffix": ""
    }
  ]
}'

textembedding-gecko

REST

To get the token count and the number of billable characters for a prompt by using the Vertex AI API, send a POST request to the publisher model endpoint.

Before using any of the request data, make the following replacements:

  • LOCATION: Enter a supported region. For the full list of supported regions, see Available locations.
  • PROJECT_ID: Your project ID.
  • TEXT: The text that you want to generate embeddings for.

HTTP method and URL:

POST https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/textembedding-gecko:countTokens

Request JSON body:

{
  "instances": [
    {
      "content": "TEXT"
    }
  ]
}

To send your request, choose one of these options:

curl

Save the request body in a file named request.json. Run the following command in the terminal to create or overwrite this file in the current directory:

cat > request.json << 'EOF'
{
  "instances": [
    {
      "content": "TEXT"
    }
  ]
}
EOF

Then execute the following command to send your REST request:

curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
"https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/textembedding-gecko:countTokens"

PowerShell

Save the request body in a file named request.json. Run the following command in the terminal to create or overwrite this file in the current directory:

@'
{
  "instances": [
    {
      "content": "TEXT"
    }
  ]
}
'@  | Out-File -FilePath request.json -Encoding utf8

Then execute the following command to send your REST request:

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
-Method POST `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/textembedding-gecko:countTokens" | Select-Object -Expand Content

You should receive a JSON response similar to the following.

{
  "totalTokens": 43,
  "totalBillableCharacters": 182
}

Example curl command

PROJECT_ID=PROJECT_ID

curl \
-X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json" \
https://us-central1-aiplatform.googleapis.com/v1/projects/${PROJECT_ID}/locations/us-central1/publishers/google/models/textembedding-gecko:countTokens -d \
$'{
  "instances": [
    {
      "content": "What is life?"
    }
  ]
}'

Pricing and quota

There is no charge for using the CountTokens API. The maximum quota for the CountTokens API and the ComputeTokens API is 3000 requests per minute.

What's next