Generative AI models break down text data into units called tokens for processing. The way text data is converted into tokens depends on the tokenizer used. A token can be characters, words, or phrases. Each model has a maximum number of tokens that it can handle in a prompt and response. This page shows you how to get the token count and the number of billable characters for a prompt.
Supported models
The following foundation models support getting prompt token counts:
text-bison
chat-bison
code-bison
codechat-bison
code-gecko
textembedding-gecko
To learn how to get the prompt token count for Gemini models, see the get token count instructions for the Vertex AI Gemini API.
Get the token count for a prompt
You can get the token count and the number of billable characters for a prompt
by using the countTokens
API. The input format for countTokens
depends on
the model you use. Each input format is the same as the predict
input format.
To get the token count and the number of billable characters for a prompt by using the Vertex AI API, send a POST request to the publisher model endpoint.
Before using any of the request data, make the following replacements:
LOCATION : Enter a supported region. For the full list of supported regions, see Available locations.PROJECT_ID : Your project ID.PROMPT : The prompt to get the token count and billable characters for. (Don't add quotes around the prompt here.)
HTTP method and URL:
POST https://LOCATION -aiplatform.googleapis.com/v1/projects/PROJECT_ID /locations/LOCATION /publishers/google/models/text-bison:countTokens
Request JSON body:
{ "instances": [ { "prompt": "PROMPT " } ] }
To send your request, choose one of these options:
Save the request body in a file named request.json
.
Run the following command in the terminal to create or overwrite
this file in the current directory:
cat > request.json << 'EOF' { "instances": [ { "prompt": "PROMPT " } ] } EOF
Then execute the following command to send your REST request:
curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
"https://LOCATION -aiplatform.googleapis.com/v1/projects/PROJECT_ID /locations/LOCATION /publishers/google/models/text-bison:countTokens"
Save the request body in a file named request.json
.
Run the following command in the terminal to create or overwrite
this file in the current directory:
@' { "instances": [ { "prompt": "PROMPT " } ] } '@ | Out-File -FilePath request.json -Encoding utf8
Then execute the following command to send your REST request:
$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }
Invoke-WebRequest `
-Method POST `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "https://LOCATION -aiplatform.googleapis.com/v1/projects/PROJECT_ID /locations/LOCATION /publishers/google/models/text-bison:countTokens" | Select-Object -Expand Content
You should receive a JSON response similar to the following.
{ "totalTokens": 12, "totalBillableCharacters": 54 }
Example curl command
PROJECT_ID="PROJECT_ID " curl \ -X POST \ -H "Authorization: Bearer $(gcloud auth print-access-token)" \ -H "Content-Type: application/json" \ https://us-central1-aiplatform.googleapis.com/v1/projects/${PROJECT_ID}/locations/us-central1/publishers/google/models/text-bison:countTokens -d \ $'{ "instances": [ { "prompt": "Give me ten interview questions for the role of program manager." }, { "prompt": "List some good qualities for a program manager." } ] }'
To get the token count and the number of billable characters for a prompt by using the Vertex AI API, send a POST request to the publisher model endpoint.
Before using any of the request data, make the following replacements:
LOCATION : Enter a supported region. For the full list of supported regions, see Available locations.PROJECT_ID : Your project ID.CONTEXT : Optional. Context can be instructions that you give to the model on how it should respond or information that it uses or references to generate a response. Add contextual information in your prompt when you need to give information to the model, or restrict the boundaries of the responses to only what's within the context.EXAMPLE_AUTHOR_1 : The author ofEXAMPLE_INPUT
(the user).EXAMPLE_INPUT : Example of a message.EXAMPLE_AUTHOR_2 : The author ofEXAMPLE_OUTPUT
(the bot).EXAMPLE_OUTPUT : Example of the ideal response.AUTHOR_1 : The author of the message (the user).CONTENT : The content of the message.
HTTP method and URL:
POST https://LOCATION -aiplatform.googleapis.com/v1/projects/PROJECT_ID /locations/LOCATION /publishers/google/models/chat-bison:countTokens
Request JSON body:
{ "instances": [ { "context": "CONTEXT ", "examples": [ { "input": { "author": "EXAMPLE_AUTHOR_1 ", "content": "EXAMPLE_INPUT " }, "output": { "author": "EXAMPLE_AUTHOR_2 ", "content": "EXAMPLE_OUTPUT " } } ], "messages": [ { "author": "AUTHOR_1 ", "content": "CONTENT " } ] } ] }
To send your request, choose one of these options:
Save the request body in a file named request.json
.
Run the following command in the terminal to create or overwrite
this file in the current directory:
cat > request.json << 'EOF' { "instances": [ { "context": "CONTEXT ", "examples": [ { "input": { "author": "EXAMPLE_AUTHOR_1 ", "content": "EXAMPLE_INPUT " }, "output": { "author": "EXAMPLE_AUTHOR_2 ", "content": "EXAMPLE_OUTPUT " } } ], "messages": [ { "author": "AUTHOR_1 ", "content": "CONTENT " } ] } ] } EOF
Then execute the following command to send your REST request:
curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
"https://LOCATION -aiplatform.googleapis.com/v1/projects/PROJECT_ID /locations/LOCATION /publishers/google/models/chat-bison:countTokens"
Save the request body in a file named request.json
.
Run the following command in the terminal to create or overwrite
this file in the current directory:
@' { "instances": [ { "context": "CONTEXT ", "examples": [ { "input": { "author": "EXAMPLE_AUTHOR_1 ", "content": "EXAMPLE_INPUT " }, "output": { "author": "EXAMPLE_AUTHOR_2 ", "content": "EXAMPLE_OUTPUT " } } ], "messages": [ { "author": "AUTHOR_1 ", "content": "CONTENT " } ] } ] } '@ | Out-File -FilePath request.json -Encoding utf8
Then execute the following command to send your REST request:
$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }
Invoke-WebRequest `
-Method POST `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "https://LOCATION -aiplatform.googleapis.com/v1/projects/PROJECT_ID /locations/LOCATION /publishers/google/models/chat-bison:countTokens" | Select-Object -Expand Content
You should receive a JSON response similar to the following.
{ "totalTokens": 43, "totalBillableCharacters": 182 }
Example curl command
PROJECT_ID="PROJECT_ID " curl \ -X POST \ -H "Authorization: Bearer $(gcloud auth print-access-token)" \ -H "Content-Type: application/json" \ https://us-central1-aiplatform.googleapis.com/v1/projects/${PROJECT_ID}/locations/us-central1/publishers/google/models/chat-bison:countTokens -d \ $'{ "instances": [ { "context": "You are Captain Bartholomew, the most feared pirate dog of the seven seas.", "examples": [ { "input": { "author": "User", "content": "Hello!" }, "output": { "author": "Captain Barktholomew", "content": "Argh! What brings ye to my ship?" } }, { "input": { "author": "User", "content": "Who are you?" }, "output": { "author": "Captain Barktholomew", "content": "I be Captain Barktholomew, the most feared pirate dog of the seven seas." } } ], "messages": [ { "author": "User", "content": "Hello!" }, { "author": "Captain Barktholomew", "content": "Ahoy there, landlubber! What brings ye to me ship?" }, { "author": "User", "content": "Can you tell me a tale of your most recent adventure?" }, { "author": "Captain Barktholomew", "content": "Aye, I\'ll spin ye a tale of me latest adventure....", }, { "author": "User", "content": "I\'m listening." } ] } ] }'
To get the token count and the number of billable characters for a prompt by using the Vertex AI API, send a POST request to the publisher model endpoint.
Before using any of the request data, make the following replacements:
LOCATION : Enter a supported region. For the full list of supported regions, see Available locations.PROJECT_ID : Your project ID.PREFIX : For code models,prefix
represents the beginning of a piece of meaningful programming code or a natural language prompt that describes code to be generated.
HTTP method and URL:
POST https://LOCATION -aiplatform.googleapis.com/v1/projects/PROJECT_ID /locations/LOCATION /publishers/google/models/code-bison:countTokens
Request JSON body:
{ "instances": [ { "prefix": "PREFIX " } ] }
To send your request, choose one of these options:
Save the request body in a file named request.json
.
Run the following command in the terminal to create or overwrite
this file in the current directory:
cat > request.json << 'EOF' { "instances": [ { "prefix": "PREFIX " } ] } EOF
Then execute the following command to send your REST request:
curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
"https://LOCATION -aiplatform.googleapis.com/v1/projects/PROJECT_ID /locations/LOCATION /publishers/google/models/code-bison:countTokens"
Save the request body in a file named request.json
.
Run the following command in the terminal to create or overwrite
this file in the current directory:
@' { "instances": [ { "prefix": "PREFIX " } ] } '@ | Out-File -FilePath request.json -Encoding utf8
Then execute the following command to send your REST request:
$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }
Invoke-WebRequest `
-Method POST `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "https://LOCATION -aiplatform.googleapis.com/v1/projects/PROJECT_ID /locations/LOCATION /publishers/google/models/code-bison:countTokens" | Select-Object -Expand Content
You should receive a JSON response similar to the following.
{ "totalTokens": 43, "totalBillableCharacters": 182 }
Example curl command
PROJECT_ID=PROJECT_ID curl \ -X POST \ -H "Authorization: Bearer $(gcloud auth print-access-token)" \ -H "Content-Type: application/json" \ https://us-central1-aiplatform.googleapis.com/v1/projects/${PROJECT_ID}/locations/us-central1/publishers/google/models/code-bison:countTokens -d \ $'{ "instances": [ { "prefix": "Write a function that checks if a year is a leap year." } ] }'
To get the token count and the number of billable characters for a prompt by using the Vertex AI API, send a POST request to the publisher model endpoint.
Before using any of the request data, make the following replacements:
LOCATION : Enter a supported region. For the full list of supported regions, see Available locations.PROJECT_ID : Your project ID.AUTHOR : The author of the message.CONTENT : The content of the message.
HTTP method and URL:
POST https://LOCATION -aiplatform.googleapis.com/v1/projects/PROJECT_ID /locations/LOCATION /publishers/google/models/codechat-bison:countTokens
Request JSON body:
{ "instances": { "messages": [ { "author": "AUTHOR ", "content": "CONTENT " } ] } }
To send your request, choose one of these options:
Save the request body in a file named request.json
.
Run the following command in the terminal to create or overwrite
this file in the current directory:
cat > request.json << 'EOF' { "instances": { "messages": [ { "author": "AUTHOR ", "content": "CONTENT " } ] } } EOF
Then execute the following command to send your REST request:
curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
"https://LOCATION -aiplatform.googleapis.com/v1/projects/PROJECT_ID /locations/LOCATION /publishers/google/models/codechat-bison:countTokens"
Save the request body in a file named request.json
.
Run the following command in the terminal to create or overwrite
this file in the current directory:
@' { "instances": { "messages": [ { "author": "AUTHOR ", "content": "CONTENT " } ] } } '@ | Out-File -FilePath request.json -Encoding utf8
Then execute the following command to send your REST request:
$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }
Invoke-WebRequest `
-Method POST `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "https://LOCATION -aiplatform.googleapis.com/v1/projects/PROJECT_ID /locations/LOCATION /publishers/google/models/codechat-bison:countTokens" | Select-Object -Expand Content
You should receive a JSON response similar to the following.
{ "totalTokens": 43, "totalBillableCharacters": 182 }
Example curl command
PROJECT_ID=PROJECT_ID curl \ -X POST \ -H "Authorization: Bearer $(gcloud auth print-access-token)" \ -H "Content-Type: application/json" \ https://us-central1-aiplatform.googleapis.com/v1/projects/${PROJECT_ID}/locations/us-central1/publishers/google/models/codechat-bison:countTokens -d \ $'{ "instances": { "messages": [ { "author": "user", "content": "Hi, how are you?" }, { "author": "system", "content": "I am doing good. What Can I help you with in the coding world?" }, { "author": "user", "content": "Please help write a function to calculate the min of two numbers" } ] } }'
To get the token count and the number of billable characters for a prompt by using the Vertex AI API, send a POST request to the publisher model endpoint.
Before using any of the request data, make the following replacements:
LOCATION : Enter a supported region. For the full list of supported regions, see Available locations.PROJECT_ID : Your project ID.PREFIX : For code models,prefix
represents the beginning of a piece of meaningful programming code or a natural language prompt that describes code to be generated. The model attempts to fill in the code in between theprefix
andsuffix
.SUFFIX : For code completion,suffix
represents the end of a piece of meaningful programming code. The model attempts to fill in the code in between theprefix
andsuffix
.
HTTP method and URL:
POST https://LOCATION -aiplatform.googleapis.com/v1/projects/PROJECT_ID /locations/LOCATION /publishers/google/models/code-gecko:countTokens
Request JSON body:
{ "instances": [ { "prefix": "PREFIX ", "suffix": "SUFFIX " } ] }
To send your request, choose one of these options:
Save the request body in a file named request.json
.
Run the following command in the terminal to create or overwrite
this file in the current directory:
cat > request.json << 'EOF' { "instances": [ { "prefix": "PREFIX ", "suffix": "SUFFIX " } ] } EOF
Then execute the following command to send your REST request:
curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
"https://LOCATION -aiplatform.googleapis.com/v1/projects/PROJECT_ID /locations/LOCATION /publishers/google/models/code-gecko:countTokens"
Save the request body in a file named request.json
.
Run the following command in the terminal to create or overwrite
this file in the current directory:
@' { "instances": [ { "prefix": "PREFIX ", "suffix": "SUFFIX " } ] } '@ | Out-File -FilePath request.json -Encoding utf8
Then execute the following command to send your REST request:
$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }
Invoke-WebRequest `
-Method POST `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "https://LOCATION -aiplatform.googleapis.com/v1/projects/PROJECT_ID /locations/LOCATION /publishers/google/models/code-gecko:countTokens" | Select-Object -Expand Content
You should receive a JSON response similar to the following.
{ "totalTokens": 43, "totalBillableCharacters": 182 }
Example curl command
PROJECT_ID=PROJECT_ID curl \ -X POST \ -H "Authorization: Bearer $(gcloud auth print-access-token)" \ -H "Content-Type: application/json" \ https://us-central1-aiplatform.googleapis.com/v1/projects/${PROJECT_ID}/locations/us-central1/publishers/google/models/code-gecko:countTokens -d \ $'{ "instances": [ { "prefix": "def reverse_string(s):", "suffix": "" } ] }'
To get the token count and the number of billable characters for a prompt by using the Vertex AI API, send a POST request to the publisher model endpoint.
Before using any of the request data, make the following replacements:
LOCATION : Enter a supported region. For the full list of supported regions, see Available locations.PROJECT_ID : Your project ID.TEXT : The text that you want to generate embeddings for.
HTTP method and URL:
POST https://LOCATION -aiplatform.googleapis.com/v1/projects/PROJECT_ID /locations/LOCATION /publishers/google/models/textembedding-gecko:countTokens
Request JSON body:
{ "instances": [ { "content": "TEXT " } ] }
To send your request, choose one of these options:
Save the request body in a file named request.json
.
Run the following command in the terminal to create or overwrite
this file in the current directory:
cat > request.json << 'EOF' { "instances": [ { "content": "TEXT " } ] } EOF
Then execute the following command to send your REST request:
curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
"https://LOCATION -aiplatform.googleapis.com/v1/projects/PROJECT_ID /locations/LOCATION /publishers/google/models/textembedding-gecko:countTokens"
Save the request body in a file named request.json
.
Run the following command in the terminal to create or overwrite
this file in the current directory:
@' { "instances": [ { "content": "TEXT " } ] } '@ | Out-File -FilePath request.json -Encoding utf8
Then execute the following command to send your REST request:
$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }
Invoke-WebRequest `
-Method POST `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "https://LOCATION -aiplatform.googleapis.com/v1/projects/PROJECT_ID /locations/LOCATION /publishers/google/models/textembedding-gecko:countTokens" | Select-Object -Expand Content
You should receive a JSON response similar to the following.
{ "totalTokens": 43, "totalBillableCharacters": 182 }
Example curl command
PROJECT_ID=PROJECT_ID curl \ -X POST \ -H "Authorization: Bearer $(gcloud auth print-access-token)" \ -H "Content-Type: application/json" \ https://us-central1-aiplatform.googleapis.com/v1/projects/${PROJECT_ID}/locations/us-central1/publishers/google/models/textembedding-gecko:countTokens -d \ $'{ "instances": [ { "content": "What is life?" } ] }'
Pricing and quota
There is no charge for using the CountTokens
API. The
maximum quota for the CountTokens
API and the ComputeTokens
API is 3000
requests per minute.
What's next
- Learn how to compute tokens.
- Learn how to test chat prompts.
- Learn how to test text prompts.
- Learn how to get text embeddings.