计算 Claude 模型的词元数

借助 count-tokens 端点，您可以在将消息发送到 Claude 之前确定消息中的词元数，从而帮助您就提示和使用情况做出明智的决策。

免费使用 count-tokens 端点。

支持的 Claude 模型

以下模型支持计算词元数：

Claude 3.5 Sonnet v2：claude-3-5-sonnet-v2@20241022。
Claude 3.5 Haiku：claude-3-5-haiku@20241022。
Claude 3 Opus：claude-3-opus@20240229。
Claude 3.5 Sonnet：claude-3-5-sonnet@20240620。
Claude 3 Haiku：claude-3-haiku@20240307。

支持的区域

以下区域支持计算词元数：

us-east5
europe-west1
asia-southeast1
us-central1
europe-west4

计算基本消息中的词元数

如需计算词元数，请向 count-tokens 端点发送 rawPredict 请求。请求正文必须包含您要针对其计数词元数的模型的模型 ID。

REST

在使用任何请求数据之前，请先进行以下替换：

LOCATION：支持的区域。
MODEL：要对其计算词元数的模型。
ROLE：与消息关联的角色。您可以指定 user 或 assistant。第一条消息必须使用 user 角色。 Claude 模型使用交替的 user 和 assistant 回合运行。如果最终消息使用 assistant 角色，则回答内容会立即从该消息中的内容继续。您可以使用它来限制模型的部分回答。
CONTENT：user 或 assistant 消息的内容（如文本）。

HTTP 方法和网址：

POST https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/publishers/anthropic/models/count-tokens:rawPredict

请求 JSON 正文：

{
  "model": "claude-3-haiku@20240307",
  "messages": [
    {
      "role": "user",
      "content":"how many tokens are in this request?"
    }
  ],
}

如需发送请求，请选择以下方式之一：

curlPowerShell

注意：以下命令假定您已使用您的用户账号通过运行 gcloud init 或 gcloud auth login 登录 gcloud CLI，或者使用了 Cloud Shell，这会使您自动登录 gcloud CLI。您可以运行 gcloud auth list 来检查当前活跃的账号。

将请求正文保存在名为 request.json 的文件中，然后执行以下命令：

curl -X POST \
     -H "Authorization: Bearer $(gcloud auth print-access-token)" \
     -H "Content-Type: application/json; charset=utf-8" \
     -d @request.json \
     "https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/publishers/anthropic/models/count-tokens:rawPredict"

注意：以下命令假定您已使用您的用户账号通过运行 gcloud init 或 gcloud auth login 登录 gcloud CLI。您可以运行 gcloud auth list 来检查当前活跃的账号。

将请求正文保存在名为 request.json 的文件中，然后执行以下命令：

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
    -Method POST `
    -Headers $headers `
    -ContentType: "application/json; charset=utf-8" `
    -InFile request.json `
    -Uri "https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/publishers/anthropic/models/count-tokens:rawPredict" | Select-Object -Expand Content

您应该收到类似以下内容的 JSON 响应。

响应

{ "input_tokens": 14 }

如需了解如何使用工具、图片和 PDF 计算消息中的词元数，请参阅 Anthropic 的文档。

配额

默认情况下，count-tokens 端点的配额为每分钟 2,000 个请求。