试用 Gemini 1.5 模型，这是 Vertex AI 中最新的多模态模型，看看您可以运用多达 200 万词元的上下文窗口构建怎样的应用。 试用 Gemini 1.5 模型，这是 Vertex AI 中最新的多模态模型，看看您可以运用多达 200 万词元的上下文窗口构建怎样的应用。

代码聊天

Codey for Code Chat (codechat-bison) 是支持代码聊天的模型的名称。它是一个基础模型，可支持专用于代码的多轮对话。该模型允许开发者与聊天机器人聊天，以解答与代码相关的问题。code chat API 用于与 Codey for Code Chat 模型交互。

Codey for Code Chat 非常适合通过来回交互完成的代码任务，以便您可以参与连续对话。对于需要单次交互的代码任务，请使用用于代码补全的 API 或用于代码生成的 API。

如需在控制台中探索此模型，请参阅 Model Garden 中的 Codey for Code Chat 模型卡片。
前往 Model Garden

使用场景

代码聊天的一些常见应用场景包括：

获取有关代码的帮助：获取有关代码问题的帮助，例如有关 API 的问题、支持的编程语言的语法或您编写的代码需要哪个版本的库。
调试：获取有关无法编译或包含错误的代码的帮助。
记录：获取有关理解代码的帮助，以便您可以准确记录代码。
了解代码：获取有关您不熟悉的代码的帮助。

HTTP 请求

POST https://us-central1-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/us-central1/publishers/google/models/codechat-bison:predict

模型版本

如需使用最新的模型版本，请指定不含版本号的模型名称，例如 codechat-bison。

如需使用稳定的模型版本，请指定模型版本号，例如 codechat-bison@002。每个稳定版本会在后续稳定版发布日期后的六个月内可用。

下表包含可用的稳定模型版本：

codechat-bison 模型	发布日期	终止日期
codechat-bison@002	2023 年 12 月 6 日	2024 年 10 月 9 日

如需了解详情，请参阅模型版本和生命周期。

请求正文

{
  "instances": [
    {
      "context": string,
      "messages": [
        {
          "content": string,
          "author": string
        }
      ]
    }
  ],
  "parameters":{
    "temperature": number,
    "maxOutputTokens": integer,
    "candidateCount": integer,
    "logprobs": integer,
    "presencePenalty": float,
    "frequencyPenalty": float,
    "seed": integer
  }
}

以下是代码聊天模型 codechat-bison 的参数。codechat-bison 模型是 Codey 中的模型之一。您可以使用这些参数来帮助优化有关代码的聊天机器人对话的提示。如需了解详情，请参阅代码模型概览和创建有关代码的聊天提示。

参数	说明	可接受的值
`context`	应该先将文本提供给模型以打下响应的基础。	文本
`messages` （必填）	以结构化的备用作者形式提供给模型的对话历史记录。消息按时间顺序显示：最旧的消息在前面，最新的信息在后面。当消息的历史记录导致输入超过最大长度时，最旧的消息会被移除，直到整个提示在允许的限制范围内。	List[Structured Message] "author": "user", "content": "user message"
`temperature` （可选）	温度 (temperature) 在生成回复期间用于采样。温度可以控制词元选择的随机性。较低的温度有利于需要更少开放性或创造性回复的提示，而较高的温度可以带来更具多样性或创造性的结果。温度为 `0` 表示始终选择概率最高的词元。在这种情况下，给定提示的回复大多是确定的，但可能仍然有少量变化。	`0.0–1.0` `Default: 0.2`
`maxOutputTokens` （可选）	回复中可生成的词元数量上限。词元约为 4 个字符。100 个词元对应大约 60-80 个单词。指定较低的值可获得较短的回复，指定较高的值可获得可能较长的回复。	`1–2048` `Default: 1024`
`candidateCount` （可选）	要返回的响应变体数量。	`1-4` `Default: 1`
`logprobs` （可选）	返回每个生成步骤中排名靠前的 `logprobs` 最可能候选词元及其对数概率。系统会始终返回每个步骤的选定词元及其对数概率。选择的词元不一定是排名靠前的 `logprobs` 最可能候选词元。	`0-5`
`frequencyPenalty` （可选）	正值会惩罚生成的文本中反复出现的词元，从而降低重复内容概率。可接受的值为 `-2.0`-`2.0`。	`Minimum value: -2.0 Maximum value: 2.0`
`presencePenalty` （可选）	正值会惩罚已生成文本中已存在的词元，从而增加生成更多样化内容的概率。可接受的值为 `-2.0`-`2.0`。	`Minimum value: -2.0 Maximum value: 2.0`
`seed`	解码器使用伪随机数生成器生成随机噪声，温度 * 噪声在采样之前添加到 logits。伪随机数生成器 (prng) 将种子作为输入，并使用同一种子生成相同的输出。如果未设置种子，则解码器中使用的种子具有不确定性，因此生成的随机噪声具有不确定性。如果设置了种子，则生成的随机噪声具有确定性。	`Optional`

示例请求

REST

如需使用 Vertex AI API 测试文本提示，请向发布方模型端点发送 POST 请求。

在使用任何请求数据之前，请先进行以下替换：

PROJECT_ID：您的项目 ID。

请求正文

HTTP 方法和网址：

POST https://us-central1-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/us-central1/publishers/google/models/codechat-bison:predict

请求 JSON 正文：

{
  "instances": [
    {
      "messages": [
        {
          "author": "AUTHOR",
          "content": "CONTENT"
        }
      ]
    }
  ],
  "parameters": {
    "temperature": TEMPERATURE,
    "maxOutputTokens": MAX_OUTPUT_TOKENS,
    "candidateCount": CANDIDATE_COUNT
  }
}

如需发送请求，请选择以下方式之一：

curl

注意：以下命令假定您已使用您的用户账号通过运行 gcloud init 或 gcloud auth login 登录 gcloud CLI，或者使用了 Cloud Shell，这会使您自动登录 gcloud CLI。您可以运行 gcloud auth list 来检查当前活跃的账号。

将请求正文保存在名为 request.json 的文件中，然后执行以下命令：

curl -X POST \
     -H "Authorization: Bearer $(gcloud auth print-access-token)" \
     -H "Content-Type: application/json; charset=utf-8" \
     -d @request.json \
     "https://us-central1-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/us-central1/publishers/google/models/codechat-bison:predict"

PowerShell

注意：以下命令假定您已使用您的用户账号通过运行 gcloud init 或 gcloud auth login 登录 gcloud CLI。您可以运行 gcloud auth list 来检查当前活跃的账号。

将请求正文保存在名为 request.json 的文件中，然后执行以下命令：

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
    -Method POST `
    -Headers $headers `
    -ContentType: "application/json; charset=utf-8" `
    -InFile request.json `
    -Uri "https://us-central1-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/us-central1/publishers/google/models/codechat-bison:predict" | Select-Object -Expand Content

您应该会收到类似示例响应的 JSON 响应。

Python

如需了解如何安装或更新 Vertex AI SDK for Python，请参阅安装 Vertex AI SDK for Python。如需了解详情，请参阅 Python API 参考文档。

from vertexai.language_models import CodeChatModel

# TODO developer - override these parameters as needed:
parameters = {
    "temperature": temperature,  # Temperature controls the degree of randomness in token selection.
    "max_output_tokens": 1024,  # Token limit determines the maximum amount of text output.
}

code_chat_model = CodeChatModel.from_pretrained("codechat-bison@001")
chat = code_chat_model.start_chat()

response = chat.send_message(
    "Please help write a function to calculate the min of two numbers", **parameters
)
print(f"Response from Model: {response.text}")

Node.js

在尝试此示例之前，请按照《Vertex AI 快速入门：使用客户端库》中的 Node.js 设置说明执行操作。如需了解详情，请参阅 Vertex AI Node.js API 参考文档。

如需向 Vertex AI 进行身份验证，请设置应用默认凭据。如需了解详情，请参阅为本地开发环境设置身份验证。

/**
 * TODO(developer): Uncomment these variables before running the sample.\
 * (Not necessary if passing values as arguments)
 */
// const project = 'YOUR_PROJECT_ID';
// const location = 'YOUR_PROJECT_LOCATION';
const aiplatform = require('@google-cloud/aiplatform');

// Imports the Google Cloud Prediction service client
const {PredictionServiceClient} = aiplatform.v1;

// Import the helper module for converting arbitrary protobuf.Value objects.
const {helpers} = aiplatform;

// Specifies the location of the api endpoint
const clientOptions = {
  apiEndpoint: 'us-central1-aiplatform.googleapis.com',
};
const publisher = 'google';
const model = 'codechat-bison@001';

// Instantiates a client
const predictionServiceClient = new PredictionServiceClient(clientOptions);

async function callPredict() {
  // Configure the parent resource
  const endpoint = `projects/${project}/locations/${location}/publishers/${publisher}/models/${model}`;

  // Learn more about creating prompts to work with a code chat model at:
  // https://cloud.google.com/vertex-ai/docs/generative-ai/code/code-chat-prompts
  const prompt = {
    messages: [
      {
        author: 'user',
        content: 'Hi, how are you?',
      },
      {
        author: 'system',
        content: 'I am doing good. What can I help you in the coding world?',
      },
      {
        author: 'user',
        content:
          'Please help write a function to calculate the min of two numbers',
      },
    ],
  };
  const instanceValue = helpers.toValue(prompt);
  const instances = [instanceValue];

  const parameter = {
    temperature: 0.5,
    maxOutputTokens: 1024,
  };
  const parameters = helpers.toValue(parameter);

  const request = {
    endpoint,
    instances,
    parameters,
  };

  // Predict request
  const [response] = await predictionServiceClient.predict(request);
  console.log('Get code chat response');
  const predictions = response.predictions;
  console.log('\tPredictions :');
  for (const prediction of predictions) {
    console.log(`\t\tPrediction : ${JSON.stringify(prediction)}`);
  }
}

callPredict();

Java

在尝试此示例之前，请按照《Vertex AI 快速入门：使用客户端库》中的 Java 设置说明执行操作。如需了解详情，请参阅 Vertex AI Java API 参考文档。

如需向 Vertex AI 进行身份验证，请设置应用默认凭据。如需了解详情，请参阅为本地开发环境设置身份验证。


import com.google.cloud.aiplatform.v1.EndpointName;
import com.google.cloud.aiplatform.v1.PredictResponse;
import com.google.cloud.aiplatform.v1.PredictionServiceClient;
import com.google.cloud.aiplatform.v1.PredictionServiceSettings;
import com.google.protobuf.InvalidProtocolBufferException;
import com.google.protobuf.Value;
import com.google.protobuf.util.JsonFormat;
import java.io.IOException;
import java.util.ArrayList;
import java.util.List;

public class PredictCodeChatSample {

  public static void main(String[] args) throws IOException {
    // TODO(developer): Replace this variable before running the sample.
    String project = "YOUR_PROJECT_ID";

    // Learn more about creating prompts to work with a code chat model at:
    // https://cloud.google.com/vertex-ai/docs/generative-ai/code/code-chat-prompts
    String instance =
        "{ \"messages\": [\n"
            + "{\n"
            + "  \"author\": \"user\",\n"
            + "  \"content\": \"Hi, how are you?\"\n"
            + "},\n"
            + "{\n"
            + "  \"author\": \"system\",\n"
            + "  \"content\": \"I am doing good. What can I help you in the coding world?\"\n"
            + " },\n"
            + "{\n"
            + "  \"author\": \"user\",\n"
            + "  \"content\":\n"
            + "     \"Please help write a function to calculate the min of two numbers.\"\n"
            + "}\n"
            + "]}";
    String parameters = "{\n" + "  \"temperature\": 0.5,\n" + "  \"maxOutputTokens\": 1024\n" + "}";
    String location = "us-central1";
    String publisher = "google";
    String model = "codechat-bison@001";

    predictCodeChat(instance, parameters, project, location, publisher, model);
  }

  // Use a code chat model to generate a code function
  public static void predictCodeChat(
      String instance,
      String parameters,
      String project,
      String location,
      String publisher,
      String model)
      throws IOException {
    final String endpoint = String.format("%s-aiplatform.googleapis.com:443", location);
    PredictionServiceSettings predictionServiceSettings =
        PredictionServiceSettings.newBuilder().setEndpoint(endpoint).build();

    // Initialize client that will be used to send requests. This client only needs to be created
    // once, and can be reused for multiple requests.
    try (PredictionServiceClient predictionServiceClient =
        PredictionServiceClient.create(predictionServiceSettings)) {
      final EndpointName endpointName =
          EndpointName.ofProjectLocationPublisherModelName(project, location, publisher, model);

      Value instanceValue = stringToValue(instance);
      List<Value> instances = new ArrayList<>();
      instances.add(instanceValue);

      Value parameterValue = stringToValue(parameters);

      PredictResponse predictResponse =
          predictionServiceClient.predict(endpointName, instances, parameterValue);
      System.out.println("Predict Response");
      System.out.println(predictResponse);
    }
  }

  // Convert a Json string to a protobuf.Value
  static Value stringToValue(String value) throws InvalidProtocolBufferException {
    Value.Builder builder = Value.newBuilder();
    JsonFormat.parser().merge(value, builder);
    return builder.build();
  }
}

响应正文

{
  "predictions": [
    {
      "candidates": [
        {
          "author": string,
          "content": string
        }
      ],
      "citationMetadata": {
        "citations": [
          {
            "startIndex": integer,
            "endIndex": integer,
            "url": string,
            "title": string,
            "license": string,
            "publicationDate": string
          }
        ]
      },
      "logprobs": {
        "tokenLogProbs": [ float ],
        "tokens": [ string ],
        "topLogProbs": [ { map<string, float> } ]
      },
      "safetyAttributes":{
        "categories": [ string ],
        "blocked": false,
        "scores": [ float ]
      },
      "score": float
    }
  ]
}

响应元素	说明
`author`	指示聊天响应作者的 `string`。
`blocked`	与安全属性关联的 `boolean` 标志，用于指示模型的输入或输出是否被阻止。如果 `blocked` 为 `true`，则响应中的 `errors` 字段包含一个或多个错误代码。如果 `blocked` 为 `false`，则响应不包含 `errors` 字段。
`categories`	与所生成内容关联的安全属性类别名称的列表。`scores` 参数中的得分顺序与类别的顺序匹配。例如，`scores` 参数中的第一个得分表示响应违反 `categories` 列表中第一个类别的可能性。
`content`	聊天响应的内容。
`endIndex`	一个整数，用于指定引用在 `content` 中的结束位置。
`errors`	错误代码数组。仅当响应中的 `blocked` 字段为 `true` 时，响应中才会包含 `errors` 响应字段。如需了解如何理解错误代码，请参阅安全错误。
`license`	与引用关联的许可。
`publicationDate`	引用的发布日期。其有效格式为 `YYYY`、`YYYY-MM`、`YYYY-MM-DD`。
`safetyAttributes`	一个安全属性阵列。该阵列包含每个响应候选项的一个安全属性。
`score`	小于零的 `float` 值。`score` 的值越高，模型回复的置信度就越高。
`scores`	`float` 值的数组。每个值都是一个得分，用于指示回复违反检查所依据的安全类别的可能性。值越小，模型就认为回复越安全。数组中得分的顺序与 `categories` 回复元素中的安全属性的顺序对应。
`startIndex`	一个整数，用于指定引用在 `content` 中的起始位置。
`title`	引用来源的标题。来源标题的示例可能是新闻报道或书籍标题。
`url`	引用来源的网址。网址来源的示例可能是新闻网站或 GitHub 代码库。
`tokens`	采样词元。
`tokenLogProbs`	采样词元的对数概率。
`topLogProbs`	每个步骤中最可能的候选词元及其对数概率。
`logprobs`	“logprobs”参数的结果。1-1 映射到“候选”。

示例响应

{
  "predictions": [
    {
      "citationMetadata": [
        {
          "citations": []
        }
      ],
      "candidates": [
        {
          "author": "AUTHOR",
          "content": "RESPONSE"
        }
      ],
      "safetyAttributes": {
        "categories": [],
        "blocked": false,
        "scores": []
      },
      "score": -1.1161688566207886
    }
  ]
}

流式传输来自生成式 AI 模型的响应

对于 API 的流式传输请求和非流式传输请求，这些参数是相同的。

如需使用 REST API 查看示例代码请求和响应，请参阅使用流式传输 REST API 的示例。

如需使用 Python 版 Vertex AI SDK 查看示例代码请求和响应，请参阅使用 Python 版 Vertex AI SDK 进行流式传输的示例。