コードチャット

Codey for Code Chat（codechat-bison）は、コードチャットをサポートするモデルの名前です。これは、コードに特化したマルチターンの会話をサポートする基盤モデルです。このモデルでは、デベロッパーは chatbot とチャットして、コード関連の質問を行うことができます。Code Chat API は、Codey for Code Chat モデルとのインターフェースに使用されます。

コードチャットの Codey は、やり取りが必要になるコードタスクに最適で、継続的な会話を行うことができます。1 回の操作で完成させるコードタスクの場合は、コード補完の API かコード生成の API を使用します。

コンソールでこのモデルを確認するには、Model Garden に移動して Codey for Code Chat モデルカードを使用します。
Model Garden に移動

ユースケース

コードチャットの一般的なユースケースは次のとおりです。

サポート: API、サポートされているプログラミング言語の構文、コードの実行に必要なライブラリのバージョンなど、コードに関する質問を行うことができます。
デバッグ: コンパイルでエラーが発生するコードや、バグが含まれるコードのデバッグをサポートします。
ドキュメント: コードを正確に記述できるようにコードの理解を支援します。
学習: 慣れていないコードについて学習できるよう支援します。

HTTP リクエスト

POST https://us-central1-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/us-central1/publishers/google/models/codechat-bison:predict

モデルバージョン

モデルの最新バージョンを使用するには、バージョン番号なしでモデル名を指定します（例: codechat-bison）。

モデルの安定版を使用する場合は、モデルのバージョン番号を指定します（例: codechat-bison@002）。安定版は、後続の安定版のリリース日から 6 か月間利用できます。

次の表に、利用可能なモデルの安定版を示します。

codechat-bison モデル	リリース日	廃止日
codechat-bison@002	2023 年 12 月 6 日	2024 年 10 月 9 日
codechat-bison@001	2023 年 6 月 29 日	2024 年 7 月 6 日

詳細については、モデルのバージョンとライフサイクルをご覧ください。

リクエストの本文

{
  "instances": [
    {
      "context": string,
      "messages": [
        {
          "content": string,
          "author": string
        }
      ]
    }
  ],
  "parameters":{
    "temperature": number,
    "maxOutputTokens": integer,
    "candidateCount": integer,
    "logprobs": integer,
    "presencePenalty": float,
    "frequencyPenalty": float,
    "seed": integer
  }
}

codechat-bison コードチャットモデルのパラメータは次のとおりです。codechat-bison モデルは Codey のモデルの一つです。これらのパラメータを使用すると、chatbot とコードに関する会話を行うためのプロンプトを最適化できます。詳細については、コードモデルの概要とコードについてチャットするプロンプトを作成するをご覧ください。

パラメータ	説明	許容できる値
`context`	有効な回答を得るために最初にモデルに提供する必要のあるテキスト。	テキスト
`messages` （必須）	構造化された形式でモデルに提供される会話の履歴。メッセージは古い順、新しい順に表示されます。メッセージの履歴のために入力が最大文字数を超えると、プロンプト全体が上限内に収まるまで最も古いメッセーが削除されます。	List[Structured Message] "author": "user", "content": "user message"
`temperature` （省略可）	温度は、レスポンス生成時のサンプリングに使用されます。温度は、トークン選択のランダム性の度合いを制御します。温度が低いほど、確定的で自由度や創造性を抑えたレスポンスが求められるプロンプトに適しています。一方、温度が高いと、より多様で創造的な結果を導くことができます。温度が `0` の場合、確率が最も高いトークンが常に選択されます。この場合、特定のプロンプトに対するレスポンスはほとんど確定的ですが、わずかに変動する可能性は残ります。	`0.0–1.0` `Default: 0.2`
`maxOutputTokens` （省略可）	レスポンスで生成できるトークンの最大数。1 トークンは約 4 文字です。100 トークンは約 60～80 語に相当します。レスポンスを短くする場合は小さい値、長くする場合は大きい値を指定します。	`1–2048` `Default: 1024`
`candidateCount` （省略可）	返すレスポンスバリエーションの数。	`1-4` `Default: 1`
`logprobs` （省略可）	各生成ステップで、最上位 `logprobs` の最も可能性の高い候補トークンとそのログ確率を返します。各ステップで選択したトークンとそのログ確率は常に返されます。選択したトークンは、最上位 `logprobs` の最も可能性の高い候補に含まれる場合もあれば、含まれない場合もあります。	`0-5`
`frequencyPenalty` （省略可）	値が正の場合は、生成されたテキストに繰り返し出現するトークンにペナルティが課されるため、コンテンツが繰り返される確率は低下します。有効な値は `-2.0`～`2.0` です。	`Minimum value: -2.0 Maximum value: 2.0`
`presencePenalty` （省略可）	値が正の場合は、生成されたテキスト内の既存のトークンにペナルティが課されるため、より多様なコンテンツが生成される確率は高くなります。有効な値は `-2.0`～`2.0` です。	`Minimum value: -2.0 Maximum value: 2.0`
`seed`	デコーダは、擬似乱数生成ツールを使用してランダムなノイズを生成します。サンプリングを行う前に、温度 * ノイズをロジットに追加します。疑似乱数生成ツール（prng）は、シードを入力として受け取り、同じシードを使用して同じ出力を生成します。シードが設定されていない場合、デコーダで使用されるシードが決定的ではないため、生成されたランダムノイズは変動する可能性があります。シードが設定されている場合、生成されるランダムノイズは決定的です。	`Optional`

リクエストの例

REST

Vertex AI API を使用してテキストプロンプトをテストするには、パブリッシャーモデルエンドポイントに POST リクエストを送信します。

リクエストのデータを使用する前に、次のように置き換えます。

PROJECT_ID: 実際のプロジェクト ID。

リクエストの本文

HTTP メソッドと URL:

POST https://us-central1-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/us-central1/publishers/google/models/codechat-bison:predict

リクエストの本文（JSON）:

{
  "instances": [
    {
      "messages": [
        {
          "author": "AUTHOR",
          "content": "CONTENT"
        }
      ]
    }
  ],
  "parameters": {
    "temperature": TEMPERATURE,
    "maxOutputTokens": MAX_OUTPUT_TOKENS,
    "candidateCount": CANDIDATE_COUNT
  }
}

リクエストを送信するには、次のいずれかのオプションを選択します。

curl

注: 次のコマンドは、gcloud init または gcloud auth login を実行して、ユーザーアカウントで gcloud CLI にログインしているか、Cloud Shell を使用して自動的に gcloud CLI にログインしていることを前提としています。gcloud auth list を実行すると、現在アクティブなアカウントを確認できます。

リクエスト本文を request.json という名前のファイルに保存して、次のコマンドを実行します。

curl -X POST \
    -H "Authorization: Bearer $(gcloud auth print-access-token)" \
    -H "Content-Type: application/json; charset=utf-8" \
    -d @request.json \
    "https://us-central1-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/us-central1/publishers/google/models/codechat-bison:predict"

PowerShell

リクエスト本文を request.json という名前のファイルに保存して、次のコマンドを実行します。

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
    -Method POST `
    -Headers $headers `
    -ContentType: "application/json; charset=utf-8" `
    -InFile request.json `
    -Uri "https://us-central1-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/us-central1/publishers/google/models/codechat-bison:predict" | Select-Object -Expand Content

レスポンスの例のような JSON レスポンスが返されます。

Python

Python をインストールまたは更新する方法については、Vertex AI SDK for Python をインストールするをご覧ください。詳細については、Python API リファレンスドキュメントをご覧ください。

from vertexai.language_models import CodeChatModel

def write_a_function(temperature: float = 0.5) -> object:
    """Example of using Codey for Code Chat Model to write a function."""

    # TODO developer - override these parameters as needed:
    parameters = {
        "temperature": temperature,  # Temperature controls the degree of randomness in token selection.
        "max_output_tokens": 1024,  # Token limit determines the maximum amount of text output.
    }

    code_chat_model = CodeChatModel.from_pretrained("codechat-bison@001")
    chat = code_chat_model.start_chat()

    response = chat.send_message(
        "Please help write a function to calculate the min of two numbers", **parameters
    )
    print(f"Response from Model: {response.text}")

    return response

Node.js

このサンプルを試す前に、Vertex AI クイックスタート: クライアントライブラリの使用にある Node.js の設定手順を完了してください。詳細については、Vertex AI Node.js API のリファレンスドキュメントをご覧ください。

Vertex AI に対する認証を行うには、アプリケーションのデフォルト認証情報を設定します。詳細については、ローカル開発環境の認証を設定するをご覧ください。

/**
 * TODO(developer): Uncomment these variables before running the sample.\
 * (Not necessary if passing values as arguments)
 */
// const project = 'YOUR_PROJECT_ID';
// const location = 'YOUR_PROJECT_LOCATION';
const aiplatform = require('@google-cloud/aiplatform');

// Imports the Google Cloud Prediction service client
const {PredictionServiceClient} = aiplatform.v1;

// Import the helper module for converting arbitrary protobuf.Value objects.
const {helpers} = aiplatform;

// Specifies the location of the api endpoint
const clientOptions = {
  apiEndpoint: 'us-central1-aiplatform.googleapis.com',
};
const publisher = 'google';
const model = 'codechat-bison@001';

// Instantiates a client
const predictionServiceClient = new PredictionServiceClient(clientOptions);

async function callPredict() {
  // Configure the parent resource
  const endpoint = `projects/${project}/locations/${location}/publishers/${publisher}/models/${model}`;

  // Learn more about creating prompts to work with a code chat model at:
  // https://cloud.google.com/vertex-ai/docs/generative-ai/code/code-chat-prompts
  const prompt = {
    messages: [
      {
        author: 'user',
        content: 'Hi, how are you?',
      },
      {
        author: 'system',
        content: 'I am doing good. What can I help you in the coding world?',
      },
      {
        author: 'user',
        content:
          'Please help write a function to calculate the min of two numbers',
      },
    ],
  };
  const instanceValue = helpers.toValue(prompt);
  const instances = [instanceValue];

  const parameter = {
    temperature: 0.5,
    maxOutputTokens: 1024,
  };
  const parameters = helpers.toValue(parameter);

  const request = {
    endpoint,
    instances,
    parameters,
  };

  // Predict request
  const [response] = await predictionServiceClient.predict(request);
  console.log('Get code chat response');
  const predictions = response.predictions;
  console.log('\tPredictions :');
  for (const prediction of predictions) {
    console.log(`\t\tPrediction : ${JSON.stringify(prediction)}`);
  }
}

callPredict();

Java

このサンプルを試す前に、Vertex AI クイックスタート: クライアントライブラリの使用にある Java の設定手順を完了してください。詳細については、Vertex AI Java API のリファレンスドキュメントをご覧ください。


import com.google.cloud.aiplatform.v1beta1.EndpointName;
import com.google.cloud.aiplatform.v1beta1.PredictResponse;
import com.google.cloud.aiplatform.v1beta1.PredictionServiceClient;
import com.google.cloud.aiplatform.v1beta1.PredictionServiceSettings;
import com.google.protobuf.InvalidProtocolBufferException;
import com.google.protobuf.Value;
import com.google.protobuf.util.JsonFormat;
import java.io.IOException;
import java.util.ArrayList;
import java.util.List;

public class PredictCodeChatSample {

  public static void main(String[] args) throws IOException {
    // TODO(developer): Replace this variable before running the sample.
    String project = "YOUR_PROJECT_ID";

    // Learn more about creating prompts to work with a code chat model at:
    // https://cloud.google.com/vertex-ai/docs/generative-ai/code/code-chat-prompts
    String instance =
        "{ \"messages\": [\n"
            + "{\n"
            + "  \"author\": \"user\",\n"
            + "  \"content\": \"Hi, how are you?\"\n"
            + "},\n"
            + "{\n"
            + "  \"author\": \"system\",\n"
            + "  \"content\": \"I am doing good. What can I help you in the coding world?\"\n"
            + " },\n"
            + "{\n"
            + "  \"author\": \"user\",\n"
            + "  \"content\":\n"
            + "     \"Please help write a function to calculate the min of two numbers.\"\n"
            + "}\n"
            + "]}";
    String parameters = "{\n" + "  \"temperature\": 0.5,\n" + "  \"maxOutputTokens\": 1024\n" + "}";
    String location = "us-central1";
    String publisher = "google";
    String model = "codechat-bison@001";

    predictCodeChat(instance, parameters, project, location, publisher, model);
  }

  // Use a code chat model to generate a code function
  public static void predictCodeChat(
      String instance,
      String parameters,
      String project,
      String location,
      String publisher,
      String model)
      throws IOException {
    final String endpoint = String.format("%s-aiplatform.googleapis.com:443", location);
    PredictionServiceSettings predictionServiceSettings =
        PredictionServiceSettings.newBuilder().setEndpoint(endpoint).build();

    // Initialize client that will be used to send requests. This client only needs to be created
    // once, and can be reused for multiple requests.
    try (PredictionServiceClient predictionServiceClient =
        PredictionServiceClient.create(predictionServiceSettings)) {
      final EndpointName endpointName =
          EndpointName.ofProjectLocationPublisherModelName(project, location, publisher, model);

      Value instanceValue = stringToValue(instance);
      List<Value> instances = new ArrayList<>();
      instances.add(instanceValue);

      Value parameterValue = stringToValue(parameters);

      PredictResponse predictResponse =
          predictionServiceClient.predict(endpointName, instances, parameterValue);
      System.out.println("Predict Response");
      System.out.println(predictResponse);
    }
  }

  // Convert a Json string to a protobuf.Value
  static Value stringToValue(String value) throws InvalidProtocolBufferException {
    Value.Builder builder = Value.newBuilder();
    JsonFormat.parser().merge(value, builder);
    return builder.build();
  }
}

レスポンスの本文

{
  "predictions": [
    {
      "candidates": [
        {
          "author": string,
          "content": string
        }
      ],
      "citationMetadata": {
        "citations": [
          {
            "startIndex": integer,
            "endIndex": integer,
            "url": string,
            "title": string,
            "license": string,
            "publicationDate": string
          }
        ]
      },
      "logprobs": {
        "tokenLogProbs": [ float ],
        "tokens": [ string ],
        "topLogProbs": [ { map<string, float> } ]
      },
      "safetyAttributes":{
        "categories": [ string ],
        "blocked": false,
        "scores": [ float ]
      },
      "score": float
    }
  ]
}

レスポンス要素	説明
`author`	`string`。チャットレスポンスの作成者を示します。
`blocked`	`boolean` フラグ。モデルの入力または出力がブロックされたかどうかを示す安全性属性に関連するフラグです。`blocked` が `true` の場合、レスポンスの `errors` フィールドには 1 つ以上のエラーコードが含まれます。`blocked` が `false` の場合、レスポンスに `errors` フィールドは含まれません。
`categories`	生成されたコンテンツに関連付けられた安全性属性のカテゴリ名のリスト。`scores` パラメータのスコアの順序はカテゴリの順序と同じです。たとえば、`scores` パラメータの最初のスコアは、レスポンスが `categories` リストの最初のカテゴリに違反する可能性を示しています。
`content`	チャットの返信の内容。
`endIndex`	`content` の中で引用の終了位置を示す整数。
`errors`	エラーコードの配列。`errors` レスポンスフィールドは、レスポンスの `blocked` フィールドが `true` の場合にのみレスポンスに含まれます。エラーコードの詳細については、安全性エラーをご覧ください。
`license`	引用に関連付けられているライセンス。
`publicationDate`	引用が公開された日付。有効な形式は、`YYYY`、`YYYY-MM`、`YYYY-MM-DD` です。
`safetyAttributes`	安全性属性の配列。この配列には、レスポンスの候補ごとに 1 つの安全性属性が含まれます。
`score`	`float` 値。0 未満の値です。`score` の値が高いほど、モデルのレスポンスの信頼度が高くなります。
`scores`	`float` 値の配列。各値は、レスポンスがチェック対象の安全性カテゴリに違反する可能性を示すスコアです。値が小さいほど、モデルはそのレスポンスが安全だとみなします。配列内のスコアの順序は、`categories` レスポンス要素の安全性属性の順序に対応しています。
`startIndex`	コンテンツ内での引用の開始位置を示す整数。
`title`	引用元のタイトル。引用元のタイトルとしては、ニュース記事や書籍などがあります。
`url`	引用元の URL。引用元の URL ソースとしては、ニュースサイトや GitHub リポジトリなどがあります。
`tokens`	サンプリングされたトークン。
`tokenLogProbs`	サンプリングされたトークンのログ確率。
`topLogProbs`	各ステップで最も可能性の高い候補トークンとそのログ確率。
`logprobs`	logprobs パラメータの結果。candidates に対する 1 対 1 のマッピング。

レスポンスの例

{
  "predictions": [
    {
      "citationMetadata": [
        {
          "citations": []
        }
      ],
      "candidates": [
        {
          "author": "AUTHOR",
          "content": "RESPONSE"
        }
      ],
      "safetyAttributes": {
        "categories": [],
        "blocked": false,
        "scores": []
      },
      "score": -1.1161688566207886
    }
  ]
}

生成 AI モデルからのレスポンスをストリーミングする

API に対するストリーミングリクエストと非ストリーミングリクエストでパラメータは同じです。

REST API を使用してサンプルコードのリクエストとレスポンスを表示するには、ストリーミング REST API の使用例をご覧ください。

Vertex AI SDK for Python を使用してサンプルコードのリクエストとレスポンスを表示するには、ストリーミングでの Vertex AI SDK for Python の使用例をご覧ください。

コードチャット

ユースケース

HTTP リクエスト

モデル バージョン

リクエストの本文

リクエストの例

REST

curl

PowerShell

Python

Node.js

Java

レスポンスの本文

レスポンスの例

生成 AI モデルからのレスポンスをストリーミングする

モデルバージョン