テキストプロンプトリクエストを送信する

Vertex AI では、Google Cloud コンソールの Vertex AI Studio、Vertex AI API、Vertex AI SDK for Python を使用してプロンプトをテストできます。このページでは、これらのインターフェースを使用してテキストプロンプトをテストする方法について説明します。

テキストプロンプトの設計について詳しくは、テキストプロンプトを設計するをご覧ください。

テキストプロンプトをテストする

テキストプロンプトをテストするには、次のいずれかの方法を選択します。

REST

Vertex AI API を使用してテキストプロンプトをテストするには、パブリッシャーモデルエンドポイントに POST リクエストを送信します。

リクエストのデータを使用する前に、次のように置き換えます。

PROJECT_ID: 実際のプロジェクト ID。
PROMPT: プロンプトとは、レスポンスを受け取るために言語モデルに送信される自然言語リクエストのことです。プロンプトには、モデルを完了または続行するための質問、手順、コンテキスト情報、例、テキストを含めることができます（ここでは、プロンプトを引用符で囲まないでください）。
TEMPERATURE: 温度は、topP と topK が適用された場合に発生するレスポンス生成時のサンプリングに使用されます。温度は、トークン選択のランダム性の度合いを制御します。温度が低いほど、確定的で自由度や創造性を抑えたレスポンスが求められるプロンプトに適しています。一方、温度が高いと、より多様で創造的な結果を導くことができます。温度が 0 の場合、確率が最も高いトークンが常に選択されます。この場合、特定のプロンプトに対するレスポンスはほとんど確定的ですが、わずかに変動する可能性は残ります。
モデルが返すレスポンスが一般的すぎる、短すぎる、あるいはフォールバック（代替）レスポンスが返ってくる場合は、温度を高く設定してみてください。
MAX_OUTPUT_TOKENS: レスポンスで生成できるトークンの最大数。1 トークンは約 4 文字です。100 トークンは約 60～80 語に相当します。
レスポンスを短くしたい場合は小さい値を、長くしたい場合は大きい値を指定します。
TOP_P: Top-P は、モデルが出力用にトークンを選択する方法を変更します。トークンは、確率の合計が Top-P 値に等しくなるまで、確率の高いもの（Top-K を参照）から低いものへと選択されます。たとえば、トークン A、B、C の確率が 0.3、0.2、0.1 であり、Top-P 値が 0.5 であるとします。この場合、モデルは温度を使用して A または B を次のトークンとして選択し、C は候補から除外します。
ランダムなレスポンスを減らしたい場合は小さい値を、ランダムなレスポンスを増やしたい場合は大きい値を指定します。
TOP_K: Top-K は、モデルが出力用にトークンを選択する方法を変更します。Top-K が 1 の場合、次に選択されるトークンは、モデルの語彙内のすべてのトークンで最も確率の高いものであることになります（グリーディデコードとも呼ばれます）。Top-K が 3 の場合は、最も確率が高い上位 3 つのトークンから次のトークン選択されることになります（温度を使用します）。
トークン選択のそれぞれのステップで、最も高い確率を持つ Top-K のトークンがサンプリングされます。その後、トークンは Top-P に基づいてさらにフィルタリングされ、最終的なトークンは温度サンプリングを用いて選択されます。

ランダムなレスポンスを減らしたい場合は小さい値を、ランダムなレスポンスを増やしたい場合は大きい値を指定します。

HTTP メソッドと URL:

POST https://us-central1-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/us-central1/publishers/google/models/text-bison:predict

リクエストの本文（JSON）:

{
  "instances": [
    { "prompt": "PROMPT"}
  ],
  "parameters": {
    "temperature": TEMPERATURE,
    "maxOutputTokens": MAX_OUTPUT_TOKENS,
    "topP": TOP_P,
    "topK": TOP_K
  }
}

リクエストを送信するには、次のいずれかのオプションを選択します。

curl

注: 次のコマンドは、gcloud init または gcloud auth login を実行して、ユーザーアカウントで gcloud CLI にログインしているか、Cloud Shell を使用して自動的に gcloud CLI にログインしていることを前提としています。gcloud auth list を実行すると、現在アクティブなアカウントを確認できます。

リクエスト本文を request.json という名前のファイルに保存して、次のコマンドを実行します。

curl -X POST \
     -H "Authorization: Bearer $(gcloud auth print-access-token)" \
     -H "Content-Type: application/json; charset=utf-8" \
     -d @request.json \
     "https://us-central1-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/us-central1/publishers/google/models/text-bison:predict"

PowerShell

注: 次のコマンドは、gcloud init または gcloud auth login を実行して、ご自分のユーザーアカウントで gcloud CLI にログインしていることを前提としています。gcloud auth list を実行すると、現在アクティブなアカウントを確認できます。

リクエスト本文を request.json という名前のファイルに保存して、次のコマンドを実行します。

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
    -Method POST `
    -Headers $headers `
    -ContentType: "application/json; charset=utf-8" `
    -InFile request.json `
    -Uri "https://us-central1-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/us-central1/publishers/google/models/text-bison:predict" | Select-Object -Expand Content

次のような JSON レスポンスが返されます。

レスポンス

{
  "predictions": [
    {
      "content": "RESPONSE"
      "citationMetadata": {
        "citations": []
      },
      "safetyAttributes": {
        "scores": [],
        "blocked": false,
        "categories": []
      }
    }
  ],
  "metadata": {
    "tokenMetadata": {
      "inputTokenCount": {
        "totalBillableCharacters": ,
        "totalTokens": 
      },
      "outputTokenCount": {
        "totalBillableCharacters": ,
        "totalTokens": 
      }
    }
  }
}

text-bison curl コマンドの例

MODEL_ID="text-bison"
PROJECT_ID=PROJECT_ID

curl \
-X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json" \
https://us-central1-aiplatform.googleapis.com/v1/projects/${PROJECT_ID}/locations/us-central1/publishers/google/models/${MODEL_ID}:predict -d \
$'{
  "instances": [
    { "prompt": "Give me ten interview questions for the role of program manager." }
  ],
  "parameters": {
    "temperature": 0.2,
    "maxOutputTokens": 256,
    "topK": 40,
    "topP": 0.95
  }
}'

Python

Vertex AI SDK for Python のインストールまたは更新の方法については、Vertex AI SDK for Python をインストールするをご覧ください。詳細については、Python API リファレンスドキュメントをご覧ください。

import vertexai

from vertexai.language_models import TextGenerationModel

# TODO(developer): Update project_id and location
vertexai.init(project=PROJECT_ID, location="us-central1")
parameters = {
    "temperature": 0.2,  # Temperature controls the degree of randomness in token selection.
    "max_output_tokens": 256,  # Token limit determines the maximum amount of text output.
    "top_p": 0.8,  # Tokens are selected from most probable to least until the sum of their probabilities equals the top_p value.
    "top_k": 40,  # A top_k of 1 means the selected token is the most probable among all tokens.
}

model = TextGenerationModel.from_pretrained("text-bison@002")
response = model.predict(
    "Give me ten interview questions for the role of program manager.",
    **parameters,
)
print(f"Response from Model: {response.text}")

Go

このサンプルを試す前に、Vertex AI クイックスタート: クライアントライブラリの使用にある Go の設定手順を完了してください。詳細については、Vertex AI Go API のリファレンスドキュメントをご覧ください。

Vertex AI に対する認証を行うには、アプリケーションのデフォルト認証情報を設定します。詳細については、ローカル開発環境の認証を設定するをご覧ください。


import (
	"context"
	"fmt"
	"io"

	aiplatform "cloud.google.com/go/aiplatform/apiv1beta1"
	"cloud.google.com/go/aiplatform/apiv1beta1/aiplatformpb"
	"google.golang.org/api/option"
	"google.golang.org/protobuf/types/known/structpb"
)

// textPredict generates text with certain prompt and configurations.
func textPredict(w io.Writer, projectID, location, model string) error {
	ctx := context.Background()

	prompt := "Hello, say something nice."
	publisher := "google"
	parameters := map[string]interface{}{
		"temperature":     0.8,
		"maxOutputTokens": 256,
		"topP":            0.4,
		"topK":            40,
	}

	apiEndpoint := fmt.Sprintf("%s-aiplatform.googleapis.com:443", location)

	client, err := aiplatform.NewPredictionClient(ctx, option.WithEndpoint(apiEndpoint))
	if err != nil {
		fmt.Fprintf(w, "unable to create prediction client: %v", err)
		return err
	}
	defer client.Close()

	// PredictRequest requires an endpoint, instances, and parameters
	// Endpoint
	base := fmt.Sprintf("projects/%s/locations/%s/publishers/%s/models", projectID, location, publisher)
	url := fmt.Sprintf("%s/%s", base, model)

	// Instances: the prompt to use with the text model
	promptValue, err := structpb.NewValue(map[string]interface{}{
		"prompt": prompt,
	})
	if err != nil {
		fmt.Fprintf(w, "unable to convert prompt to Value: %v", err)
		return err
	}

	// Parameters: the model configuration parameters
	parametersValue, err := structpb.NewValue(parameters)
	if err != nil {
		fmt.Fprintf(w, "unable to convert parameters to Value: %v", err)
		return err
	}

	// PredictRequest: create the model prediction request
	req := &aiplatformpb.PredictRequest{
		Endpoint:   url,
		Instances:  []*structpb.Value{promptValue},
		Parameters: parametersValue,
	}

	// PredictResponse: receive the response from the model
	resp, err := client.Predict(ctx, req)
	if err != nil {
		fmt.Fprintf(w, "error in prediction: %v", err)
		return err
	}

	fmt.Fprintf(w, "text-prediction response: %v", resp.Predictions[0])
	return nil
}

Java

このサンプルを試す前に、Vertex AI クイックスタート: クライアントライブラリの使用にある Java の設定手順を完了してください。詳細については、Vertex AI Java API のリファレンスドキュメントをご覧ください。


import com.google.cloud.aiplatform.v1.EndpointName;
import com.google.cloud.aiplatform.v1.PredictResponse;
import com.google.cloud.aiplatform.v1.PredictionServiceClient;
import com.google.cloud.aiplatform.v1.PredictionServiceSettings;
import com.google.protobuf.Value;
import com.google.protobuf.util.JsonFormat;
import java.io.IOException;
import java.util.ArrayList;
import java.util.List;

public class PredictTextPromptSample {

  public static void main(String[] args) throws IOException {
    // TODO(developer): Replace these variables before running the sample.
    // Details of designing text prompts for supported large language models:
    // https://cloud.google.com/vertex-ai/docs/generative-ai/text/text-overview
    String instance =
        "{ \"prompt\": " + "\"Give me ten interview questions for the role of program manager.\"}";
    String parameters =
        "{\n"
            + "  \"temperature\": 0.2,\n"
            + "  \"maxOutputTokens\": 256,\n"
            + "  \"topP\": 0.95,\n"
            + "  \"topK\": 40\n"
            + "}";
    String project = "YOUR_PROJECT_ID";
    String location = "us-central1";
    String publisher = "google";
    String model = "text-bison@001";

    predictTextPrompt(instance, parameters, project, location, publisher, model);
  }

  // Get a text prompt from a supported text model
  public static void predictTextPrompt(
      String instance,
      String parameters,
      String project,
      String location,
      String publisher,
      String model)
      throws IOException {
    String endpoint = String.format("%s-aiplatform.googleapis.com:443", location);
    PredictionServiceSettings predictionServiceSettings =
        PredictionServiceSettings.newBuilder().setEndpoint(endpoint).build();

    // Initialize client that will be used to send requests. This client only needs to be created
    // once, and can be reused for multiple requests.
    try (PredictionServiceClient predictionServiceClient =
        PredictionServiceClient.create(predictionServiceSettings)) {
      final EndpointName endpointName =
          EndpointName.ofProjectLocationPublisherModelName(project, location, publisher, model);

      // Initialize client that will be used to send requests. This client only needs to be created
      // once, and can be reused for multiple requests.
      Value.Builder instanceValue = Value.newBuilder();
      JsonFormat.parser().merge(instance, instanceValue);
      List<Value> instances = new ArrayList<>();
      instances.add(instanceValue.build());

      // Use Value.Builder to convert instance to a dynamically typed value that can be
      // processed by the service.
      Value.Builder parameterValueBuilder = Value.newBuilder();
      JsonFormat.parser().merge(parameters, parameterValueBuilder);
      Value parameterValue = parameterValueBuilder.build();

      PredictResponse predictResponse =
          predictionServiceClient.predict(endpointName, instances, parameterValue);
      System.out.println("Predict Response");
      System.out.println(predictResponse);
    }
  }
}

Node.js

このサンプルを試す前に、Vertex AI クイックスタート: クライアントライブラリの使用にある Node.js の設定手順を完了してください。詳細については、Vertex AI Node.js API のリファレンスドキュメントをご覧ください。

/**
 * TODO(developer): Update these variables before running the sample.
 */
const PROJECT_ID = process.env.CAIP_PROJECT_ID;
const LOCATION = 'us-central1';
const PUBLISHER = 'google';
const MODEL = 'text-bison@001';
const aiplatform = require('@google-cloud/aiplatform');

// Imports the Google Cloud Prediction service client
const {PredictionServiceClient} = aiplatform.v1;

// Import the helper module for converting arbitrary protobuf.Value objects.
const {helpers} = aiplatform;

// Specifies the location of the api endpoint
const clientOptions = {
  apiEndpoint: 'us-central1-aiplatform.googleapis.com',
};

// Instantiates a client
const predictionServiceClient = new PredictionServiceClient(clientOptions);

async function callPredict() {
  // Configure the parent resource
  const endpoint = `projects/${PROJECT_ID}/locations/${LOCATION}/publishers/${PUBLISHER}/models/${MODEL}`;

  const prompt = {
    prompt:
      'Give me ten interview questions for the role of program manager.',
  };
  const instanceValue = helpers.toValue(prompt);
  const instances = [instanceValue];

  const parameter = {
    temperature: 0.2,
    maxOutputTokens: 256,
    topP: 0.95,
    topK: 40,
  };
  const parameters = helpers.toValue(parameter);

  const request = {
    endpoint,
    instances,
    parameters,
  };

  // Predict request
  const response = await predictionServiceClient.predict(request);
  console.log('Get text prompt response');
  console.log(response);
}

callPredict();

C#

このサンプルを試す前に、Vertex AI クイックスタート: クライアントライブラリの使用にある C# の設定手順を完了してください。詳細については、Vertex AI C# API のリファレンスドキュメントをご覧ください。


using Google.Cloud.AIPlatform.V1;
using System;
using System.Collections.Generic;
using System.Linq;
using Value = Google.Protobuf.WellKnownTypes.Value;

public class PredictTextPromptSample
{
    public string PredictTextPrompt(
        string projectId = "your-project-id",
        string locationId = "us-central1",
        string publisher = "google",
        string model = "text-bison@001"
    )
    {
        // Initialize client that will be used to send requests.
        // This client only needs to be created
        // once, and can be reused for multiple requests.
        var client = new PredictionServiceClientBuilder
        {
            Endpoint = $"{locationId}-aiplatform.googleapis.com"
        }.Build();

        // Configure the parent resource
        var endpoint = EndpointName.FromProjectLocationPublisherModel(projectId, locationId, publisher, model);

        // Initialize request argument(s)
        var prompt = "Give me ten interview questions for the role of program manager.";

        var instanceValue = Value.ForStruct(new()
        {
            Fields =
            {
                ["prompt"] = Value.ForString(prompt)
            }
        });

        var instances = new List<Value>
        {
            instanceValue
        };

        var parameters = Value.ForStruct(new()
        {
            Fields =
            {
                { "temperature", new Value { NumberValue = 0.2 } },
                { "maxOutputTokens", new Value { NumberValue = 256 } },
                { "topP", new Value { NumberValue = 0.95 } },
                { "topK", new Value { NumberValue = 40 } }
            }
        });

        // Make the request
        var response = client.Predict(endpoint, instances, parameters);

        // Parse and return the content.
        var content = response.Predictions.First().StructValue.Fields["content"].StringValue;
        Console.WriteLine($"Content: {content}");
        return content;
    }
}

Ruby

このサンプルを試す前に、Vertex AI クイックスタート: クライアントライブラリの使用にある Ruby の設定手順を完了してください。詳細については、Vertex AI Ruby API のリファレンスドキュメントをご覧ください。

require "google/cloud/ai_platform/v1"

##
# Vertex AI Predict Text Prompt
#
# @param project_id [String] Your Google Cloud project (e.g. "my-project")
# @param location_id [String] Your Processor Location (e.g. "us-central1")
# @param publisher [String] The Model Publisher (e.g. "google")
# @param model [String] The Model Identifier (e.g. "text-bison@001")
#
def predict_text_prompt project_id:, location_id:, publisher:, model:
  # Create the Vertex AI client.
  client = ::Google::Cloud::AIPlatform::V1::PredictionService::Client.new do |config|
    config.endpoint = "#{location_id}-aiplatform.googleapis.com"
  end

  # Build the resource name from the project.
  endpoint = client.endpoint_path(
    project: project_id,
    location: location_id,
    publisher: publisher,
    model: model
  )

  prompt = "Give me ten interview questions for the role of program manager."

  # Initialize the request arguments
  instance = Google::Protobuf::Value.new(
    struct_value: Google::Protobuf::Struct.new(
      fields: {
        "prompt" => Google::Protobuf::Value.new(
          string_value: prompt
        )
      }
    )
  )

  instances = [instance]

  parameters = Google::Protobuf::Value.new(
    struct_value: Google::Protobuf::Struct.new(
      fields: {
        "temperature" => Google::Protobuf::Value.new(number_value: 0.2),
        "maxOutputTokens" => Google::Protobuf::Value.new(number_value: 256),
        "topP" => Google::Protobuf::Value.new(number_value: 0.95),
        "topK" => Google::Protobuf::Value.new(number_value: 40)
      }
    )
  )

  # Make the prediction request
  response = client.predict endpoint: endpoint, instances: instances, parameters: parameters

  # Handle the prediction response
  puts "Predict Response"
  puts response
end

コンソール

Google Cloud コンソールの Vertex AI Studio を使用してテキストプロンプトをテストするには、以下の操作を行います。

Google Cloud コンソールの [Vertex AI] セクションで、[Vertex AI Studio] ページに移動します。
Vertex AI Studio に移動
[開始] タブをクリックします。
[TEXT PROMPT] をクリックします。
プロンプトの入力方法を以下から選択します。
- 自由形式: ゼロショットプロンプトまたはコピーペーストのフューショットプロンプトにおすすめです。
- 構造化: Vertex AI Studio で少数ショットプロンプトを設計する場合におすすめします。
自由形式

[Prompt] テキストフィールドにプロンプトを入力します。
構造化

プロンプトの入力方法に [構造化] を選択した場合、プロンプトのコンポーネントが以下のようにさまざまなフィールドに分割されます。
- コンテキスト: モデルが実行するタスクの手順を入力し、モデルが参照するコンテキスト情報を含めます。
- 例: 少数ショットプロンプトの場合は、モデルが模倣する動作パターンを示す入出力の例を追加します。入力と出力の例で接頭辞の追加は任意です。接頭辞を追加する場合は、すべての例で一貫している必要があります。
- テスト: [入力] フィールドに、レスポンスの取得対象となるプロンプトの入力内容を入力します。テストの入出力に接頭辞を追加するかどうかは任意です。[例] に接頭辞を追加した場合、テストでも同じ接頭辞を使用する必要があります。
モデルとパラメータを構成します。
- モデル: text-bison モデルまたは gemini-1.0-pro モデルを選択します。
- 温度: スライダーまたはテキストボックスを使用して、温度の値を入力します。
  温度は、レスポンス生成時のサンプリングに使用されます。レスポンス生成は、topP と topK が適用された場合に発生します。温度は、トークン選択のランダム性の度合いを制御します。温度が低いほど、確定的で自由度や創造性を抑えたレスポンスが求められるプロンプトに適しています。一方、温度が高いと、より多様で創造的な結果を導くことができます。温度が 0 の場合、確率が最も高いトークンが常に選択されます。この場合、特定のプロンプトに対するレスポンスはほとんど確定的ですが、わずかに変動する可能性は残ります。
  モデルが返すレスポンスが一般的すぎる、短すぎる、あるいはフォールバック（代替）レスポンスが返ってくる場合は、温度を高く設定してみてください。
- トークンの上限: スライダーまたはテキストボックスを使用して、最大出力の上限値を入力します。
  レスポンスで生成できるトークンの最大数。1 トークンは約 4 文字です。100 トークンは約 60～80 語に相当します。
  レスポンスを短くしたい場合は小さい値を、長くしたい場合は大きい値を指定します。
- Top-K: スライダーまたはテキストボックスを使用して、Top-K の値を入力します。
  Top-K は、モデルが出力用にトークンを選択する方法を変更します。Top-K が 1 の場合、次に選択されるトークンは、モデルの語彙内のすべてのトークンで最も確率の高いものであることになります（グリーディデコードとも呼ばれます）。Top-K が 3 の場合は、最も確率が高い上位 3 つのトークンから次のトークン選択されることになります（温度を使用します）。
  トークン選択のそれぞれのステップで、最も高い確率を持つ Top-K のトークンがサンプリングされます。その後、トークンは Top-P に基づいてさらにフィルタリングされ、最終的なトークンは温度サンプリングを用いて選択されます。
  
  ランダムなレスポンスを減らしたい場合は小さい値を、ランダムなレスポンスを増やしたい場合は大きい値を指定します。
- Top-P: スライダーまたはテキストボックスを使用して、Top-P の値を入力します。確率の合計が Top-P の値と等しくなるまで、最も確率が高いものから最も確率が低いものの順に、トークンが選択されます。結果を最小にするには、Top-P を 0 に設定します。
[送信] をクリックします。
省略可: プロンプトを [My prompts] に保存するには、[Save] をクリックします。
省略可: プロンプトの Python コードまたは curl コマンドを取得するには、 [View code] をクリックします。

テキストモデルからのレスポンスをストリーミングする

REST API を使用してサンプルコードのリクエストとレスポンスを表示するには、REST API の使用例をご覧ください。

Vertex AI SDK for Python を使用してサンプルコードのリクエストとレスポンスを表示するには、Vertex AI SDK for Python の使用例をご覧ください。

次のステップ

Gemini チャットプロンプトリクエストを送信する方法を学習する。
チャットプロンプトのテスト方法を確認する。
基盤モデルのチューニング方法を学習する。
責任ある AI のベストプラクティスと Vertex AI の安全フィルタについて学習する。

テキスト プロンプト リクエストを送信する

テキスト プロンプトをテストする

REST

curl

PowerShell

レスポンス

text-bison curl コマンドの例

Python

Go

Java

Node.js

C#

Ruby

コンソール

自由形式

構造化

テキストモデルからのレスポンスをストリーミングする

次のステップ

テキストプロンプトリクエストを送信する

テキストプロンプトをテストする