マルチモーダル プロンプトからテキストを生成する
コレクションでコンテンツを整理
必要に応じて、コンテンツの保存と分類を行います。
このサンプルでは、Gemini モデルを使用してマルチモーダル プロンプトからテキストを生成する方法を示します。プロンプトは、3 つの画像と 2 つのテキスト プロンプトで構成されています。このモデルは、画像とテキスト プロンプトを説明するテキスト レスポンスを生成します。
コードサンプル
特に記載のない限り、このページのコンテンツはクリエイティブ・コモンズの表示 4.0 ライセンスにより使用許諾されます。コードサンプルは Apache 2.0 ライセンスにより使用許諾されます。詳しくは、Google Developers サイトのポリシーをご覧ください。Java は Oracle および関連会社の登録商標です。
[[["わかりやすい","easyToUnderstand","thumb-up"],["問題の解決に役立った","solvedMyProblem","thumb-up"],["その他","otherUp","thumb-up"]],[["わかりにくい","hardToUnderstand","thumb-down"],["情報またはサンプルコードが不正確","incorrectInformationOrSampleCode","thumb-down"],["必要な情報 / サンプルがない","missingTheInformationSamplesINeed","thumb-down"],["翻訳に関する問題","translationIssue","thumb-down"],["その他","otherDown","thumb-down"]],[],[],[],null,["# Generate text from multimodal prompt\n\nThis sample demonstrates how to generate text from a multimodal prompt using the Gemini model. The prompt consists of three images and two text prompts. The model generates a text response that describes the images and the text prompts.\n\nCode sample\n-----------\n\n### C#\n\n\nBefore trying this sample, follow the C# setup instructions in the\n[Vertex AI quickstart using\nclient libraries](/vertex-ai/docs/start/client-libraries).\n\n\nFor more information, see the\n[Vertex AI C# API\nreference documentation](/dotnet/docs/reference/Google.Cloud.AIPlatform.V1/latest).\n\n\nTo authenticate to Vertex AI, set up Application Default Credentials.\nFor more information, see\n\n[Set up authentication for a local development environment](/docs/authentication/set-up-adc-local-dev-environment).\n\n\n using https://cloud.google.com/dotnet/docs/reference/Google.Api.Gax/latest/Google.Api.Gax.Grpc.html;\n using https://cloud.google.com/dotnet/docs/reference/Google.Cloud.AIPlatform.V1/latest/Google.Cloud.AIPlatform.V1.html;\n using https://cloud.google.com/dotnet/docs/reference/Google.Protobuf/latest/Google.Protobuf.html;\n using System.Net.Http;\n using System.Text;\n using System.Threading.Tasks;\n\n public class MultimodalMultiImage\n {\n public async Task\u003cstring\u003e GenerateContent(\n string projectId = \"your-project-id\",\n string location = \"us-central1\",\n string publisher = \"google\",\n string model = \"gemini-2.0-flash-001\"\n )\n {\n var predictionServiceClient = new https://cloud.google.com/dotnet/docs/reference/Google.Cloud.AIPlatform.V1/latest/Google.Cloud.AIPlatform.V1.PredictionServiceClientBuilder.html\n {\n Endpoint = $\"{location}-aiplatform.googleapis.com\"\n }.Build();\n\n https://cloud.google.com/dotnet/docs/reference/Google.Protobuf/latest/Google.Protobuf.ByteString.html colosseum = await ReadImageFileAsync(\n \"https://storage.googleapis.com/cloud-samples-data/vertex-ai/llm/prompts/landmark1.png\");\n\n https://cloud.google.com/dotnet/docs/reference/Google.Protobuf/latest/Google.Protobuf.ByteString.html forbiddenCity = await ReadImageFileAsync(\n \"https://storage.googleapis.com/cloud-samples-data/vertex-ai/llm/prompts/landmark2.png\");\n\n https://cloud.google.com/dotnet/docs/reference/Google.Protobuf/latest/Google.Protobuf.ByteString.html christRedeemer = await ReadImageFileAsync(\n \"https://storage.googleapis.com/cloud-samples-data/vertex-ai/llm/prompts/landmark3.png\");\n\n var generateContentRequest = new https://cloud.google.com/dotnet/docs/reference/Google.Cloud.AIPlatform.V1/latest/Google.Cloud.AIPlatform.V1.GenerateContentRequest.html\n {\n Model = $\"projects/{projectId}/locations/{location}/publishers/{publisher}/models/{model}\",\n Contents =\n {\n new https://cloud.google.com/dotnet/docs/reference/Google.Cloud.AIPlatform.V1/latest/Google.Cloud.AIPlatform.V1.Content.html\n {\n Role = \"USER\",\n Parts =\n {\n new https://cloud.google.com/dotnet/docs/reference/Google.Cloud.AIPlatform.V1/latest/Google.Cloud.AIPlatform.V1.Part.html { InlineData = new() { MimeType = \"image/png\", Data = colosseum }},\n new https://cloud.google.com/dotnet/docs/reference/Google.Cloud.AIPlatform.V1/latest/Google.Cloud.AIPlatform.V1.Part.html { Text = \"city: Rome, Landmark: the Colosseum\" },\n new https://cloud.google.com/dotnet/docs/reference/Google.Cloud.AIPlatform.V1/latest/Google.Cloud.AIPlatform.V1.Part.html { InlineData = new() { MimeType = \"image/png\", Data = forbiddenCity }},\n new https://cloud.google.com/dotnet/docs/reference/Google.Cloud.AIPlatform.V1/latest/Google.Cloud.AIPlatform.V1.Part.html { Text = \"city: Beijing, Landmark: Forbidden City\"},\n new https://cloud.google.com/dotnet/docs/reference/Google.Cloud.AIPlatform.V1/latest/Google.Cloud.AIPlatform.V1.Part.html { InlineData = new() { MimeType = \"image/png\", Data = christRedeemer }}\n }\n }\n }\n };\n\n using PredictionServiceClient.StreamGenerateContentStream response = predictionServiceClient.StreamGenerateContent(generateContentRequest);\n\n StringBuilder fullText = new();\n\n AsyncResponseStream\u003cGenerateContentResponse\u003e responseStream = response.GetResponseStream();\n await foreach (https://cloud.google.com/dotnet/docs/reference/Google.Cloud.AIPlatform.V1/latest/Google.Cloud.AIPlatform.V1.GenerateContentResponse.html responseItem in responseStream)\n {\n fullText.Append(responseItem.Candidates[0].https://cloud.google.com/dotnet/docs/reference/Google.Cloud.AIPlatform.V1/latest/Google.Cloud.AIPlatform.V1.Content.html.Parts[0].Text);\n }\n return fullText.ToString();\n }\n\n private static async Task\u003cByteString\u003e ReadImageFileAsync(string url)\n {\n using HttpClient client = new();\n using var response = await client.GetAsync(url);\n byte[] imageBytes = await response.https://cloud.google.com/dotnet/docs/reference/Google.Cloud.AIPlatform.V1/latest/Google.Cloud.AIPlatform.V1.Content.html.ReadAsByteArrayAsync();\n return https://cloud.google.com/dotnet/docs/reference/Google.Protobuf/latest/Google.Protobuf.ByteString.html.https://cloud.google.com/dotnet/docs/reference/Google.Protobuf/latest/Google.Protobuf.ByteString.html#Google_Protobuf_ByteString_CopyFrom_System_Byte___(imageBytes);\n }\n }\n\n### Node.js\n\n\nBefore trying this sample, follow the Node.js setup instructions in the\n[Vertex AI quickstart using\nclient libraries](/vertex-ai/docs/start/client-libraries).\n\n\nFor more information, see the\n[Vertex AI Node.js API\nreference documentation](/nodejs/docs/reference/aiplatform/latest).\n\n\nTo authenticate to Vertex AI, set up Application Default Credentials.\nFor more information, see\n\n[Set up authentication for a local development environment](/docs/authentication/set-up-adc-local-dev-environment).\n\n const {VertexAI} = require('https://cloud.google.com/nodejs/docs/reference/vertexai/latest/overview.html');\n const axios = require('axios');\n\n async function getBase64(url) {\n const image = await axios.get(url, {responseType: 'arraybuffer'});\n return Buffer.from(image.https://cloud.google.com/nodejs/docs/reference/vertexai/latest/vertexai/generativecontentblob.html).toString('base64');\n }\n\n /**\n * TODO(developer): Update these variables before running the sample.\n */\n async function sendMultiModalPromptWithImage(\n projectId = 'PROJECT_ID',\n location = 'us-central1',\n model = 'gemini-2.0-flash-001'\n ) {\n // For images, the SDK supports base64 strings\n const landmarkImage1 = await getBase64(\n 'https://storage.googleapis.com/cloud-samples-data/vertex-ai/llm/prompts/landmark1.png'\n );\n const landmarkImage2 = await getBase64(\n 'https://storage.googleapis.com/cloud-samples-data/vertex-ai/llm/prompts/landmark2.png'\n );\n const landmarkImage3 = await getBase64(\n 'https://storage.googleapis.com/cloud-samples-data/vertex-ai/llm/prompts/landmark3.png'\n );\n\n // Initialize Vertex with your Cloud project and location\n const vertexAI = new https://cloud.google.com/nodejs/docs/reference/vertexai/latest/vertexai/vertexai.html({project: projectId, location: location});\n\n const generativeVisionModel = vertexAI.https://cloud.google.com/nodejs/docs/reference/vertexai/latest/vertexai/vertexai.html({\n model: model,\n });\n\n // Pass multimodal prompt\n const request = {\n contents: [\n {\n role: 'user',\n parts: [\n {\n inlineData: {\n data: landmarkImage1,\n mimeType: 'image/png',\n },\n },\n {\n text: 'city: Rome, Landmark: the Colosseum',\n },\n\n {\n inlineData: {\n data: landmarkImage2,\n mimeType: 'image/png',\n },\n },\n {\n text: 'city: Beijing, Landmark: Forbidden City',\n },\n {\n inlineData: {\n data: landmarkImage3,\n mimeType: 'image/png',\n },\n },\n ],\n },\n ],\n };\n\n // Create the response\n const response = await generativeVisionModel.generateContent(request);\n // Wait for the response to complete\n const aggregatedResponse = await response.response;\n // Select the text from the response\n const fullTextResponse =\n aggregatedResponse.https://cloud.google.com/nodejs/docs/reference/vertexai/latest/vertexai/generatecontentresponse.html[0].https://cloud.google.com/nodejs/docs/reference/vertexai/latest/vertexai/generatecontentcandidate.html.https://cloud.google.com/nodejs/docs/reference/vertexai/latest/vertexai/content.html[0].text;\n\n console.log(fullTextResponse);\n }\n\nWhat's next\n-----------\n\n\nTo search and filter code samples for other Google Cloud products, see the\n[Google Cloud sample browser](/docs/samples?product=generativeaionvertexai)."]]