Générer du texte à partir d'une requête multimodale
Restez organisé à l'aide des collections
Enregistrez et classez les contenus selon vos préférences.
Cet exemple montre comment générer du texte à partir d'une requête multimodale à l'aide du modèle Gemini. La requête se compose de trois images et de deux requêtes textuelles. Le modèle génère une réponse textuelle décrivant les images et les requêtes textuelles.
Exemple de code
Sauf indication contraire, le contenu de cette page est régi par une licence Creative Commons Attribution 4.0, et les échantillons de code sont régis par une licence Apache 2.0. Pour en savoir plus, consultez les Règles du site Google Developers. Java est une marque déposée d'Oracle et/ou de ses sociétés affiliées.
[[["Facile à comprendre","easyToUnderstand","thumb-up"],["J'ai pu résoudre mon problème","solvedMyProblem","thumb-up"],["Autre","otherUp","thumb-up"]],[["Difficile à comprendre","hardToUnderstand","thumb-down"],["Informations ou exemple de code incorrects","incorrectInformationOrSampleCode","thumb-down"],["Il n'y a pas l'information/les exemples dont j'ai besoin","missingTheInformationSamplesINeed","thumb-down"],["Problème de traduction","translationIssue","thumb-down"],["Autre","otherDown","thumb-down"]],[],[],[],null,["# Generate text from multimodal prompt\n\nThis sample demonstrates how to generate text from a multimodal prompt using the Gemini model. The prompt consists of three images and two text prompts. The model generates a text response that describes the images and the text prompts.\n\nCode sample\n-----------\n\n### C#\n\n\nBefore trying this sample, follow the C# setup instructions in the\n[Vertex AI quickstart using\nclient libraries](/vertex-ai/docs/start/client-libraries).\n\n\nFor more information, see the\n[Vertex AI C# API\nreference documentation](/dotnet/docs/reference/Google.Cloud.AIPlatform.V1/latest).\n\n\nTo authenticate to Vertex AI, set up Application Default Credentials.\nFor more information, see\n\n[Set up authentication for a local development environment](/docs/authentication/set-up-adc-local-dev-environment).\n\n\n using https://cloud.google.com/dotnet/docs/reference/Google.Api.Gax/latest/Google.Api.Gax.Grpc.html;\n using https://cloud.google.com/dotnet/docs/reference/Google.Cloud.AIPlatform.V1/latest/Google.Cloud.AIPlatform.V1.html;\n using https://cloud.google.com/dotnet/docs/reference/Google.Protobuf/latest/Google.Protobuf.html;\n using System.Net.Http;\n using System.Text;\n using System.Threading.Tasks;\n\n public class MultimodalMultiImage\n {\n public async Task\u003cstring\u003e GenerateContent(\n string projectId = \"your-project-id\",\n string location = \"us-central1\",\n string publisher = \"google\",\n string model = \"gemini-2.0-flash-001\"\n )\n {\n var predictionServiceClient = new https://cloud.google.com/dotnet/docs/reference/Google.Cloud.AIPlatform.V1/latest/Google.Cloud.AIPlatform.V1.PredictionServiceClientBuilder.html\n {\n Endpoint = $\"{location}-aiplatform.googleapis.com\"\n }.Build();\n\n https://cloud.google.com/dotnet/docs/reference/Google.Protobuf/latest/Google.Protobuf.ByteString.html colosseum = await ReadImageFileAsync(\n \"https://storage.googleapis.com/cloud-samples-data/vertex-ai/llm/prompts/landmark1.png\");\n\n https://cloud.google.com/dotnet/docs/reference/Google.Protobuf/latest/Google.Protobuf.ByteString.html forbiddenCity = await ReadImageFileAsync(\n \"https://storage.googleapis.com/cloud-samples-data/vertex-ai/llm/prompts/landmark2.png\");\n\n https://cloud.google.com/dotnet/docs/reference/Google.Protobuf/latest/Google.Protobuf.ByteString.html christRedeemer = await ReadImageFileAsync(\n \"https://storage.googleapis.com/cloud-samples-data/vertex-ai/llm/prompts/landmark3.png\");\n\n var generateContentRequest = new https://cloud.google.com/dotnet/docs/reference/Google.Cloud.AIPlatform.V1/latest/Google.Cloud.AIPlatform.V1.GenerateContentRequest.html\n {\n Model = $\"projects/{projectId}/locations/{location}/publishers/{publisher}/models/{model}\",\n Contents =\n {\n new https://cloud.google.com/dotnet/docs/reference/Google.Cloud.AIPlatform.V1/latest/Google.Cloud.AIPlatform.V1.Content.html\n {\n Role = \"USER\",\n Parts =\n {\n new https://cloud.google.com/dotnet/docs/reference/Google.Cloud.AIPlatform.V1/latest/Google.Cloud.AIPlatform.V1.Part.html { InlineData = new() { MimeType = \"image/png\", Data = colosseum }},\n new https://cloud.google.com/dotnet/docs/reference/Google.Cloud.AIPlatform.V1/latest/Google.Cloud.AIPlatform.V1.Part.html { Text = \"city: Rome, Landmark: the Colosseum\" },\n new https://cloud.google.com/dotnet/docs/reference/Google.Cloud.AIPlatform.V1/latest/Google.Cloud.AIPlatform.V1.Part.html { InlineData = new() { MimeType = \"image/png\", Data = forbiddenCity }},\n new https://cloud.google.com/dotnet/docs/reference/Google.Cloud.AIPlatform.V1/latest/Google.Cloud.AIPlatform.V1.Part.html { Text = \"city: Beijing, Landmark: Forbidden City\"},\n new https://cloud.google.com/dotnet/docs/reference/Google.Cloud.AIPlatform.V1/latest/Google.Cloud.AIPlatform.V1.Part.html { InlineData = new() { MimeType = \"image/png\", Data = christRedeemer }}\n }\n }\n }\n };\n\n using PredictionServiceClient.StreamGenerateContentStream response = predictionServiceClient.StreamGenerateContent(generateContentRequest);\n\n StringBuilder fullText = new();\n\n AsyncResponseStream\u003cGenerateContentResponse\u003e responseStream = response.GetResponseStream();\n await foreach (https://cloud.google.com/dotnet/docs/reference/Google.Cloud.AIPlatform.V1/latest/Google.Cloud.AIPlatform.V1.GenerateContentResponse.html responseItem in responseStream)\n {\n fullText.Append(responseItem.Candidates[0].https://cloud.google.com/dotnet/docs/reference/Google.Cloud.AIPlatform.V1/latest/Google.Cloud.AIPlatform.V1.Content.html.Parts[0].Text);\n }\n return fullText.ToString();\n }\n\n private static async Task\u003cByteString\u003e ReadImageFileAsync(string url)\n {\n using HttpClient client = new();\n using var response = await client.GetAsync(url);\n byte[] imageBytes = await response.https://cloud.google.com/dotnet/docs/reference/Google.Cloud.AIPlatform.V1/latest/Google.Cloud.AIPlatform.V1.Content.html.ReadAsByteArrayAsync();\n return https://cloud.google.com/dotnet/docs/reference/Google.Protobuf/latest/Google.Protobuf.ByteString.html.https://cloud.google.com/dotnet/docs/reference/Google.Protobuf/latest/Google.Protobuf.ByteString.html#Google_Protobuf_ByteString_CopyFrom_System_Byte___(imageBytes);\n }\n }\n\n### Node.js\n\n\nBefore trying this sample, follow the Node.js setup instructions in the\n[Vertex AI quickstart using\nclient libraries](/vertex-ai/docs/start/client-libraries).\n\n\nFor more information, see the\n[Vertex AI Node.js API\nreference documentation](/nodejs/docs/reference/aiplatform/latest).\n\n\nTo authenticate to Vertex AI, set up Application Default Credentials.\nFor more information, see\n\n[Set up authentication for a local development environment](/docs/authentication/set-up-adc-local-dev-environment).\n\n const {VertexAI} = require('https://cloud.google.com/nodejs/docs/reference/vertexai/latest/overview.html');\n const axios = require('axios');\n\n async function getBase64(url) {\n const image = await axios.get(url, {responseType: 'arraybuffer'});\n return Buffer.from(image.https://cloud.google.com/nodejs/docs/reference/vertexai/latest/vertexai/generativecontentblob.html).toString('base64');\n }\n\n /**\n * TODO(developer): Update these variables before running the sample.\n */\n async function sendMultiModalPromptWithImage(\n projectId = 'PROJECT_ID',\n location = 'us-central1',\n model = 'gemini-2.0-flash-001'\n ) {\n // For images, the SDK supports base64 strings\n const landmarkImage1 = await getBase64(\n 'https://storage.googleapis.com/cloud-samples-data/vertex-ai/llm/prompts/landmark1.png'\n );\n const landmarkImage2 = await getBase64(\n 'https://storage.googleapis.com/cloud-samples-data/vertex-ai/llm/prompts/landmark2.png'\n );\n const landmarkImage3 = await getBase64(\n 'https://storage.googleapis.com/cloud-samples-data/vertex-ai/llm/prompts/landmark3.png'\n );\n\n // Initialize Vertex with your Cloud project and location\n const vertexAI = new https://cloud.google.com/nodejs/docs/reference/vertexai/latest/vertexai/vertexai.html({project: projectId, location: location});\n\n const generativeVisionModel = vertexAI.https://cloud.google.com/nodejs/docs/reference/vertexai/latest/vertexai/vertexai.html({\n model: model,\n });\n\n // Pass multimodal prompt\n const request = {\n contents: [\n {\n role: 'user',\n parts: [\n {\n inlineData: {\n data: landmarkImage1,\n mimeType: 'image/png',\n },\n },\n {\n text: 'city: Rome, Landmark: the Colosseum',\n },\n\n {\n inlineData: {\n data: landmarkImage2,\n mimeType: 'image/png',\n },\n },\n {\n text: 'city: Beijing, Landmark: Forbidden City',\n },\n {\n inlineData: {\n data: landmarkImage3,\n mimeType: 'image/png',\n },\n },\n ],\n },\n ],\n };\n\n // Create the response\n const response = await generativeVisionModel.generateContent(request);\n // Wait for the response to complete\n const aggregatedResponse = await response.response;\n // Select the text from the response\n const fullTextResponse =\n aggregatedResponse.https://cloud.google.com/nodejs/docs/reference/vertexai/latest/vertexai/generatecontentresponse.html[0].https://cloud.google.com/nodejs/docs/reference/vertexai/latest/vertexai/generatecontentcandidate.html.https://cloud.google.com/nodejs/docs/reference/vertexai/latest/vertexai/content.html[0].text;\n\n console.log(fullTextResponse);\n }\n\nWhat's next\n-----------\n\n\nTo search and filter code samples for other Google Cloud products, see the\n[Google Cloud sample browser](/docs/samples?product=generativeaionvertexai)."]]