Generar texto desde un mensaje multimodal
Organiza tus páginas con colecciones
Guarda y categoriza el contenido según tus preferencias.
En este ejemplo, se muestra cómo generar texto a partir de un prompt multimodal con el modelo de Gemini. El mensaje consta de tres imágenes y dos prompts de texto. El modelo genera una respuesta de texto que describe las imágenes y los prompts de texto.
Muestra de código
Salvo que se indique lo contrario, el contenido de esta página está sujeto a la licencia Atribución 4.0 de Creative Commons, y los ejemplos de código están sujetos a la licencia Apache 2.0. Para obtener más información, consulta las políticas del sitio de Google Developers. Java es una marca registrada de Oracle o sus afiliados.
[[["Fácil de comprender","easyToUnderstand","thumb-up"],["Resolvió mi problema","solvedMyProblem","thumb-up"],["Otro","otherUp","thumb-up"]],[["Difícil de entender","hardToUnderstand","thumb-down"],["Información o código de muestra incorrectos","incorrectInformationOrSampleCode","thumb-down"],["Faltan la información o los ejemplos que necesito","missingTheInformationSamplesINeed","thumb-down"],["Problema de traducción","translationIssue","thumb-down"],["Otro","otherDown","thumb-down"]],[],[],[],null,["This sample demonstrates how to generate text from a multimodal prompt using the Gemini model. The prompt consists of three images and two text prompts. The model generates a text response that describes the images and the text prompts.\n\nCode sample \n\nJava\n\n\nBefore trying this sample, follow the Java setup instructions in the\n[Vertex AI quickstart using\nclient libraries](/vertex-ai/docs/start/client-libraries).\n\n\nFor more information, see the\n[Vertex AI Java API\nreference documentation](/java/docs/reference/google-cloud-aiplatform/latest/com.google.cloud.aiplatform.v1).\n\n\nTo authenticate to Vertex AI, set up Application Default Credentials.\nFor more information, see\n\n[Set up authentication for a local development environment](/docs/authentication/set-up-adc-local-dev-environment).\n\n\n import com.google.genai.Client;\n import com.google.genai.types.Content;\n import com.google.genai.types.GenerateContentResponse;\n import com.google.genai.types.HttpOptions;\n import com.google.genai.types.Part;\n import java.io.IOException;\n import java.nio.file.Files;\n import java.nio.file.Paths;\n\n public class TextGenerationWithMultiLocalImage {\n\n public static void main(String[] args) throws IOException {\n // TODO(developer): Replace these variables before running the sample.\n String modelId = \"gemini-2.5-flash\";\n String localImageFilePath1 = \"your/local/img1.jpg\";\n String localImageFilePath2 = \"your/local/img2.jpg\";\n generateContent(modelId, localImageFilePath1, localImageFilePath2);\n }\n\n // Generates text using multiple local images\n public static String generateContent(\n String modelId, String localImageFilePath1, String localImageFilePath2) throws IOException {\n // Initialize client that will be used to send requests. This client only needs to be created\n // once, and can be reused for multiple requests.\n try (Client client =\n Client.builder()\n .location(\"global\")\n .vertexAI(true)\n .httpOptions(HttpOptions.builder().apiVersion(\"v1\").build())\n .build()) {\n\n // Read content from local files.\n byte[] localFileImg1Bytes = Files.readAllBytes(Paths.get(localImageFilePath1));\n byte[] localFileImg2Bytes = Files.readAllBytes(Paths.get(localImageFilePath2));\n\n GenerateContentResponse response =\n client.models.generateContent(\n modelId,\n Content.fromParts(\n Part.fromBytes(localFileImg1Bytes, \"image/jpeg\"),\n Part.fromBytes(localFileImg2Bytes, \"image/jpeg\"),\n Part.fromText(\"Generate a list of all the objects contained in both images\")),\n null);\n\n System.out.print(response.text());\n // Example response:\n // Based on both images, here are the objects contained in both:\n //\n // 1. **Coffee cups (or mugs)**: Both images feature one or more cups containing a beverage.\n // 2. **Coffee (or a similar beverage)**: Both images contain a liquid beverage in the cups,\n // appearing to be coffee or a coffee-like drink.\n // 3. **Table (or a flat surface)**: Both compositions are set on a flat surface, likely a\n // table or countertop.\n return response.text();\n }\n }\n }\n\nNode.js\n\n\nBefore trying this sample, follow the Node.js setup instructions in the\n[Vertex AI quickstart using\nclient libraries](/vertex-ai/docs/start/client-libraries).\n\n\nFor more information, see the\n[Vertex AI Node.js API\nreference documentation](/nodejs/docs/reference/aiplatform/latest).\n\n\nTo authenticate to Vertex AI, set up Application Default Credentials.\nFor more information, see\n\n[Set up authentication for a local development environment](/docs/authentication/set-up-adc-local-dev-environment).\n\n const {GoogleGenAI} = require('@google/genai');\n const fs = require('fs');\n\n const GOOGLE_CLOUD_PROJECT = process.env.GOOGLE_CLOUD_PROJECT;\n const GOOGLE_CLOUD_LOCATION = process.env.GOOGLE_CLOUD_LOCATION || 'global';\n\n function loadImageAsBase64(path) {\n const bytes = fs.readFileSync(path);\n return bytes.toString('base64');\n }\n\n async function generateContent(\n projectId = GOOGLE_CLOUD_PROJECT,\n location = GOOGLE_CLOUD_LOCATION,\n imagePath1,\n imagePath2\n ) {\n const ai = new GoogleGenAI({\n vertexai: true,\n project: projectId,\n location: location,\n });\n\n // TODO(Developer): Update the below file paths to your images\n const image1 = loadImageAsBase64(imagePath1);\n const image2 = loadImageAsBase64(imagePath2);\n\n const response = await ai.models.generateContent({\n model: 'gemini-2.5-flash',\n contents: [\n {\n role: 'user',\n parts: [\n {\n text: 'Generate a list of all the objects contained in both images.',\n },\n {\n inlineData: {\n data: image1,\n mimeType: 'image/jpeg',\n },\n },\n {\n inlineData: {\n data: image2,\n mimeType: 'image/jpeg',\n },\n },\n ],\n },\n ],\n });\n\n console.log(response.text);\n\n return response.text;\n }\n // Example response:\n // Okay, here's a jingle combining the elements of both sets of images, focusing on ...\n // ...\n\nPython\n\n\nBefore trying this sample, follow the Python setup instructions in the\n[Vertex AI quickstart using\nclient libraries](/vertex-ai/docs/start/client-libraries).\n\n\nFor more information, see the\n[Vertex AI Python API\nreference documentation](/python/docs/reference/aiplatform/latest).\n\n\nTo authenticate to Vertex AI, set up Application Default Credentials.\nFor more information, see\n\n[Set up authentication for a local development environment](/docs/authentication/set-up-adc-local-dev-environment).\n\n from google import genai\n from google.genai.types import HttpOptions, Part\n\n client = genai.Client(http_options=HttpOptions(api_version=\"v1\"))\n # TODO(Developer): Update the below file paths to your images\n # image_path_1 = \"path/to/your/image1.jpg\"\n # image_path_2 = \"path/to/your/image2.jpg\"\n with open(image_path_1, \"rb\") as f:\n image_1_bytes = f.read()\n with open(image_path_2, \"rb\") as f:\n image_2_bytes = f.read()\n\n response = client.models.generate_content(\n model=\"gemini-2.5-flash\",\n contents=[\n \"Generate a list of all the objects contained in both images.\",\n Part.from_bytes(data=image_1_bytes, mime_type=\"image/jpeg\"),\n Part.from_bytes(data=image_2_bytes, mime_type=\"image/jpeg\"),\n ],\n )\n print(response.text)\n # Example response:\n # Okay, here's a jingle combining the elements of both sets of images, focusing on ...\n # ...\n\nWhat's next\n\n\nTo search and filter code samples for other Google Cloud products, see the\n[Google Cloud sample browser](/docs/samples?product=googlegenaisdk)."]]