根据多模态提示生成文本
使用集合让一切井井有条
根据您的偏好保存内容并对其进行分类。
此示例演示了如何使用 Gemini 模型根据多模态提示生成文本。提示由三张图片和两个文本提示组成。模型生成描述图片和文本提示的文本回复。
代码示例
如未另行说明,那么本页面中的内容已根据知识共享署名 4.0 许可获得了许可,并且代码示例已根据 Apache 2.0 许可获得了许可。有关详情,请参阅 Google 开发者网站政策。Java 是 Oracle 和/或其关联公司的注册商标。
[[["易于理解","easyToUnderstand","thumb-up"],["解决了我的问题","solvedMyProblem","thumb-up"],["其他","otherUp","thumb-up"]],[["很难理解","hardToUnderstand","thumb-down"],["信息或示例代码不正确","incorrectInformationOrSampleCode","thumb-down"],["没有我需要的信息/示例","missingTheInformationSamplesINeed","thumb-down"],["翻译问题","translationIssue","thumb-down"],["其他","otherDown","thumb-down"]],[],[],[],null,["# Generate text from multimodal prompt\n\nThis sample demonstrates how to generate text from a multimodal prompt using the Gemini model. The prompt consists of three images and two text prompts. The model generates a text response that describes the images and the text prompts.\n\nCode sample\n-----------\n\n### Python\n\n\nBefore trying this sample, follow the Python setup instructions in the\n[Vertex AI quickstart using\nclient libraries](/vertex-ai/docs/start/client-libraries).\n\n\nFor more information, see the\n[Vertex AI Python API\nreference documentation](/python/docs/reference/aiplatform/latest).\n\n\nTo authenticate to Vertex AI, set up Application Default Credentials.\nFor more information, see\n\n[Set up authentication for a local development environment](/docs/authentication/set-up-adc-local-dev-environment).\n\n from google import genai\n from google.genai.types import HttpOptions, Part\n\n client = genai.Client(http_options=HttpOptions(api_version=\"v1\"))\n # TODO(Developer): Update the below file paths to your images\n # image_path_1 = \"path/to/your/image1.jpg\"\n # image_path_2 = \"path/to/your/image2.jpg\"\n with open(image_path_1, \"rb\") as f:\n image_1_bytes = f.read()\n with open(image_path_2, \"rb\") as f:\n image_2_bytes = f.read()\n\n response = client.models.generate_content(\n model=\"gemini-2.5-flash\",\n contents=[\n \"Generate a list of all the objects contained in both images.\",\n Part.from_bytes(data=image_1_bytes, mime_type=\"image/jpeg\"),\n Part.from_bytes(data=image_2_bytes, mime_type=\"image/jpeg\"),\n ],\n )\n print(response.text)\n # Example response:\n # Okay, here's a jingle combining the elements of both sets of images, focusing on ...\n # ...\n\nWhat's next\n-----------\n\n\nTo search and filter code samples for other Google Cloud products, see the\n[Google Cloud sample browser](/docs/samples?product=googlegenaisdk)."]]