根据多模态提示生成文本

此示例演示了如何使用 Gemini 模型根据多模态提示生成文本。提示由三张图片和两个文本提示组成。模型生成描述图片和文本提示的文本回复。

代码示例

Python

在尝试此示例之前,请按照《Vertex AI 快速入门:使用客户端库》中的 Python 设置说明执行操作。 如需了解详情,请参阅 Vertex AI Python API 参考文档

如需向 Vertex AI 进行身份验证,请设置应用默认凭证。如需了解详情,请参阅为本地开发环境设置身份验证

from google import genai
from google.genai.types import HttpOptions, Part

client = genai.Client(http_options=HttpOptions(api_version="v1"))
# TODO(Developer): Update the below file paths to your images
# image_path_1 = "path/to/your/image1.jpg"
# image_path_2 = "path/to/your/image2.jpg"
with open(image_path_1, "rb") as f:
    image_1_bytes = f.read()
with open(image_path_2, "rb") as f:
    image_2_bytes = f.read()

response = client.models.generate_content(
    model="gemini-2.5-flash",
    contents=[
        "Generate a list of all the objects contained in both images.",
        Part.from_bytes(data=image_1_bytes, mime_type="image/jpeg"),
        Part.from_bytes(data=image_2_bytes, mime_type="image/jpeg"),
    ],
)
print(response.text)
# Example response:
# Okay, here's a jingle combining the elements of both sets of images, focusing on ...
# ...

后续步骤

如需搜索和过滤其他 Google Cloud 产品的代码示例,请参阅Google Cloud 示例浏览器