此页面由 Cloud Translation API 翻译。

使用 Gemini 生成图片

注意：Gemini 2.0 Flash 图片生成功能将于 2025 年 9 月 26 日弃用。gemini-2.0-flash-preview-image-generation 将于 2025 年 9 月 26 日移除。将所有工作流迁移到 gemini-2.5-flash-image-preview。

Gemini 2.5 Flash 图片预览版支持多种模态的回答生成，包括文本和图片。

图片生成

Gemini Flash 预览版图片生成功能 (gemini-2.5-flash-image-preview) 的公开预览版支持生成图片和文本。这可扩展 Gemini 的功能，使其能够执行以下操作：

通过自然语言对话迭代生成图片，在调整图片的同时保持一致性和上下文。
生成具有高质量长文本渲染效果的图片。
生成图文交织的输出。例如，在单个对话轮次中包含文本和图片的博文。以前，这需要将多个模型串联在一起。
利用 Gemini 的世界知识和推理能力生成图片。

在此公开实验版中，Gemini 2.5 Flash Image 预览版可以生成 1,024 像素的图片，支持生成人物图片，并包含更新的安全过滤条件，可提供更灵活、限制更少的用户体验。

它支持以下模态和功能：

文本到图像
- 提示示例：“生成一张以烟花为背景的埃菲尔铁塔图片。”
文生图（文本渲染）
- 提示示例：“生成一张电影效果照片，照片中有一栋大型建筑，建筑正面投影着巨大的文字：‘Gemini 2.5 现在可以生成长篇文本了’”
文生图和文本（交织）
- 提示示例：“生成一份图文并茂的海鲜饭食谱。在生成食谱时，与文本一起创建图片。”
- 提示示例：“生成一个关于狗狗的故事，采用 3D 卡通动画风格。为每个场景生成一张图片”
图片和文生图及文本（交织）
- 提示示例：（附带一张带家具的房间的照片）“我的空间还适合放置哪些颜色的沙发？您可以更新图片吗？”
支持根据语言区域生成图片
- 提示示例：“生成一张早餐图片。”

限制：

为获得最佳性能，请使用以下语言：英语、西班牙语（墨西哥）、日语（日本）、中文（中国）、印地语（印度）。
图片生成不支持音频或视频输入。
图片生成可能不会始终触发：
- 模型可能只能输出文本。尝试明确要求生成图片输出。例如，“在您操作过程中提供图片”。
- 模型可能会以图片形式生成文本。尝试明确要求文本输出。例如，“生成叙事文本及插图”。
- 模型可能会中途停止生成。请重试或尝试使用其他提示。

生成图片

以下部分介绍了如何使用 Vertex AI Studio 或 API 生成图片。

如需了解提示方面的指南和最佳实践，请参阅设计多模态提示。

控制台

如需使用图片生成，请执行以下操作：

依次打开 Vertex AI Studio > 创建提示。
点击切换模型，然后从菜单中选择 gemini-2.5-flash-image-preview。
在输出面板中，从下拉菜单中选择图片和文本。
在编写提示文本区域中，撰写要生成的图片的说明。
点击提示 () 按钮。

Gemini 将根据您的说明生成图片。此过程应需要几秒钟，但可能会相对较慢，具体取决于容量。

Python

安装

pip install --upgrade google-genai

如需了解详情，请参阅 SDK 参考文档。

设置环境变量以将 Gen AI SDK 与 Vertex AI 搭配使用：

# Replace the `GOOGLE_CLOUD_PROJECT` and `GOOGLE_CLOUD_LOCATION` values
# with appropriate values for your project.
export GOOGLE_CLOUD_PROJECT=GOOGLE_CLOUD_PROJECT
export GOOGLE_CLOUD_LOCATION=global
export GOOGLE_GENAI_USE_VERTEXAI=True

from google import genai
from google.genai.types import GenerateContentConfig, Modality
from PIL import Image
from io import BytesIO

client = genai.Client()

response = client.models.generate_content(
    model="gemini-2.5-flash-image-preview",
    contents=("Generate an image of the Eiffel tower with fireworks in the background."),
    config=GenerateContentConfig(
        response_modalities=[Modality.TEXT, Modality.IMAGE],
        candidate_count=1,
        safety_settings=[
            {"method": "PROBABILITY"},
            {"category": "HARM_CATEGORY_DANGEROUS_CONTENT"},
            {"threshold": "BLOCK_MEDIUM_AND_ABOVE"},
        ],
    ),
)
for part in response.candidates[0].content.parts:
    if part.text:
        print(part.text)
    elif part.inline_data:
        image = Image.open(BytesIO((part.inline_data.data)))
        image.save("output_folder/example-image-eiffel-tower.png")
# Example response:
#   I will generate an image of the Eiffel Tower at night, with a vibrant display of
#   colorful fireworks exploding in the dark sky behind it. The tower will be
#   illuminated, standing tall as the focal point of the scene, with the bursts of
#   light from the fireworks creating a festive atmosphere.

Node.js

安装

npm install @google/genai

如需了解详情，请参阅 SDK 参考文档。

设置环境变量以将 Gen AI SDK 与 Vertex AI 搭配使用：

# Replace the `GOOGLE_CLOUD_PROJECT` and `GOOGLE_CLOUD_LOCATION` values
# with appropriate values for your project.
export GOOGLE_CLOUD_PROJECT=GOOGLE_CLOUD_PROJECT
export GOOGLE_CLOUD_LOCATION=global
export GOOGLE_GENAI_USE_VERTEXAI=True

const fs = require('fs');
const {GoogleGenAI, Modality} = require('@google/genai');

const GOOGLE_CLOUD_PROJECT = process.env.GOOGLE_CLOUD_PROJECT;
const GOOGLE_CLOUD_LOCATION =
  process.env.GOOGLE_CLOUD_LOCATION || 'us-central1';

async function generateContent(
  projectId = GOOGLE_CLOUD_PROJECT,
  location = GOOGLE_CLOUD_LOCATION
) {
  const ai = new GoogleGenAI({
    vertexai: true,
    project: projectId,
    location: location,
  });

  const response = await ai.models.generateContentStream({
    model: 'gemini-2.0-flash-exp',
    contents:
      'Generate an image of the Eiffel tower with fireworks in the background.',
    config: {
      responseModalities: [Modality.TEXT, Modality.IMAGE],
    },
  });

  const generatedFileNames = [];
  let imageIndex = 0;
  for await (const chunk of response) {
    const text = chunk.text;
    const data = chunk.data;
    if (text) {
      console.debug(text);
    } else if (data) {
      const fileName = `generate_content_streaming_image_${imageIndex++}.png`;
      console.debug(`Writing response image to file: ${fileName}.`);
      try {
        fs.writeFileSync(fileName, data);
        generatedFileNames.push(fileName);
      } catch (error) {
        console.error(`Failed to write image file ${fileName}:`, error);
      }
    }
  }

  return generatedFileNames;
}

REST

在终端中运行以下命令，在当前目录中创建或覆盖此文件：

curl -X POST \
  -H "Authorization: Bearer $(gcloud auth print-access-token)" \
  -H "Content-Type: application/json" \
  https://${API_ENDPOINT}:generateContent \
  -d '{
    "contents": {
      "role": "USER",
      "parts": { "text": "Create a tutorial explaining how to make a peanut butter and jelly sandwich in three easy steps."},
    },
    "generation_config": {
      "response_modalities": ["TEXT", "IMAGE"],
     },
     "safetySettings": {
      "method": "PROBABILITY",
      "category": "HARM_CATEGORY_DANGEROUS_CONTENT",
      "threshold": "BLOCK_MEDIUM_AND_ABOVE"
    },
  }' 2>/dev/null >response.json

注意：您必须在配置中添加 responseModalities: ["TEXT", "IMAGE"]。这些模型不支持仅图片输出。

Gemini 将根据您的说明生成图片。此过程应需要几秒钟，但可能会相对较慢，具体取决于容量。

生成图文交织的内容

Gemini 2.5 Flash 图片预览版可以生成图文交织的回答。例如，您可以生成所生成食谱中每个步骤的图片，以便与该步骤的文本搭配使用，而无需单独向模型发出请求。

控制台

如需生成图文交织的回答，请执行以下操作：

依次打开 Vertex AI Studio > 创建提示。
点击切换模型，然后从菜单中选择 gemini-2.5-flash-image-preview。
在输出面板中，从下拉菜单中选择图片和文本。
在编写提示文本区域中，撰写要生成的图片的说明。例如，“创建一个教程，说明如何通过三个简单步骤制作花生酱和果酱三明治。对于每个步骤，提供一个包含步骤编号的标题、一段说明，并生成一张图片，每张图片的宽高比为 1:1。”
点击提示 () 按钮。

Gemini 将根据您的说明生成回答。此过程应需要几秒钟，但可能会相对较慢，具体取决于容量。

Python

安装

pip install --upgrade google-genai

如需了解详情，请参阅 SDK 参考文档。

设置环境变量以将 Gen AI SDK 与 Vertex AI 搭配使用：

# Replace the `GOOGLE_CLOUD_PROJECT` and `GOOGLE_CLOUD_LOCATION` values
# with appropriate values for your project.
export GOOGLE_CLOUD_PROJECT=GOOGLE_CLOUD_PROJECT
export GOOGLE_CLOUD_LOCATION=global
export GOOGLE_GENAI_USE_VERTEXAI=True

from google import genai
from google.genai.types import GenerateContentConfig, Modality
from PIL import Image
from io import BytesIO

client = genai.Client()

response = client.models.generate_content(
    model="gemini-2.5-flash-image-preview",
    contents=(
        "Generate an illustrated recipe for a paella."
        "Create images to go alongside the text as you generate the recipe"
    ),
    config=GenerateContentConfig(response_modalities=[Modality.TEXT, Modality.IMAGE]),
)
with open("output_folder/paella-recipe.md", "w") as fp:
    for i, part in enumerate(response.candidates[0].content.parts):
        if part.text is not None:
            fp.write(part.text)
        elif part.inline_data is not None:
            image = Image.open(BytesIO((part.inline_data.data)))
            image.save(f"output_folder/example-image-{i+1}.png")
            fp.write(f"![image](example-image-{i+1}.png)")
# Example response:
#  A markdown page for a Paella recipe(`paella-recipe.md`) has been generated.
#   It includes detailed steps and several images illustrating the cooking process.

REST

在终端中运行以下命令，在当前目录中创建或覆盖此文件：

curl -X POST \
  -H "Authorization: Bearer $(gcloud auth print-access-token)" \
  -H "Content-Type: application/json" \
  https://${API_ENDPOINT}:generateContent \
  -d '{
    "contents": {
      "role": "USER",
      "parts": { "text": "Create a tutorial explaining how to make a peanut butter and jelly sandwich in three easy steps. For each step, provide a title with the number of the step, an explanation, and also generate an image, generate each image in a 1:1 aspect ratio."},
    },
    "generation_config": {
      "response_modalities": ["TEXT", "IMAGE"],
     },
     "safetySettings": {
      "method": "PROBABILITY",
      "category": "HARM_CATEGORY_DANGEROUS_CONTENT",
      "threshold": "BLOCK_MEDIUM_AND_ABOVE"
    },
  }' 2>/dev/null >response.json

注意：您必须在配置中添加 responseModalities: ["TEXT", "IMAGE"]。这些模型不支持仅图片输出。

Gemini 将根据您的说明生成图片。此过程应需要几秒钟，但可能会相对较慢，具体取决于容量。

支持根据语言区域生成图片

Gemini 2.5 Flash Image 预览版还可以在提供文本或图片回答时包含有关您位置的信息。例如，您可以生成考虑到您当前位置的地点或体验类型的图片，而无需向模型指定您的位置。

控制台

如需使用支持根据语言区域生成图片的模型，请执行以下操作：

依次打开 Vertex AI Studio > 创建提示。
点击切换模型，然后从菜单中选择 gemini-2.5-flash-image-preview。
在输出面板中，从下拉菜单中选择图片和文本。
在编写提示文本区域中，撰写要生成的图片的说明。例如，“生成一张典型早餐的照片。”
点击提示 () 按钮。

Gemini 将根据您的说明生成回答。此过程应需要几秒钟，但可能会相对较慢，具体取决于容量。

Python

安装

pip install --upgrade google-genai

如需了解详情，请参阅 SDK 参考文档。

设置环境变量以将 Gen AI SDK 与 Vertex AI 搭配使用：

# Replace the `GOOGLE_CLOUD_PROJECT` and `GOOGLE_CLOUD_LOCATION` values
# with appropriate values for your project.
export GOOGLE_CLOUD_PROJECT=GOOGLE_CLOUD_PROJECT
export GOOGLE_CLOUD_LOCATION=global
export GOOGLE_GENAI_USE_VERTEXAI=True

from google import genai
from google.genai.types import GenerateContentConfig, Modality
from PIL import Image
from io import BytesIO

client = genai.Client()

response = client.models.generate_content(
    model="gemini-2.5-flash-image-preview",
    contents=("Generate a photo of a breakfast meal."),
    config=GenerateContentConfig(response_modalities=[Modality.TEXT, Modality.IMAGE]),
)
for part in response.candidates[0].content.parts:
    if part.text:
        print(part.text)
    elif part.inline_data:
        image = Image.open(BytesIO((part.inline_data.data)))
        image.save("output_folder/example-breakfast-meal.png")
# Example response:
#   Generates a photo of a vibrant and appetizing breakfast meal.
#   The scene will feature a white plate with golden-brown pancakes
#   stacked neatly, drizzled with rich maple syrup and ...

REST

在终端中运行以下命令，在当前目录中创建或覆盖此文件：

curl -X POST \
  -H "Authorization: Bearer $(gcloud auth print-access-token)" \
  -H "Content-Type: application/json" \
  https://${API_ENDPOINT}:generateContent \
  -d '{
    "contents": {
      "role": "USER",
      "parts": { "text": "Generate a photo of a typical breakfast."},
    },
    "generation_config": {
      "response_modalities": ["TEXT", "IMAGE"],
     },
     "safetySettings": {
      "method": "PROBABILITY",
      "category": "HARM_CATEGORY_DANGEROUS_CONTENT",
      "threshold": "BLOCK_MEDIUM_AND_ABOVE"
    },
  }' 2>/dev/null >response.json

注意：您必须在配置中添加 responseModalities: ["TEXT", "IMAGE"]。这些模型不支持仅图片输出。

Gemini 将根据您的说明生成图片。此过程应需要几秒钟，但可能会相对较慢，具体取决于容量。

使用 Gemini 生成图片 使用集合让一切井井有条 根据您的偏好保存内容并对其进行分类。

图片生成

生成图片

控制台

Python

安装

Node.js

安装

REST

生成图文交织的内容

控制台

Python

安装

REST

支持根据语言区域生成图片

控制台

Python

安装

REST

使用 Gemini 生成图片