This page shows you how to use Gemini to generate and edit images. You can perform the following tasks: Gemini 2.0 Flash supports response generation in multiple
modalities, including text and images. The Gemini model offers several ways to work with images. The following table compares these functionalities to help you choose the best one for your use case. Gemini 2.0 Flash's public preview for image generation
( With this public experimental release, Gemini 2.0 Flash can generate images in
1024px, supports generating and editing images of people, and contains updated
safety filters that provide a more flexible and less restrictive user
experience. It supports the following modalities and capabilities: Text to image Text to image (text rendering) Text to image(s) and text (interleaved) Image(s) and text to image(s) and text (interleaved) Image editing (text and image to image) Multi-turn image editing (chat) Limitations: You can generate images using either Vertex AI Studio or the API. For guidance and best practices for prompting, see Design multimodal
prompts. To generate an image:
Gemini generates an image based on your description. This
process usually takes a few seconds but might be slower depending on the current
capacity.
To learn more, see the
SDK reference documentation.
Set environment variables to use the Gen AI SDK with Vertex AI:
To learn more, see the
SDK reference documentation.
Set environment variables to use the Gen AI SDK with Vertex AI:
To generate an image, send a POST request to the
Gemini generates an image based on your description. This
process usually takes a few seconds but might be slower depending on the current
capacity.
Image generation capabilities
Functionality
Description
Use Case
Generate images
Create a new image from only a text prompt.
Creating a completely new visual from an idea or concept.
Edit an image
Modify an existing image based on a text prompt.
Altering, restyling, or adding or removing elements from an image you already have.
Generate interleaved images and text
Produce a single response that contains both generated text and relevant, newly created images.
Creating tutorials, recipes, or stories where visuals are needed to illustrate steps or concepts described in the text.
gemini-2.0-flash-preview-image-generation
) supports the ability to generate
images in addition to text. This expands Gemini's capabilities to include the
following:
Generate images
Method
Description
Pros
Cons
Vertex AI Studio
A web-based UI for building and experimenting with generative AI models.
Easy to use, no coding required, good for rapid prototyping.
Less suitable for automation or integration into applications.
API (REST & Python SDK)
A programmatic interface to integrate Gemini features into your applications.
Full control, enables automation and deep integration.
Requires coding and environment setup.
Console
gemini-2.0-flash-preview-image-generation
from the menu.
Python
Install
pip install --upgrade google-genai
# Replace the `GOOGLE_CLOUD_PROJECT` and `GOOGLE_CLOUD_LOCATION` values
# with appropriate values for your project.
export GOOGLE_CLOUD_PROJECT=GOOGLE_CLOUD_PROJECT
export GOOGLE_CLOUD_LOCATION=global
export GOOGLE_GENAI_USE_VERTEXAI=True
Node.js
Install
npm install @google/genai
# Replace the `GOOGLE_CLOUD_PROJECT` and `GOOGLE_CLOUD_LOCATION` values
# with appropriate values for your project.
export GOOGLE_CLOUD_PROJECT=GOOGLE_CLOUD_PROJECT
export GOOGLE_CLOUD_LOCATION=global
export GOOGLE_GENAI_USE_VERTEXAI=True
REST
generateContent
method using the following cURL
command:
curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json" \
https://${API_ENDPOINT}:generateContent \
-d '{
"contents": {
"role": "USER",
"parts": { "text": "Create a tutorial explaining how to make a peanut butter and jelly sandwich in three easy steps."},
},
"generation_config": {
"response_modalities": ["TEXT", "IMAGE"],
},
"safetySettings": {
"method": "PROBABILITY",
"category": "HARM_CATEGORY_DANGEROUS_CONTENT",
"threshold": "BLOCK_MEDIUM_AND_ABOVE"
},
}' 2>/dev/null >response.json
Edit an image
Console
To edit an image:
- Go to the Vertex AI Studio > Create prompt page.
-
Click Switch model and select
gemini-2.0-flash-preview-image-generation
from the menu. - In the Outputs panel, select Image and text from the drop-down menu.
- Click Insert media ( ), select a source from the menu, and then follow the dialog's instructions.
- In the Write a prompt text area, describe the edits that you want to make to the image.
- Click Submit ( ).
Gemini generates an edited version of the image based on your description. This process usually takes a few seconds but might be slower depending on the current capacity.
Python
Install
pip install --upgrade google-genai
To learn more, see the SDK reference documentation.
Set environment variables to use the Gen AI SDK with Vertex AI:
# Replace the `GOOGLE_CLOUD_PROJECT` and `GOOGLE_CLOUD_LOCATION` values # with appropriate values for your project. export GOOGLE_CLOUD_PROJECT=GOOGLE_CLOUD_PROJECT export GOOGLE_CLOUD_LOCATION=global export GOOGLE_GENAI_USE_VERTEXAI=True
REST
To edit an image, send a POST request to the generateContent
method using the following cURL
command:
curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json" \
https://${API_ENDPOINT}:generateContent \
-d '{
"contents": {
"role": "USER",
"parts": [
{"file_data": {
"mime_type": "image/jpg",
"file_uri": "<var>FILE_NAME</var>"
}
},
{"text": "Convert this photo to black and white, in a cartoonish style."},
]
},
"generation_config": {
"response_modalities": ["TEXT", "IMAGE"],
},
"safetySettings": {
"method": "PROBABILITY",
"category": "HARM_CATEGORY_DANGEROUS_CONTENT",
"threshold": "BLOCK_MEDIUM_AND_ABOVE"
},
}' 2>/dev/null >response.json
Gemini generates an edited version of the image based on your description. This process usually takes a few seconds but might be slower depending on the current capacity.
Generate interleaved images and text
Gemini 2.0 Flash can generate interleaved images with its text responses. For example, you can ask the model to create a recipe and also generate an image for each step. This avoids making separate requests for the text and each image.
Console
To generate interleaved images and text:
- Go to the Vertex AI Studio > Create prompt page.
-
Click Switch model and select
gemini-2.0-flash-preview-image-generation
from the menu. - In the Outputs panel, select Image and text from the drop-down menu.
- In the Write a prompt text area, write a description of the response you want to generate. For example: "Create a tutorial explaining how to make a peanut butter and jelly sandwich in three easy steps. For each step, provide a title with the number of the step, an explanation, and also generate an image, generate each image in a 1:1 aspect ratio."
- Click Submit ( ).
Gemini generates a response that includes text and images based on your description. This process usually takes a few seconds but might be slower depending on the current capacity.
Python
Install
pip install --upgrade google-genai
To learn more, see the SDK reference documentation.
Set environment variables to use the Gen AI SDK with Vertex AI:
# Replace the `GOOGLE_CLOUD_PROJECT` and `GOOGLE_CLOUD_LOCATION` values # with appropriate values for your project. export GOOGLE_CLOUD_PROJECT=GOOGLE_CLOUD_PROJECT export GOOGLE_CLOUD_LOCATION=global export GOOGLE_GENAI_USE_VERTEXAI=True
REST
To generate interleaved images and text, send a POST request to the generateContent
method using the following cURL
command:
curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json" \
https://${API_ENDPOINT}:generateContent \
-d '{
"contents": {
"role": "USER",
"parts": { "text": "Create a tutorial explaining how to make a peanut butter and jelly sandwich in three easy steps. For each step, provide a title with the number of the step, an explanation, and also generate an image, generate each image in a 1:1 aspect ratio."},
},
"generation_config": {
"response_modalities": ["TEXT", "IMAGE"],
},
"safetySettings": {
"method": "PROBABILITY",
"category": "HARM_CATEGORY_DANGEROUS_CONTENT",
"threshold": "BLOCK_MEDIUM_AND_ABOVE"
},
}' 2>/dev/null >response.json
Gemini generates a response that includes text and images based on your description. This process usually takes a few seconds but might be slower depending on the current capacity.