The Imagen API lets you create high quality images in seconds, using text prompts and reference images to guide subject or style generation.
View Imagen for Editing and Customization model card
Supported Models
Model | Code |
---|---|
Customization using reference images (few-shot) | imagen-3.0-capability-001 |
For more information about the features that each model supports, see Imagen models.
HTTP method and URL
POST https://${LOCATION}-aiplatform.googleapis.com/v1/projects/${PROJECT_ID}/locations/${LOCATION}/publishers/google/models/imagen-3.0-capability-001:predict
Example syntax
Syntax to customize an image from a text prompt and reference images.
Syntax
Syntax to customize an image.
REST
curl -X POST \ -H "Authorization: Bearer $(gcloud auth print-access-token)" \ -H "Content-Type: application/json" \ https://${LOCATION}-aiplatform.googleapis.com/v1/projects/${PROJECT_ID}/locations/${LOCATION}/publishers/google/models/imagen-3.0-capability-001:predict \ -d '{ "instances": [ { // Use [1] to refer to the reference images with referenceId=1 // [2] to refer to the reference images with referenceId=2, // following the same format for all reference IDs that you provide. "prompt": "${TEXT_PROMPT}", "referenceImages": [ // A list of at most 4 reference image objects. [...] ] } ], "parameters": { [...] } }'
Sample request body:
This request is for person customization with a face mesh control image and three reference images.
{ "instances": [ { "prompt": "Create an image about a man with short hair [1] in the pose of control image [2] to match the description: A pencil style sketch of a full-body portrait of a man with short hair [1] with hatch-cross drawing, hatch drawing of portrait with 6B and graphite pencils, white background, pencil drawing, high quality, pencil stroke, looking at camera, natural human eyes", "referenceImages": [ { "referenceType": "REFERENCE_TYPE_CONTROL", "referenceId": 2, "referenceImage": { "bytesBase64Encoded": "${IMAGE_BYTES_1}" }, "controlImageConfig": { "controlType": "CONTROL_TYPE_FACE_MESH", "enableControlImageComputation": true } }, { "referenceType": "REFERENCE_TYPE_SUBJECT", "referenceId": 1, "referenceImage": { "bytesBase64Encoded": "${IMAGE_BYTES_2}" }, "subjectImageConfig": { "subjectDescription": "a man with short hair", "subjectType": "SUBJECT_TYPE_PERSON" } }, { "referenceType": "REFERENCE_TYPE_SUBJECT", "referenceId": 1, "referenceImage": { "bytesBase64Encoded": "${IMAGE_BYTES_3}" }, "subjectImageConfig": { "subjectDescription": "a man with short hair", "subjectType": "SUBJECT_TYPE_PERSON" } }, { "referenceType": "REFERENCE_TYPE_SUBJECT", "referenceId": 1, "referenceImage": { "bytesBase64Encoded": "${IMAGE_BYTES_4}" }, "subjectImageConfig": { "subjectDescription": "a man with short hair", "subjectType": "SUBJECT_TYPE_PERSON" } } ] } ], "parameters": { "negativePrompt": "wrinkles, noise, Low quality, dirty, low res, multi face, rough texture, messy, messy background, color background, photo realistic, photo, super realistic, signature, autograph, sign, text, characters, alphabet, letter", "seed": 1, "language": "en", "sampleCount": 4 } }
Parameter list
See examples for implementation details.
Customize images
REST
Parameters | |
---|---|
referenceType |
Required enumeration:
|
referenceId |
Required integer The reference ID. Use this reference ID in the prompt. For example, use [1] to refer to the reference images with referenceId=1, [2] to
refer to the reference images with referenceId=2.
|
referenceImage.bytesBase64Encoded |
Required string A Base64 string for the encoded reference image. |
maskImageConfig.maskMode |
Optional enumeration:
Specified when referenceType is set as REFERENCE_TYPE_MASK .
|
maskImageConfig.dilation |
Optional float . Range: [0, 1]The percentage of image width to dilate this mask by. Specified when referenceType is set as REFERENCE_TYPE_MASK .
|
maskImageConfig.maskClasses |
Optional list[Integer] .Mask classes for MASK_MODE_SEMANTIC mode.Specified when referenceType is set as REFERENCE_TYPE_MASK .
|
controlImageConfig.controlType |
Required enumeration:
Specified when referenceType is set as REFERENCE_TYPE_CONTROL .
|
controlImageConfig.enableControlImageComputation |
Optional bool .Default: false .
Specified when referenceType is set as REFERENCE_TYPE_CONTROL .
|
language |
Optional: The language code that corresponds to your text prompt language. The following values are supported:
en : English (if omitted, the default value)
|
subjectImageConfig.subjectDescription |
Required string .A short description of the subject in the image. For example, a woman with short brown hair. Specified when referenceType is set as REFERENCE_TYPE_SUBJECT .
|
subjectImageConfig.subjectType |
Required enumeration:
Specified when referenceType is set as REFERENCE_TYPE_SUBJECT .
|
styleImageConfig.styleDescription |
Optional string .A short description for the style. Specified when referenceType is set as REFERENCE_TYPE_STYLE .
|
Response
The response body from the REST request.
Parameter | |
---|---|
predictions |
An array of
|
Vision generative model result object
Information about the model result.
Parameter | |
---|---|
bytesBase64Encoded |
The base64 encoded generated image. Not present if the output image did not pass responsible AI filters. |
mimeType |
The type of the generated image. Not present if the output image did not pass responsible AI filters. |
Examples
The following examples show how to use the Imagen model to customize images.
Customize images
REST
Before using any of the request data, make the following replacements:
- PROJECT_ID: Your Google Cloud project ID.
- LOCATION: Your project's region. For example,
us-central1
,europe-west2
, orasia-northeast3
. For a list of available regions, see Generative AI on Vertex AI locations. - TEXT_PROMPT: The text prompt guides what images the model
generates. To use Imagen 3 Customization, include the
referenceId
of the reference image or images you provide in the format [$referenceId]. For example:- The following text prompt is for a request that has two reference images with
"referenceId": 1
. Both images have an optional description of"subjectDescription": "man with short hair"
: Create an image about a man with short hair to match the description: A pencil style sketch of a full-body portrait of a man with short hair [1] with hatch-cross drawing, hatch drawing of portrait with 6B and graphite pencils, white background, pencil drawing, high quality, pencil stroke, looking at camera, natural human eyes
- The following text prompt is for a request that has two reference images with
"referenceId"
: The ID of the reference image, or the ID for a series of reference images that correspond to the same subject or style. In this example the two reference images are of the same person, so they share the samereferenceId
(1
).- BASE64_REFERENCE_IMAGE: A reference image to guide image generation. The image must be specified as a base64-encoded byte string.
- SUBJECT_DESCRIPTION: Optional. A text description of the reference image you can
then use in the
prompt
field. For example:"prompt": "a full-body portrait of a man with short hair [1] with hatch-cross drawing", [...], "subjectDescription": "man with short hair"
- IMAGE_COUNT: The number of generated images. Accepted integer values: 1-4. Default value: 4.
HTTP method and URL:
POST https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/imagen-3.0-capability-001:predict
Request JSON body:
{ "instances": [ { "prompt": "TEXT_PROMPT", "referenceImages": [ { "referenceType": "REFERENCE_TYPE_SUBJECT", "referenceId": 1, "referenceImage": { "bytesBase64Encoded": "BASE64_REFERENCE_IMAGE" }, "subjectImageConfig": { "subjectDescription": "SUBJECT_DESCRIPTION", "subjectType": "SUBJECT_TYPE_PERSON" } }, { "referenceType": "REFERENCE_TYPE_SUBJECT", "referenceId": 1, "referenceImage": { "bytesBase64Encoded": "BASE64_REFERENCE_IMAGE" }, "subjectImageConfig": { "subjectDescription": "SUBJECT_DESCRIPTION", "subjectType": "SUBJECT_TYPE_PERSON" } } ] } ], "parameters": { "sampleCount": IMAGE_COUNT } }
To send your request, choose one of these options:
curl
Save the request body in a file named request.json
,
and execute the following command:
curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
"https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/imagen-3.0-capability-001:predict"
PowerShell
Save the request body in a file named request.json
,
and execute the following command:
$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }
Invoke-WebRequest `
-Method POST `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/imagen-3.0-capability-001:predict" | Select-Object -Expand Content
"sampleCount": 2
. The response returns two prediction objects, with
the generated image bytes base64-encoded.
{ "predictions": [ { "bytesBase64Encoded": "BASE64_IMG_BYTES", "mimeType": "image/png" }, { "mimeType": "image/png", "bytesBase64Encoded": "BASE64_IMG_BYTES" } ] }
Class IDs
Use the following object class IDs to automatically create an image mask based on specific objects.
Class ID (class_ ) |
Object |
---|---|
0 | backpack |
1 | umbrella |
2 | bag |
3 | tie |
4 | suitcase |
5 | case |
6 | bird |
7 | cat |
8 | dog |
9 | horse |
10 | sheep |
11 | cow |
12 | elephant |
13 | bear |
14 | zebra |
15 | giraffe |
16 | animal (other) |
17 | microwave |
18 | radiator |
19 | oven |
20 | toaster |
21 | storage tank |
22 | conveyor belt |
23 | sink |
24 | refrigerator |
25 | washer dryer |
26 | fan |
27 | dishwasher |
28 | toilet |
29 | bathtub |
30 | shower |
31 | tunnel |
32 | bridge |
33 | pier wharf |
34 | tent |
35 | building |
36 | ceiling |
37 | laptop |
38 | keyboard |
39 | mouse |
40 | remote |
41 | cell phone |
42 | television |
43 | floor |
44 | stage |
45 | banana |
46 | apple |
47 | sandwich |
48 | orange |
49 | broccoli |
50 | carrot |
51 | hot dog |
52 | pizza |
53 | donut |
54 | cake |
55 | fruit (other) |
56 | food (other) |
57 | chair (other) |
58 | armchair |
59 | swivel chair |
60 | stool |
61 | seat |
62 | couch |
63 | trash can |
64 | potted plant |
65 | nightstand |
66 | bed |
67 | table |
68 | pool table |
69 | barrel |
70 | desk |
71 | ottoman |
72 | wardrobe |
73 | crib |
74 | basket |
75 | chest of drawers |
76 | bookshelf |
77 | counter (other) |
78 | bathroom counter |
79 | kitchen island |
80 | door |
81 | light (other) |
82 | lamp |
83 | sconce |
84 | chandelier |
85 | mirror |
86 | whiteboard |
87 | shelf |
88 | stairs |
89 | escalator |
90 | cabinet |
91 | fireplace |
92 | stove |
93 | arcade machine |
94 | gravel |
95 | platform |
96 | playingfield |
97 | railroad |
98 | road |
99 | snow |
100 | sidewalk pavement |
101 | runway |
102 | terrain |
103 | book |
104 | box |
105 | clock |
106 | vase |
107 | scissors |
108 | plaything (other) |
109 | teddy bear |
110 | hair dryer |
111 | toothbrush |
112 | painting |
113 | poster |
114 | bulletin board |
115 | bottle |
116 | cup |
117 | wine glass |
118 | knife |
119 | fork |
120 | spoon |
121 | bowl |
122 | tray |
123 | range hood |
124 | plate |
125 | person |
126 | rider (other) |
127 | bicyclist |
128 | motorcyclist |
129 | paper |
130 | streetlight |
131 | road barrier |
132 | mailbox |
133 | cctv camera |
134 | junction box |
135 | traffic sign |
136 | traffic light |
137 | fire hydrant |
138 | parking meter |
139 | bench |
140 | bike rack |
141 | billboard |
142 | sky |
143 | pole |
144 | fence |
145 | railing banister |
146 | guard rail |
147 | mountain hill |
148 | rock |
149 | frisbee |
150 | skis |
151 | snowboard |
152 | sports ball |
153 | kite |
154 | baseball bat |
155 | baseball glove |
156 | skateboard |
157 | surfboard |
158 | tennis racket |
159 | net |
160 | base |
161 | sculpture |
162 | column |
163 | fountain |
164 | awning |
165 | apparel |
166 | banner |
167 | flag |
168 | blanket |
169 | curtain (other) |
170 | shower curtain |
171 | pillow |
172 | towel |
173 | rug floormat |
174 | vegetation |
175 | bicycle |
176 | car |
177 | autorickshaw |
178 | motorcycle |
179 | airplane |
180 | bus |
181 | train |
182 | truck |
183 | trailer |
184 | boat ship |
185 | slow wheeled object |
186 | river lake |
187 | sea |
188 | water (other) |
189 | swimming pool |
190 | waterfall |
191 | wall |
192 | window |
193 | window blind |
What's next
- For more information, see Imagen on Vertex AI.