Use the Inference API to generate Gemini prompts.
The Gemini model family includes models that work with multimodal prompt requests. The term multimodal indicates that you can use more than one modality, or type of input, in a prompt. Models that aren't multimodal accept prompts only with text. Modalities can include text, audio, video, and more.
For more overview information see:
Supported Models:
Model | Version |
---|---|
Gemini 1.5 Flash (Preview) | gemini-1.5-flash-preview-0514 |
Gemini 1.5 Pro (Preview) | gemini-1.5-pro-preview-0514 |
Gemini 1.0 Pro Vision | gemini-1.0-pro-001 gemini-1.0-pro-vision-001 |
Gemini 1.0 Pro | gemini-1.0-pro gemini-1.0-pro-001 gemini-1.0-pro-002 |
Limitations:
If you provide a lot of images, then latency might be high.
Example syntax
Syntax to generate a model response.
Non-Streaming
curl
curl -X POST \ -H "Authorization: Bearer $(gcloud auth print-access-token)" \ -H "Content-Type: application/json" \ https://${LOCATION}-aiplatform.googleapis.com/v1/projects/${PROJECT_ID}/locations/${LOCATION}/publishers/google/models/${MODEL_ID}:generateContent \ -d '{ "contents": [{ ... }], "generation_config": { ... }, "safety_settings": { ... } ... }'
Python
gemini_model = GenerativeModel(MODEL_ID) generation_config = GenerationConfig(...) model_response = gemini_model.generate_content([...], generation_config, safety_settings={...})
Streaming
curl
curl -X POST \ -H "Authorization: Bearer $(gcloud auth print-access-token)" \ -H "Content-Type: application/json" \ https://${LOCATION}-aiplatform.googleapis.com/v1/projects/${PROJECT_ID}/locations/${LOCATION}/publishers/google/models/${MODEL_ID}:streamGenerateContent \ -d '{ "contents": [{ ... }], "generation_config": { ... }, "safety_settings": { ... } ... }'
Python
gemini_model = GenerativeModel(MODEL_ID) model_response = gemini_model.generate_content([...], generation_config, safety_settings={...}, stream=True)
Parameter list
See examples for implementation details.
Request body
The request body contains data with the following parameters:
Parameters | |
---|---|
|
Required: The content of the current conversation with the model. For single-turn queries, this is a single instance. For multi-turn queries, this is a repeated field that contains conversation history and the latest request. |
|
Optional: The user provided system instructions for the model. Note: Only |
|
Optional. See Function Calling API. |
|
Optional. See Function Calling API. |
|
Optional: Per request settings for blocking unsafe content. Enforced on |
|
Optional: Generation configuration settings. |
Content
The base structured data type containing multi-part content of a message.
This class consists of two main properties: role
and parts
. The role
property denotes the individual producing the content, while the parts
property contains multiple elements, each representing a segment of data within
a message.
Parameters | |
---|---|
|
Optional: The identity of the entity that creates the message. The following values are supported:
The For non-multi-turn conversations, this field can be left blank or unset. |
|
A list of ordered parts that make up a single message. Different parts may have different IANA MIME types. |
Part
A data type containing media that is part of a multi-part Content
message.
Parameters | |
---|---|
|
Optional: A text prompt or code snippet. |
|
Optional: Inline data in raw bytes. |
|
Optional: Data stored in a file. |
|
Optional: It contains a string representing the See Function Calling API. |
|
Optional: The result output of a See Function Calling API. |
|
Optional: Video metadata. The metadata should only be specified while the video data is presented in |
Blob
Content blob. If possible send as text rather than raw bytes.
Parameters | |
---|---|
|
IANA MIME type of the data. |
|
Raw bytes. |
FileData
URI based data.
Parameters | |
---|---|
mime_type |
IANA MIME type of the data. |
file_uri |
string The Cloud Storage URI to the file storing the data |
FunctionCall
A predicted FunctionCall
returned from the model that contains a string
representing the FunctionDeclaration.name
and a structured JSON object
containing the parameters and their values.
Parameters | |
---|---|
|
The name of the function to call. |
|
The function parameters and values in JSON object format. See Function Calling API for parameter details. |
FunctionResponse
The resulting output from a FunctionCall
that contains a string representing the
FunctionDeclaration.name
. Also contains a structured JSON object with the
output from the function (and uses it as context for the model). This should contain the
result of a FunctionCall
made based on model prediction.
Parameters | |
---|---|
|
The name of the function to call. |
|
The function response in JSON object format. |
VideoMetadata
Metadata describing the input video content.
Parameters | |
---|---|
|
Optional: The start offset of the video. |
|
Optional: The end offset of the video. |
SafetySetting
Safety settings.
Parameters | |
---|---|
|
Optional: The category of harm. |
|
Optional: The harm block threshold. |
|
Optional: The max number of influential terms that contribute the most to the safety scores, which might cause potential blocking. |
|
Optional: Specify if the threshold is used for probability or severity score. If not specified, the threshold is used for probability score. |
HarmCategory
Hrm categories that block content.
Parameters | |
---|---|
|
The harm category is unspecified. |
|
The harm category is hate speech. |
|
The harm category is dangerous content. |
|
The harm category is harassment. |
|
The harm category is sexually explicit content. |
HarmBlockThreshold
Probability thresholds levels used to block a response.
Parameters | |
---|---|
|
Unspecified harm block threshold. |
|
Block low threshold and higher (i.e. block more). |
|
Block medium threshold and higher. |
|
Block only high threshold (i.e. block less). |
|
Block none. |
HarmBlockMethod
A probability threshold that blocks a response based on a combination of probability and severity.
Parameters | |
---|---|
|
The harm block method is unspecified. |
|
The harm block method uses both probability and severity scores. |
|
The harm block method uses the probability score. |
GenerationConfig
Configuration settings used when generating the prompt.
Parameters | |
---|---|
|
Optional: Controls the randomness of predictions. |
|
Optional: If specified, nucleus sampling is used. |
|
Optional: If specified, top-k sampling is used. |
|
Optional: Number of candidates to generate. |
|
Optional: int The maximum number of output tokens to generate per message. |
|
Optional: Stop sequences. |
|
Optional: Positive penalties. |
|
Optional: Frequency penalties. |
|
Optional: Output response mimetype of the generated candidate text. Supported mimetype:
This is a preview feature. |
Examples
Non-streaming text response
Generate a non-streaming model response from a text input.
REST
Before using any of the request data, make the following replacements:
- PROJECT_ID: Your project ID.
- LOCATION: The region to process the request.
- TEXT: The text instructions to include in the prompt.
HTTP method and URL:
POST https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/MODEL_ID:generateContent
Request JSON body:
{ "contents": [{ "role": "user", "parts": [{ "text": "TEXT" }] }] }'
To send your request, choose one of these options:
curl
Save the request body in a file named request.json
,
and execute the following command:
curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
"https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/MODEL_ID:generateContent"
PowerShell
Save the request body in a file named request.json
,
and execute the following command:
$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }
Invoke-WebRequest `
-Method POST `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/MODEL_ID:generateContent" | Select-Object -Expand Content
Python
NodeJS
Non-streaming multi-modal response
Generate a non-streaming model response from a multi-modal input, such as text and an image.
REST
Before using any of the request data, make the following replacements:
- PROJECT_ID: Your project ID.
- LOCATION: The region to process the request.
- TEXT: The text instructions to include in the prompt.
- FILE_URI: The Cloud Storage URI to the file storing the data.
- MIME_TYPE: The TIANA MIME type of the data.
HTTP method and URL:
POST https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/MODEL_ID:generateContent
Request JSON body:
{ "contents": [{ "role": "user", "parts": [ { "text": "TEXT" }, { "file_data": {"file_uri": "FILE_URI", "MIME_TYPE"} }, { "file_data": {"file_uri": "FILE_URI", "MIME_TYPE"} } ] }] }
To send your request, choose one of these options:
curl
Save the request body in a file named request.json
,
and execute the following command:
curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
"https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/MODEL_ID:generateContent"
PowerShell
Save the request body in a file named request.json
,
and execute the following command:
$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }
Invoke-WebRequest `
-Method POST `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/MODEL_ID:generateContent" | Select-Object -Expand Content
Python
NodeJS
Streaming text response
Generate a streaming model response from a text input.
REST
Before using any of the request data, make the following replacements:
- PROJECT_ID: Your project ID.
- LOCATION: The region to process the request.
- TEXT: The text instructions to include in the prompt.
HTTP method and URL:
POST https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/MODEL_ID:streamGenerateContent
Request JSON body:
{ "contents": [{ "role": "user", "parts": [{ "text": "TEXT" }] }] }'
To send your request, choose one of these options:
curl
Save the request body in a file named request.json
,
and execute the following command:
curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
"https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/MODEL_ID:streamGenerateContent"
PowerShell
Save the request body in a file named request.json
,
and execute the following command:
$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }
Invoke-WebRequest `
-Method POST `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/MODEL_ID:streamGenerateContent" | Select-Object -Expand Content
Python
NodeJS
Streaming multi-modal response
Generate a streaming model response from a multi-modal input, such as text and an image.
REST
Before using any of the request data, make the following replacements:
- PROJECT_ID: Your project ID.
- LOCATION: The region to process the request.
- TEXT: The text instructions to include in the prompt.
HTTP method and URL:
POST https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/MODEL_ID:streamGenerateContent
Request JSON body:
{ "contents": [{ "role": "user", "parts": [{ "text": "TEXT" }] }] }'
To send your request, choose one of these options:
curl
Save the request body in a file named request.json
,
and execute the following command:
curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
"https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/MODEL_ID:streamGenerateContent"
PowerShell
Save the request body in a file named request.json
,
and execute the following command:
$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }
Invoke-WebRequest `
-Method POST `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/MODEL_ID:streamGenerateContent" | Select-Object -Expand Content
Python
NodeJS
What's next
- Learn more about the Gemini API.
- Learn more about Function calling.
- Learn more about Grounding responses for Gemini models.