Batch predictions let you send a large number of multimodal prompts in
a single batch request. For more information about the batch workflow and how to format your input
data, see
Get batch predictions for Gemini. The following example shows how to send a batch prediction API request using the
See examples for implementation details. A name you choose for your job. The model to use for batch prediction. The data format. For Gemini batch prediction,
Cloud Storage and BigQuery input sources are supported. The output configuration which determines model output location.
Cloud Storage and BigQuery output locations are supported. The prompt input format. Use The input source URI. This is a Cloud Storage location of the JSONL
file in the form The input source URI. This is a BigQuery table URI in
the form The output format of the prediction. Use The Cloud Storage bucket and directory location, in the form
The BigQuery URI of the target output table, in the
form Batch requests for multimodal models accept Cloud Storage storage and BigQuery storage
sources. To learn more, see the following: Depending on the number of input items that you submitted, a
batch generation task can take some time to complete. To create a batch prediction job, use the
Before using any of the request data,
make the following replacements:
Request JSON body:
To send your request, choose one of these options:
Save the request body in a file named
Save the request body in a file named You should receive a JSON response similar to the following.
Before using any of the request data,
make the following replacements:
Request JSON body:
To send your request, choose one of these options:
Save the request body in a file named
Save the request body in a file named You should receive a JSON response similar to the following.
To learn more, see the
SDK reference documentation.
Set environment variables to use the Gen AI SDK with Vertex AI:
Before trying this sample, follow the Node.js setup instructions in the
Vertex AI quickstart using
client libraries.
For more information, see the
Vertex AI Node.js API
reference documentation.
To authenticate to Vertex AI, set up Application Default Credentials.
For more information, see
Set up authentication for a local development environment.
Before trying this sample, follow the Java setup instructions in the
Vertex AI quickstart using
client libraries.
For more information, see the
Vertex AI Java API
reference documentation.
To authenticate to Vertex AI, set up Application Default Credentials.
For more information, see
Set up authentication for a local development environment.
Before trying this sample, follow the Go setup instructions in the
Vertex AI quickstart using
client libraries.
For more information, see the
Vertex AI Go API
reference documentation.
To authenticate to Vertex AI, set up Application Default Credentials.
For more information, see
Set up authentication for a local development environment.
When a batch prediction task completes, the output is stored in the
Cloud Storage bucket or the BigQuery table that you specified
in your request.Supported models
Example syntax
curl
command. This example is specific to BigQuery storage.curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json" \
https://${LOCATION}-aiplatform.googleapis.com/v1/projects/${PROJECT_ID}/locations/${LOCATION}/batchPredictionJobs \
-d '{
"displayName": "...",
"model": "publishers/google/models/${MODEL_ID}",
"inputConfig": {
"instancesFormat": "bigquery",
"bigquerySource": {
"inputUri" : "..."
}
},
"outputConfig": {
"predictionsFormat": "bigquery",
"bigqueryDestination": {
"outputUri": "..."
}
}
}'
Parameters
Body request
Parameters
displayName
model
inputConfig
outputConfig
inputConfig
Parameters
instancesFormat
jsonl
for Cloud Storage or
bigquery
for BigQuery.
gcsSource.uris
gs://bucketname/path/to/file.jsonl
.
bigquerySource.inputUri
bq://project_id.dataset.table
. The region of the
input BigQuery dataset must be the same as the
Vertex AI batch prediction job.outputConfig
Parameters
predictionsFormat
bigquery
.
gcsDestination.outputUriPrefix
gs://mybucket/path/to/output
.
bigqueryDestination.outputUri
bq://project_id.dataset.table
. If the table doesn't
already exist, then it is created for you. The region of the
output BigQuery dataset must be the same as the
Vertex AI batch prediction job.Examples
Request a batch response
REST
projects.locations.batchPredictionJobs.create
method.Cloud Storage input
publishers/google/models/gemini-2.5-flash
; or the tuned endpoint name, for example,
projects/PROJECT_ID/locations/LOCATION/models/MODEL_ID
,
where MODEL_ID is the model ID of the tuned model.
gs://bucketname/path/to/file.jsonl
.jsonl
.
bigqueryDestination
. For
Cloud Storage, specify gcsDestination
.
outputUri
. For
Cloud Storage, specify outputUriPrefix
.
bq://myproject.mydataset.output_result
. The region of the output
BigQuery dataset must be the same as the Vertex AI batch
prediction job.
For Cloud Storage, specify the bucket and directory location such as
gs://mybucket/path/to/output
.
{
"displayName": "my-cloud-storage-batch-prediction-job",
"model": "MODEL_PATH",
"inputConfig": {
"instancesFormat": "jsonl",
"gcsSource": {
"uris" : "INPUT_URI"
}
},
"outputConfig": {
"predictionsFormat": "OUTPUT_FORMAT",
"DESTINATION": {
"OUTPUT_URI_FIELD_NAME": "OUTPUT_URI"
}
}
}
curl
request.json
,
and execute the following command:
curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
"https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/batchPredictionJobs"PowerShell
request.json
,
and execute the following command:
$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }
Invoke-WebRequest `
-Method POST `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/batchPredictionJobs" | Select-Object -Expand ContentBigQuery input
publishers/google/models/gemini-2.0-flash-001
; or the tuned endpoint name, for example,
projects/PROJECT_ID/locations/LOCATION/models/MODEL_ID
,
where MODEL_ID is the model ID of the tuned model.
bq://myproject.mydataset.input_table
. The dataset must be located in the same region as the batch prediction job. Multi-region datasets are not supported.bigquery
. To output to
a Cloud Storage bucket, specify jsonl
.
bigqueryDestination
. For
Cloud Storage, specify gcsDestination
.
outputUri
. For
Cloud Storage, specify outputUriPrefix
.
bq://myproject.mydataset.output_result
. The region of the output
BigQuery dataset must be the same as the Vertex AI batch
prediction job.
For Cloud Storage, specify the bucket and directory location such as
gs://mybucket/path/to/output
.
{
"displayName": "my-bigquery-batch-prediction-job",
"model": "MODEL_PATH",
"inputConfig": {
"instancesFormat": "bigquery",
"bigquerySource":{
"inputUri" : "INPUT_URI"
}
},
"outputConfig": {
"predictionsFormat": "OUTPUT_FORMAT",
"DESTINATION": {
"OUTPUT_URI_FIELD_NAME": "OUTPUT_URI"
}
}
}
curl
request.json
,
and execute the following command:
curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
"https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/batchPredictionJobs"PowerShell
request.json
,
and execute the following command:
$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }
Invoke-WebRequest `
-Method POST `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/batchPredictionJobs" | Select-Object -Expand ContentPython
Install
pip install --upgrade google-genai
# Replace the `GOOGLE_CLOUD_PROJECT` and `GOOGLE_CLOUD_LOCATION` values
# with appropriate values for your project.
export GOOGLE_CLOUD_PROJECT=GOOGLE_CLOUD_PROJECT
export GOOGLE_CLOUD_LOCATION=global
export GOOGLE_GENAI_USE_VERTEXAI=True
Cloud Storage input
BigQuery input
Node.js
Cloud Storage input
BigQuery input
Java
Cloud Storage input
BigQuery input
Go
Cloud Storage input
BigQuery input
Retrieve batch output
What's next
Get batch predictions for Gemini
Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 4.0 License, and code samples are licensed under the Apache 2.0 License. For details, see the Google Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.
Last updated 2025-08-21 UTC.