This guide shows you how to get batch predictions for Gemini models, covering the following topics:
- Supported models: Learn which models support batch prediction.
- Example syntax: See the basic structure of a batch prediction API request.
- Parameters: Understand the request body parameters for a batch job.
- Request a batch response: View and use code samples for submitting batch jobs.
- Retrieve batch output: Find out where your prediction results are stored.
The following diagram summarizes the workflow for getting batch predictions:
Batch prediction lets you send many multimodal prompts in a single request.
For more information about the batch workflow and how to format your input data, see Get batch predictions for Gemini.
Supported models
Example syntax
The following example shows how to use the curl
command to send a batch prediction API request with a BigQuery data source.
curl -X POST \ -H "Authorization: Bearer $(gcloud auth print-access-token)" \ -H "Content-Type: application/json" \ https://${LOCATION}-aiplatform.googleapis.com/v1/projects/${PROJECT_ID}/locations/${LOCATION}/batchPredictionJobs \ -d '{ "displayName": "...", "model": "publishers/google/models/${MODEL_ID}", "inputConfig": { "instancesFormat": "bigquery", "bigquerySource": { "inputUri" : "..." } }, "outputConfig": { "predictionsFormat": "bigquery", "bigqueryDestination": { "outputUri": "..." } } }'
Parameters
See examples for implementation details.
Body request
Parameters | |
---|---|
|
A name you choose for your job. |
|
The model to use for batch prediction. |
|
The data format. For Gemini batch prediction, Cloud Storage and BigQuery input sources are supported. |
|
The output configuration which determines model output location. Cloud Storage and BigQuery output locations are supported. |
inputConfig
Parameters | |
---|---|
|
The prompt input format. Use |
|
The input source URI. This is a Cloud Storage location of the JSONL
file in the form |
|
The input source URI. This is a BigQuery table URI in
the form |
outputConfig
Parameters | |
---|---|
|
The output format of the prediction. Use |
|
The Cloud Storage bucket and directory location, in the form
|
|
The BigQuery URI of the target output table, in the
form |
Request a batch response
To learn how to format your input data, see the following pages:
Depending on the number of input items you submit, the batch generation task can take some time to complete.
REST
To create a batch prediction job, use the
projects.locations.batchPredictionJobs.create
method.
Cloud Storage input
Before using any of the request data, make the following replacements:
- LOCATION: A region that supports Gemini models.
- PROJECT_ID: Your project ID.
- MODEL_PATH: the publisher model name, for example,
publishers/google/models/gemini-2.5-flash
; or the tuned endpoint name, for example,projects/PROJECT_ID/locations/LOCATION/models/MODEL_ID
, where MODEL_ID is the model ID of the tuned model. - INPUT_URI: The
Cloud Storage location of your JSONL batch prediction input such as
gs://bucketname/path/to/file.jsonl
. - OUTPUT_FORMAT: To output to
a Cloud Storage bucket, specify
jsonl
. - DESTINATION: For
BigQuery, specify
bigqueryDestination
. For Cloud Storage, specifygcsDestination
. - OUTPUT_URI_FIELD_NAME:
For BigQuery, specify
outputUri
. For Cloud Storage, specifyoutputUriPrefix
. - OUTPUT_URI: For
BigQuery, specify the table location such as
bq://myproject.mydataset.output_result
. The region of the output BigQuery dataset must be the same as the Vertex AI batch prediction job. For Cloud Storage, specify the bucket and directory location such asgs://mybucket/path/to/output
.
Request JSON body:
{ "displayName": "my-cloud-storage-batch-prediction-job", "model": "MODEL_PATH", "inputConfig": { "instancesFormat": "jsonl", "gcsSource": { "uris" : "INPUT_URI" } }, "outputConfig": { "predictionsFormat": "OUTPUT_FORMAT", "DESTINATION": { "OUTPUT_URI_FIELD_NAME": "OUTPUT_URI" } } }
To send your request, choose one of these options:
curl
Save the request body in a file named request.json
,
and execute the following command:
curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
"https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/batchPredictionJobs"
PowerShell
Save the request body in a file named request.json
,
and execute the following command:
$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }
Invoke-WebRequest `
-Method POST `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/batchPredictionJobs" | Select-Object -Expand Content
You should receive a JSON response similar to the following.
BigQuery input
Before using any of the request data, make the following replacements:
- LOCATION: A region that supports Gemini models.
- PROJECT_ID: Your project ID.
- MODEL_PATH: the publisher model name, for example,
publishers/google/models/gemini-2.0-flash-001
; or the tuned endpoint name, for example,projects/PROJECT_ID/locations/LOCATION/models/MODEL_ID
, where MODEL_ID is the model ID of the tuned model. - INPUT_URI: The
BigQuery table where your batch prediction input is located
such as
bq://myproject.mydataset.input_table
. The dataset must be located in the same region as the batch prediction job. Multi-region datasets are not supported. - OUTPUT_FORMAT: To output to
a BigQuery table, specify
bigquery
. To output to a Cloud Storage bucket, specifyjsonl
. - DESTINATION: For
BigQuery, specify
bigqueryDestination
. For Cloud Storage, specifygcsDestination
. - OUTPUT_URI_FIELD_NAME:
For BigQuery, specify
outputUri
. For Cloud Storage, specifyoutputUriPrefix
. - OUTPUT_URI: For
BigQuery, specify the table location such as
bq://myproject.mydataset.output_result
. The region of the output BigQuery dataset must be the same as the Vertex AI batch prediction job. For Cloud Storage, specify the bucket and directory location such asgs://mybucket/path/to/output
.
Request JSON body:
{ "displayName": "my-bigquery-batch-prediction-job", "model": "MODEL_PATH", "inputConfig": { "instancesFormat": "bigquery", "bigquerySource":{ "inputUri" : "INPUT_URI" } }, "outputConfig": { "predictionsFormat": "OUTPUT_FORMAT", "DESTINATION": { "OUTPUT_URI_FIELD_NAME": "OUTPUT_URI" } } }
To send your request, choose one of these options:
curl
Save the request body in a file named request.json
,
and execute the following command:
curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
"https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/batchPredictionJobs"
PowerShell
Save the request body in a file named request.json
,
and execute the following command:
$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }
Invoke-WebRequest `
-Method POST `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/batchPredictionJobs" | Select-Object -Expand Content
You should receive a JSON response similar to the following.
Python
Install
pip install --upgrade google-genai
To learn more, see the SDK reference documentation.
Set environment variables to use the Gen AI SDK with Vertex AI:
# Replace the `GOOGLE_CLOUD_PROJECT` and `GOOGLE_CLOUD_LOCATION` values # with appropriate values for your project. export GOOGLE_CLOUD_PROJECT=GOOGLE_CLOUD_PROJECT export GOOGLE_CLOUD_LOCATION=global export GOOGLE_GENAI_USE_VERTEXAI=True
Cloud Storage input
BigQuery input
Node.js
Before trying this sample, follow the Node.js setup instructions in the Vertex AI quickstart using client libraries. For more information, see the Vertex AI Node.js API reference documentation.
To authenticate to Vertex AI, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Cloud Storage input
BigQuery input
Java
Before trying this sample, follow the Java setup instructions in the Vertex AI quickstart using client libraries. For more information, see the Vertex AI Java API reference documentation.
To authenticate to Vertex AI, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Cloud Storage input
BigQuery input
Go
Before trying this sample, follow the Go setup instructions in the Vertex AI quickstart using client libraries. For more information, see the Vertex AI Go API reference documentation.
To authenticate to Vertex AI, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Cloud Storage input
BigQuery input
Retrieve batch output
After a batch prediction task completes, Vertex AI stores the output in the Cloud Storage bucket or the BigQuery table that you specified in your request.
What's next
- To learn how to tune a Gemini model, see Overview of model tuning for Gemini.
- To learn more about the batch prediction workflow, see Get batch predictions for Gemini.