Batch predictions let you send a large number of multimodal prompts in a single batch request.
For more information about the batch workflow and how to format your input data, see Get batch predictions for Gemini.
Supported Models:
Model | Version |
---|---|
Gemini 2.0 Flash | gemini-2.0-flash-001 |
Gemini 1.5 Flash | gemini-1.5-flash-002 gemini-1.5-flash-001 |
Gemini 1.5 Pro | gemini-1.5-pro-002 gemini-1.5-pro-001 |
Gemini 1.0 Pro | gemini-1.0-pro-001 gemini-1.0-pro-002 |
Example syntax
The following syntax shows how to send a batch prediction API request using the
curl
command. This example is specific to BigQuery storage.
curl -X POST \ -H "Authorization: Bearer $(gcloud auth print-access-token)" \ -H "Content-Type: application/json" \ https://${LOCATION}-aiplatform.googleapis.com/v1/projects/${PROJECT_ID}/locations/${LOCATION}/batchPredictionJobs \ -d '{ "displayName": "...", "model": "publishers/google/models/${MODEL_ID}", "inputConfig": { "instancesFormat": "bigquery", "bigquerySource": { "inputUri" : "..." } }, "outputConfig": { "predictionsFormat": "bigquery", "bigqueryDestination": { "outputUri": "..." } } }'
Parameters
See examples for implementation details.
Body request
Parameters | |
---|---|
|
A name you choose for your job. |
|
The model to use for batch prediction. |
|
The data format. For Gemini batch prediction, Cloud Storage and BigQuery input sources are supported. |
|
The output configuration which determines model output location. Cloud Storage and BigQuery output locations are supported. |
inputConfig
Parameters | |
---|---|
|
The prompt input format. Use |
|
The input source URI. This is a Cloud Storage location of the JSONL
file in the form |
|
The input source URI. This is a BigQuery table URI in
the form |
outputConfig
Parameters | |
---|---|
|
The output format of the prediction. Use |
|
The Cloud Storage bucket and directory location, in the form
|
|
The BigQuery URI of the target output table, in the
form |
Examples
Request a batch response
Batch requests for multimodal models accept Cloud Storage storage and BigQuery storage sources. To learn more, see the following:
Depending on the number of input items that you submitted, a batch generation task can take some time to complete.
To create a batch prediction job, use the
projects.locations.batchPredictionJobs.create
method.
Before using any of the request data, make the following replacements:
LOCATION : A region that supports Gemini models.PROJECT_ID : Your project ID.INPUT_URI : The Cloud Storage location of your JSONL batch prediction input such asgs://bucketname/path/to/file.jsonl
.OUTPUT_FORMAT : To output to a BigQuery table, specifybigquery
. To output to a Cloud Storage bucket, specifyjsonl
.DESTINATION : For BigQuery, specifybigqueryDestination
. For Cloud Storage, specifygcsDestination
.OUTPUT_URI_FIELD_NAME : For BigQuery, specifyoutputUri
. For Cloud Storage, specifyoutputUriPrefix
.OUTPUT_URI : For BigQuery, specify the table location such asbq://myproject.mydataset.output_result
. The region of the output BigQuery dataset must be the same as the Vertex AI batch prediction job. For Cloud Storage, specify the bucket and directory location such asgs://mybucket/path/to/output
.
Request JSON body:
{ "displayName": "my-cloud-storage-batch-prediction-job", "model": "publishers/google/models/gemini-1.5-flash-002", "inputConfig": { "instancesFormat": "jsonl", "gcsSource": { "uris" : "INPUT_URI " } }, "outputConfig": { "predictionsFormat": "OUTPUT_FORMAT ", "DESTINATION ": { "OUTPUT_URI_FIELD_NAME ": "OUTPUT_URI " } } }
To send your request, choose one of these options:
Save the request body in a file named request.json
,
and execute the following command:
curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
"https://LOCATION -aiplatform.googleapis.com/v1/projects/PROJECT_ID /locations/LOCATION /batchPredictionJobs"
Save the request body in a file named request.json
,
and execute the following command:
$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }
Invoke-WebRequest `
-Method POST `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "https://LOCATION -aiplatform.googleapis.com/v1/projects/PROJECT_ID /locations/LOCATION /batchPredictionJobs" | Select-Object -Expand Content
You should receive a JSON response similar to the following.
Response
{ "name": "projects/PROJECT_ID /locations/LOCATION /batchPredictionJobs/BATCH_JOB_ID ", "displayName": "my-cloud-storage-batch-prediction-job", "model": "publishers/google/models/gemini-1.5-flash-002", "inputConfig": { "instancesFormat": "jsonl", "gcsSource": { "uris": [ "INPUT_URI " ] } }, "outputConfig": { "predictionsFormat": "OUTPUT_FORMAT ", "DESTINATION ": { "OUTPUT_URI_FIELD_NAME ": "OUTPUT_URI " } }, "state": "JOB_STATE_PENDING", "createTime": "2024-10-16T19:33:59.153782Z", "updateTime": "2024-10-16T19:33:59.153782Z", "modelVersionId": "1" }
Before using any of the request data, make the following replacements:
LOCATION : A region that supports Gemini models.PROJECT_ID : Your project ID.INPUT_URI : The BigQuery table where your batch prediction input is located such asbq://myproject.mydataset.input_table
. Multi-region datasets are not supported.OUTPUT_FORMAT : To output to a BigQuery table, specifybigquery
. To output to a Cloud Storage bucket, specifyjsonl
.DESTINATION : For BigQuery, specifybigqueryDestination
. For Cloud Storage, specifygcsDestination
.OUTPUT_URI_FIELD_NAME : For BigQuery, specifyoutputUri
. For Cloud Storage, specifyoutputUriPrefix
.OUTPUT_URI : For BigQuery, specify the table location such asbq://myproject.mydataset.output_result
. The region of the output BigQuery dataset must be the same as the Vertex AI batch prediction job. For Cloud Storage, specify the bucket and directory location such asgs://mybucket/path/to/output
.
Request JSON body:
{ "displayName": "my-bigquery-batch-prediction-job", "model": "publishers/google/models/gemini-1.5-flash-002", "inputConfig": { "instancesFormat": "bigquery", "bigquerySource":{ "inputUri" : "INPUT_URI " } }, "outputConfig": { "predictionsFormat": "OUTPUT_FORMAT ", "DESTINATION ": { "OUTPUT_URI_FIELD_NAME ": "OUTPUT_URI " } } }
To send your request, choose one of these options:
Save the request body in a file named request.json
,
and execute the following command:
curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
"https://LOCATION -aiplatform.googleapis.com/v1/projects/PROJECT_ID /locations/LOCATION /batchPredictionJobs"
Save the request body in a file named request.json
,
and execute the following command:
$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }
Invoke-WebRequest `
-Method POST `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "https://LOCATION -aiplatform.googleapis.com/v1/projects/PROJECT_ID /locations/LOCATION /batchPredictionJobs" | Select-Object -Expand Content
You should receive a JSON response similar to the following.
Response
{ "name": "projects/PROJECT_ID /locations/LOCATION /batchPredictionJobs/BATCH_JOB_ID ", "displayName": "my-bigquery-batch-prediction-job", "model": "publishers/google/models/gemini-1.5-flash-002", "inputConfig": { "instancesFormat": "bigquery", "bigquerySource": { "inputUri" : "INPUT_URI " } }, "outputConfig": { "predictionsFormat": "OUTPUT_FORMAT ", "DESTINATION ": { "OUTPUT_URI_FIELD_NAME ": "OUTPUT_URI " } }, "state": "JOB_STATE_PENDING", "createTime": "2024-10-16T19:33:59.153782Z", "updateTime": "2024-10-16T19:33:59.153782Z", "modelVersionId": "1" }
The response includes a unique identifier for the batch job.
You can poll for the status of the batch job using
the BATCH_JOB_ID until the job state
is
JOB_STATE_SUCCEEDED
. For example:
curl -X GET \ -H "Authorization: Bearer $(gcloud auth print-access-token)" \ -H "Content-Type: application/json" \ https://us-central1-aiplatform.googleapis.com/v1/projects/PROJECT_ID /locations/us-central1/batchPredictionJobs/BATCH_JOB_ID
To learn how to install or update the Vertex AI SDK for Python, see Install the Vertex AI SDK for Python. For more information, see the Vertex AI SDK for Python API reference documentation.
Before trying this sample, follow the Node.js setup instructions in the Vertex AI quickstart using client libraries. For more information, see the Vertex AI Node.js API reference documentation.
To authenticate to Vertex AI, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Before trying this sample, follow the Java setup instructions in the Vertex AI quickstart using client libraries. For more information, see the Vertex AI Java API reference documentation.
To authenticate to Vertex AI, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Before trying this sample, follow the Go setup instructions in the Vertex AI quickstart using client libraries. For more information, see the Vertex AI Go API reference documentation.
To authenticate to Vertex AI, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Retrieve batch output
When a batch prediction task completes, the output is stored in the Cloud Storage bucket or the BigQuery table that you specified in your request.
What's next
- Learn how to tune a Gemini model in Overview of model tuning for Gemini.
- Learn more about how to Get batch predictions for Gemini.