Method: models.predict

Full name: projects.locations.publishers.models.predict

Request message for running inference on Google's generative AI models on Vertex AI. You can use this method to perform tasks like image generation, image editing, virtual try-on, visual question answering, video generation, and generating text and multimodal embeddings.

To run inference on a base (non-tuned) Gemini model, see models.generateContent.

Endpoint

post https://aiplatform.googleapis.com/v1beta1/{endpoint}:predict

Path parameters

endpoint string

Required. The resource name of the publisher model or endpiont requested to serve the prediction. For Google models like Embedding, Imagen, or Veo, use the publisher model format. For tuned models or other models deployed to a Vertex AI

Endpoint

, use the endpoint format.

Publisher model format:

projects/{project}/locations/{location}/publishers/google/models/{model}
Endpoint format:

projects/{project}/locations/{location}/endpoints/{endpoint}

Request body

The request body contains data with the following structure:

Fields

instances[] value (Value format)

Required. The format of each instance is model-dependent. For Vertex AI Generative AI models, the instance schema can be one of the following types:

Text Embedding: TextEmbeddingPredictionInstance
Multimodal Embedding: VisionEmbeddingModelInstance
Imagen for image generation and editing: VisionGenerativeModelInstance
Imagen for virtual try-on: VirtualTryOnModelInstance
Imagen for visual question answering (VQA): VisionReasoningModelInstance
Veo for video generation: VideoGenerationModelInstance

parameters value (Value format)

The format of parameters is model-dependent. For Vertex AI Generative AI models, the parameters schema can be one of the following types:

Text Embedding: TextEmbeddingPredictionParams
Multimodal Embedding: VisionEmbeddingModelParams
Imagen for image generation and editing: VisionGenerativeModelParams
Imagen for virtual try-on: VirtualTryOnModelParams
Imagen for visual question answering (VQA): VisionReasoningModelParams
Veo for video generation: VideoGenerationModelParams

labels map (key: string, value: string)

Optional. The user labels for Imagen billing usage only. Only Imagen supports labels. For other use cases, it will be ignored.

Response body

If successful, the response body contains an instance of PredictResponse.