Overview of getting predictions on Vertex AI

Stay organized with collections Save and categorize content based on your preferences.
This page gives you an overview of how to get online and batch predictions predictions from AutoML and custom trained models on Vertex AI.

Online predictions

Online predictions are synchronous requests made to a model endpoint. Use online predictions when you are making requests in response to application input or in situations that require timely inference.

Model deployment

You must deploy a model to an endpoint before that model can be used to serve online predictions. Deploying a model associates physical resources with the model so it can serve online predictions with low latency.

You can deploy more than one model to an endpoint, and you can deploy a model to more than one endpoint. For more information about options and use cases for deploying models, see Considerations for deploying models.

To learn how to deploy an AutoML model, see the Get predictions from AutoML models section of this page and select the page that's relevant to your model.

To learn how to deploy a custom trained model, see Get predictions from a custom trained model.

Batch predictions

Batch predictions are asynchronous requests. You request batch predictions directly from the model resource without needing to deploy the model to an endpoint. Use batch predictions when you don't require an immediate response and want to process accumulated data by using a single request.

Get predictions from AutoML models

You can get online or batch predictions from AutoML models by using the Google Cloud console or the Vertex AI API. The instructions for how to do this slightly vary based on your data type and model objective:

Image

Learn how to get predictions from the following types of image AutoML models:

Tabular

Learn how to get predictions from the following types of tabular AutoML models:

Text

Learn how to get predictions from the following types of text AutoML models:

Video

Learn how to get predictions from the following types of video AutoML models:

Get predictions from custom trained models

The instructions on how to get online and batch predictions from your custom trained model are the same, regardless of your data type or model objective.

For details, see Get predictions from a custom trained model.