Overview of self-deployed models

Model Garden offers both self-deployed open, partner, and custom models that you can deploy and serve on Vertex AI. These models are different from the model-as-a-service (MaaS) offerings, which are serverless and require no manual deployment.

When you self deploy models, you deploy them securely within your Google Cloud project and VPC network.

Self-deploy open models

Open models provide pretrained capabilities for various AI tasks, including Gemini models that excel in multimodal processing. An open model is freely available, you are free to publish its outputs, and it can be used anywhere as long as you adhere to its licensing terms. Vertex AI offers both open (also known as open weight) and open source models.

When you use an open model with Vertex AI, you use Vertex AI for your infrastructure. You can also use open models with other infrastructure products, such as PyTorch or Jax.

Open weight models

Many open models are considered open weight large language models (LLMs). Open models provide more transparency than models that aren't open weight. A model's weights are the numerical values stored in the model's neural network architecture that represent learned patterns and relationships from the data a model is trained on. The pretrained parameters, or weights, of open weight models are released. You can use an open weight model for inference and tuning while details such as the original dataset, model architecture, and training code aren't provided.

Open source models

Open models differ from open source AI models. While open models often expose the weights and the core numerical representation of learned patterns, they don't necessarily provide the full source code or training details. Providing weights offers a level of AI model transparency, allowing you to understand the model's capabilities without needing to build it yourself.

Self-deployed partner models

Model Garden helps you purchase and manage model licenses from partners who offer proprietary models as a self deploy option. After you purchase access to a model from Cloud Marketplace, you can choose to deploy on on-demand hardware or use your Compute Engine reservations and committed use discounts to meet your budget requirements. You are charged for model usage and for the Vertex AI infrastructure that you use.

To request usage of a self-deployed partner model, find the relevant model in the Model Garden console, click Contact sales, and then complete the form, which initiates contact with a Google Cloud sales representative.

For more information about deploying and using partner models, see Deploy a partner model and make prediction requests.

Considerations

Consider the following limitations when using self-deployed partner models:

Unlike with open models, you cannot export weights.
For endpoints, only the shared public endpoint type is supported.

Learn more about self-deployed models in Vertex AI

To learn more about custom weights, see Deploy models with custom weights.
For more information about Model Garden, see Overview of Model Garden.
For more information about deploying models, see Use models in Model Garden.
Use Gemma open models
Use Llama open models
Use Hugging Face open models