Model tuning is a crucial process in adapting Gemini to perform specific tasks with greater precision and accuracy. Model tuning works by providing a model with a training dataset that contains a set of examples of specific downstream tasks.
This page provides an overview of model tuning for Gemini, describes the tuning options available for Gemini, and helps you determine when each tuning option should be used.
Benefits of model tuning
Model tuning is an effective way to customize large models to your tasks. It's a key step to improve the model's quality and efficiency. Model tuning provides the following benefits:
- Higher quality for your specific tasks.
- Increased model robustness.
- Lower inference latency and cost due to shorter prompts.
Tuning compared to prompt design
Tuning provides the following benefits over prompt design.
- Allows deep customization on the model and results in better performance on specific tasks.
- Offers more consistent and reliable results.
- Capable of handling more examples at once.
Tuning approaches
Parameter-efficient tuning and full fine-tuning are two approaches to customizing large models. Both methods have their advantages and implications in terms of model quality and resource efficiency.
Parameter efficient tuning
Parameter-efficient tuning, also called adapter tuning, enables efficient adaptation of large models to your specific tasks or domain. Parameter-efficient tuning updates a relatively small subset of the model's parameters during the tuning process.
To understand how Vertex AI supports adapter tuning and serving, you can find more details in the following whitepaper, Adaptation of Large Foundation Models.
Full fine-tuning
Full fine-tuning updates all parameters of the model, making it suitable for adapting the model to highly complex tasks, with the potential of achieving higher quality. However full fine tuning demands higher computational resources for both tuning and serving, leading to higher overall costs.
Parameter efficient tuning compared to full fine tuning
Parameter-efficient tuning is more resource efficient and cost effective compared to full fine-tuning. It uses significantly lower computational resources to train. It's able to adapt the model faster with a smaller dataset. The flexibility of parameter-efficient tuning offers a solution for multi-task learning without the need for extensive retraining.
Tuning Gemini models
Gemini models (gemini-1.0-pro-002
) support the following tuning methods:
Supervised fine-tuning (parameter-efficient)
Supervised fine-tuning for Gemini models improves the performance of the model by teaching it a new skill. Data that contains hundreds of labeled examples is used to teach the model to mimic a desired behavior or task. Each labeled example demonstrates what you want the model to output during inference.
Supervised fine-tuning is ideal when you have a well-defined task with available labeled data. supervised fine-tuning adapts model behavior with a labeled dataset. This process adjusts the model's weights to minimize the difference between its predictions and the actual labels.
Quota
Quota is enforced on the number of concurrent tuning jobs. Every project comes
with a default quota to run at least one tuning job. This is a global quota,
shared across all available regions. If you want to run more jobs concurrently,
you need to request additional quota for Global concurrent tuning jobs
.
Pricing
Supervised fine-tuning for gemini-1.0-pro-002
is in Preview.
- While tuning is in Preview, there is no charge to tune a model.
- After tuning a model, inference costs for the tuned model still apply. Inference pricing is the same for each stable version of Gemini 1.0 Pro.
For more information, see Vertex AI pricing and Available Gemini stable model versions.
What's next
To learn how to prepare tuning data, see Prepare supervised fine-tuning data.
To learn how supervised fine-tuning can be used in a solution that builds a generative AI knowledge base, see Jump Start Solution: Generative AI knowledge base.