Introduction to tuning

Model tuning is the process of adapting Gemini to perform specific downstream tasks with greater precision and accuracy. Model tuning works by providing a model with a training dataset that contains examples of specific downstream tasks.

This page provides an overview of model tuning for Gemini, covering the following topics:

Benefits of model tuning

Model tuning is an effective way to customize large models for your tasks. It's a key step in improving a model's quality and efficiency. Model tuning provides the following benefits:

  • Higher quality: Improves model performance on your specific tasks.
  • Increased model robustness: Makes the model more resilient to variations in input.
  • Lower inference cost and latency: Reduces costs and response times by allowing for shorter prompts.

Tuning compared to prompt design

The following table compares prompt design and fine-tuning:

Method Description Best for
Prompt design Crafting effective instructions to guide the model's output without changing the model itself. Rapid prototyping, tasks with limited labeled data, or when you need a baseline performance quickly.
Fine-tuning Retraining the base model on a custom labeled dataset to adapt its weights to a specific task. Complex or unique tasks, achieving higher quality, and when you have a sizable dataset (100+ examples).

When deciding between prompt design and fine-tuning, consider the following recommendations:

  • Start with prompt design to find the optimal prompt. If needed, use fine-tuning to further boost performance or fix recurring errors.
  • Before adding more data, evaluate where the model makes mistakes.
  • Prioritize high-quality, well-labeled data over quantity.
  • Make sure that the data used for fine-tuning reflects the prompt distribution, format, and context the model will encounter in production.

Tuning provides the following benefits over prompt design:

  • Deeper customization: Provides deeper customization of the model, resulting in better performance on specific tasks.
  • Better alignment: Aligns the model with custom syntax, instructions, and domain-specific semantic rules.
  • More consistent results: Offers more consistent and reliable outputs.
  • Handles more examples: Processes more examples in a single prompt.
  • Reduced inference cost: Saves costs at inference by eliminating the need for few-shot examples and long instructions in prompts.

Tuning approaches

Parameter-efficient tuning and full fine-tuning are two approaches to customizing large models. Both methods have their advantages and implications for model quality and resource efficiency.

Tuning Approach Description Pros Cons
Parameter-efficient tuning (Adapter tuning) Updates only a small subset of the model's parameters. Resource-efficient, cost-effective, faster training with smaller datasets, flexible for multi-task learning. May not achieve the same peak quality as full tuning for highly complex tasks.
Full fine-tuning Updates all of the model's parameters. Potential for higher quality on highly complex tasks. Requires significant computational resources, higher costs for tuning and serving.

Parameter-efficient tuning (PET), also called adapter tuning, updates a small subset of the model's parameters. This approach is more resource-efficient and cost-effective than full fine-tuning. It adapts the model faster with a smaller dataset and offers a flexible solution for multi-task learning without extensive retraining. To understand how Vertex AI supports adapter tuning and serving, see the whitepaper, Adaptation of Large Foundation Models.

Full fine-tuning updates all of the model's parameters. This method is suitable for adapting a model to highly complex tasks and can achieve higher quality. However, it requires significant computational resources for both tuning and serving, leading to higher overall costs.

Supported tuning methods

Vertex AI supports supervised fine-tuning to customize foundational models.

Supervised fine-tuning

Supervised fine-tuning improves the performance of a model by teaching it a new skill using a dataset with hundreds of labeled examples. Each labeled example demonstrates the desired output you want the model to produce during inference.

When you run a supervised fine-tuning job, the model learns additional parameters that encode the information necessary to perform the desired task or learn the desired behavior. These parameters are used during inference. The output of the tuning job is a new model that combines the newly learned parameters with the original model.

Supervised fine-tuning of a text model is a good option when the output of your model isn't complex and is relatively easy to define. Supervised fine-tuning is a good choice for tasks like classification, sentiment analysis, entity extraction, summarization of content that isn't complex, and writing domain-specific queries. For code models, supervised tuning is the only option.

Models that support supervised fine-tuning

The following Gemini models support supervised tuning:

For more information about supervised fine-tuning with different data types, see how to tune models using text, image, audio, and document data.

What's next