Announcing Public Preview of Vertex AI Prompt Optimizer
George Lee
Product Manager, Cloud AI Research
Ivan Nardini
Developer Relations Engineer
Prompt design and engineering stands out as one of the most approachable methods to drive meaningful output from a Large Language Model (LLM). However, prompting large language models can feel like navigating a complex maze. You must experiment with various combinations of instructions and examples to achieve the desired output. Moreover, even if you find the ideal prompt template, there is no guarantee that it will continue to deliver optimal results for a different LLM.
Migrating or translating prompts from one LLM to another is challenging because different language models behave differently. Simply reusing prompts is ineffective, so users need an intelligent prompt optimizer to generate useful outputs.
To help mitigate the "prompt fatigue" experienced by users while they build LLM-based applications, we are announcing Vertex AI Prompt Optimizer in Public Preview.
What is Vertex AI Prompt Optimizer?
Vertex AI Prompt Optimizer helps you find the best prompt (instruction and demonstrations) for any preferred model on Vertex AI. It is based on Google Research’s publication (accepted by NeurIPS 2024) on automatic prompt optimization (APO) methods, and employs an iterative LLM-based optimization algorithm where the optimizer model [responsible for generating paraphrased instructions] and evaluator model [responsible for evaluating the selected instruction and demonstration] work together to generate and evaluate candidate prompts. Prompt Optimizer subsequently selects the best instructions and demonstrations based on the evaluation metrics the user wants to optimize against. Instructions include the system instruction, context, and task of your prompt template. Demonstrations are the few-shot examples you provide in your prompt to elicit a specific style or tone from the model response.
With just a few labeled examples and configured optimization settings, Vertex AI Prompt Optimizer finds the best prompt (instruction and demonstrations) for the target model and removes the need for manually optimizing existing prompts every time for a new LLM. You can now easily craft a new prompt for a particular task or translate a prompt from one model to another model on Vertex AI. Here are the key characteristics:
-
Easy optimization: Quickly optimize prompts for any target Google model, including migration and translation of prompts from any source model.
-
Versatile task handling: Accommodates any text-based task (such as question and answering, summarization, classification, and entity extraction) and expand support for multimodal tasks is coming soon.
-
Comprehensive evaluation: Supports a wide array of evaluation metrics, including model-based, computation-based, and custom metrics, to ensure optimal prompt performance against the metrics you care about.
-
Flexible and customizable: Tailor the optimization process and latency with advanced settings and utilize various notebook versions to suit your expertise level and needs.
Why use Vertex AI Prompt Optimizer?
Data-driven optimization: Many existing prompt optimization tools focus on tailoring your prompts to your preferred style and tone and oftentimes still require human verification. However, Vertex AI Prompt Optimizer goes beyond this by optimizing your prompts based on specific evaluation metrics, ensuring the best possible performance for your target model.
Built for Gemini: If you’re using Gemini, Vertex AI Prompt Optimizer is designed to keep Gemini’s underlying characteristics in mind. It is specifically designed to adapt to the unique attributes of the Gemini and other Google models. This tailored approach allows you to unlock the full potential of Gemini and achieve superior results.
Getting started with Vertex AI Prompt Optimizer
To start using Vertex AI Prompt Optimizer, you can use the Colab notebook available in the Google Cloud Generative AI repository on Github which contains sample code and notebooks for Generative AI on Google Cloud. Refer to the UI version for basic settings and the SDK version for more advanced settings. More versions of the notebook to support custom metrics and multimodal input will be added in the coming weeks. You can also access it via the Vertex AI Studio console. Look for entry points in the console that indicate “prompt optimizer” or “optimizer your prompt further” (refer to screencasts below).
To either optimize or translate prompts using Vertex AI Prompt Optimizer, follow these steps:
-
Configure your prompt template
-
Input your data (labeled examples)
-
Configure your optimization settings (target model, evaluation metrics, etc.)
-
Run the optimization job
-
Inspect the results
Vertex AI Prompt Optimizer supports any Google models and evaluation metrics supported by the Generative AI Evaluation Service.
Entry points from Vertex AI Studio to Vertex AI Prompt Optimizer Colab Enterprise Notebook
A. The Saved prompts page will include a new Prompt optimizer button.
B. The Prompt assist dialog pop-up will include a new Optimize your prompt further button.
How AdVon Commerce and Augmedix enhanced Gemini prompts using Vertex AI Prompt Optimizer
AdVon Commerce - a digital commerce platform - partnered with Google Cloud to create quality content at scale for retailers using tailored AI solutions. AdVon Commerce utilizes LLMs to generate accurate and engaging product page content at scale, which incorporates the right keywords and represents the products accurately. When optimizing retail pages, there’s a lot of missing or incorrect data to work through. Creating shopper-first content means accurately completing missing product attributes that are essential for product searchability and customer journey.
Vertex AI Prompt Optimizer streamlined the creation and refinement of AI prompts, resulting in higher accuracy and relevance. AdVon Commerce observed a 10% increase in attribute accuracy and are able to maintain their commitment to high-quality content while significantly reducing the time they spend on human verification, saving substantial costs. Coupled with Gemini Flash, they have received impressive results with a reduction in incorrect specs and better quality in product page content. For example, AdVon Commerce has recently helped one of the largest global retailers by using Vertex AI Prompt Optimizer and Gemini 1.5 Flash to automate the process of creating populating product attributes for hundreds of millions of items. This resulted in a 100x increase in productivity for the retailer, as it would have taken 100 times longer if they had tried to do that manually.
Vlad Barshai, Chief Technology Officer at AdVon Commerce, stated “Vertex AI Prompt Optimizer allows us to optimize our prompts for Gemini Flash with 10% incremental improvements for problematic Product Attributes and PDP (Product Detail Page) for retailer listings, significantly surpassing results from all other leading AI models on the market. With Vertex AI Prompt Optimizer, we save time on human verification, allowing us to enrich millions of products in a loop where we optimize prompts and generate AI attributes and PDP content at scale. Coupled with a solid human-in-the-loop process, Vertex AI Prompt Optimizer will help us produce high quality enrichment every time.”
Augmedix – a leader in ambient AI medical documentation and data solutions that has generated over 10 million medical notes to date – partnered with Google Cloud to enhance medical documentation for healthcare providers. Augmedix utilizes LLMs to improve efficiency and accuracy in capturing patient interactions, reduce clinician administrative burdens, and ultimately, improve patient care. Augmedix adopted a hybrid approach – models are fine-tuned and inputs are prompt-tuned. For many parts of note generation, fine-tuning is best, and basic prompts work well. In other parts of their system, where there may be hundreds of rules that instruct the LLM, prompt-tuning these rules is optimal.
Augmedix employed Vertex AI Prompt Optimizer to enhance medical note generation from doctor-patient conversations. The feature improved LLM output quality scores from 66% to 86%. In addition, with Vertex AI Prompt Optimizer, Augmedix can test prompt variations quickly, allowing for faster iteration and optimization. The optimized prompts run in 6 seconds, compared to 20 seconds for prompts without Vertex AI Prompt Optimizer.
Ian Shakil, Founder, Director, and Chief Strategy Officer at Augmedix, stated, “Our partnership with Google Cloud AI enabled us to pioneer the frontier of the LLM wave. With MedLM and Gemini, we have achieved revolutionary advancements, driving cutting-edge innovation in the digital health space. This collaboration empowers us to deliver higher quality outputs, reduced turnaround times, and a richer feature set.”
What’s next
If you want to know more about Vertex AI Prompt Optimizer, join the Vertex AI Google Cloud community to share your experiences, ask questions, and collaborate on new projects. Also check out the following resources:
Documentation
Github samples
- Getting started with Vertex AI Prompt Optimizer UI
- Getting started with Vertex AI Prompt Optimizer SDK
Google for Developers blog post
Medium Google Cloud Community blog post