Google Cloud expands access to Gemini models for Vertex AI customers
Burak Gokturk
VP & GM, Cloud AI & Industry Solutions, Google Cloud
In December, Google announced Gemini, our most capable and general model yet. Since December, select customers like Samsung and Palo Alto Networks have been building sophisticated AI agents with Gemini models in Vertex AI, unlocking new levels of productivity, personalized learning, and more for their users. Today, we’re bringing more Gemini models to our customers with new updates and expanded availability:
- Gemini 1.0 Pro, our best model for scaling across AI tasks, is now generally available to all Vertex AI customers. Starting today, any developer can start building with Gemini Pro in production. 1.0 Pro offers the best balance of quality, performance, and cost for most AI tasks, like content generation, editing, summarization, and classification.
- Gemini 1.0 Ultra, our most sophisticated and capable model for complex tasks, is now generally available on Vertex AI for customers via allowlist. 1.0 Ultra is designed for complex tasks, showing especially strong performance in areas such as complex instruction, code, reasoning, and multilinguality, and is optimized for high quality output.
In addition, we're excited to introduce a new generation of Gemini models with Gemini 1.5, which delivers improved performance on a more efficient architecture.
The first Gemini 1.5 model we’re releasing for early testing is Gemini 1.5 Pro, which is now in private preview on Vertex AI. It’s a mid-size multimodal model, optimized for scaling across a wide-range of tasks, and performs at a similar level to 1.0 Ultra, our largest model to date. 1.5 Pro introduces a new breakthrough experimental feature in long-context understanding — the longest context window of any large-scale foundation model yet. Apps can now run up to 1 million tokens in production. This means 1.5 Pro can process vast amounts of information in one go — including 1 hour of video, 11 hours of audio, codebases with over 30,000 lines of code or over 700,000 words.
Larger context windows enable models to reference more information, grasp narrative flow, maintain coherence over longer passages, and generate more contextually rich responses. For example, with 1.5 Pro, enterprises can:
- Accurately analyze an entire code library in a single prompt, without the need to fine-tune the model, including understanding and reasoning over small details that a developer might easily miss, such as errors, inefficiencies, and inconsistencies in code.
- Reason across very long documents, from comparing details across contracts to synthesizing and analyzing themes and opinions across analyst reports, research studies, or even a series of books.
- Analyze and compare content across hours of video, such as finding specific details in sports footage or getting caught up on detailed information from video meeting summaries that support precise question-answer.
- Enable chatbots to hold long conversations without forgetting details, even over complex tasks or many follow-up interactions.
- Enable hyper-personalized experiences by pulling relevant user information into the prompt, without the complexity of fine-tuning a model.
How customers are innovating with Gemini models
Vertex AI has seen strong adoption with API requests increasing nearly 6X from H1 to H2 last year. We are really impressed with the amazing things customers are doing with Gemini models particularly because they are multimodal and can handle complex reasoning so well.
Samsung: Samsung recently announced that their Galaxy S24 series is the first smartphone equipped with Gemini models. Starting with Samsung-native applications, customers can take advantage of summarization features across Notes and Voice Recorder. Samsung is confident their end users are protected with built-in security, safety, and privacy in Vertex AI.
Palo Alto Networks: Palo Alto Networks is testing Gemini models across a variety of use cases including intelligent product agents that let its customers interact with their product portfolio in a more intuitive way and reduce the time spent with customer support.
Jasper: Jasper, an AI offering that helps enterprise marketing teams create and repackage content, is using Gemini models to automatically generate blog content and product descriptions for their customers. Teams can now move faster while maintaining a high quality bar for content, ensuring it adheres to brand voice and marketing guidelines.
Quora: Quora, the popular question and answer platform, is using Gemini to help power creator monetization on their AI chat platform, Poe, where users can explore a wide-variety of AI-powered bots. Gemini is enabling Poe creators to build custom bots across a variety of use cases including writing assistance, generating code, personalized learning, and more.
Build production-ready applications with the Gemini API in Vertex AI
The Gemini API in Vertex AI empowers developers to build the next generation of AI agents and apps — ones that can simultaneously process information across modalities like text, code, images, and video. To harness the power of the Gemini models, organizations and developers need to be able to build enterprise-grade applications and take them to production. Vertex AI is the only cloud AI platform to offer a single, integrated platform for models, tooling and infrastructure, ensuring that once applications are built with Gemini models, they can be easily deployed and maintained. With Vertex AI, customers can:
Customize Gemini models for specific business needs. The Gemini API in Vertex AI now supports adapter based tuning such as Low-Rank Adaptation (LoRA), which allows developers to customize the model in an efficient, lower-cost way. Additional customization techniques like reinforcement learning from human feedback (RLHF) and distillation are coming to the Gemini API in the coming months.
Augment the Gemini models response with up to the minute information and enable the model to take action in the real world. With support for fully-managed grounding, developers can improve the accuracy and relevance of the Gemini model’s answers using their enterprise’s own data. With function calling, now generally available, developers can connect the Gemini model to external APIs for transactions and other actions.
Manage and scale Gemini in production with purpose-built tools to help ensure that once applications are built, they can be easily deployed and maintained. Vertex AI offers an automated evaluation tool for generative AI models: Automatic Side by Side. This feature compares models responses by a standard set of criteria, which helps developers understand Gemini’s performance and adjust prompts and tuning based on that feedback.
Build search and conversational agents with Gemini models with minimal coding expertise required, in hours and days instead of weeks and months:
- Vertex AI Search provides developers with an out of the box, Google Search-quality information retrieval and answer generation system. With support for Gemini models, developers can build search applications with even more robust grounding, accurate citations, and satisfying answers.
- Vertex AI Conversation now offers developers the ability to build sophisticated gen AI powered conversational chatbots using Gemini models. With the advanced reasoning and multimodal capabilities of Gemini, developers can drive more personalized, informative and engaging conversational AI experiences in their applications.
The Gemini era is just beginning — stay on the cutting edge
Developers can build production grade applications on Vertex AI, which offers enterprise-grade model augmentation, testing, deployment, and management tools. In addition, developers can experience the Gemini models with the API in Google AI Studio, a free, web-based developer tool to prototype and launch apps quickly with an API key. With all of our new Gemini models now in our customers’ hands, we can’t wait to see the new generation of intelligent apps and agents they’ll create. The Gemini era is just beginning, however — if your organization wants to stay on the cutting edge, work with your account team to ensure you’re signed up to be a trusted tester of upcoming Gemini models. Be sure to join us in Las Vegas in April at Google Cloud Next ‘24 for our latest gen AI news and to explore our upcoming events for deep dives into products and strategies.