Gemini Live API Now GA on Vertex AI
Fabien Blanc-paques
Group Product Manager, Vertex AI, Google Cloud
Today, we are excited to announce that Gemini Live API, powered by the latest Gemini 2.5 Flash Native Audio model, is generally available on Vertex AI.
Pioneering organizations have been using Gemini Live API to build the next generation of multimodal conversational AI that blends voice, vision, and text, to deliver fluid, human-like, and highly contextual interactions. For Google Cloud customers, this means you can deploy low-latency voice and video agents with the stability and performance required for your most demanding workflows.
A new standard with real-time multimodal AI agents
Gemini Live API represents a new standard for bringing AI to life. Imagine an agent that doesn't just listen, but instantly understands the user's intent, the context of their screen, captures the emotion in their voice, and responds with a human-like voice — all in real time.
The power behind this dynamic capability is the Gemini 2.5 Flash Native Audio model. Our approach is based on a simple commitment: to bring the same high-quality conversational intelligence found in advanced experiences across Google directly to your enterprise applications.
In a real-time interaction, precision and speed are non-negotiable. Gemini Live API is natively multimodal and is designed to handle the instantaneous complexity of human dialogue:
-
It can process interruptions mid-sentence without missing a beat, ensuring natural turn-taking.
-
It understands acoustic cues like pitch and pace, deciphering intent and tone.
-
It can see and discuss complex visual data (charts, live video, diagrams) shared by a user, providing immediate, contextual assistance.
The confidence to deploy on Vertex AI
Gemini Live API is engineered for enterprise success. Vertex AI provides the security and stability your mission-critical agents need for production.
The Gemini 2.5 Flash Native Audio model is optimized to process a high volume of concurrent interactions with consistent, low-latency performance. Deploying on Vertex AI allows you to leverage our expanding global infrastructure across multiple regions, delivering reliability for your users. Additionally, enterprise-grade data residency features that allow you to manage where your data is processed, helping you meet critical regulatory and compliance standards.
Building real-world impact with Gemini Live API
The true power of Gemini Live API is demonstrated by the companies who are using it today to redefine their customer experiences.
Shopify, the leading global commerce platform, developed Sidekick, a multimodal AI assistant powered by Gemini Live API on Vertex AI. It provides personalized, robust support away from a desk, enabling real-time problem solving that eliminates traditional ticketing workflows.
“Users often forget they’re talking to AI within a minute of using Sidekick, and in some cases have thanked the bot after a long chat. This is an exciting time to be an entrepreneur. New AI capabilities offered through Gemini empower our merchants to win.” – David Wurtz, VP of Product, Shopify

United Wholesale Mortgage (UWM) transformed its business process by using their AI Loan Officer Assistant, Mia, to dramatically increase business efficiency for their broker partners.
“By integrating the Gemini 2.5 Flash Native Audio model and harnessing the Gemini Live API capabilities on the Vertex AI platform, we've significantly enhanced Mia's capabilities since launching in May 2025. This powerful combination has enabled us to generate over 14,000 loans for our broker partners, proving that AI is much more than just a buzzword at UWM." – Jason Bressler, Chief Technology Officer, UWM

SightCall provides remote video support and AI-driven visual assistance, helping customer service and field teams solve problems faster.
“What makes this partnership so exciting is that the Gemini 2.5 Flash Native Audio model isn’t just fast — it’s seamlessly human. When combined with SightCall Xpert Knowledge™, it becomes a real-time expert that knows what your best technicians know... This is the future of visual support.” – Thomas Cottereau, CEO, SightCall

Napster uses the Gemini Live API’s vision and audio capabilities so their users can co-create and receive live guidance from specialized AI companions.
“By utilizing the Gemini 2.5 Flash Native Audio model on Vertex AI, we've built something we couldn't before: AI Companions that see you, see your screen, and respond like real experts in real-time conversation. This combination of vision and audio enables genuine collaboration — no prompting, no engineering — just natural dialogue where AI understands your full context and unlocks creativity and expertise for everyone.” – Edo Segal, CTO, Napster

Lumeris is deploying their health AI assistant, Tom, in high-stakes environments where nuance and emotional sensitivity are non-negotiable.
“The transition to the Gemini Live API on Vertex AI is a strategic investment in more intuitive and efficient patient conversations. The result is a more responsive and personalized voice experience. For Lumeris, our goal is elevating the quality of every interaction between patients and Tom, our agentic primary care team member. This helps us set a new standard for patient care.” – Jean-Claude Saghbini, President and Chief Technology Officer, Lumeris

Newo deploys versatile AI Receptionists that achieve a conversational quality that is truly lifelike and emotionally intuitive, handling tasks from general inquiries to sales.
“Working with the Gemini 2.5 Flash Native Audio model through Vertex AI allows Newo.ai AI Receptionists to achieve unmatched conversational intelligence — combining ultra-low latency with advanced reasoning. They can identify the main speaker even in noisy settings, switch languages mid-conversation, and sound remarkably natural and emotionally expressive. Our Gemini Live API-powered outbound AI Sales Agents can laugh, joke, and truly connect — making every call feel human.” – David Yang, co-founder, Newo.ai

11Sight is redefining customer interactions with AI-powered conversational agents that book appointments and close sales.
“The Gemini 2.5 Flash Native Audio model on Vertex AI gave us the enterprise-grade platform required to rapidly develop our voice AI agents with very low latency. Integrating this solution with our Sentinel AI Agents pushed our call resolution rates from 40% in February to 60% in November.” – Dr. Farokh Eskafi, CTO, 11Sight

Start building your next-generation agent today
Here is how you can start building with Gemini Live API on Vertex AI today:
-
Try Gemini Live API now in Vertex AI Studio.
-
Read the developer blog to dive deeper into creative use cases, code snippets, and a step-by-step guide for implementation.
- Explore our Gemini Live API documentation for API specifics, reference architectures, and even more demos.



