Proactive Audio helps Gemini have more authentic conversations by letting you control when it responds and in what contexts with fewer interruptions. For example, you can ask Gemini to only respond when prompted or when certain specific topics are discussed. To see Proactive Audio in action, check out a demonstration of the features.
This guide covers how Proactive Audio works, how to integrate it into your application, and what tokens you are billed for. This guide doesn't cover the price list for Proactive Audio. For full pricing details, see Vertex AI pricing. This guide assumes you are working either in Vertex AI Studio or are using the Google Gen AI SDK for Python.
Supported models
You can use Proactive Audio with the following models:
Model version | Availability level |
---|---|
gemini-live-2.5-flash-preview-native-audio-09-2025 |
Public preview |
gemini-live-2.5-flash-preview-native-audio |
Public preview; Discontinuation date: October 17, 2025 |
Use Proactive Audio
Proactive Audio is not enabled by default in
gemini-live-2.5-flash-preview-native-audio-09-2025
.
To use Proactive Audio, configure the proactivity
field in
the setup message and set proactive_audio
to true
:
Python
config = LiveConnectConfig( response_modalities=["AUDIO"], proactivity=ProactivityConfig(proactive_audio=True), )
Have a conversation using Proactive Audio
You can initiate a conversation with Gemini using Proactive Audio and define when Gemini can respond, limiting its responses to relevant topics.
For example, the following is a sample of what a conversation with Gemini about cooking might look like:
Prompt: "You are an AI assistant in Italian cooking; only chime in when the topic is about Italian cooking."
Speaker A: "I really love cooking!" (No response from Gemini.)
Speaker B: "Oh yes, me too! My favorite is French cuisine." (No response from
Gemini.)
Speaker A: "I really like Italian food; do you know how to make a pizza?"
(Italian cooking topic will trigger response from Gemini.)
Live API: "I'd be happy to help! Here's a recipe for a pizza."
Features
When using Proactive Audio, Gemini will respond with minimal latency after the user is done speaking. This reduces interruptions and helps Gemini not lose context if an interruption happens.
Proactive Audio also helps Gemini avoid interruptions from background noise or external chatter, and prevents Gemini from responding if external chatter is introduced during a conversation.
If the user needs to interrupt during a response from Gemini, Proactive Audio makes it easier for Gemini to appropriately back-channel (meaning appropriate interruptions are handled), rather than if a user uses filler words such as umm or uhh.
Gemini can co-listen to an audio file that's not the speaker's voice and subsequently answer questions about that audio file later in the conversation.
Billing
While Gemini is listening to a conversation, input audio tokens will be charged.
For output audio tokens, you're only charged when Gemini responds. If Gemini does not respond or stays silent, there will be no charge to your output audio tokens.
For more information, see Vertex AI pricing.