Speech models

Dialogflow voice agents use Speech-to-Text for speech recognition, which is included in Dialogflow pricing. Dialogflow automatically selects a speech recognition model for you, but you can optionally specify the model.

Available models

All available models are listed at Speech-to-Text models. Select a model that is best suited to your domain and supports your agent language and speech features.

If a model is not explicitly specified, then Dialogflow auto-selects a model based on the audio configuration in API requests and agent settings.

If enhanced speech model is enabled for the agent and an enhanced version of the specified model for the language does not exist, then the speech is recognized using the standard version of the specified model.

The following models typically have the best performance:

telephony_short (best for telephony Dialogflow)
telephony (best for Agent Assist)
phone_call (good for Agent Assist and telephony Dialogflow)
latest_short (best for non-telephony Dialogflow)
command_and_search (best for languages where other models are not available)

Specify a model

You can supply the model when calling the detectIntent or streamingDetectIntent methods on the Sessions type; or when configuring the ConversationProfile for Agent Assist.

Mutual TLS authentication

Speech adaptation