Speech models

Dialogflow voice agents use Speech-to-Text for speech recognition, which is included in Dialogflow pricing. Dialogflow automatically selects a speech recognition model for you, but you can optionally specify the model.

Available models

All available models are listed at Speech-to-Text models. Select a model that is best suited to your domain and supports your agent language and speech features.

If a model is not explicitly specified, then Dialogflow auto-selects a model based on the audio configuration in API requests and agent settings.

The following models typically have the best performance:

  • telephony_short (best for telephony Dialogflow)
  • telephony (best for Agent Assist) (also good for telephony Dialogflow when advanced timeout-based end of speech sensitivity is enabled)
  • phone_call (good for Agent Assist and telephony Dialogflow)
  • latest_short (best for non-telephony Dialogflow)
  • command_and_search (best for languages where other models are not available)

Specify a model

You can supply the model for an agent, flow, or page with the model selection setting.

You can also supply the model when calling the Sessions.detectIntent or Sessions.streamingDetectIntent methods;

Select a protocol and version for the Session reference:

Protocol V3 V3beta1
REST Session resource Session resource
RPC Session interface Session interface
C++ SessionsClient Not available
C# SessionsClient Not available
Go SessionsClient Not available
Java SessionsClient SessionsClient
Node.js SessionsClient SessionsClient
PHP Not available Not available
Python SessionsClient SessionsClient
Ruby Not available Not available
or when configuring the ConversationProfile for Agent Assist. Specifying the model in a detect intent or conversation profile API call will override any model selections applied to the agent, flow, or page, unless you enable the Override request-level speech model setting.