This page shows how to stream audio input to a detect intent request using the API. Dialogflow processes the audio and converts it to text before attempting an intent match. This conversion is known as audio input, speech recognition, speech-to-text, or STT.
Before you begin
This feature is only applicable when using the API for end-user interactions. If you are using an integration, you can skip this guide.
You should do the following before reading this guide:
- Read Dialogflow basics.
- Perform setup steps.
Create an agent
If you have not already created an agent, create one now:
- Go to the Dialogflow ES console.
- If requested, sign in to the Dialogflow Console. See Dialogflow console overview for more information.
- Click Create Agent in the left sidebar menu. (If you already have other agents, click the agent name, scroll to the bottom and click Create new agent.)
- Enter your agent's name, default language, and default time zone.
- If you have already created a project, enter that project. If you want to allow the Dialogflow Console to create the project, select Create a new Google project.
- Click the Create button.
Import the example file to your agent
The steps in this guide make assumptions about your agent, so you need to import an agent prepared for this guide. When importing, these steps use the restore option, which overwrites all agent settings, intents, and entities.
To import the file, follow these steps:
-
Download the
room-booking-agent.zip
file. - Go to the Dialogflow ES console.
- Select your agent.
- Click the settings settings button next to the agent name.
- Select the Export and Import tab.
- Select Restore From Zip and follow instructions to restore the zip file that you downloaded.
Streaming basics
The Session
type's streamingDetectIntent
method returns a bidirectional gRPC streaming object.
The available methods for this object vary by language,
so see the reference documentation for your client library for details.
The streaming object is used to send and receive data concurrently.
Using this object, your client streams audio content to Dialogflow,
while concurrently listening for a StreamingDetectIntentResponse
.
The streamingDetectIntent
method has a
query_input.audio_config.single_utterance
parameter
that affects speech recognition:
- If
false
(default), speech recognition does not cease until the client closes the stream. - If
true
, Dialogflow will detect a single spoken utterance in input audio. When Dialogflow detects the audio's voice has stopped or paused, it ceases speech recognition and sends aStreamingDetectIntentResponse
with a recognition result ofEND_OF_SINGLE_UTTERANCE
to your client. Any audio sent to Dialogflow on the stream after receipt ofEND_OF_SINGLE_UTTERANCE
is ignored by Dialogflow.
In bidirectional streaming, a client can half-close the stream object
to signal to the server that it won't send more data.
For example, in Java and Go, this method is called closeSend
.
It is important to half-close (but not abort) streams in the following situations:
- Your client has finished sending data.
- Your client is configured with
single_utterance
set to true, and it receives aStreamingDetectIntentResponse
with a recognition result ofEND_OF_SINGLE_UTTERANCE
.
After closing a stream, your client should start a new request with a new stream as needed.
Streaming detect intent
The following samples use the
Session
type's streamingDetectIntent
method to stream audio.
To authenticate to Dialogflow, set up Application Default Credentials.
For more information, see
Set up authentication for a local development environment.
To authenticate to Dialogflow, set up Application Default Credentials.
For more information, see
Set up authentication for a local development environment.
To authenticate to Dialogflow, set up Application Default Credentials.
For more information, see
Set up authentication for a local development environment.
To authenticate to Dialogflow, set up Application Default Credentials.
For more information, see
Set up authentication for a local development environment.
C#:
Please follow the
C# setup instructions
on the client libraries page
and then visit the
Dialogflow reference documentation for .NET.
PHP:
Please follow the
PHP setup instructions
on the client libraries page
and then visit the
Dialogflow reference documentation for PHP.
Ruby:
Please follow the
Ruby setup instructions
on the client libraries page
and then visit the
Dialogflow reference documentation for Ruby.
Go
Java
Node.js
Python
Additional languages
Samples
See the samples page for best practices on streaming from a browser microphone to Dialogflow.