Detecting Intent from a Stream

Here is an example of detecting intent by streaming to a Dialogflow agent.

Import the sample Dialogflow agent

This example uses a sample agent that you must import before you run the example code. To import the sample agent:

  1. Create a Dialogflow Enterprise agent for sample code. Remember that you can only have one Dialogflow Enterprise agent for a Google Cloud Platform project.

    For an example of how to create a Dialogflow Enterprise Agent, see the Quickstart.

  2. Download the file that contains the sample agent.
  3. Go to and select the settings for your Dialogflow Enterprise agent.
  4. Select the Export and Import tab.

    WARNING: Importing the sample agent will add intents and entities to your Dialogflow agent. You might want to use a different Google Cloud Platform Project, or export your Dialogflow agent before importing the sample agent to save a version of your agent before the sample agent was imported.

  5. Select IMPORT FROM ZIP and import the file that you downloaded.

Detect intent


For more on installing and creating a Dialogflow client, refer to Dialogflow Client Libraries.

 * Returns the result of detect intent with streaming audio as input.
 * Using the same `session_id` between requests allows continuation of the conversation.
 * @param projectId Project/Agent Id.
 * @param audioFilePath The audio file to be processed.
 * @param sessionId Identifier of the DetectIntent session.
 * @param languageCode Language code of the query.
public static void detectIntentStream(String projectId, String audioFilePath, String sessionId,
    String languageCode) throws Throwable {
  // Start bi-directional StreamingDetectIntent stream.
  final CountDownLatch notification = new CountDownLatch(1);
  final List<Throwable> responseThrowables = new ArrayList<>();
  final List<StreamingDetectIntentResponse> responses = new ArrayList<>();

  // Instantiates a client
  try (SessionsClient sessionsClient = SessionsClient.create()) {
    // Set the session name using the sessionId (UUID) and projectID (my-project-id)
    SessionName session = SessionName.of(projectId, sessionId);
    System.out.println("Session Path: " + session.toString());

    // Note: hard coding audioEncoding and sampleRateHertz for simplicity.
    // Audio encoding of the audio content sent in the query request.
    AudioEncoding audioEncoding = AudioEncoding.AUDIO_ENCODING_LINEAR_16;
    int sampleRateHertz = 16000;

    // Instructs the speech recognizer how to process the audio content.
    InputAudioConfig inputAudioConfig = InputAudioConfig.newBuilder()
        .setAudioEncoding(audioEncoding) // audioEncoding = AudioEncoding.AUDIO_ENCODING_LINEAR_16
        .setLanguageCode(languageCode) // languageCode = "en-US"
        .setSampleRateHertz(sampleRateHertz) // sampleRateHertz = 16000

    ApiStreamObserver<StreamingDetectIntentResponse> responseObserver =
        new ApiStreamObserver<StreamingDetectIntentResponse>() {
          public void onNext(StreamingDetectIntentResponse response) {
            // Do something when receive a response

          public void onError(Throwable t) {
            // Add error-handling

          public void onCompleted() {
            // Do something when complete.

    // Performs the streaming detect intent callable request
    ApiStreamObserver<StreamingDetectIntentRequest> requestObserver =

    // Build the query with the InputAudioConfig
    QueryInput queryInput = QueryInput.newBuilder().setAudioConfig(inputAudioConfig).build();

    try (FileInputStream audioStream = new FileInputStream(audioFilePath)) {
      // The first request contains the configuration
      StreamingDetectIntentRequest request = StreamingDetectIntentRequest.newBuilder()

      // Make the first request

      // Following messages: audio chunks. We just read the file in fixed-size chunks. In reality
      // you would split the user input by time.
      byte[] buffer = new byte[4096];
      int bytes;
      while ((bytes = != -1) {
                .setInputAudio(ByteString.copyFrom(buffer, 0, bytes))
    } catch (RuntimeException e) {
      // Cancel stream.
    // Half-close the stream.
    // Wait for the final response (without explicit timeout).
    // Process errors/responses.
    if (!responseThrowables.isEmpty()) {
      throw responseThrowables.get(0);
    if (responses.isEmpty()) {
      throw new RuntimeException("No response from Dialogflow.");

    for (StreamingDetectIntentResponse response : responses) {
      if (response.hasRecognitionResult()) {
            "Intermediate transcript: '%s'\n", response.getRecognitionResult().getTranscript());

    // Display the last query result
    QueryResult queryResult = responses.get(responses.size() - 1).getQueryResult();
    System.out.format("Query Text: '%s'\n", queryResult.getQueryText());
    System.out.format("Detected Intent: %s (confidence: %f)\n",
        queryResult.getIntent().getDisplayName(), queryResult.getIntentDetectionConfidence());
    System.out.format("Fulfillment Text: '%s'\n", queryResult.getFulfillmentText());


For more on installing and creating a Dialogflow client, refer to Dialogflow Client Libraries.

// Imports the Dialogflow library
const dialogflow = require('dialogflow');

// Instantiates a sessison client
const sessionClient = new dialogflow.SessionsClient();

// The path to the local file on which to perform speech recognition, e.g.
// /path/to/audio.raw const filename = '/path/to/audio.raw';

// The encoding of the audio file, e.g. 'AUDIO_ENCODING_LINEAR16'
// const encoding = 'AUDIO_ENCODING_LINEAR16';

// The sample rate of the audio file in hertz, e.g. 16000
// const sampleRateHertz = 16000;

// The BCP-47 language code to use, e.g. 'en-US'
// const languageCode = 'en-US';
let sessionPath = sessionClient.sessionPath(projectId, sessionId);

const initialStreamRequest = {
  session: sessionPath,
  queryParams: {
    session: sessionClient.sessionPath(projectId, sessionId),
  queryInput: {
    audioConfig: {
      audioEncoding: encoding,
      sampleRateHertz: sampleRateHertz,
      languageCode: languageCode,
    singleUtterance: true,

// Create a stream for the streaming request.
const detectStream = sessionClient
  .on('error', console.error)
  .on('data', data => {
    if (data.recognitionResult) {
        `Intermediate transcript: ${data.recognitionResult.transcript}`
    } else {
      console.log(`Detected intent:`);
      logQueryResult(sessionClient, data.queryResult);

// Write the initial stream request to config for audio input.

// Stream an audio file from disk to the Conversation API, e.g.
// "./resources/audio.raw"
  // Format the audio stream into the request format.
  through2.obj((obj, _, next) => {
    next(null, {inputAudio: obj});


For more on installing and creating a Dialogflow client, refer to Dialogflow Client Libraries.

def detect_intent_stream(project_id, session_id, audio_file_path,
    """Returns the result of detect intent with streaming audio as input.

    Using the same `session_id` between requests allows continuation
    of the conversaion."""
    session_client = dialogflow.SessionsClient()

    # Note: hard coding audio_encoding and sample_rate_hertz for simplicity.
    audio_encoding = dialogflow.enums.AudioEncoding.AUDIO_ENCODING_LINEAR_16
    sample_rate_hertz = 16000

    session = session_client.session_path(project_id, session_id)
    print('Session path: {}\n'.format(session))

    def request_generator(audio_config, audio_file_path):
        query_input = dialogflow.types.QueryInput(audio_config=audio_config)

        # The first request contains the configuration.
        yield dialogflow.types.StreamingDetectIntentRequest(
            session=session, query_input=query_input)

        # Here we are reading small chunks of audio data from a local
        # audio file.  In practice these chunks should come from
        # an audio input device.
        with open(audio_file_path, 'rb') as audio_file:
            while True:
                chunk =
                if not chunk:
                # The later requests contains audio data.
                yield dialogflow.types.StreamingDetectIntentRequest(

    audio_config = dialogflow.types.InputAudioConfig(
        audio_encoding=audio_encoding, language_code=language_code,

    requests = request_generator(audio_config, audio_file_path)
    responses = session_client.streaming_detect_intent(requests)

    print('=' * 20)
    for response in responses:
        print('Intermediate transcript: "{}".'.format(

    # Note: The result from the last response is the final transcript along
    # with the detected content.
    query_result = response.query_result

    print('=' * 20)
    print('Query text: {}'.format(query_result.query_text))
    print('Detected intent: {} (confidence: {})\n'.format(
    print('Fulfillment text: {}\n'.format(

Send feedback about...

Google Dialogflow Enterprise API