AudioInput

Represents the natural language speech audio to be processed.

JSON representation
{
  "config": {
    object (InputAudioConfig)
  },
  "audio": string
}
Fields
config

object (InputAudioConfig)

Required. Instructs the speech recognizer how to process the speech audio.

audio

string (bytes format)

Required. The natural language speech audio to be processed. A single request can contain up to 2 minutes of speech audio data. The transcribed text cannot contain more than 256 bytes for virtual agent interactions.

A base64-encoded string.