Class AudioInput (1.32.0)

AudioInput(mapping=None, *, ignore_unknown_fields=False, **kwargs)

Represents the natural speech audio to be processed.


Required. Instructs the speech recognizer how to process the speech audio.
audio bytes
The natural language speech audio to be processed. A single request can contain up to 2 minutes of speech audio data. The [transcribed text][] cannot contain more than 256 bytes. For non-streaming audio detect intent, both config and audio must be provided. For streaming audio detect intent, config must be provided in the first request and audio must be provided in all following requests.