Try Gemini 1.5 Pro, our most advanced multimodal model in Vertex AI, and see what you can build with a 1M token context window.Try Gemini 1.5 Pro, our most advanced multimodal model in Vertex AI, and see what you can build with a 1M token context window.
Required. The natural language speech audio to be processed. A single request can contain up to 1 minute of speech audio data. The transcribed text cannot contain more than 256 bytes for virtual agent interactions.