For information about audio encoding, see Introduction to audio encoding in the Speech-to-Text documentation.
Supported audio encodings
The Media Translation API supports a number of different encodings. The following table lists supported audio codecs:
Codec | Name | Lossless | Usage Notes |
---|---|---|---|
AMR |
Adaptive Multi-Rate Narrowband | No | Sample rate must be 8000 Hz |
AMR_WB |
Adaptive Multi-Rate Wideband | No | Sample rate must be 16000 Hz |
FLAC |
Free Lossless Audio Codec | Yes | 16-bit or 24-bit required for streams |
LINEAR16 |
Linear PCM | Yes | 16-bit linear pulse-code modulation (PCM) encoding |
MP3 |
MPEG Audio Layer III | No | MP3 audio. Support all standard MP3 bitrates (which range from 32-320 kbps). When using this encoding, sample_rate_hertz has to match the sample rate of the file being used. |
MULAW |
μ-law | No | 8-bit PCM encoding |
OGG_OPUS |
Opus encoded audio frames in an Ogg container | No | Sample rate must be one of 8000 Hz, 12000 Hz, 16000 Hz, 24000 Hz, or 48000 Hz |