News

Alibaba unveils a new speech recognition model covering 11 languages, noise-robust transcription, and even singing voice ...
Alibaba’s Marco-Voice pairs voice cloning with controllable emotion for more natural and expressive synthetic speech in ...
Abstract: Visual Speech Recognition (lip-reading) has witnessed tremendous improvements, reaching word error rates as low as 12.8 WER in English. However, the ...