- Ananya Misra
- David Qiu
- Dongseong Hwang
- Françoise Beaufays
- Khe Chai Sim
- Nikhil Siddhartha
- Shefali Garg
- Trevor Strohman
- Yanzhang (Ryan) He
- Zhouyuan Huo
Abstract
Self- and Semi-supervised learning methods have been actively investigated to reduce labeled training data or enhance the model performance. However, the approach mostly focus on in-domain performance for public datasets. In this study, we utilize the combination of self- and semi-supervised learning methods to solve unseen domain adaptation problem in a large-scale production setting for online ASR model. This approach demonstrates that using the source domain data with a small fraction of the target domain data (3%) can recover the performance gap compared to a full data baseline: relative 13.5% WER improvement for target domain data.
Research Areas
Learn more about how we do research
We maintain a portfolio of research projects, providing individuals and teams the freedom to emphasize specific types of work