Large-scale ASR Domain Adaptation by Self- and Semi-supervised Learning

Ananya Misra

David Qiu

Dongseong Hwang

Françoise Beaufays

Khe Chai Sim

Nikhil Siddhartha

Shefali Garg

Trevor Strohman

Yanzhang (Ryan) He

Zhouyuan Huo

ICASSP(2022) (to appear)

Download Google Scholar

Abstract

Self- and Semi-supervised learning methods have been actively investigated to reduce labeled training data or enhance the model performance. However, the approach mostly focus on in-domain performance for public datasets. In this study, we utilize the combination of self- and semi-supervised learning methods to solve unseen domain adaptation problem in a large-scale production setting for online ASR model. This approach demonstrates that using the source domain data with a small fraction of the target domain data (3%) can recover the performance gap compared to a full data baseline: relative 13.5% WER improvement for target domain data.

Research Areas

Speech Processing

Defining the technology of today and tomorrow.

Philosophy

People

Teams

AI/ML Foundations  & Capabilities

Algorithms & Optimization

Computing Paradigms

Responsible Human-Centric Technology

Science & Societal Impact

Projects

Publications

Resources

Shaping the future, together.

Student programs

Faculty programs

Conferences & events

Large-scale ASR Domain Adaptation by Self- and Semi-supervised Learning

Abstract

Research Areas

Learn more about how we conduct our research

Defining the technology of today and tomorrow.

Philosophy

People

Teams

AI/ML Foundations & Capabilities

Algorithms & Optimization

Computing Paradigms

Responsible Human-Centric Technology

Science & Societal Impact

Projects

Publications

Resources

Shaping the future, together.

Student programs

Faculty programs

Conferences & events

Large-scale ASR Domain Adaptation by Self- and Semi-supervised Learning

Abstract

Research Areas

Learn more about how we conduct our research

AI/ML Foundations  & Capabilities