Silence is Golden: Modeling Non-speech Events in WFST-based Dynamic Network Decoders

David Rybach

Ralf Schlüter

Hermann Ney

IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP)(2012), pp. 4205-4208

Download Google Scholar

Abstract

Models for silence are a fundamental part of continuous speech recognition systems. Depending on application requirements, audio data segmentation, and availability of detailed training data annotations, it may be necessary or beneficial to differentiate between other non-speech events, for example breath and background noise. The integration of multiple non-speech models in a WFST-based dynamic network decoder is not straightforward, because these models do not perfectly fit in the transducer framework. This paper describes several options for the transducer construction with multiple non-speech models, shows their considerable different characteristics in memory and runtime efficiency, and analyzes the impact on the recognition performance.

Defining the technology of today and tomorrow.

Philosophy

People

Teams

AI/ML Foundations  & Capabilities

Algorithms & Optimization

Computing Paradigms

Responsible Human-Centric Technology

Science & Societal Impact

Projects

Publications

Resources

Shaping the future, together.

Student programs

Faculty programs

Conferences & events

Silence is Golden: Modeling Non-speech Events in WFST-based Dynamic Network Decoders

Abstract

Research Areas

Learn more about how we conduct our research

Defining the technology of today and tomorrow.

Philosophy

People

Teams

AI/ML Foundations & Capabilities

Algorithms & Optimization

Computing Paradigms

Responsible Human-Centric Technology

Science & Societal Impact

Projects

Publications

Resources

Shaping the future, together.

Student programs

Faculty programs

Conferences & events

Silence is Golden: Modeling Non-speech Events in WFST-based Dynamic Network Decoders

Abstract

Research Areas

Learn more about how we conduct our research

AI/ML Foundations  & Capabilities