Robust speech recognition using temporal masking and thresholding algorithm

Chanwoo  Kim; Kean Chin; Michiel Bacchiani; R. M. Stern

Robust speech recognition using temporal masking and thresholding algorithm

Chanwoo Kim

Kean Chin

Michiel Bacchiani

R. M. Stern

INTERSPEECH-2014, pp. 2734-2738

Download Google Scholar

Abstract

In this paper, we present a new dereverberation algorithm called
Temporal Masking and Thresholding (TMT) to enhance the
temporal spectra of spectral features for robust speech recognition
in reverberant environments. This algorithm is motivated
by the precedence effect and temporal masking of human
auditory perception. This work is an improvement of our
previous dereverberation work called Suppression of Slowlyvarying
components and the falling edge of the power envelope
(SSF). The TMT algorithm uses a different mathematical
model to characterize temporal masking and thresholding compared
to the model that had been used to characterize the SSF
algorithm. Specifically, the nonlinear highpass filtering used
in the SSF algorithm has been replaced by a masking mechanism
based on a combination of peak detection and dynamic
thresholding. Speech recognition results show that the TMT
algorithm provides superior recognition accuracy compared to
other algorithms such as LTLSS, VTS, or SSF in reverberant
environments.

Explore our many areas of focus

Building a collaborative ecosystem

Shaping the future together

Translating discovery into real-world impact

Robust speech recognition using temporal masking and thresholding algorithm

Abstract

Meet the teams driving innovation

Google AI

Google Cloud

Google DeepMind

Google Labs