Robust speech recognition using multiple prior models for speech reconstruction

Xiaojia Zhao
DeLiang Wang
Eric Fosler-Lussier
Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), IEEE(2011), pp. 4800-4803


Prior models of speech have been used in robust automatic speech recognition to enhance noisy speech. Typically, a single prior model is trained by pooling the entire training data. In this paper we propose to train multiple prior models of speech instead of a single prior model. The prior models can be trained based on distinct characteristics of speech. In this study, they are trained based on voicing characteristics. The trained prior models are then used to reconstruct noisy speech. Significant improvements are obtained on the Aurora-4 robust speech recognition task when multiple priors are used; in conjunction with an uncertainty transform technique, multiple priors yield a 13.7% absolute improvement in the average word error rate over directly recognizing noisy speech.

Research Areas