A CASA based system for long-term SNR estimation

DeLiang Wang
IEEE Transactions on Audio, Speech, and Language Processing, 20(2012), pp. 2518-2527

Abstract

We present a system for robust signal-to-noise ratio (SNR) estimation based on computational auditory scene analysis (CASA). The proposed algorithm uses an estimate of the ideal binary mask to segregate a time-frequency representation of the noisy signal into speech dominated and noise dominated regions. Energy within each of these regions is summated to derive the filtered global SNR. An SNR transform is introduced to convert the estimated filtered SNR to the true broadband SNR of the noisy signal. The algorithm is further extended to estimate subband SNRs. Evaluations are done using the TIMIT speech corpus and the NOISEX92 noise database. Results indicate that both global and subband SNR estimates are superior to those of existing methods, especially at low SNR conditions.

Research Areas