Phase-sensitive Joint Learning Algorithms for Deep Learning-based Speech Enhancement

Hong-Goo Kang

Jan Skoglund

Jinkyu Lee

Turaj Zakizadeh Shabestary

IEEE Signal Processing Letters, vol. 25 (8) (2018), pp. 1276-1280

Google Scholar

Abstract

This letter presents a phase-sensitive joint learning algorithm for single-channel speech enhancement. Although a deep learning framework that estimates time-frequency (T-F) domain ideal ratio masks demonstrates a strong performance, it is limited in that the enhancement process is performed only in the magnitude domain, while the phase spectra remain unchanged. Thus, recent studies have been conducted to involve phase spectra in speech enhancement systems. A phase-sensitive mask (PSM) is a T-F mask that implicitly represents phaserelated information. However, since the PSM has an unbounded value, the networks are trained to target its truncated values rather than directly estimating it. To effectively train the PSM, we first approximate it to have a bounded dynamic range under the assumption that speech and noise are uncorrelated. We then propose a joint learning algorithm that trains the approximated value through its parameterized variables in order to minimize the inevitable error caused by the truncation process. Specifically, we design a network that explicitly targets three parameterized variables: speech magnitude spectra, noise magnitude spectra, and phase difference of clean to noisy spectra. To further improve the performance, we also investigate how the dynamic range of magnitude spectra controlled by a warping function affects the final performance in joint learning algorithms. Finally, we examined how the proposed additional constraint that preserves the sum of the estimated speech and noise power spectra affects the overall system performance. The experimental results show that the proposed learning algorithm outperforms the conventional learning algorithm with the truncated phase-sensitive approximation.

Research Areas

Speech Processing

Defining the technology of today and tomorrow.

Philosophy

People

Teams

AI/ML Foundations  & Capabilities

Algorithms & Optimization

Computing Paradigms

Responsible Human-Centric Technology

Science & Societal Impact

Projects

Publications

Resources

Shaping the future, together.

Student programs

Faculty programs

Conferences & events

Phase-sensitive Joint Learning Algorithms for Deep Learning-based Speech Enhancement

Abstract

Research Areas

Learn more about how we conduct our research

Defining the technology of today and tomorrow.

Philosophy

People

Teams

AI/ML Foundations & Capabilities

Algorithms & Optimization

Computing Paradigms

Responsible Human-Centric Technology

Science & Societal Impact

Projects

Publications

Resources

Shaping the future, together.

Student programs

Faculty programs

Conferences & events

Phase-sensitive Joint Learning Algorithms for Deep Learning-based Speech Enhancement

Abstract

Research Areas

Learn more about how we conduct our research

AI/ML Foundations  & Capabilities