Google Research

Sound source separation algorithm using phase difference and angle distribution modeling near the target

INTERSPEECH 2015, pp. 751-755

Abstract

In this paper we present a novel two-microphone sound source separation algorithm, which selects the signal from the target direction while suppressing signals from other directions. In this algorithm, which is referred to as Power Angle Information Near Target (PAINT), we first calculate phase difference for each time-frequency bin. From the phase difference, the angle of a sound source is estimated. For each frame, we represent the source angle distribution near the expected target location as a mixture of a Gaussian and a uniform distributions and obtain binary masks using hypothesis testing. Continuous masks are calculated from the binary masks using the Channel Weighting (CW) technique, and processed speech is synthesized using IFFT and the OverLap-Add (OLA) method. We demonstrate that the algorithm described in this paper shows better speech recognition accuracy compared to conventional approaches and our previous approaches

Learn more about how we do research

We maintain a portfolio of research projects, providing individuals and teams the freedom to emphasize specific types of work