Sound source separation using phase difference and reliable mask selection
Abstract
In this paper, we present an algorithm called Reliable Mask Selection-
Phase Difference Channel Weighting (RMS-PDCW) which selects
the target source masked by a noise source using the Angle of Arrival
(AoA) information calculated using the phase difference informa-
tion. The RMS-PDCW algorithm selects masks to apply using the
information about the localized sound source and the onset detec-
tion of speech. We demonstrate that this algorithm shows relatively
5.3 percent improvement over the baseline acoustic model, which
was multistyle-trained using 22 million utterances on the simulated
test set consisting of real-world and interfering-speaker noise with
reverberation time distribution between 0 ms and 900 ms and SNR
distribution between 0 dB up to clean.
Phase Difference Channel Weighting (RMS-PDCW) which selects
the target source masked by a noise source using the Angle of Arrival
(AoA) information calculated using the phase difference informa-
tion. The RMS-PDCW algorithm selects masks to apply using the
information about the localized sound source and the onset detec-
tion of speech. We demonstrate that this algorithm shows relatively
5.3 percent improvement over the baseline acoustic model, which
was multistyle-trained using 22 million utterances on the simulated
test set consisting of real-world and interfering-speaker noise with
reverberation time distribution between 0 ms and 900 ms and SNR
distribution between 0 dB up to clean.