Monaural Segregation of Voiced Speech using Discriminative Random Fields

Rohit Prabhavalkar; Zhaozhang Jin; Eric Fosler-Lussier

Monaural Segregation of Voiced Speech using Discriminative Random Fields

Rohit Prabhavalkar

Zhaozhang Jin

Eric Fosler-Lussier

Proceedings of Annual Conference of the International Speech Communication Association (Interspeech), ISCA (2009), pp. 856-859

Google Scholar

Abstract

Techniques for separating speech from background noise and other sources of interference have important applications for robust speech recognition and speech enhancement. Many traditional computational auditory scene analysis (CASA) based approaches decompose the input mixture into a time-frequency (T-F) representation, and attempt to identify the T-F units where the target energy dominates that of the interference. This is accomplished using a two stage process of segmentation and grouping. In this pilot study, we explore the use of Discriminative Random Fields (DRFs) for the task of monaural speech segregation. We find that the use of DRFs allows us to effectively combine multiple auditory features into the system, while simultaneously integrating the the two CASA stages into one. Our preliminary results suggest that CASA based approaches may benefit from the DRF framework.

Explore our many areas of focus

Building a collaborative ecosystem

Shaping the future together

Translating discovery into real-world impact

Monaural Segregation of Voiced Speech using Discriminative Random Fields

Abstract

Meet the teams driving innovation

Google AI

Google Cloud

Google DeepMind

Google Labs