Google Research

Improving DNN Speaker Independence with I-vector Inputs

Proc. ICASSP, IEEE (2014)


We propose providing additional utterance-level features as inputs to a deep neural network (DNN) to facilitate speaker, channel and background normalization. Modifications of the basic algorithm are developed which result in significant reductions in word error rates (WERs). The algorithms are shown to combine well with speaker adaptation by backpropagation, resulting in a 9\% relative WER reduction. We address implementation of the algorithm for a streaming task.

Research Areas

Learn more about how we do research

We maintain a portfolio of research projects, providing individuals and teams the freedom to emphasize specific types of work