HYBRID LSTM-FSMN NETWORKS FOR ACOUSTIC MODELING

Asa Oines; Eugene Weinstein; Pedro Moreno

HYBRID LSTM-FSMN NETWORKS FOR ACOUSTIC MODELING

Asa Oines

Eugene Weinstein

Pedro Moreno

(2018)

Google Scholar

Abstract

This paper describes a series of experiments with neural networks containing long short-term memory (LSTM) [1] and feedforward sequential memory network (FSMN) [2, 3, 4] layers trained with the connectionist temporal classification (CTC) [5] criteria for acoustic modeling. We propose using a hybrid LSTM/FSMN (FLMN) architecture as an enhancement to conventional LSTM-only acoustic models. The addition of FSMN layers allows the network to model a fixed size representation of future context suitable for online speech recognition. Our experiments show that FLMN acoustic models significantly outperform conventional LSTM. We also compare the FLMN architecture with other methods of modeling future context. Finally, we present a modification of the FSMN architecture that improves performance by reducing the width of the FSMN output.

Explore our many areas of focus

Building a collaborative ecosystem

Shaping the future together

Translating discovery into real-world impact

HYBRID LSTM-FSMN NETWORKS FOR ACOUSTIC MODELING

Abstract

Meet the teams driving innovation

Google AI

Google Cloud

Google DeepMind

Google Labs