- Bo Li
- Tara Sainath
- Arun Narayanan
- Joe Caroselli
- Michiel Bacchiani
- Ananya Misra
- Izhak Shafran
- Hasim Sak
- Golan Pundak
- Kean Chin
- Khe Chai Sim
- Ron J. Weiss
- Kevin Wilson
- Ehsan Variani
- Chanwoo Kim
- Olivier Siohan
- Mitchel Weintraub
- Erik McDermott
- Rick Rose
- Matt Shannon
Abstract
This paper describes the technical and system building advances made to the Google Home multichannel speech recognition system, which was launched in November 2016. Technical advances include an adaptive dereverberation frontend, the use of neural network models that do multichannel processing jointly with acoustic modeling, and grid lstms to model frequency variations. On the system level, improvements include adapting the model using Google Home specific data. We present results on a variety of multichannel sets. The combination of technical and system advances result in a reduction of WER of over 18\% relative compared to the current production system.
Research Areas
Learn more about how we do research
We maintain a portfolio of research projects, providing individuals and teams the freedom to emphasize specific types of work