- Georg Heigold
- Erik McDermott
- Vincent Vanhoucke
- Andrew Senior
- Michiel Bacchiani
This paper explores asynchronous stochastic optimization for sequence training of deep neural networks. Sequence training requires more computation than frame-level training using pre-computed frame data. This leads to several complications for stochastic optimization, arising from signiﬁcant asynchrony in model updates under massive parallelization, and limited data shufﬂing due to utterance-chunked processing. We analyze the impact of these two issues on the efﬁciency and performance of sequence training. In particular, we suggest a framework to formalize the reasoning about the asynchrony and present experimental results on both small and large scale Voice Search tasks to validate the effectiveness and efﬁciency of asynchronous stochastic optimization.
Learn more about how we do research
We maintain a portfolio of research projects, providing individuals and teams the freedom to emphasize specific types of work