- Azalia Mirhoseini
- Hieu Pham
- Quoc Le
- Mohammad Norouzi
- Samy Bengio
- Benoit Steiner
- Yuefeng Zhou
- Naveen Kumar
- Rasmus Larsen
- Jeff Dean
Abstract
The past few years have seen much success in applying neural networks to many practical problems. Together with this success is the growth in size and computational requirements for training and inference with neural networks. A common approach to address these requirements is to use a heterogeneous distributed environment with a mix of hardware devices such as CPUs, and GPUs. Importantly, the decision of placing parts of the neural models on devices is most often made by a human expert relying on heuristic approaches. In this paper, we propose a method which learns to optimize device placement. Key to our method is the employment of a recurrent neural network to predict a set of device placements for a target neural computation graph. The execution time according to the predicted placements is then used as the reward function to optimize the parameters of the recurrent neural network. Our main result is that on Inception for ImageNet classification, and on LSTM, for language modeling and neural translation, our model finds non-trivial device placements that significantly outperform handcrafted heuristics and traditional algorithmic methods.
Research Areas
Learn more about how we do research
We maintain a portfolio of research projects, providing individuals and teams the freedom to emphasize specific types of work