Rasmus Munk Larsen
Authored Publications
Sort By
Preview abstract
This talk gives an overview of TensorFlow graph optimization technologies. The talk starts from an introduction to TensorFlow. It then goes into an in-depth dive into Grappler - a graph optimizer for TensorFlow graphs. The second part of the talk focuses on MLIR - a new compiler infrastructure designed to support optimizations not only on TensorFlow graphs but also on multiple other representations used by components of the TensorFlow software stack (TensorFlow Lite, XLA etc.).
View details
Device Placement Optimization with Reinforcement Learning
Azalia Mirhoseini
Hieu Pham
Mohammad Norouzi
Samy Bengio
Benoit Steiner
Yuefeng Zhou
Naveen Kumar
ICML (2017)
Preview abstract
The past few years have seen much success in applying neural networks to many practical problems. Together with this success is the growth in size and computational requirements for training and inference with neural networks. A common approach to address these requirements is to use a heterogeneous distributed environment with a mix of hardware devices such as CPUs, and GPUs. Importantly, the decision of placing parts of the neural models on devices is most often made by a human expert relying on heuristic approaches. In this paper, we propose a method which learns to optimize device placement. Key to our method is the employment of a recurrent neural network to predict a set of device placements for a target neural computation graph. The execution time according to the predicted placements is then used as the reward function to optimize the parameters of the recurrent neural network. Our main result is that on Inception for ImageNet classification, and on LSTM, for language modeling and neural translation, our model finds non-trivial device placements that significantly outperform handcrafted heuristics and traditional algorithmic methods.
View details