DSelect-k: Differentiable Selection in the Mixture of Experts with Applications to Multi-Task Learning

Zhe Zhao
Aakanksha Chowdhery
Maheswaran Sathiamoorthy
Yihua Chen
Rahul Mazumder
Lichan Hong
35th Conference on Neural Information Processing Systems (NeurIPS 2021)(2021)

Abstract

Research Areas