- Pramod Kaushik Mudrakarta
- Mark Sandler
- Andrey Zhmoginov
- Andrew Howard
Abstract
In this paper we introduce a novel method that enables parameter efficient transfer and multitask learning.
We show that by reusing more than 95\% of the parameters we can re-purpose neural networks to solve very
different types of problems such as going from COCO-dataset SSD detection to Imagenet classification.
Our approach allows both simultaneous (e.g. multi-task) learning as well as sequential fine-tuning where
we change the already trained networks to solve a different problem.
We show that our approach leads to significant increase in accuracy when compared to traditional logits-only fine-tuning
while using much fewer parameters. Interestingly, for multi-task learning our approach sometimes acts as a regularizer often leading
to improved performance when compared to models trained on a single task.
Our approach has multiple immediate applications. It can be used to dramatically increase the number of models available in resource-constrained settings, since the marginal cost of a new model is now less than 5\% of the full model. The constrained fine-tuning enables better generalization when limited amount data is available. We evaluate our approach on multiple datasets and multiple models.
Research Areas
Learn more about how we do research
We maintain a portfolio of research projects, providing individuals and teams the freedom to emphasize specific types of work