In this paper we introduce a novel method that enables parameter efficient transfer and multitask learning.
We show that by reusing more than 95\% of the parameters we can re-purpose neural networks to solve very
different types of problems such as going from COCO-dataset SSD detection to Imagenet classification.
Our approach allows both simultaneous (e.g. multi-task) learning as well as sequential fine-tuning where we change the already trained networks to solve a different problem. We show that our approach leads to significant increase in accuracy when compared to traditional logits-only fine-tuning while using much fewer parameters. Interestingly, for multi-task learning our approach sometimes acts as a regularizer often leading to improved performance when compared to models trained on a single task.
Our approach has multiple immediate applications. It can be used to dramatically increase the number of models available in resource-constrained settings, since the marginal cost of a new model is now less than 5\% of the full model. The constrained fine-tuning enables better generalization when limited amount data is available. We evaluate our approach on multiple datasets and multiple models.