Just Pick a Sign: Reducing Gradient Conflict in Deep Networks with Gradient Sign Dropout

Drago Anguelov
Henrik Kretzschmar
Jiquan Ngiam
Yuning Chai
Zhao Chen
NeurIPS 2020 Submission (2020) (to appear)
Google Scholar

Abstract

The vast majority of modern deep neural networks produce multiple gradient signals which then attempt to update the same set of scalar weights. Such updates are often incompatible with each other, leading to gradient conflicts which impede optimal network training. We present Gradient Sign Dropout (GradDrop), a probabilistic masking procedure which encourages backpropagation only of gradients which are mutually consistent at a given deep activation layer. GradDrop is simple to implement as a modular layer within any deepnet and is synergistic with other gradient balancing approaches. We show that GradDrop performs better than other state-of-the-art methods for two very common contexts in which gradient conflicts pose a problem: multitask learning and transfer learning.

Research Areas