Explaining the Learning Dynamics of Direct Feedback Alignment
Abstract
Two recently developed methods, Feedback Alignment (FA) and Direct Feedback
Alignment (DFA), have been shown to obtain surprising performance on vision
tasks by replacing the traditional backpropagation update with a random feedback
update. However, it is still not clear what mechanisms allow learning to happen
with these random updates. In this work we argue that DFA can be viewed as a
noisy variant of a layer-wise training method we call Linear Aligned Feedback
Systems (LAFS). We support this connection theoretically by comparing the update
rules for the two methods. We additionally empirically verify that the random
update matrices used in DFA work effectively as readout matrices, and that strong
correlations exist between the error vectors used in the DFA and LAFS updates.
With this new connection between DFA and LAFS we are able to explain why the
“alignment” happens in DFA.
Alignment (DFA), have been shown to obtain surprising performance on vision
tasks by replacing the traditional backpropagation update with a random feedback
update. However, it is still not clear what mechanisms allow learning to happen
with these random updates. In this work we argue that DFA can be viewed as a
noisy variant of a layer-wise training method we call Linear Aligned Feedback
Systems (LAFS). We support this connection theoretically by comparing the update
rules for the two methods. We additionally empirically verify that the random
update matrices used in DFA work effectively as readout matrices, and that strong
correlations exist between the error vectors used in the DFA and LAFS updates.
With this new connection between DFA and LAFS we are able to explain why the
“alignment” happens in DFA.