Jump to Content

Multitask Learning Via Interleaving: A Neural Network Investigation

David Mayo
Tyler Scott
Mengye Ren
Katherine Hermann
Matt Jones
Michael Mozer
44th Annual Meeting of the Cognitive Science Society (2023)


The most common settings in machine learning to study multi-task learning assume either iid task draws on each training trial or training on each task to mastery before moving on to the next. We instead study a setting in which tasks are interleaved, i.e., training proceeds on task $\mathcal{A}$ for some period of time and then switches to another task $\mathcal{B}$ before $\mathcal{A}$ is mastered. We examine properties of standard neural net learning algorithms and architectures in this setting. With inspiration from psychological phenomena pertaining to the influence of task sequence on human learning, we observe qualitatively similar phenomena in networks, including: forgetting with relearning savings, task switching costs, and better memory consolidation with interleaved training. By improving our understanding of such properties, one can design learning procedures that are suitable given the temporal structure of the environment. We illustrate with a momentum optimizer that resets following a task switch and leads to reliably better online cumulative learning accuracy.