Jump to Content

Factorizing declarative and procedural knowledge in structured, dynamic environments

Anirudh Goyal
Alex Lamb
Phanideep Gampa
Philippe Beaudoin
Sergey Levine
Charles Blundell
Yoshua Bengio
Michael Mozer
International Conference on Learning Representations (ICLR) (2021) (to appear)

Abstract

Modeling a structured, dynamic environment like a video game requires keeping track of the objects and their states (declarative knowledge) as well as predicting how objects behave (procedural knowledge). Black-box models with a monolithic hidden state often lack systematicity: they fail to apply procedural knowledge consistently and uniformly. For example, in a video game, correct prediction of one enemy's trajectory does not ensure correct prediction of another's. We address this issue via an architecture that factorizes declarative and procedural knowledge and that imposes modularity within each form of knowledge. The architecture consists of active modules called object files that maintain the state of a single object and invoke passive external knowledge sources called schemata that prescribe state updates. To use a video game as an illustration, two enemies of the same type will share schemata but will each have its own object file to encode its distinct state (e.g., health, position). We propose to use attention to control the determination of which object files update, the selection of schemata, and the propagation of information between object files. The resulting architecture is a drop-in replacement conforming to the same input-output interface as normal recurrent networks (e.g., LSTM, GRU) yet achieves substantially better generalization on environments like Atari, rolling balls, and visual reasoning.