Show Me the Way: Intrinsic Motivation from Demonstrations
Abstract
In reinforcement learning, exploration of sparse-reward environments remains a great challenge. Most algorithms introduced to tackle this issue make use of an intrinsic motivation derived from the notion of curiosity. While randomness alone allows a very local exploration, these methods generally lead to a more exhaustive search of the state space and thus a higher chance of getting any reward. However, in many environments, exhaustive exploration is impossible due to the number of states and actions. Moreover, it is generally not even desirable, as most behaviours in a realistic setting are -to a human- obviously meaningless.
We propose to extract an intrinsic bonus from exploratory demonstrations. We exhibit how to learn this bonus and show how it conveys the demonstrator's way of exploring its environment.
We propose to extract an intrinsic bonus from exploratory demonstrations. We exhibit how to learn this bonus and show how it conveys the demonstrator's way of exploring its environment.