Google Research

Don't Do What Doesn't Matter: Intrinsic Motivation with Action Usefulness

Proc. of IJCAI 2021 (to appear)

Abstract

Sparse rewards are double-edged training signals in reinforcement learning: easy to design but hard to optimize. Intrinsic motivation guidance has thus been developed toward alleviating the resulting exploration problem. They usually incentivize agents to look for new states through novelty signals. Yet, such methods encourage exhaustive exploration of the state space rather than focusing on the environ-ment’s salient interaction opportunities. We pro-pose a new exploration method, called Rare Actions Matter (RAM), shifting the emphasis from state novelty to action novelty. While most actions consistently modify the state when used, e.g.turning left/right, jumping, etc., some actions are only effective in specific conditions, e.g.,opening a door, grabbing an object.RAM detects and rewards actions that seldom affect the environ-ment only. We evaluate RAM on the procedurally-generated environment MiniGrid against state-of-the-art methods. Experiments consistently show that RAM greatly reduces sample complexity on simple tasks and is the very first method that solves the hardest instances, installing the new state-of-the-art in MiniGrid

Research Areas

Learn more about how we do research

We maintain a portfolio of research projects, providing individuals and teams the freedom to emphasize specific types of work