Matching options to tasks using Option-Indexed Hierarchical Reinforcement Learning

Soumya Chatterjee
Akash Reddy
Balaraman Ravindran


The options framework in Hierarchical Reinforcement Learning breaks down overall goals into a combination of options or simpler tasks and associated policies, allowing for abstraction in the action space. Ideally, these options can be reused across different higher-level goals; indeed, many previous approaches have proposed limited forms of transfer of prelearned options to new task settings. We propose a novel "option indexing" approach to hierarchical learning (OI-HRL), where we learn an affinity function between options and the functionalities (or affordances) supported by the environment. This allows us to effectively reuse a large library of pretrained options, in zero-shot generalization at test time, by restricting goal-directed learning to only those options relevant to the task at hand. We develop a meta-training loop that learns the representations of options and environment affordances over a series of HRL problems, by incorporating feedback about the relevance of retrieved options to the higher-level goal. In addition to a substantial decrease in sample complexity compared to learning HRL policies from scratch, we also show significant gains over baselines that have the entire option pool available for learning the hierarchical policy.

Research Areas