Value Function Spaces: Skill-Centric State Abstractions for Long-Horizon Reasoning

Alexander Toshkov Toshev

Brian Andrew Ichter

Dhruv Shah

Peng Xu

Sergey Levine

Ted Xiao

Yao Lu

ICLR (2022)

Google Scholar

Abstract

Reinforcement learning can train policies that effectively perform complex tasks. However, the performance of these methods degrades as the horizon increases, and performing long-horizon tasks often requires reasoning over and composing multiple lower-level skills. Hierarchical reinforcement learning aims to enable this, by providing a bank of low-level skills as action abstractions, in the form of primitives or options.
However, an effective hierarchy should exhibit abstraction both in the space of actions and states. We posit that a suitable state abstraction for the higher-level policy should depend on the capabilities of the available lower-level policies, and we propose an approach that produces such a representation by using the value functions corresponding to each lower-level skill to capture the affordances for these skills.
Empirical evaluations for maze-solving and robotic manipulation tasks demonstrate that our approach improves long-horizon performance and enables better zero-shot generalization than popular model-free and model-based methods by constructing a compact state abstraction that represents the affordances of the scene and is robust to distractors.

We implement our approach in two domains: a long-horizon maze solving task, and a complex image-based robotic manipulation simulator. In both settings, we show empirically that, when provided with a suitable bank of skills, our approach enables more effective long-horizon control as compared to alternative state representation learning methods.

Defining the technology of today and tomorrow.

Philosophy

People

Research areas

Foundational ML & Algorithms

Computing Systems & Quantum AI

Science, AI & Society

Projects

Publications

Resources

Shaping the future, together.

Student programs

Faculty programs

Conferences & events

Value Function Spaces: Skill-Centric State Abstractions for Long-Horizon Reasoning

Abstract

Research Areas

Meet the teams driving innovation