The difficulty of passive learning in deep reinforcement learning

Georg Ostrovski

Pablo Samuel Castro

Will Dabney

NeurIPS 2021(2021)

Google Scholar

Abstract

Offline reinforcement learning, which uses observational data instead of active environmental interaction, has been shown to be a challenging problem. Recent solutions typically involve constraints on the learner’s policy, preventing strong deviations from the state-action distribution of the dataset. Although the suggested methods are evaluated using non-linear function approximation, their theoretical justifications are mostly limited to the tabular or linear cases. Given the impressive results of deep reinforcement learning, we argue for a clearer understanding of the challenges in this setting. In the vein of Held & Hein's classic 1963 experiment, we propose “tandem learning”, an experimental paradigm which facilitates our in-depth empirical analysis of the difficulties in offline reinforcement learning. We identify function approximation in conjunction with inadequate data distributions as the strongest factors, thereby extending but also challenging certain assumptions made in past work. Our results provide a more principled view, and new insights on potential directions for future work on offline reinforcement learning.

Research Areas

Machine Intelligence

Defining the technology of today and tomorrow.

Philosophy

People

Teams

AI/ML Foundations  & Capabilities

Algorithms & Optimization

Computing Paradigms

Responsible Human-Centric Technology

Science & Societal Impact

Projects

Publications

Resources

Shaping the future, together.

Student programs

Faculty programs

Conferences & events

The difficulty of passive learning in deep reinforcement learning

Abstract

Research Areas

Meet the teams driving innovation

Defining the technology of today and tomorrow.

Philosophy

People

Teams

AI/ML Foundations & Capabilities

Algorithms & Optimization

Computing Paradigms

Responsible Human-Centric Technology

Science & Societal Impact

Projects

Publications

Resources

Shaping the future, together.

Student programs

Faculty programs

Conferences & events

The difficulty of passive learning in deep reinforcement learning

Abstract

Research Areas

Meet the teams driving innovation

AI/ML Foundations  & Capabilities