AW-Opt: Learning Robotic Skills with Imitationand Reinforcement at Scale

Yao Lu
Karol Hausman
Yevgen Chebotar
Mengyuan Yan
Eric Victor Jang
Alexander Herzog
Mohi Khansari
Dmitry Kalashnikov
Sergey Levine
Conference on Robot Learning 2021(2021)
Google Scholar


This paper proposes a new algorithm "AW-Opt" to combine Imitation Learning (IL) and Reinforcement Learning (RL). Prior methods face significant difficulty with sparse reward, image based input robotics tasks. By carefully designing sample filtering strategy, exploration strategy, and bellman equation, AW-Opt outperforms existing SOTA algorithms. Experimental results in both simulation and with real robots show that AW-Opt can achieve reasonable success rate from initial demonstrations, maintain low inference time, fine tune to reach SOTA success rate and use much less samples than existing algorithms.

Research Areas