Adversarial Motion Priors Make Good Substitutes for Complex Reward Functions

Ale Escontrela

Atil Iscen

Jason Peng

Ken Goldberg

Pieter Abbeel

Tingnan Zhang

Wenhao Yu

2022 IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS (2022) (to appear)

Google Scholar

Abstract

Training high-dimensional simulated agents with under-specified reward functions often leads to jerky and unnatural behaviors, which results in physically infeasible strategies that are generally ineffective when deployed in the real world. To mitigate these unnatural behaviors, reinforcement learning (RL) practitioners often utilize complex reward functions that encourage more physically plausible behaviors, in conjunction with tricks such as domain randomization to train policies that satisfy the user's style criteria and can be successfully deployed on real robots. Such an approach has been successful in the realm of legged locomotion, leading to state-of-the-art results. However, designing effective reward functions can be a labour-intensive and tedious tuning process, and these hand-designed rewards do not easily generalize across platforms and tasks. We propose substituting complex reward functions with "style rewards" learned from a dataset of motion capture demonstrations. This learned style reward can be combined with a simple task reward to train policies that perform tasks using naturalistic strategies. These more natural strategies can also facilitate transfer to the real world. We build upon prior work in computer graphics and demonstrate that an adversarial approach to training control policies can produce behaviors that transfer to a real quadrupedal robot without requiring complex reward functions. We also demonstrate that an effective style reward can be learned from a few seconds of motion capture data gathered from a German Shepherd and leads to energy-efficient locomotion strategies with natural gait transitions.

Research Areas

Machine Intelligence

Defining the technology of today and tomorrow.

Philosophy

People

Research areas

Foundational ML & Algorithms

Computing Systems & Quantum AI

Science, AI & Society

Projects

Publications

Resources

Shaping the future, together.

Student programs

Faculty programs

Conferences & events

Adversarial Motion Priors Make Good Substitutes for Complex Reward Functions

Abstract

Research Areas

Meet the teams driving innovation