Learning Fast Adaptation with Meta Strategy Optimization

Erwin Johan Coumans
Sehoon Ha
Learning Fast Adaptation with Meta Strategy Optimization(2020)
Google Scholar


The ability to walk in new situations is a key milestone on the path toward real-world applications of legged robots. In this work, we introduce a novel algorithm for training locomotion policies for legged robots that can quickly adapt to new scenarios with a handful of trials in the target environment. We extend the framework of strategy optimization that trains a control policy with additional latent parameters in the simulation and transfers to the real robot by optimizing the latent inputs. The key idea in our proposed algorithm, Meta Strategy Optimization (MSO), is to formulate the problem as a meta-learning process by exposing the same strategy optimization to both the training and testing phases. This change allows MSO to effectively learn locomotion skills as well as a latent space that is suitable for fast adaptation. We evaluate our method on a real quadruped robot and demonstrate successful adaptation in various scenarios, including sim-to-real transfer, walking with a weakened motor, or climbing up a slope. Furthermore, we analyze the generalization capability of the trained policy in simulated environments and show that our method outperforms previous methods in both simulated and real environments.