What Matters for On-Policy Deep Actor-Critic Methods? A Large-Scale Study

Marcin Andrychowicz; Anton Raichuk; Piotr Michal Stanczyk; Manu Orsini; Sertan Girgin; Raphaël Marinier; Léonard Hussenot; Matthieu Geist; Olivier Pietquin; Marcin Michalski; Sylvain Gelly; Olivier Frederic Bachem

What Matters for On-Policy Deep Actor-Critic Methods? A Large-Scale Study

Marcin Andrychowicz

Anton Raichuk

Piotr Michal Stanczyk

Manu Orsini

Sertan Girgin

Raphaël Marinier

Léonard Hussenot

Matthieu Geist

Olivier Pietquin

Marcin Michalski

Sylvain Gelly

Olivier Frederic Bachem

ICLR (2021)

Download Google Scholar

Abstract

In recent years, reinforcement learning (RL) has been successfully applied to many different continuous control tasks. While RL algorithms are often conceptually simple, their state-of-the-art implementations take numerous low- and high-level design decisions that strongly affect the performance of the resulting agents. Those choices are usually not extensively discussed in the literature, leading to discrepancy between published descriptions of algorithms and their implementations. This makes it hard to attribute progress in RL and slows down overall progress [Engstrom'20]. As a step towards filling that gap, we implement >50 such ``"choices" in a unified on-policy deep actor-critic framework, allowing us to investigate their impact in a large-scale empirical study. We train over 250'000 agents in five continuous control environments of different complexity and provide insights and practical recommendations for the training of on-policy deep actor-critic RL agents.

Research Areas

Machine intelligence

Explore our many areas of focus

Building a collaborative ecosystem

Shaping the future together

Translating discovery into real-world impact

What Matters for On-Policy Deep Actor-Critic Methods? A Large-Scale Study

Abstract

Research Areas

Meet the teams driving innovation

Google AI

Google Cloud

Google DeepMind

Google Labs