A Connection between Actor Regularization and Critic Regularization in Reinforcement Learning

Benjamin Eysenbach

Matthieu Geist

Ruslan Salakhutdinov

Sergey Levine

International Conference on Machine Learning (ICML) (2023)

Google Scholar

Abstract

As with any machine learning problem with limited data, effective offline RL algorithms require careful regularization to avoid overfitting, with most methods regularizing either the actor or the critic. These methods appear distinct. Actor regularization (e.g., behavioral cloning penalties) is simpler and has appealing convergence properties, while critic regularization typically requires significantly more compute because it involves solving a game, but it has appealing lower-bound guarantees. Empirically, prior work alternates between claiming better results with actor regularization and critic regularization. In this paper, we show that these two regularization techniques can be equivalent under some assumptions: regularizing the critic using a CQL-like objective is equivalent to updating the actor with a BC- like regularizer and with a SARSA Q-value (i.e., “1-step RL”). Our experiments show that this theoretical model makes accurate, testable predictions about the performance of CQL and one-step RL. While our results do not definitively say whether users should prefer actor regularization or critic regularization, our results hint that actor regularization methods may be a simpler way to achieve the desirable properties of critic regularization. The results also suggest that the empirically- demonstrated benefits of both types of regularization might be more a function of implementation details rather than objective superiority.

Defining the technology of today and tomorrow.

Philosophy

People

Teams

AI/ML Foundations  & Capabilities

Algorithms & Optimization

Computing Paradigms

Responsible Human-Centric Technology

Science & Societal Impact

Projects

Publications

Resources

Shaping the future, together.

Student programs

Faculty programs

Conferences & events

A Connection between Actor Regularization and Critic Regularization in Reinforcement Learning

Abstract

Research Areas

Meet the teams driving innovation

Defining the technology of today and tomorrow.

Philosophy

People

Teams

AI/ML Foundations & Capabilities

Algorithms & Optimization

Computing Paradigms

Responsible Human-Centric Technology

Science & Societal Impact

Projects

Publications

Resources

Shaping the future, together.

Student programs

Faculty programs

Conferences & events

A Connection between Actor Regularization and Critic Regularization in Reinforcement Learning

Abstract

Research Areas

Meet the teams driving innovation

AI/ML Foundations  & Capabilities