Google Research

Theoretical Advantages of Lenient Learners in Multiagent Systems

Proceedings of the Sixth International Conference on Autonomous Agents and Multi-agent Systems (AAMAS-07), ACM (2007)


This paper presents the dynamics of multiple reinforcement learning agents from an Evolutionary Game Theoretic perspective. We provide a Replicator Dynamics model for traditional multiagent Q-learning, and we then extend these differential equations to account for lenient learners: agents that forgive possible mistakes of their teammates that resulted in lower rewards. We use this extended formal model to visualize the basins of attraction of both traditional and lenient multiagent Q-learners in two benchmark coordination problems. The results indicate that lenience provides learners with more accurate estimates for the utility of their actions, resulting in higher likelihood of convergence to the globally optimal solution. In addition, our research supports the strength of EGT as a backbone for multiagent reinforcement learning.

Research Areas

Learn more about how we do research

We maintain a portfolio of research projects, providing individuals and teams the freedom to emphasize specific types of work