Google Research

Resilient Computing with Reinforcement Learning on a Dynamical System: Case study in Sorting

57th IEEE Conference on Decision and Control (2018)


This paper poses general computation as a feedback-control problem. This formulation allows the agent to autonomously overcome some limitations of standard procedural language programming: resilience to errors and early program termination. Our formulation considers computation to be trajectory generation in the program's variable space. The computing is then posed as a sequential decision making problem, solved with RL, and analyzed with Lyapunov stability theory to assess agent's progression to the goal and resilience. We do this through a case study on a quintessential computer science problem, array sorting. Evaluations show that our RL sorting agent makes steady progress to an asymptotically stable goal, is resilient to faulty components, and performs less array manipulations than traditional Quicksort and Bubble sort.

Learn more about how we do research

We maintain a portfolio of research projects, providing individuals and teams the freedom to emphasize specific types of work