Google Research

Logarithmic Regret for Online Control



We study the optimal regret bounds for control in linear dynamical systems with adversarially changing strongly convex cost functions. This framework includes several well studied and influential algorithms such as the Kalman filter and the linear quadratic regulator. State of the art methods achieve regret which scales as $O(\sqrt{T})$, where $T$ is the time horizon, or number of iterations.

We show that the optimal regret in this fundamental setting can be significantly smaller, and scales as $O(\poly(\log T))$, closing the gap in the literature between known upper and lower bounds. This regret bound is achieved by an efficient iterative gradient-based method.

Research Areas

Learn more about how we do research

We maintain a portfolio of research projects, providing individuals and teams the freedom to emphasize specific types of work