Logarithmic Regret for Online Control
Abstract
We study the optimal regret bounds for control in linear dynamical systems with adversarially changing strongly convex cost functions. This framework includes several well studied and influential algorithms such as the Kalman filter and the linear quadratic regulator. State of the art methods achieve regret which scales as $O(\sqrt{T})$, where $T$ is the time horizon, or number of iterations.
We show that the optimal regret in this fundamental setting can be significantly smaller, and scales as $O(\poly(\log T))$, closing the gap in the literature between known upper and lower bounds. This regret bound is achieved by an efficient iterative gradient-based method.
We show that the optimal regret in this fundamental setting can be significantly smaller, and scales as $O(\poly(\log T))$, closing the gap in the literature between known upper and lower bounds. This regret bound is achieved by an efficient iterative gradient-based method.