Logarithmic Regret for Online Control

Elad Hazan; Karan Singh; Naman Agarwal

Logarithmic Regret for Online Control

Elad Hazan

Karan Singh

Naman Agarwal

(2019)

Google Scholar

Abstract

We study the optimal regret bounds for control in linear dynamical systems with adversarially changing strongly convex cost functions. This framework includes several well studied and influential algorithms such as the Kalman filter and the linear quadratic regulator. State of the art methods achieve regret which scales as $O(\sqrt{T})$, where $T$ is the time horizon, or number of iterations.

We show that the optimal regret in this fundamental setting can be significantly smaller, and scales as $O(\poly(\log T))$, closing the gap in the literature between known upper and lower bounds. This regret bound is achieved by an efficient iterative gradient-based method.

Research Areas

Machine intelligence

Explore our many areas of focus

Building a collaborative ecosystem

Shaping the future together

Translating discovery into real-world impact

Logarithmic Regret for Online Control

Abstract

Research Areas

Meet the teams driving innovation

Google AI

Google Cloud

Google DeepMind

Google Labs