Finite time analysis of temporal difference learning with linear function approximation: Tail averaging and regularization

Dheeraj Nagaraj

Doina Precup

Gandharv Patil

Prashanth L A

AISTATS (2023) (to appear)

Google Scholar

Abstract

We study the finite-time behaviour of the popular temporal difference (TD) learning algorithm, when combined with tail-averaging. We derive finite time bounds on the parameter error of the tail-averaged TD iterate under a step-size choice that does not require information about the eigenvalues of the matrix underlying the projected TD fixed point. Our analysis shows that tail-averaged TD converges at the optimal $O\left(1/t\right)$ rate, both in expectation and with high probability. In addition, our bounds exhibit a sharper rate of decay for the initial error (bias), which is an improvement over averaging all iterates. We also propose and analyze a variant of TD that incorporates regularization. From our finite time analysis, we conclude that the regularized version of TD is useful for problems with ill-conditioned features.

Defining the technology of today and tomorrow.

Philosophy

People

Teams

AI/ML Foundations  & Capabilities

Algorithms & Optimization

Computing Paradigms

Responsible Human-Centric Technology

Science & Societal Impact

Projects

Publications

Resources

Shaping the future, together.

Student programs

Faculty programs

Conferences & events

Finite time analysis of temporal difference learning with linear function approximation: Tail averaging and regularization

Abstract

Research Areas

Learn more about how we conduct our research

Defining the technology of today and tomorrow.

Philosophy

People

Teams

AI/ML Foundations & Capabilities

Algorithms & Optimization

Computing Paradigms

Responsible Human-Centric Technology

Science & Societal Impact

Projects

Publications

Resources

Shaping the future, together.

Student programs

Faculty programs

Conferences & events

Finite time analysis of temporal difference learning with linear function approximation: Tail averaging and regularization

Abstract

Research Areas

Learn more about how we conduct our research

AI/ML Foundations  & Capabilities