Adaptive scale-invariant online algorithms for learning linear models
Abstract
We consider online learning with linear models,
where the algorithm predicts on sequentially revealed instances (feature vectors), and is compared against the best linear function (comparator) in hindsight. Popular algorithms in this framework, such as Online Gradient Descent (OGD),
have parameters (learning rates), which ideally
should be tuned based on the scales of the features
and the optimal comparator, but these quantities
only become available at the end of the learning process. In this paper, we resolve the tuning
problem by proposing online algorithms making
predictions which are invariant under arbitrary
rescaling of the features. The algorithms have
no parameters to tune, do not require any prior
knowledge on the scale of the instances or the
comparator, and achieve regret bounds matching
(up to a logarithmic factor) that of OGD with optimally tuned separate learning rates per dimension,
while retaining comparable runtime performance.
where the algorithm predicts on sequentially revealed instances (feature vectors), and is compared against the best linear function (comparator) in hindsight. Popular algorithms in this framework, such as Online Gradient Descent (OGD),
have parameters (learning rates), which ideally
should be tuned based on the scales of the features
and the optimal comparator, but these quantities
only become available at the end of the learning process. In this paper, we resolve the tuning
problem by proposing online algorithms making
predictions which are invariant under arbitrary
rescaling of the features. The algorithms have
no parameters to tune, do not require any prior
knowledge on the scale of the instances or the
comparator, and achieve regret bounds matching
(up to a logarithmic factor) that of OGD with optimally tuned separate learning rates per dimension,
while retaining comparable runtime performance.