Near Optimal Heteroscedastic Regression with Symbiotic Learning

Aniket Das
Dheeraj Baby
Conference on Learning Theory (COLT)(2023)


We consider regression where the noise distribution depends on the covariates (i.e, heteroscedastic noise), which captures popular settings such as linear regression with multiplicative noise occuring due to covariate uncertainty. In particular we consider linear regression where the noise variance is an unknown rank-1 quadratic function of the covariates. While an application of least squares regression can achieve an error rate of $\nicefrac{d}{n}$, this ignores the fact that the magnitude of the noise can be very small for certain values of the covariates, which can aid faster learning. Our algorithm \ouralg~runs a parameter estimation algorithm and a noise distribution model learning algorithm are run alternately, using each other's outputs to iteratively obtain better estimates of the parameter and the noise distribution model respectively. This achieves an error rate of $\nicefrac{1}{n} + \nicefrac{d^2}{n^2}$, which we show is minimax optimal up to logarithmic factors. A sub-routine for \ouralg~performs phase estimation with multiplicative noise maybe of independent interest.