On Learning Mixtures of Linear Regressions in a Non-Realizable Setting

Arya Mazumdar
Avishek Ghosh
Soumyabrata Pal
On Learning Mixture of Linear Regressions in the Non-Realizable Setting (2022)

Abstract

While mixture of linear regression is a well-studied topic, prior works in the literature usually focus on the {\em realizable} setting, i.e., when data is generated by a mixed linear (noisy) model. In this paper we show that a version of the popular AM algorithm finds out the best fit lines in a dataset even when a realizable model is not assumed, under some regularity conditions on the dataset and the initial point. In general, finding the best fit lines in the dataset involved solving a nonconvex optimization problem. We further provide an algorithm that runs in polynomial time in the number of datapoints, and recovers a good approximation of the best fit lines. In addition, we show that, mixture of linear regression can be used in a {\em list} prediction framework, with small prediction error via solving the aforementioned optimization problem.

Research Areas