Generalization and Learnability in Multiple Instance Regression

Kushal Chauhan

Rishi Saket

Lorne Applebaum

Ashwinkumar Badanidiyuru Varadaraja

Chandan Giri

Aravindan Raghuveer

Proc. UAI (2024)

Download Google Scholar

Abstract

Multiple instance regression (MIR) introduced by (Ray & Page, 2001) as an analogue of multiple
instance learning (MIL) in which we are given bags of feature-vectors (instances) and for each bag there is a bag-label which is matches the label of one (unknown) primary instance from that bag. The goal is to compute a hypothesis regressor consistent with the underlying instance-labels. A natural approach is to find the best primary instance assignment and regressor optimizing (say) the mse loss on the bags though no formal generalization guarantees were known. Our work is the first to prove generalization error bounds for MIR when the bags are drawn iid at random. Essentially, w.h.p. any MIR regressor with low error on sampled bags also has low error on the underlying instance-label distribution. We next study the complexity of linear regression on MIR bags, shown to be NP-hard in general by (Ray & Page, 2001) who however left open the possibility of arbitrarily good approximations. Significantly strengthening previous work, we prove a strong inapproximability bound: even if there exists zero-loss MIR linear regressor on a collection of 2-sized bags with labels in [−1, 1], it is NP-hard to find an MIR linear regressor with loss < C for some absolute constant C > 0.

Our work also proposes two novel model training methods on MIR bags based on (i) weighted assignment loss and, (ii) EM pseudo-labeling, handling the case of overlapping bags which has not previously been studied. We conduct extensive empirical evaluations on synthetic and real-world datasets showing that our method outperforms the baseline MIR methods.

Defining the technology of today and tomorrow.

Philosophy

People

Research areas

Foundational ML & Algorithms

Computing Systems & Quantum AI

Science, AI & Society

Projects

Publications

Resources

Shaping the future, together.

Student programs

Faculty programs

Conferences & events

Generalization and Learnability in Multiple Instance Regression

Abstract

Meet the teams driving innovation