Google Research

The Gibbs--Rand Model

PODS 2022 (to appear)


Thanks to its many applications the clustering ensemble problem has been extensively studied in the past. In htis problem we are giving in input $m$ clustering and the objective is to output a clustering that well-represent'' all the input clustering. In this paper, we propose to thee best of our knowledge the first generative model for the problem. Our model is parameterized by acenter'' clustering and a scale; the probability of a particular clustering is an exponential function of its Rand distance to the center, scaled. For this new model, we show: (i) a sampling algorithm that runs in polynomial time when the center has a constant number of clusters and (ii) a simple polynomial time reconstruction algorithm when the scale is small.

Learn more about how we do research

We maintain a portfolio of research projects, providing individuals and teams the freedom to emphasize specific types of work