Google Research

The Gibbs--Rand Model

PODS 2022 (to appear)

Abstract

Thanks to its many applications the clustering ensemble problem has been extensively studied in the past. In htis problem we are giving in input $m$ clustering and the objective is to output a clustering that well-represent'' all the input clustering. In this paper, we propose to thee best of our knowledge the first generative model for the problem. Our model is parameterized by acenter'' clustering and a scale; the probability of a particular clustering is an exponential function of its Rand distance to the center, scaled. For this new model, we show: (i) a sampling algorithm that runs in polynomial time when the center has a constant number of clusters and (ii) a simple polynomial time reconstruction algorithm when the scale is small.

Learn more about how we do research

We maintain a portfolio of research projects, providing individuals and teams the freedom to emphasize specific types of work