Near-Optimal Correlation Clustering with Privacy

Ashkan Norouzi Fard
Chenglin Fan
Jakub Tarnawski
Slobodan Mitrović
NeurIPS 2022 (2022) (to appear)
Google Scholar

Abstract

Correlation clustering is a central problem in unsupervised learning, with applications spanning community detection, duplicate detection, automated labeling and many more. In the correlation clustering problem one receives as input a set of nodes and for each node a list of co-clustering preferences, and the goal is to output a clustering that minimizes the disagreement with the specified nodes' preferences. In this paper, we introduce a simple and computationally efficient algorithm for the correlation clustering problem with provable privacy guarantees. Our additive error is stronger than the one shown in prior work and is optimal up to polylogarithmic factors for fixed privacy parameters.