Jump to Content

Clustering without Over-Representation

Abstract

Clustering is a fundamental problem in unsupervised machine learning. In many applications, clustering needs to be performed in presence of additional constraints, such as fairness or diversity constraints. In this paper, we formulate the problem of k-center clustering without over-representation, and propose approximation algorithms to solve the problem, as well as hardness results. We empirically evaluate our clusterings on real-world dataset and show that fairness can be obtained with limited effect on clustering quality.