Consistent k-clustering

ICML '17 (2017)

Abstract

We introduce the {\em consistent $k$-clustering} problem, which asks to minimize the number of re-clusterings necessary to maintain a constant approximate solution as points arrive in an online manner. We prove lower bounds of $\Omega(k \log n)$ on the number of clustering changes necessary, and give an algorithm that needs only $O(k^3 \log^5 n)$ changes to maintain a constant competitive solution. This is an exponential improvement on the naive solution of reclustering at every time step. Finally, we show experimentally that our approach performs much better than the theoretical bound, with the number of changes growing approximately as $O(\log n)$