Jump to Content

Spectral Clustering Oracles in Sublinear Time

Aidasadat Mousavifar
Christian Alexander Sohler
Grzegorz Gluch
Michael Kapralov
SODA 2021 (to appear)
Google Scholar

Abstract

Given a graph G that can be partitioned into k clusters, can we efficiently construct a small space data structure that allows quickly classifying vertices of G according to cluster they belong to? Formally, if G can be partitioned into k disjoint expanders with outer conductance upper bounded by ∈ (0, 1), how efficient can such a data structure be? In this paper we show that surprisingly efficient and robust data structure exist. In particular we prove that storing approximations to a small number of random walks from a few random nodes in the graph G allows one to classify vertices according to cluster structure of G using only poly(k, log n) · n 1/2+O() n time per vertex, i.e. in sublinear time! This runtime is optimal for small constant values of s and k [Chiplunkar et al’FOCS’18]. To the best of our knowledge, our result is the first spectral clustering algorithm that allows classification in sublinear time even when the outer conductance of the partitions is only a small constant (as opposed to vanishingly small).