Spectral Clustering Oracles in Sublinear Time
Abstract
Given a graph G that can be partitioned into k clusters, can we efficiently construct a small
space data structure that allows quickly classifying vertices of G according to cluster they belong to?
Formally, if G can be partitioned into k disjoint expanders with outer conductance upper bounded
by ∈ (0, 1), how efficient can such a data structure be?
In this paper we show that surprisingly efficient and robust data structure exist. In particular we
prove that storing approximations to a small number of random walks from a few random nodes in the
graph G allows one to classify vertices according to cluster structure of G using only poly(k, log n) ·
n
1/2+O() n time per vertex, i.e. in sublinear time! This runtime is optimal for small constant
values of s and k [Chiplunkar et al’FOCS’18].
To the best of our knowledge, our result is the first spectral clustering algorithm that allows
classification in sublinear time even when the outer conductance of the partitions is only a small
constant (as opposed to vanishingly small).