A Massively Parallel Modularity-Maximizing Algorithm With Provable Guarantees
Abstract
Graph clustering is one of the most basic and pop-
ular unsupervised learning problem. Among the
different formulations of the problem, the modu-
larity objective has been particularly successful
for helping design impactful algorithms; Most
notably the Louvain algorithm has become one
of the most used algorithm for clustering graphs.
Yet, one major limitation of the Louvain algorithm
is its sequential nature which makes it impracti-
cal in distributive environments and on massive
datasets.
In this paper, we provide a parallel version of Lou-
vain which works in the massively parallel com-
putation model (MPC). We show that it achieves
optimal cluster recovery in the classic stochastic
block model in only a constant number of parallel
rounds, and so for the same regime of parameters
than the standard Louvain algorithm as shown
recently in Cohen-Addad et al. (2020).
ular unsupervised learning problem. Among the
different formulations of the problem, the modu-
larity objective has been particularly successful
for helping design impactful algorithms; Most
notably the Louvain algorithm has become one
of the most used algorithm for clustering graphs.
Yet, one major limitation of the Louvain algorithm
is its sequential nature which makes it impracti-
cal in distributive environments and on massive
datasets.
In this paper, we provide a parallel version of Lou-
vain which works in the massively parallel com-
putation model (MPC). We show that it achieves
optimal cluster recovery in the classic stochastic
block model in only a constant number of parallel
rounds, and so for the same regime of parameters
than the standard Louvain algorithm as shown
recently in Cohen-Addad et al. (2020).