On the Power of Louvain for Graph Clustering
Abstract
A classic problem in machine learning and data analysis it to partition
the vertices of a network in such a way that vertices in the same set are densely connected.
In practice, the most popular approaches rely on local search algorithms;
not only for the ease of implementation and the efficiency, but also because of the accuracy of these methods on
many real-world graphs. For example, the Louvain algorithm -- a local search based algorithm -- has quickly become
the method of choice for clustering in social networks, accumulating more than 10700 citations over the past 10 years.
However, explaining the success of these methods remains an open problem: in the worst-case, both
their running time and output can be arbitrarily bad.
In this work, we study the Louvain heuristic and aim at explaining its success and identifying the mechanisms that makes
it successful through a classic model for social networks, the Stochastic Block Model.
For a wide range of parameters, we give the first theoretical evidence that Louvain identifies the underlying natural
partition of a network in near-linear time.
the vertices of a network in such a way that vertices in the same set are densely connected.
In practice, the most popular approaches rely on local search algorithms;
not only for the ease of implementation and the efficiency, but also because of the accuracy of these methods on
many real-world graphs. For example, the Louvain algorithm -- a local search based algorithm -- has quickly become
the method of choice for clustering in social networks, accumulating more than 10700 citations over the past 10 years.
However, explaining the success of these methods remains an open problem: in the worst-case, both
their running time and output can be arbitrarily bad.
In this work, we study the Louvain heuristic and aim at explaining its success and identifying the mechanisms that makes
it successful through a classic model for social networks, the Stochastic Block Model.
For a wide range of parameters, we give the first theoretical evidence that Louvain identifies the underlying natural
partition of a network in near-linear time.