Coresets Meet EDCS: Algorithms for Matching and Vertex Cover on Massive Graphs
Abstract
Maximum matching and minimum vertex cover are among the most fundamental graph
optimization problems. Recently, randomized composable coresets were introduced as an effective
technique for solving these problems in various models of computation on massive graphs. In this
technique, one partitions the edges of an input graph randomly into multiple pieces, compresses
each piece into a smaller subgraph, namely a coreset, and solves the problem on the union of these
coresets to find the final solution. By designing small size randomized composable coresets, one
can obtain efficient algorithms, in a black-box way, in multiple computational models including
streaming, distributed communication, and the massively parallel computation (MPC) model.
We develop randomized composable coresets of size Oe(n) that for any constant ε > 0, give a
(3/2 + ε)-approximation to matching and a (3 + ε)-approximation to vertex cover. Our coresets
improve upon the previously best approximation ratio of O(1) for matching and O(log n) for
vertex cover. Most notably, our result for matching goes beyond a 2-approximation, which is
a natural barrier for maximum matching in many models of computation. Our coresets lead
to improved algorithms for the simultaneous communication model with randomly partitioned
input, the streaming model when the input arrives in a random order, and the MPC model with
O~(n√n) memory per machine and only two MPC rounds.
Furthermore, inspired by the recent work of Czumaj et al. (arXiv 2017), we study algorithms
for matching and vertex cover in the MPC model with only Oe(n) memory per machine. Building
on our coreset constructions, we develop parallel algorithms that give an O(1)-approximation
to both matching and vertex cover in only O(log log n) MPC rounds and O~(n) memory per
machine. We further improve the approximation ratio of our matching algorithm to (1 + ε) for
any constant ε > 0. Our results settle multiple open questions posed by Czumaj et al.
A key technical ingredient of our paper is a novel application of edge degree constrained
subgraphs (EDCS) that were previously introduced in the context of maintaining matchings in
dynamic graphs. At the heart of our proofs are new structural properties of EDCS that identify
these subgraphs as sparse certificates for large matchings and small vertex covers which are
quite robust to sampling and composition.
optimization problems. Recently, randomized composable coresets were introduced as an effective
technique for solving these problems in various models of computation on massive graphs. In this
technique, one partitions the edges of an input graph randomly into multiple pieces, compresses
each piece into a smaller subgraph, namely a coreset, and solves the problem on the union of these
coresets to find the final solution. By designing small size randomized composable coresets, one
can obtain efficient algorithms, in a black-box way, in multiple computational models including
streaming, distributed communication, and the massively parallel computation (MPC) model.
We develop randomized composable coresets of size Oe(n) that for any constant ε > 0, give a
(3/2 + ε)-approximation to matching and a (3 + ε)-approximation to vertex cover. Our coresets
improve upon the previously best approximation ratio of O(1) for matching and O(log n) for
vertex cover. Most notably, our result for matching goes beyond a 2-approximation, which is
a natural barrier for maximum matching in many models of computation. Our coresets lead
to improved algorithms for the simultaneous communication model with randomly partitioned
input, the streaming model when the input arrives in a random order, and the MPC model with
O~(n√n) memory per machine and only two MPC rounds.
Furthermore, inspired by the recent work of Czumaj et al. (arXiv 2017), we study algorithms
for matching and vertex cover in the MPC model with only Oe(n) memory per machine. Building
on our coreset constructions, we develop parallel algorithms that give an O(1)-approximation
to both matching and vertex cover in only O(log log n) MPC rounds and O~(n) memory per
machine. We further improve the approximation ratio of our matching algorithm to (1 + ε) for
any constant ε > 0. Our results settle multiple open questions posed by Czumaj et al.
A key technical ingredient of our paper is a novel application of edge degree constrained
subgraphs (EDCS) that were previously introduced in the context of maintaining matchings in
dynamic graphs. At the heart of our proofs are new structural properties of EDCS that identify
these subgraphs as sparse certificates for large matchings and small vertex covers which are
quite robust to sampling and composition.