Jump to Content

Coresets Meet EDCS: Algorithms for Matching and Vertex Cover on Massive Graphs

Aaron Bernstein
Cliff Stein
Sepehr Assadi
Proceedings of the Thirtieth Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), SIAM (2019), pp. 1616-1635

Abstract

Maximum matching and minimum vertex cover are among the most fundamental graph optimization problems. Recently, randomized composable coresets were introduced as an effective technique for solving these problems in various models of computation on massive graphs. In this technique, one partitions the edges of an input graph randomly into multiple pieces, compresses each piece into a smaller subgraph, namely a coreset, and solves the problem on the union of these coresets to find the final solution. By designing small size randomized composable coresets, one can obtain efficient algorithms, in a black-box way, in multiple computational models including streaming, distributed communication, and the massively parallel computation (MPC) model. We develop randomized composable coresets of size Oe(n) that for any constant ε > 0, give a (3/2 + ε)-approximation to matching and a (3 + ε)-approximation to vertex cover. Our coresets improve upon the previously best approximation ratio of O(1) for matching and O(log n) for vertex cover. Most notably, our result for matching goes beyond a 2-approximation, which is a natural barrier for maximum matching in many models of computation. Our coresets lead to improved algorithms for the simultaneous communication model with randomly partitioned input, the streaming model when the input arrives in a random order, and the MPC model with O~(n√n) memory per machine and only two MPC rounds. Furthermore, inspired by the recent work of Czumaj et al. (arXiv 2017), we study algorithms for matching and vertex cover in the MPC model with only Oe(n) memory per machine. Building on our coreset constructions, we develop parallel algorithms that give an O(1)-approximation to both matching and vertex cover in only O(log log n) MPC rounds and O~(n) memory per machine. We further improve the approximation ratio of our matching algorithm to (1 + ε) for any constant ε > 0. Our results settle multiple open questions posed by Czumaj et al. A key technical ingredient of our paper is a novel application of edge degree constrained subgraphs (EDCS) that were previously introduced in the context of maintaining matchings in dynamic graphs. At the heart of our proofs are new structural properties of EDCS that identify these subgraphs as sparse certificates for large matchings and small vertex covers which are quite robust to sampling and composition.