Almost Optimal Streaming Algorithms for Coverage Problems
Abstract
Maximum coverage and minimum set cover problems --collectively called coverage problems-- have been studied extensively in streaming models. However, previous research not only achieve sub-optimal approximation factors and space complexities, but also study a restricted set arrival model which makes an explicit or implicit assumption on oracle access to the sets, ignoring the complexity of reading and storing the whole set at once. In this paper, we address the above shortcomings, and present algorithms with improved approximation factor and improved space complexity, and prove that our results are almost tight. Moreover, unlike most of previous work, our results hold on a more general edge arrival model. More specifically, we present (almost) optimal approximation algorithms for maximum coverage and minimum set cover problems in the streaming model with an (almost) optimal space complexity of O~(n), i.e., the space is {\em independent of the size of the sets or the size of the ground set of elements}. These results not only improve over the best known algorithms for the set arrival model, but also are the first such algorithms for the more powerful {\em edge arrival} model. In order to achieve the above results, we introduce a new general sketching technique for coverage functions: This sketching scheme can be applied to convert an α-approximation algorithm for a coverage problem to a $(1-\eps)\alpha$-approximation algorithm for the same problem in streaming, or RAM models. We show the significance of our sketching technique by ruling out the possibility of solving coverage problems via accessing (as a black box) a $(1 \pm \eps)$-approximate oracle (e.g., a sketch function) that estimates the coverage function on any subfamily of the sets.