Optimal Distributed Submodular Optimization via Sketching

Hossein Esfandiari
Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (2018), pp. 1138-1147

Abstract

As an important special case of submodular optimization problems, coverage problems are a central problem in optimization with a wide range of applications in data mining and machine learning. As
we need to handle larger and larger data sets, there is a clear need to develop distributed solutions to
these problems. While several results have been developed for distributed coverage maximizations, all
the existing method have notable limitations, e.g., they all achieve either suboptimal approximation
guarantees or suboptimal space and memory complexities. Moreover, most previous results for submodular maximization either explicitly or implicitly assume that one has a value oracle access to
the submodular function. Such a value oracle for coverage functions has the following form: given a subfamily of (input) subsets, determine the size of the union of the subsets in this subfamily.