Bicriteria Distributed Submodular Maximization in a Few Rounds
Abstract
We study the problem of efficiently optimizing submodular functions under
cardinality constraints in distributed setting. Recently, several
distributed algorithms for this problem have been introduced
which either achieve a sub-optimal solution or they run in super-constant
number
of rounds of computation. Unlike previous work, we aim
to design distributed algorithms in multiple rounds
with almost optimal approximation guarantees
at the cost of outputting a larger number of elements.
Toward this goal, we present a distributed algorithm that, for any \epsilon >
0
and any constant r,
outputs a set
S of O(rk/\epsilon^{1\over r}) items in r rounds, and achieves
a (1-\epsilon)-approximation of the value of the optimum set with k items.
This is the first
distributed algorithm
that achieves an approximation factor of (1-\epsilon) running in less than
$\log {1\over \epsilon}$ number of rounds.
We also prove a hardness result showing that the output of any $1-\epsilon$ approximation distributed algorithm limited to one distributed round should have at least $\Omega(k/\epsilon)$ items. In light of this hardness result, our distributed algorithm in one round, $r=1$, is asymptotically tight in terms of the output size.
We support the theoretical guarantees
with an extensive empirical study of our
algorithm showing that achieving almost optimum solutions is indeed possible in
a few rounds for large-scale real datasets.
cardinality constraints in distributed setting. Recently, several
distributed algorithms for this problem have been introduced
which either achieve a sub-optimal solution or they run in super-constant
number
of rounds of computation. Unlike previous work, we aim
to design distributed algorithms in multiple rounds
with almost optimal approximation guarantees
at the cost of outputting a larger number of elements.
Toward this goal, we present a distributed algorithm that, for any \epsilon >
0
and any constant r,
outputs a set
S of O(rk/\epsilon^{1\over r}) items in r rounds, and achieves
a (1-\epsilon)-approximation of the value of the optimum set with k items.
This is the first
distributed algorithm
that achieves an approximation factor of (1-\epsilon) running in less than
$\log {1\over \epsilon}$ number of rounds.
We also prove a hardness result showing that the output of any $1-\epsilon$ approximation distributed algorithm limited to one distributed round should have at least $\Omega(k/\epsilon)$ items. In light of this hardness result, our distributed algorithm in one round, $r=1$, is asymptotically tight in terms of the output size.
We support the theoretical guarantees
with an extensive empirical study of our
algorithm showing that achieving almost optimum solutions is indeed possible in
a few rounds for large-scale real datasets.