Efficient Approximation for Restricted Biclique Cover Problems
Abstract
                Covering the edges of a bipartite graph by a minimum set of bipartite complete 
graphs (bicliques) is a basic graph theoretic problem, with numerous
applications. In particular, it is used to characterize parsimonious models of a
set of observations (each biclique corresponds to a {\it factor} or {\it
feature} that relates the observations in the two sets of nodes connected by
the biclique).
\revision{The decision version of the minimum biclique cover problem is NP-Complete}, and
unless $P=NP$, the cover size cannot be approximated in general within less than
a sub-linear factor of the number of nodes (or edges) in the graph.
In this work, we consider two natural restrictions to the problem, motivated by
practical applications. In the first case, we restrict the number of bicliques
a node can belong to. We show that when this number is at least $5$, the
problem is still NP-hard. In contrast, we show that when nodes belong to no more
than 2 bicliques, the problem has efficient approximations.
The second model we consider corresponds to observing a set of independent
samples from an unknown model, governed by a possibly large number of factors.
The model is defined by a bipartite graph
$G=(L,R,E)$, where each node in $L$ is assigned to an arbitrary subset of up to
a constant $f$ factors, while the nodes in $R$ (the independent observations)
are assigned to random subsets of the set of $k$ factors where $k$ can grow
with size of the graph. We show that this practical version of the biclique
cover problem is amenable to efficient approximations.
        graphs (bicliques) is a basic graph theoretic problem, with numerous
applications. In particular, it is used to characterize parsimonious models of a
set of observations (each biclique corresponds to a {\it factor} or {\it
feature} that relates the observations in the two sets of nodes connected by
the biclique).
\revision{The decision version of the minimum biclique cover problem is NP-Complete}, and
unless $P=NP$, the cover size cannot be approximated in general within less than
a sub-linear factor of the number of nodes (or edges) in the graph.
In this work, we consider two natural restrictions to the problem, motivated by
practical applications. In the first case, we restrict the number of bicliques
a node can belong to. We show that when this number is at least $5$, the
problem is still NP-hard. In contrast, we show that when nodes belong to no more
than 2 bicliques, the problem has efficient approximations.
The second model we consider corresponds to observing a set of independent
samples from an unknown model, governed by a possibly large number of factors.
The model is defined by a bipartite graph
$G=(L,R,E)$, where each node in $L$ is assigned to an arbitrary subset of up to
a constant $f$ factors, while the nodes in $R$ (the independent observations)
are assigned to random subsets of the set of $k$ factors where $k$ can grow
with size of the graph. We show that this practical version of the biclique
cover problem is amenable to efficient approximations.
