Fast (1+eps)-Approximation Algorithms for Binary Matrix Factorization
Abstract
We study efficient approximation algorithms for the binary matrix factorization (BMF) problem, where the inputs are a matrix $\bA\in\{0,1\}^{n\times d}$ and a rank parameter $k>0$, which is typically a small integer and the goal is to approximate $\bA$ by outputting factors $\bU\in\{0,1\}^{n\times k}$ and $\bV\in\{0,1\}^{k\times d}$ that minimize the Frobenius loss $\|\bA-\bU\bV\|_F$.
Currently, the state-of-the-art for this problem is the approximation algorithm of Kumar \etal~[ICML 2019], which achieves a $C$-approximation for some constant $C>576$.
We give the first $(1+\eps)$-approximation algorithm using $2^{\tO{k^2/\eps^4}}\poly(n,d)$ runtime, which matches previous results, the new dependency on the desired approximation parameter $\eps>0$ notwithstanding.
Our techniques generalize to other common variants of the BMF problem, admitting bicriteria $(1+\eps)$-approximation algorithms for $L_p$ loss functions, as well as the setting where all matrix operations are performed in $\mathbb{F}_2$.
Moreover, our approach can be implemented in standard big data models, such as the streaming model or the distributed model.
Currently, the state-of-the-art for this problem is the approximation algorithm of Kumar \etal~[ICML 2019], which achieves a $C$-approximation for some constant $C>576$.
We give the first $(1+\eps)$-approximation algorithm using $2^{\tO{k^2/\eps^4}}\poly(n,d)$ runtime, which matches previous results, the new dependency on the desired approximation parameter $\eps>0$ notwithstanding.
Our techniques generalize to other common variants of the BMF problem, admitting bicriteria $(1+\eps)$-approximation algorithms for $L_p$ loss functions, as well as the setting where all matrix operations are performed in $\mathbb{F}_2$.
Moreover, our approach can be implemented in standard big data models, such as the streaming model or the distributed model.