Fast (1+eps)-Approximation Algorithms for Binary Matrix Factorization

Maximilian Vötsch
David P. Woodruff
Samson Zhou
ICML 2023
Google Scholar


We study efficient approximation algorithms for the binary matrix factorization (BMF) problem, where the inputs are a matrix $\bA\in\{0,1\}^{n\times d}$ and a rank parameter $k>0$, which is typically a small integer and the goal is to approximate $\bA$ by outputting factors $\bU\in\{0,1\}^{n\times k}$ and $\bV\in\{0,1\}^{k\times d}$ that minimize the Frobenius loss $\|\bA-\bU\bV\|_F$. Currently, the state-of-the-art for this problem is the approximation algorithm of Kumar \etal~[ICML 2019], which achieves a $C$-approximation for some constant $C>576$. We give the first $(1+\eps)$-approximation algorithm using $2^{\tO{k^2/\eps^4}}\poly(n,d)$ runtime, which matches previous results, the new dependency on the desired approximation parameter $\eps>0$ notwithstanding. Our techniques generalize to other common variants of the BMF problem, admitting bicriteria $(1+\eps)$-approximation algorithms for $L_p$ loss functions, as well as the setting where all matrix operations are performed in $\mathbb{F}_2$. Moreover, our approach can be implemented in standard big data models, such as the streaming model or the distributed model.