Improved Approximation Algorithms and Lower Bounds for Search-Diversification Problems

Amir Abboud
Euiwoong Lee
49th International Colloquium on Automata, Languages and Programming (ICALP) (2022)
Google Scholar

Abstract

We study several questions related to diversifying search results. We give improved approximation algorithms in each of the problem, together with some lower bounds.
\begin{enumerate}
\item We give a polynomial-time approximation scheme (PTAS) for a diversified search ranking problem~\cite{BansalJKN10} whose objective is to minimizes the discounted cumulative gain. Our PTAS runs in time $n^{2^{O(\log(1/\eps)/\eps)}} \cdot m^{O(1)}$ where $n$ denote the number of elements in the databases and $m$ denote the constraints. Complementing this result, we show that no PTAS can run in time $f(\eps) \cdot (nm)^{2^{o(1/\eps)}}$ assuming Gap-ETH and therefore our running time is nearly tight. Both our upper and lower bounds answer open questions from~\cite{BansalJKN10}
\item We next consider the Max-Sum Dispersion problem, whose objective is to select $k$ out of $n$ elements from a database that maximizes the dispersion, which is define as the sum of the pairwise distances under a given metric. We give a quasipolynomial-time approximation scheme (QPTAS) for the problem which runs in time $n^{O_{\eps}(\log n)}$. This improves upon previously known polynomial-time algorithms with approximate ratios 0.5~\cite{HassinRT97,BorodinJLY17} Furthermore, we observe that reductions from previous work rule out approximation schemes that run in $n^{\tilde{o}_\eps(\log n)}$ time assuming ETH.
\item Finally, we consider a generalization of Max-Sum Dispersion called Max-Sum Diversification. In addition to the sum of pairwise distance, the objective also includes another function $f$. For monotone submodular function $f$, we give a quasipolynomial-time algorithm with approximation ratio arbitrarily close to $(1 - 1/e)$. This improves upon the best polynomial-time algorithm which has approximation ratio $0.5$~\cite{BorodinJLY17}. Furthermore, the $(1 - 1/e)$ factor is also tight as achieving better-than-$(1 - 1/e)$ approximation is NP-hard~\cite{Feige98}.