Improved Approximation Algorithms and Lower Bounds for Search-Diversification Problems
Abstract
We study several questions related to diversifying search results. We give improved approximation algorithms in each of the problem, together with some lower bounds.
\begin{enumerate}
\item We give a polynomial-time approximation scheme (PTAS) for a diversified search ranking problem~\cite{BansalJKN10} whose objective is to minimizes the discounted cumulative gain. Our PTAS runs in time $n^{2^{O(\log(1/\eps)/\eps)}} \cdot m^{O(1)}$ where $n$ denote the number of elements in the databases and $m$ denote the constraints. Complementing this result, we show that no PTAS can run in time $f(\eps) \cdot (nm)^{2^{o(1/\eps)}}$ assuming Gap-ETH and therefore our running time is nearly tight. Both our upper and lower bounds answer open questions from~\cite{BansalJKN10}
\item We next consider the Max-Sum Dispersion problem, whose objective is to select $k$ out of $n$ elements from a database that maximizes the dispersion, which is define as the sum of the pairwise distances under a given metric. We give a quasipolynomial-time approximation scheme (QPTAS) for the problem which runs in time $n^{O_{\eps}(\log n)}$. This improves upon previously known polynomial-time algorithms with approximate ratios 0.5~\cite{HassinRT97,BorodinJLY17} Furthermore, we observe that reductions from previous work rule out approximation schemes that run in $n^{\tilde{o}_\eps(\log n)}$ time assuming ETH.
\item Finally, we consider a generalization of Max-Sum Dispersion called Max-Sum Diversification. In addition to the sum of pairwise distance, the objective also includes another function $f$. For monotone submodular function $f$, we give a quasipolynomial-time algorithm with approximation ratio arbitrarily close to $(1 - 1/e)$. This improves upon the best polynomial-time algorithm which has approximation ratio $0.5$~\cite{BorodinJLY17}. Furthermore, the $(1 - 1/e)$ factor is also tight as achieving better-than-$(1 - 1/e)$ approximation is NP-hard~\cite{Feige98}.
\begin{enumerate}
\item We give a polynomial-time approximation scheme (PTAS) for a diversified search ranking problem~\cite{BansalJKN10} whose objective is to minimizes the discounted cumulative gain. Our PTAS runs in time $n^{2^{O(\log(1/\eps)/\eps)}} \cdot m^{O(1)}$ where $n$ denote the number of elements in the databases and $m$ denote the constraints. Complementing this result, we show that no PTAS can run in time $f(\eps) \cdot (nm)^{2^{o(1/\eps)}}$ assuming Gap-ETH and therefore our running time is nearly tight. Both our upper and lower bounds answer open questions from~\cite{BansalJKN10}
\item We next consider the Max-Sum Dispersion problem, whose objective is to select $k$ out of $n$ elements from a database that maximizes the dispersion, which is define as the sum of the pairwise distances under a given metric. We give a quasipolynomial-time approximation scheme (QPTAS) for the problem which runs in time $n^{O_{\eps}(\log n)}$. This improves upon previously known polynomial-time algorithms with approximate ratios 0.5~\cite{HassinRT97,BorodinJLY17} Furthermore, we observe that reductions from previous work rule out approximation schemes that run in $n^{\tilde{o}_\eps(\log n)}$ time assuming ETH.
\item Finally, we consider a generalization of Max-Sum Dispersion called Max-Sum Diversification. In addition to the sum of pairwise distance, the objective also includes another function $f$. For monotone submodular function $f$, we give a quasipolynomial-time algorithm with approximation ratio arbitrarily close to $(1 - 1/e)$. This improves upon the best polynomial-time algorithm which has approximation ratio $0.5$~\cite{BorodinJLY17}. Furthermore, the $(1 - 1/e)$ factor is also tight as achieving better-than-$(1 - 1/e)$ approximation is NP-hard~\cite{Feige98}.