Reconstructing Hidden Permutations Using the Average-Precision (AP) Correlation Statistic
Abstract
We study the problem of learning probabilistic models
for permutations, where the order between highly
ranked items in the observed permutations is more reliable
(i.e., consistent in different rankings) than the order
between lower ranked items, a typical phenomena observed
in many applications such as web search results
and product ranking. We introduce and study a variant
of the Mallows model where the distribution is a function
of the widely used Average-Precision (AP) Correlation
statistic, instead of the standard Kendall’s tau distance.
We present a generative model for constructing
samples from this distribution and prove useful properties
of that distribution. Using these properties we develop
an efficient algorithm that provably computes an
asymptotically unbiased estimate of the center permutation,
and a faster algorithm that learns with high probability
the hidden central permutation for a wide range of
the parameters of the model. We complement our theoretical
analysis with extensive experiments showing that
unsupervised methods based on our model can precisely
identify ground-truth clusters of rankings in real-world
data. In particular, when compared to the Kendall’s tau
based methods, our methods are less affected by noise
in low-rank items.