Multiscale Quantization for Fast Similarity Search

Xiang Wu; Ruiqi Guo; Ananda Theertha Suresh; Sanjiv Kumar; Dan Holtmann-Rice; David Simcha; Felix X. Yu

Multiscale Quantization for Fast Similarity Search

Xiang Wu

Ruiqi Guo

Ananda Theertha Suresh

Sanjiv Kumar

Dan Holtmann-Rice

David Simcha

Felix X. Yu

NIPS (2017)

Download Google Scholar

Abstract

We propose a multiscale quantization approach for fast similarity search on large, high-dimensional datasets. The key insight of the approach is that quantization methods, in particular product quantization, perform poorly when there is large variance in the norms of the data points. This is a common scenario for real-world datasets, especially when doing product quantization of residuals obtained from coarse vector quantization. To address this issue, we propose a multiscale formulation where we learn a separate scalar quantizer of the residual norm scales. All parameters are learned jointly in a stochastic gradient descent framework to minimize the overall quantization error. We provide theoretical motivation for the proposed technique and conduct comprehensive experiments on two large-scale public datasets, demonstrating substantial improvements in recall over existing state-of-the-art methods.

Explore our many areas of focus

Building a collaborative ecosystem

Shaping the future together

Translating discovery into real-world impact

Multiscale Quantization for Fast Similarity Search

Abstract

Research Areas

Meet the teams driving innovation

Google AI

Google Cloud

Google DeepMind

Google Labs