Beyond “Near-Duplicates”: Learning Hash Codes for Efficient Similar-Image Retrieval

Shumeet Baluja; Michele Covell

Beyond “Near-Duplicates”: Learning Hash Codes for Efficient Similar-Image Retrieval

Shumeet Baluja

Michele Covell

20th International Conference on Pattern Recognition 2010

Download Google Scholar

Abstract

Finding similar images in a large database is an important, but often computationally expensive, task. In this paper, we present a two-tier similar-image retrieval system with the efficiency characteristics found in simpler systems designed to recognize near-duplicates. We compare the efficiency of lookups based on random projections and learned hashes to 100-times-more-frequent exemplar sampling. Both approaches significantly improve on the results from exemplar sampling, despite having significantly lower computational costs. Learned-hash keys provide the best result, in terms of both recall and efficiency.

Research Areas

Machine perception

Explore our many areas of focus

Building a collaborative ecosystem

Shaping the future together

Translating discovery into real-world impact

Beyond “Near-Duplicates”: Learning Hash Codes for Efficient Similar-Image Retrieval

Abstract

Research Areas

Meet the teams driving innovation

Google AI

Google Cloud

Google DeepMind

Google Labs