VisualRank

May 1, 2008

Posted by Shumeet Baluja and Yushi Jing



At WWW-2008, in Beijing, China, we presented our paper "PageRank for Product Image Search". In this paper, we presented a system that used visual cues, instead of solely text information, to determine the rank of images. The idea was simple: find common visual themes in a set of images, and then find a small set of images that best represented those themes. The resulting algorithm wound up being PageRank, but on an entirely inferred graph of image similarities. Since the release of the paper, we've noticed lots of coverage in the press and have received quite a few questions. We thought we could answer a few of them here.


"Why did we choose to use products for our test case?" First and foremost, product queries are popular in actual usage; addressing them is important. Second, users have strong expectations of what results we should return for these queries; therefore, this category provides an important set of examples that we need to address especially carefully. Third, on a pragmatic note, they lend themselves well to the type of "image features" that we selected in this study. Since the publication of the paper, we've also extended our results to other query types, including travel-related queries. One of the nice features of the approach is that (we hope) it will be easy to extend to new domains; as research in measuring image or object similarity continues, the advances can easily be incorporated into the similarity calculation to compute the underlying graph; the computations on the graph do not change.

"Where are we going from here?" Besides broadening the sets of queries (and sets of features) for which we can use this approach, there are three directions we're exploring. First, estimating similarity measures for all of the images on the web is computationally expensive; approximations or alternative computations are needed. Second, we hope to evaluate our approach with respect to the large number of recently proposed alternative clustering methods. Third, many variations of PageRank can be used in quite interesting ways for image search. For example, we can use some of these previously published methods to reintroduce, in a meaningful manner, the textual information that the VisualRank algorithm removed. In the end, we have an approach that has an easy integration with both text and visual clues. Stay tuned for more on that in the coming months.

And now to answer the most commonly asked question, "Is it live?" Not yet. Currently, it is research in progress (click here to help speed up the process). In the meantime, though, if you'd like another sneak peek of our research on large graphs, this time in the context of YouTube datamining, just follow the link.

Finally, we want to extend our deepest thanks to the people who helped on this project, especially the image-search team; without their help, this research would not have been possible.