Benchmarking Large-Scale Fine-Grained Categorization

Phil Long
IEEE Winter Conference on Applications of Computer Vision (WACV)(2014)

Abstract

This paper presents a systematic evaluation of recent methods in the fine-grained categorization domain, which have shown significant promise. More specifically, we investigate an automatic segmentation algorithm, a region pooling algorithm which is akin to pose-normalized pooling [31] [28], and a multi-class optimization method. We considered the largest and most popular datasets for finegrained categorization available in the field: the CaltechUCSD 200 Birds dataset [27], the Oxford 102 Flowers dataset [19], the Stanford 120 Dogs dataset [16], and the Oxford 37 Cats and Dogs dataset [21]. We view this work from a practitioner’s perspective, answering the question: what are the methods that can create the best possible fine-grained recognition system which can be applied in practice? Our experiments provide insights of the relative merit of these methods. More importantly, after combining the methods, we achieve the top results in the field, outperforming the state-of-the-art methods by 4.8% and 10.3% for birds and dogs datasets, respectively. Additionally, our method achieves a mAP of 37.92 on the of 2012 Imagenet Fine-Grained Categorization Challenge [1], which outperforms the winner of this challenge by 5.7 points.

Research Areas