Revisiting knowledge transfer for training object class detectors
Abstract
We propose to revisit knowledge transfer for training
object detectors on target classes from weakly supervised
training images, helped by a set of source classes with
bounding-box annotations. We present a unified knowledge
transfer framework based on training a single neural net-
work multi-class object detector over all source classes, or-
ganized in a semantic hierarchy. This generates proposals
with scores at multiple levels in the hierarchy, which we use
to explore knowledge transfer over a broad range of gen-
erality, ranging from class-specific (bycicle to motorbike)
to class-generic (objectness to any class). Experiments
on the 200 object classes in the ILSVRC 2013 detection
dataset show that our technique (1) leads to much better
performance on the target classes (70.3% CorLoc, 36.9%
mAP) than a weakly supervised baseline which uses man-
ually engineered objectness [10] (50.5% CorLoc, 25.4%
mAP). (2) delivers target object detectors reaching 80% of
the mAP of their fully supervised counterparts. (3) outper-
forms the best reported transfer learning results [17, 42] on
this dataset (+41% CorLoc, +3% mAP). Moreover, we also
carry out several across-dataset knowledge transfer exper-
iments [25, 22, 32] and find that (4) our technique outper-
forms the weakly supervised baseline in all dataset pairs by
1.5 × −1.9×, establishing its general applicability.
object detectors on target classes from weakly supervised
training images, helped by a set of source classes with
bounding-box annotations. We present a unified knowledge
transfer framework based on training a single neural net-
work multi-class object detector over all source classes, or-
ganized in a semantic hierarchy. This generates proposals
with scores at multiple levels in the hierarchy, which we use
to explore knowledge transfer over a broad range of gen-
erality, ranging from class-specific (bycicle to motorbike)
to class-generic (objectness to any class). Experiments
on the 200 object classes in the ILSVRC 2013 detection
dataset show that our technique (1) leads to much better
performance on the target classes (70.3% CorLoc, 36.9%
mAP) than a weakly supervised baseline which uses man-
ually engineered objectness [10] (50.5% CorLoc, 25.4%
mAP). (2) delivers target object detectors reaching 80% of
the mAP of their fully supervised counterparts. (3) outper-
forms the best reported transfer learning results [17, 42] on
this dataset (+41% CorLoc, +3% mAP). Moreover, we also
carry out several across-dataset knowledge transfer exper-
iments [25, 22, 32] and find that (4) our technique outper-
forms the weakly supervised baseline in all dataset pairs by
1.5 × −1.9×, establishing its general applicability.