Deep Reinforcement Learning of Region Proposal Networks for Object Detection
Abstract
We propose drl-RPN, a deep reinforcement learningbased
visual recognition model consisting of a sequential
region proposal network (RPN) and an object detector. In
contrast to typical RPNs, where candidate object regions
(RoIs) are selected greedily via class-agnostic NMS, drlRPN
optimizes an objective closer to the final detection
task. This is achieved by replacing the greedy RoI selection
process with a sequential attention mechanism which is
trained via deep reinforcement learning (RL). Our model is
capable of accumulating class-specific evidence over time,
potentially affecting subsequent proposals and classification
scores, and we show that such context integration significantly
boosts detection accuracy. Moreover, drl-RPN
automatically decides when to stop the search process and
has the benefit of being able to jointly learn the parameters
of the policy and the detector, both represented as deep networks.
Our model can further learn to search over a wide
range of exploration-accuracy trade-offs making it possible
to specify or adapt the exploration extent at test time.
The resulting search trajectories are image- and categorydependent,
yet rely only on a single policy over all object
categories. Results on the MS COCO and PASCAL
VOC challenges show that our approach outperforms established,
typical state-of-the-art object detection pipelines.