An Update to Open Images - Now with Bounding-Boxes
July 20, 2017
Posted by Vittorio Ferrari, Research Scientist, Machine Perception
Quick links
Last year we introduced Open Images, a collaborative release of ~9 million images annotated with labels spanning over 6000 object categories, designed to be a useful dataset for machine learning research. The initial release featured image-level labels automatically produced by a computer vision model similar to Google Cloud Vision API, for all 9M images in the training set, and a validation set of 167K images with 1.2M human-verified image-level labels.
Today, we introduce an update to Open Images, which contains the addition of a total of ~2M bounding-boxes to the existing dataset, along with several million additional image-level labels. Details include:
- 1.2M bounding-boxes around objects for 600 categories on the training set. These have been produced semi-automatically by an enhanced version of the technique outlined in [1], and are all human-verified.
- Complete bounding-box annotation for all object instances of the 600 categories on the validation set, all manually drawn (830K boxes). The bounding-box annotations in the training and validations sets will enable research on object detection on this dataset. The 600 categories offer a broader range than those in the ILSVRC and COCO detection challenges, and include new objects such as fedora hat and snowman.
- 4.3M human-verified image-level labels on the training set (over all categories). This will enable large-scale experiments on object classification, based on a clean training set with reliable labels.
Annotated images from the Open Images dataset. Left: FAMILY MAKING A SNOWMAN by mwvchamber. Right: STANZA STUDENTI.S.S. ANNUNZIATA by ersupalermo. Both images used under CC BY 2.0 license. See more examples here. |
References
[1] We don't need no bounding-boxes: Training object class detectors using only human verification, Papadopoulos, Uijlings, Keller, and Ferrari, CVPR 2016