Discovering the physical parts of an articulated object class from multiple videos
Abstract
We propose a method to discover the physical parts of an articulated object class (e.g. tiger, horse) from multiple videos. Since the individual parts of an object can move independently of one another, we discover them as object regions that consistently move relatively with respect to the rest of the object across videos. We then learn a location model of the parts and segment them accurately in the individual videos using an energy function that also enforces temporal and spatial consistency in the motion of the parts. Traditional methods for motion segmentation or non-rigid structure from motion cannot discover parts unless they display independent motion, since they operate on one video at a time. Our method overcomes this problem by discovering the parts across videos, which allows to discover them in videos where they move to videos where they do not.
We evaluate our method on a new dataset of 32 videos of tigers and horses, where we significantly outperform state-of-the art motion segmentation on the task of part discovery (roughly twice the accuracy).
We evaluate our method on a new dataset of 32 videos of tigers and horses, where we significantly outperform state-of-the art motion segmentation on the task of part discovery (roughly twice the accuracy).