Unsupervised monocular depth and ego-motion learning with structure and semantics

Soeren Pirk
CVPR Workshop on Visual Odometry & Computer Vision Applications Based on Location Clues (2019)

Abstract

We present an approach which takes advantage of both structure and semantics for unsupervised monocular learning of depth and ego-motion. More specifically we model the motions of individual objects and learn their 3D motion vector jointly with depth and egomotion. We obtain more accurate results, especially for challenging dynamic scenes not addressed by previous approaches. This is an extended version of Casser et al. Code and models have been open sourced at:
https://sites.google.com/corp/view/struct2depth.