Google Research

Unsupervised Monocular Depth Learning in Dynamic Scenes

Workshop on Perception for Autonomous Driving, ECCV 2020 (2020)


We present a method for jointly training the estimation of depth, egomotion, and a dense 3D translation field of objects, suitable for dynamic scenes containing multiple moving objects. Monocular photometric consistency is the sole source of supervision. We show that this apparently heavily-underdetermined problem can be regularized by imposing the following prior knowledge about 3D translation fields: They are sparse, since most of the scene is static, and they tend to be constant through rigid moving objects. We show that this regularization alone is sufficient to train monocular depth prediction models that exceed the accuracy achieved in prior work, including methods that require semantic input.

Research Areas

Learn more about how we do research

We maintain a portfolio of research projects, providing individuals and teams the freedom to emphasize specific types of work