Dimensions of Motion: Monocular Prediction through Flow Subspaces

Richard Strong Bowen*; Richard Tucker*; Ramin Zabih; Noah Snavely

Dimensions of Motion: Monocular Prediction through Flow Subspaces

Richard Strong Bowen*

Richard Tucker*

Ramin Zabih

Noah Snavely

Proceedings of the International Conference on 3D Vision (3DV) (2022)

Download Google Scholar

Abstract

We introduce a way to learn to estimate a scene representation from a single image by predicting a low-dimensional subspace of optical flow for each training example, which encompasses the variety of possible camera and object movement. Supervision is provided by a novel loss which measures the distance between this predicted flow subspace and an observed optical flow. This provides a new approach to learning scene representation tasks, such as monocular depth prediction or instance segmentation, in an unsupervised fashion using in-the-wild input videos without requiring camera poses, intrinsics, or an explicit multi-view stereo step. We evaluate our method in multiple settings, including an indoor depth prediction task where it achieves comparable performance to recent methods trained with more supervision.

Research Areas

Machine perception

Explore our many areas of focus

Building a collaborative ecosystem

Shaping the future together

Translating discovery into real-world impact

Dimensions of Motion: Monocular Prediction through Flow Subspaces

Abstract

Research Areas

Meet the teams driving innovation

Google AI

Google Cloud

Google DeepMind

Google Labs