Jump to Content

Associating Objects and their Effects in Unconstrained Monocular Video

Erika Lu
Zhengqi Li
Leonid Sigal
IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2023
Google Scholar


We propose a method to decompose a video into a back- ground and a set of foreground layers, where the back- ground captures stationary elements while the foreground layers capture moving objects along with their associated effects (e.g. shadows and reflections). Our approach is de- signed for unconstrained monocular videos, with arbitrary camera and object motion. Prior work that tackles this problem assumes that the video can be mapped onto a fixed 2D canvas, severely limiting the possible space of camera motion. Instead, our method applies recent progress in monocular camera pose and depth estimation to create a full, RGBD video layer for the background, along with a video layer for each foreground object. To solve the under- constrained decomposition problem, we propose a new loss formulation based on multi-view consistency. We test our method on challenging videos with complex camera motion and show significant qualitative improvement over current methods.

Research Areas