On the Benefit of Adding an Adversarial Loss to Depth Prediction

Rick Groenendijk
Sezer Karaoglu
Theo Gevers
Computer Vision and Image Understanding (CVIU)(2019)


Adversarial learning is one of the most promising novel learning paradigms in computer vision. Using Generative Adversarial Networks (GANs) and Variational Auto Encoders (VAEs) many image manipulation tasks have been addressed, from generating images from text, transfer images from one domain to another, or to translate sketches to images (and vice versa). In this paper we address the benefit for adding adversarial training to the task of monocular depth estimation, when trained from stereo pairs of images. For this depth estimation task many losses have been proposed, like L1 and SSIM image reconstruction loss, left-right consistency, and occlusion loss. We evaluate three flavours of adversarial models (Vanilla GANs, LSGANs and Wasserstein GANs) to our model with different number of image reconstruction losses. Based on extensive experimental evaluation, we conclude that adding a GAN is useful when the reconstruction loss is not too much constrained. While, for a constrained reconstruction loss (using a combination of 5 different losses) outperforms (or is on par with) any method trained with a GAN.