The Devil is in the Decoder: Classification, Regression and GANs
Abstract
Many machine vision applications require
predictions for every pixel of the input image (for exam-
ple semantic segmentation, boundary detection). Mod-
els for such problems usually consist of encoders which
decreases spatial resolution while learning a high-di-
mensional representation, followed by decoders who re-
cover the original input resolution and result in low-
dimensional predictions. While encoders have been stud-
ied rigorously, relatively few studies address the decoder
side. Therefore this paper presents an extensive com-
parison of a variety of decoders for a variety of pixel-
wise tasks ranging from classification, regression to syn-
thesis. Our contributions are: (1) Decoders matter: we
observe significant variance in results between different
types of decoders on various problems. (2) We introduce
new residual-like connections for decoders. (3) We in-
troduce a novel decoder: bilinear additive upsampling.
(4) We explore prediction artefacts.
predictions for every pixel of the input image (for exam-
ple semantic segmentation, boundary detection). Mod-
els for such problems usually consist of encoders which
decreases spatial resolution while learning a high-di-
mensional representation, followed by decoders who re-
cover the original input resolution and result in low-
dimensional predictions. While encoders have been stud-
ied rigorously, relatively few studies address the decoder
side. Therefore this paper presents an extensive com-
parison of a variety of decoders for a variety of pixel-
wise tasks ranging from classification, regression to syn-
thesis. Our contributions are: (1) Decoders matter: we
observe significant variance in results between different
types of decoders on various problems. (2) We introduce
new residual-like connections for decoders. (3) We in-
troduce a novel decoder: bilinear additive upsampling.
(4) We explore prediction artefacts.