Sharing Decoders: Network Fission for Multi-task Pixel Prediction

Steven Hickson

Karthik Raveendran

Irfan Essa

Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, IEEE/CVF (2022), pp. 3771-3780

Download Google Scholar

Abstract

We examine the benefits of splitting encoder-decoders for
multitask learning and showcase results on three tasks (semantics, surface normals, and depth) while adding very few
FLOPS per task. Current hard parameter sharing methods for multi-task pixel-wise labeling use one shared encoder with separate decoders for each task. We generalize
this notion and term the splitting of encoder-decoder architectures at different points as fission. Our ablation studies on fission show that sharing most of the decoder layers in multi-task encoder-decoder networks results in improvement while adding far fewer parameters per task. Our
proposed method trains faster, uses less memory, results in
better accuracy, and uses significantly fewer floating point
operations (FLOPS) than conventional multi-task methods,
with additional tasks only requiring 0.017% more FLOPS
than the single-task network.

Research Areas

Machine perception

Explore our many areas of focus

Building a collaborative ecosystem

Shaping the future together

Translating discovery into real-world impact

Sharing Decoders: Network Fission for Multi-task Pixel Prediction

Abstract

Research Areas

Meet the teams driving innovation

Google AI

Google Cloud

Google DeepMind

Google Labs