Jump to Content

Inverted Residuals and Linear Bottlenecks: Mobile Networks for Classification, Detection and Segmentation

Andrew Howard
Liang-Chieh Chen
Menglong Zhu
CVPR (2018)
Google Scholar

Abstract

In this paper we describe a new mobile architecture MobileNetV2 that improves the state of the art performance of mobile models on multiple benchmarks across a spectrum of different model sizes. MobileNetV2 is based on an inverted residual structure where the input and output of the residual block are thin bottleneck layers, while the intermediate layer is an expanded representation that uses light weight depthwise convolutions to filter features. Additionally, we find that it is important to not use non-linearities in the narrow layers in order to maintain representational power. We demonstrate that this improves performance and provide an intuition that led to this design. Finally, our approach allows a decoupling of the input/output domains from the expressiveness of the transformation, which provides a convenient framework for further analysis. We measure our performance on ImageNet \cite{Russakovsky:2015:ILS:2846547.2846559} classification, VOC image segmentation \cite{PASCAL} and COCO object detection \cite{COCO} datasets, and evaluate the trade-offs between accuracy, and number of multiply adds, and number of parameters