Fine-Grained Stochastic Architecture Search
Abstract
State-of-the-art deep networks are often too large to deploy on mobile devices and
embedded systems. Mobile neural architecture search (NAS) methods automate
the design of small models but state-of-the-art NAS methods are expensive to
run. Differentiable neural architecture search (DNAS) methods reduce the search
cost but explore a limited subspace of candidate architectures. In this paper, we
introduce Fine-Grained Stochastic Architecture Search (FiGS), a differentiable
search method that searches over a much larger set of candidate architectures. FiGS
simultaneously selects and modifies operators in the search space by applying a
structured sparse regularization penalty based on the Logistic-Sigmoid distribution.
We show results across 3 existing search spaces, matching or outperforming the
original search algorithms and producing state-of-the-art parameter-efficient models
on ImageNet (e.g., 75.4% top-1 with 2.6M params). Using our architectures as
backbones for object detection with SSDLite, we achieve significantly higher mAP
on COCO (e.g., 25.8 with 3.0M params) than MobileNetV3 and MnasNet.
embedded systems. Mobile neural architecture search (NAS) methods automate
the design of small models but state-of-the-art NAS methods are expensive to
run. Differentiable neural architecture search (DNAS) methods reduce the search
cost but explore a limited subspace of candidate architectures. In this paper, we
introduce Fine-Grained Stochastic Architecture Search (FiGS), a differentiable
search method that searches over a much larger set of candidate architectures. FiGS
simultaneously selects and modifies operators in the search space by applying a
structured sparse regularization penalty based on the Logistic-Sigmoid distribution.
We show results across 3 existing search spaces, matching or outperforming the
original search algorithms and producing state-of-the-art parameter-efficient models
on ImageNet (e.g., 75.4% top-1 with 2.6M params). Using our architectures as
backbones for object detection with SSDLite, we achieve significantly higher mAP
on COCO (e.g., 25.8 with 3.0M params) than MobileNetV3 and MnasNet.