Structured Transforms for Small-footprint Deep Learning

Neural Information Processing Systems (NIPS) (2015)

Abstract

We consider the task of building compact deep learning pipelines suitable for deployment
on storage and power constrained mobile devices. We propose a uni-
fied framework to learn a broad family of structured parameter matrices that are
characterized by the notion of low displacement rank. Our structured transforms
admit fast function and gradient evaluation, and span a rich range of parameter
sharing configurations whose statistical modeling capacity can be explicitly tuned
along a continuum from structured to unstructured. Experimental results show
that these transforms can significantly accelerate inference and forward/backward
passes during training, and offer superior accuracy-compactness-speed tradeoffs
in comparison to a number of existing techniques. In keyword spotting applications
in mobile speech recognition, our methods are much more effective than
standard linear low-rank bottleneck layers and nearly retain the performance of
state of the art models, while providing more than 3.5-fold compression.