TapNet: The Design, Training, Implementation, and Applications of a Multi-Task Learning CNN for Off-Screen Mobile Input

Michael Xuelin Huang
Nazneen Nazneen
Alex Chao
ACM CHI Conference on Human Factors in Computing Systems, ACM (2021)

Abstract

Off-screen interaction offers great potential for one-handed and eyes-free mobile interaction. While a few existing studies have explored the built-in mobile phone sensors to sense off-screen signals, none met practical requirement. This paper discusses the design, training, implementation and applications of TapNet, a multi-task network that detects tapping on the smartphone using built-in accelerometer and gyroscope. With sensor location as auxiliary information, TapNet can jointly learn from data across devices and simultaneously recognize multiple tap properties, including tap direction and tap location. We developed four datasets consisting of over 180K training samples, 38K testing samples, and 87 participants in total. Experimental evaluation demonstrated the effectiveness of the TapNet design and its significant improvement over the state of the art. Along with the datasets, codebase, and extensive experiments, TapNet establishes a new technical foundation for off-screen mobile input.