Google Research

Elf: Accelerate High-resolution Mobile Deep Vision with Content-aware Parallel Offloading

  • Wuyang Zhang
  • Zhezhi He
  • Luyang Liu
  • Zhenhua Jia
  • Yunxin Liu
  • Marco Gruteser
  • Dipankar Raychaudhuri
  • Yanyong Zhang
The 27th Annual International Conference on Mobile Computing and Networking (ACM MobiCom 2021). (2021)

Abstract

A broad class of computer vision algorithms on images or videos collected by mobile devices greatly benefit from deep learning for high application performance. Meanwhile, these applications often demand real-time responses (e.g., <100ms), which can hardly be satisfied with mobile devices of limited computation capability. Offloading the computation from mobile devices to edge clouds has been recently proposed as a promising approach. However, the previous work assumes that there always exist dedicated and powerful edge servers that put all the computing resources for a single offloading job. This assumption can hardly be true consistently due to the distributed nature of edge cloud and dynamic resource needs from mobile users.

In this work, we propose and design a system called \textit{Elf} to accelerate the deep neural network inference for vision application running on mobile devices. Elf is customized to minimize the end-to-end latency, through the intelligent content partition and multi-edge server offloading. In particular, instead of offloading an entire high-resolution video-clips to a single edge server naively, we perform intelligent partitions on the high-resolution video-clips in a content-/resource-aware fashion. Such partition leverages various techniques, including Region-Proposal (RP) complexity estimation, RP location prediction and Low-Resolution Compensation (LRC), which dynamically assigns the partitions to different computation servers. Comprehensive experiments are performed to demonstrate that the Elf system can effectively reduce the end-to-end latency of multi-object segmentation tasks on DAVIS2017 by 88.5\%, 94.6\% and 33.8\%, in comparison to NVIDIA Jetson TX2, Nano and single object counterparts.

Learn more about how we do research

We maintain a portfolio of research projects, providing individuals and teams the freedom to emphasize specific types of work