Raman Sarokin
Research Areas
Authored Publications
Sort By
Efficient Heterogeneous Video Segmentation at the Edge
Jamie Lin
Siargey Pisarchyk
David Cong Tian
Tingbo Hou
Sixth Workshop on Computer Vision for AR/VR (CV4ARVR) (2022)
Preview abstract
We introduce an efficient video segmentation system for resource-limited edge devices leveraging heterogeneous compute. Specifically, we design network models by searching across multiple dimensions of specifications for the neural architectures and operations on top of already light-weight backbones, targeting commercially available edge inference engines. We further analyze and optimize the heterogeneous data flows in our systems across the CPU, the GPU and the NPU. Our approach has empirically factored well into our real-time AR system, enabling remarkably higher accuracy with quadrupled effective resolutions, yet at much shorter end-to-end latency, much higher frame rate, and even lower power consumption on edge platforms.
View details
On-Device Neural Net Inference with Mobile GPUs
Nikolay Chirkov
Yury Pisarchyk
Mogan Shieh
Fabio Riccardi
Efficient Deep Learning for Computer Vision CVPR 2019 (ECV2019) (to appear)
Preview abstract
On-device inference of machine learning models for mobile phones is desirable due to its lower latency and increased privacy. Running such a compute-intensive task solely on the mobile CPU, however, can be difficult due to limited computing power, thermal constraints, and energy consumption. App developers and researchers have begun exploiting hardware accelerators to overcome these challenges. Recently, device manufacturers are adding neural processing units into high-end phones for on-device inference, but these account for only a small fraction of hand-held devices. In this paper, we present how we leverage the mobile GPU, a ubiquitous hardware accelerator on virtually every phone, to run inference of deep neural networks in real-time for both Android and iOS devices. By describing our architecture, we also discuss how to design networks that are mobile GPU-friendly. Our state-of-the-art mobile GPU inference engine is integrated into the open-source project TensorFlow Lite and publicly available at https://tensorflow.org/lite.
View details