Google Research

MediaPipe: A Framework for Perceiving and Processing Reality

  • Camillo Lugaresi
  • Jiuqiang Tang
  • Hadon Nash
  • Chris McClanahan
  • Esha Uboweja
  • Michael Hays
  • Fan Zhang
  • Chuo-Ling Chang
  • Ming Yong
  • Juhyun Lee
  • Wan-Teh Chang
  • Wei Hua
  • Manfred Georg
  • Matthias Grundmann
Third Workshop on Computer Vision for AR/VR at IEEE Computer Vision and Pattern Recognition (CVPR) 2019

Abstract

Building an application that processes perceptual inputs involves more than running an ML model. Developers have to harness the capabilities of a wide range of devices; balance resource usage and quality of results; run multiple operations in parallel and with pipelining; and ensure that time-series data is properly synchronized. The MediaPipe framework addresses these challenges. A developer can use MediaPipe to easily and rapidly combine existing and new perception components into prototypes and advance them to polished cross-platform applications. The developer can configure an application built with MediaPipe to manage resources efficiently (both CPU and GPU) for low latency performance, to handle synchronization of time-series data such as audio and video frames and to measure performance and resource consumption. We show that these features enable a developer to focus on the algorithm or model development, and use MediaPipe as an environment for iteratively improving their application, with results reproducible across different devices and platforms. MediaPipe will be open-sourced at https://github.com/google/mediapipe.

Research Areas

Learn more about how we do research

We maintain a portfolio of research projects, providing individuals and teams the freedom to emphasize specific types of work