Jump to Content

Warehouse-Scale Video Acceleration: Co-design and Deployment in the Wild

Danner Stodolsky
Jeff Calow
Jeremy Dorfman
Clint Smullen
Aki Kuusela
Aaron James Laursen
Alex Ramirez
Amir Salek
Anna Cheung
Ben Gelb
Brian Fosco
Cho Mon Kyaw
Dake He
David Alexander Munday
David Wickeraad
Devin Persaud
Don Stark
Elisha Indupalli
Fong Lou
Hon Kwan Wu
In Suk Chong
Indira Jayaram
Jia Feng
JP Maaninen
Maire Mahony
Mark Steven Wachsler
Mercedes Tan
Niranjani Dasharathi
Poonacha Kongetira
Prakash Chauhan
Raghuraman Balasubramanian
Ramon Macias
Richard Ho
Rob Springer
Roy W Huffman
Sandeep Bhatia
Sathish K Sekar
Srikanth Muroor
Ville-Mikko Rautio
Yolanda Ripley
Yoshiaki Hase
Yuan Li
Proceedings of the 26th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Association for Computing Machinery, New York, NY, USA (2021), pp. 600-615


Video sharing (e.g., YouTube, Vimeo, Facebook, TikTok) accounts for the majority of internet traffic, and video processing is also foundational to several other key workloads (video conferencing, virtual/augmented reality, cloud gaming, video in Internet-of-Things devices, etc.). The importance of these workloads motivates larger video processing infrastructures and – with the slowing of Moore’s law – specialized hardware accelerators to deliver more computing at higher efficiencies. This paper describes the design and deployment, at scale, of a new accelerator targeted at warehouse-scale video transcoding. We present our hardware design including a new accelerator building block – the video coding unit (VCU) – and discuss key design trade-offs for balanced systems at data center scale and co-designing accelerators with large-scale distributed software systems. We evaluate these accelerators “in the wild" serving live data center jobs, demonstrating 20-33x improved efficiency over our prior well-tuned non-accelerated baseline. Our design also enables effective adaptation to changing bottlenecks and improved failure management, and new workload capabilities not otherwise possible with prior systems. To the best of our knowledge, this is the first work to discuss video acceleration at scale in large warehouse-scale environments.