Kyle Alan Lucke

Authored Publications
Sort By
  • Title
  • Title, descending
  • Year
  • Year, descending
    Warehouse-Scale Video Acceleration: Co-design and Deployment in the Wild
    Danner Stodolsky
    Jeff Calow
    Jeremy Dorfman
    Clint Smullen
    Aki Kuusela
    Aaron James Laursen
    Alex Ramirez
    Alvin Adrian Wijaya
    Amir Salek
    Anna Cheung
    Ben Gelb
    Brian Fosco
    Cho Mon Kyaw
    Dake He
    David Alexander Munday
    David Wickeraad
    Devin Persaud
    Don Stark
    Drew Walton
    Elisha Indupalli
    Fong Lou
    Hon Kwan Wu
    In Suk Chong
    Indira Jayaram
    Jia Feng
    JP Maaninen
    Maire Mahony
    Mark Steven Wachsler
    Mercedes Tan
    Niranjani Dasharathi
    Poonacha Kongetira
    Prakash Chauhan
    Raghuraman Balasubramanian
    Ramon Macias
    Richard Ho
    Rob Springer
    Roy W Huffman
    Sandeep Bhatia
    Sarah J. Gwin
    Sathish K Sekar
    Srikanth Muroor
    Ville-Mikko Rautio
    Yolanda Ripley
    Yoshiaki Hase
    Yuan Li
    Proceedings of the 26th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Association for Computing Machinery, New York, NY, USA (2021), pp. 600-615
    Preview abstract Video sharing (e.g., YouTube, Vimeo, Facebook, TikTok) accounts for the majority of internet traffic, and video processing is also foundational to several other key workloads (video conferencing, virtual/augmented reality, cloud gaming, video in Internet-of-Things devices, etc.). The importance of these workloads motivates larger video processing infrastructures and – with the slowing of Moore’s law – specialized hardware accelerators to deliver more computing at higher efficiencies. This paper describes the design and deployment, at scale, of a new accelerator targeted at warehouse-scale video transcoding. We present our hardware design including a new accelerator building block – the video coding unit (VCU) – and discuss key design trade-offs for balanced systems at data center scale and co-designing accelerators with large-scale distributed software systems. We evaluate these accelerators “in the wild" serving live data center jobs, demonstrating 20-33x improved efficiency over our prior well-tuned non-accelerated baseline. Our design also enables effective adaptation to changing bottlenecks and improved failure management, and new workload capabilities not otherwise possible with prior systems. To the best of our knowledge, this is the first work to discuss video acceleration at scale in large warehouse-scale environments. View details