System performance

About the team

Our team guides the roadmap, architecture and design of Google’s global computer infrastructure. We bring together experts in computer architecture, machine learning, software systems, compilers and operating systems to define and build the next generation of technology that powers Google.

Our research encompasses the entire system stack, from distributed software and runtime systems to microarchitecture and circuits. We seek to propose new computing substrates and accelerators, build and optimize large-scale real-world systems, research techniques to maximize code efficiency and define new machine-learning-based systems and paradigms. Research and open-ended exploration are key aspects of our work and we seek to share this work externally with the broader research community. We publish at a wide array of conferences, including ISCA, ASPLOS, MICRO, NeurIPS, ICML and ICLR.

Team focus summaries

Computer architecture

The combination of the end of Moore’s law and exponential increases in demand for computing and data has created an opportunity to redefine many of the layers that power computing. We architect state-of-the-art hardware accelerators, define new microarchitectures, and drive hardware and software co-design for Google-scale workloads.

ML-for-Systems

Using machine learning to improve computing systems enables us to replace many traditional heuristics within Google’s large-scale systems in the short-term, and a longer-term focus to automate the processes that we use to architect computer systems. We research, propose, and prototype ML-based techniques and then seek to deploy those techniques at scale across Google.

Runtime systems

Google’s data centers operate on a global scale. We seek to understand how to optimize a wide range of workloads and computing resources to ensure that Google’s workloads operate at peak performance and efficiency. Research into runtime systems at Google exposes us to the scale and complexity of warehouse computing.

Efficiency and profiling

To optimize Google’s workloads, we must understand how they execute at the datacenter scale, which requires cutting-edge research focused on code efficiency, new profiling techniques and co-design across layers of the stack, including operating systems and compilers.

Featured publications

Warehouse-Scale Video Acceleration: Co-design and Deployment in the Wild

Parthasarathy Ranganathan

Danner Stodolsky

Jeff Calow

Jeremy Dorfman

Marisabel Guevara Hechtman

Clint Smullen

Aki Kuusela

Aaron James Laursen

Alex Ramirez

Alvin Adrian Wijaya

Amir Salek

Anna Cheung

Ben Gelb

Brian Fosco

Cho Mon Kyaw

Dake He

David Alexander Munday

David Wickeraad

Devin Persaud

Don Stark

Drew Walton

Elisha Indupalli

Eric Perkins-Argueta

Fong Lou

Hon Kwan Wu

In Suk Chong

Indira Jayaram

Jia Feng

JP Maaninen

Kyle Alan Lucke

Maire Mahony

Mark Steven Wachsler

Mercedes Tan

Narayana Penukonda

Niranjani Dasharathi

Poonacha Kongetira

Prakash Chauhan

Raghuraman Balasubramanian

Ramon Macias

Richard Ho

Rob Springer

Roy W Huffman

Samuel Foss

Sandeep Bhatia

Sarah J. Gwin

Sathish K Sekar

Sergey N. Sokolov

Srikanth Muroor

Ville-Mikko Rautio

Yolanda Ripley

Yoshiaki Hase

Yuan Li

Proceedings of the 26th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Association for Computing Machinery, New York, NY, USA (2021), pp. 600-615