System Performance

About the team

Our team guides the roadmap, architecture and design of Google’s global computer infrastructure. We bring together experts in computer architecture, machine learning, software systems, compilers and operating systems to define and build the next generation of technology that powers Google.

Our research encompasses the entire system stack, from distributed software and runtime systems to microarchitecture and circuits. We seek to propose new computing substrates and accelerators, build and optimize large-scale real-world systems, research techniques to maximize code efficiency and define new machine-learning-based systems and paradigms. Research and open-ended exploration are key aspects of our work and we seek to share this work externally with the broader research community. We publish at a wide array of conferences, including ISCA, ASPLOS, MICRO, NeurIPS, ICML and ICLR.

Team focus summaries

Computer architecture

The combination of the end of Moore’s law and exponential increases in demand for computing and data has created an opportunity to redefine many of the layers that power computing. We architect state-of-the-art hardware accelerators, define new microarchitectures, and drive hardware and software co-design for Google-scale workloads.

ML-for-Systems

Using machine learning to improve computing systems enables us to replace many traditional heuristics within Google’s large-scale systems in the short-term, and a longer-term focus to automate the processes that we use to architect computer systems. We research, propose, and prototype ML-based techniques and then seek to deploy those techniques at scale across Google.

Runtime systems

Google’s data centers operate on a global scale. We seek to understand how to optimize a wide range of workloads and computing resources to ensure that Google’s workloads operate at peak performance and efficiency. Research into runtime systems at Google exposes us to the scale and complexity of warehouse computing.

Efficiency and profiling

To optimize Google’s workloads, we must understand how they execute at the datacenter scale, which requires cutting-edge research focused on code efficiency, new profiling techniques and co-design across layers of the stack, including operating systems and compilers.

Featured publications

Warehouse-Scale Video Acceleration: Co-design and Deployment in the Wild

Parthasarathy Ranganathan

Danner Stodolsky

Jeff Calow

Jeremy Dorfman

Marisabel Guevara Hechtman

Clint Smullen

Aki Kuusela

Aaron James Laursen

Alex Ramirez

Alvin Adrian Wijaya

Amir Salek

Anna Cheung

Ben Gelb

Brian Fosco

Cho Mon Kyaw

Dake He

David Alexander Munday

David Wickeraad

Devin Persaud

Don Stark

Drew Walton

Elisha Indupalli

Eric Perkins-Argueta

Fong Lou

Hon Kwan Wu

In Suk Chong

Indira Jayaram

Jia Feng

JP Maaninen

Kyle Alan Lucke

Maire Mahony

Mark Steven Wachsler

Mercedes Tan

Narayana Penukonda

Niranjani Dasharathi

Poonacha Kongetira

Prakash Chauhan

Raghuraman Balasubramanian

Ramon Macias

Richard Ho

Rob Springer

Roy W Huffman

Samuel Foss

Sandeep Bhatia

Sarah J. Gwin

Sathish K Sekar

Sergey N. Sokolov

Srikanth Muroor

Ville-Mikko Rautio

Yolanda Ripley

Yoshiaki Hase

Yuan Li

Proceedings of the 26th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Association for Computing Machinery, New York, NY, USA (2021), pp. 600-615

Learning Memory Access Patterns

Milad Hashemi

Kevin Jordan Swersky

Jamie Alexander Smith

Grant Ayers

Heiner Litz

Jichuan Chang

Christos Kozyrakis

Parthasarathy Ranganathan

ICML (2018)

Software-defined far memory in warehouse-scale computers

Andres Lagar-Cavilla

Junwhan Ahn

Suleiman Souhlal

Neha Agarwal

Radoslaw Burny

Shakeel Butt

Jichuan Chang

Ashwin Chaugule

Nan Deng

Junaid Shahid

Greg Thelen

Kamil Adam Yurtsever

Yu Zhao

Parthasarathy Ranganathan

International Conference on Architectural Support for Programming Languages and Operating Systems (2019)

Sage: Practical & Scalable ML-Driven Performance Debugging in Microservices

Yu Gan

Mingyu Liang

Sundar Dev

David Lo

Christina Delimitrou

ASPLOS 2021

A Hierarchical Neural Model of Data Prefetching

Zhan Shi

Akanksha Jain

Kevin Swersky

Milad Hashemi

Parthasarathy Ranganathan

Calvin Lin

Architectural Support for Programming Languages and Operating Systems (ASPLOS) (2021)

Oops I Took A Gradient: Scalable Sampling for Discrete Distributions

Christopher Joseph Maddison

David Duvenaud

Kevin Jordan Swersky

Milad Hashemi

Will Grathwohl

ICML (2021)

Searching for Fast Models on Datacenter Accelerators

Sheng Li

Mingxing Tan

Ruoming Pang

Andrew Li

Liqun (Legion) Cheng

Quoc V. Le

Norm Jouppi

Conference on Computer Vision and Pattern Recognition (2021)

Learning Execution through Neural Code Fusion

Zhan Shi

Kevin Jordan Swersky

Danny Tarlow

Parthasarathy Ranganathan

Milad Hashemi

ICLR 2020

Thunderbolt: Throughput-Optimized, Quality-of-Service-Aware Power Capping at Scale

Shaohong Li

Xi Wang

Xiao Zhang

Vasileios Kontorinis

Sreekumar Kodakara

David Lo

Parthasarathy Ranganathan

14th USENIX Symposium on Operating Systems Design and Implementation (OSDI 20), {USENIX} Association (2020), pp. 1241-1255

Neural Execution Engines: Learning to Execute Subroutines

Yujun Yan

Kevin Swersky

Danai Koutra

Parthasarathy Ranganathan

Milad Hashemi

NeurIPS 2020 (2020)

Defining the technology of today and tomorrow.

Philosophy

People

Teams

AI/ML Foundations  & Capabilities

Algorithms & Optimization

Computing Paradigms

Responsible Human-Centric Technology

Science & Societal Impact

Projects

Publications

Resources

Shaping the future, together.

Student programs

Faculty programs

Conferences & events

System performance

About the team

Team focus summaries

Computer architecture

ML-for-Systems

Runtime systems

Efficiency and profiling

Featured publications

Highlighted work

Some of our locations

Some of our people

Join us

Defining the technology of today and tomorrow.

Philosophy

People

Teams

AI/ML Foundations & Capabilities

Algorithms & Optimization

Computing Paradigms

Responsible Human-Centric Technology

Science & Societal Impact

Projects

Publications

Resources

Shaping the future, together.

Student programs

Faculty programs

Conferences & events

System performance

System performance

About the team

Team focus summaries

Computer architecture

ML-for-Systems

Runtime systems

Efficiency and profiling

Featured publications

Highlighted work

Some of our locations

Some of our people

Join us

AI/ML Foundations  & Capabilities