Luigi Rizzo
Luigi Rizzo works in the host networking team where he helps design and develop high performance network support for the fleet. Before joining Google in 2016 he was a professor of computer engineering at the Universita` di Pisa, Italy, where he did research on computer networks and operating systems
Research Areas
Authored Publications
Sort By
Understanding Host Interconnect Congestion
Khaled Elmeleegy
Masoud Moshref
Rachit Agarwal
Saksham Agarwal
Sylvia Ratnasamy
Association for Computing Machinery, New York, NY, USA (2022), 198–204
Preview abstract
We present evidence and characterization of host congestion in production clusters: adoption of high-bandwidth access links leading to emergence of bottlenecks within the host interconnect (NIC-to-CPU data path). We demonstrate that contention on existing IO memory management units and/or the memory subsystem can significantly reduce the available NIC-to-CPU bandwidth, resulting in hundreds of microseconds of queueing delays and eventual packet drops at hosts (even when running a state-of-the-art congestion control protocol that accounts for CPU-induced host congestion). We also discuss implications of host interconnect congestion to design of future host architecture, network stacks and network protocols.
View details
ghOSt: Fast and Flexible User-Space Delegation of Linux Scheduling
Jack Tigar Humphries
Neel Natu
Ofir Weisse
Barret Rhoden
Josh Don
Oleg Rombakh
Christos Kozyrakis
Proceedings of the ACM SIGOPS 28th Symposium on Operating Systems Principles CD-ROM, Association for Computing Machinery, New York, NY, USA (2021), 588–604
Preview abstract
We present ghOSt, our infrastructure for delegating kernel scheduling decisions to userspace code. ghOSt is designed to support the rapidly evolving needs of our data center workloads and platforms.
Improving scheduling decisions can drastically improve the throughput, tail latency, scalability, and security of important workloads. However, kernel schedulers are difficult to implement, test, and deploy efficiently across a large fleet. Recent research suggests bespoke scheduling policies, within custom data plane operating systems, can provide compelling performance results in a data center setting. However, these gains have proved difficult to realize as it is impractical to deploy a custom OS image(s) at an application granularity, particularly in a multi-tenant environment, limiting the practical applications of these new techniques.
ghOSt provides general-purpose delegation of scheduling policies to userspace processes in a Linux environment. ghOSt provides state encapsulation, communication, and action mechanisms that allow complex expression of scheduling policies within a userspace agent, while assisting in synchronization. Programmers use any language to develop and optimize policies, which are modified without a host reboot. ghOSt supports a wide range of scheduling models, from per-CPU to centralized, run-to-completion to preemptive, and incurs low overheads for scheduling actions. We demonstrate ghOSt's performance on both academic and real-world workloads, including Google Snap and Google Search. We show that by using ghOSt instead of the kernel scheduler, we can quickly achieve comparable throughput and latency while enabling policy optimization, non-disruptive upgrades, and fault isolation for our data center workloads. We open-source our implementation to enable future research and development based on ghOSt.
View details
Preview abstract
These slides document kstats, a framework to collect statistics in the Linux kernel previously submitted to the upstream linux mailing lists.
View details