Stanko Novakovic

Stanko Novakovic

I’m a member of SystemsResearch@Google where I work on efficient and secure systems infrastructure. My background is in operating systems, distributed systems and computer architecture. At Google, I focus on building systems that improve the cost efficiency and security of running large-scale datacenter services. These systems include sandboxing tools and hypervisors for virtual machines, control plane infrastructure and systems for machine learning. Before Google, I was at Microsoft Research working on systems that leverage emerging hardware to improve the efficiency of cloud platforms. Before Microsoft, I worked on distributed systems that use a high-performance network for low-latency access to remote data. I won the Best Paper Award at SYSTOR 2019, Best Paper Honorable Mention at SIGMOD 2019, and a Distinguished Paper Award at ASPLOS 2023. I hold a PhD in Computer Science from École Polytechnique Fédérale de Lausanne (EPFL).
Authored Publications
Sort By
  • Title
  • Title, descending
  • Year
  • Year, descending
    Wave: Offloading Resource Management to SmartNIC Cores
    Jack Humphries
    Neel Natu
    Kostis Kaffes
    Hank Levy
    Christos Kozyrakis
    2025
    Preview abstract SmartNICs are increasingly deployed in datacenters to offload tasks from server CPUs, improving the efficiency and flexibility of datacenter security, networking and storage. Optimizing cloud server efficiency in this way is critically important to ensure that virtually all server resources are available to paying customers. Userspace system software, specifically, decision-making tasks performed by various operating system subsystems, is particularly well suited for execution on mid-tier SmartNIC ARM cores. To this end, we introduce Wave, a framework for offloading userspace system software to processes/agents running on the SmartNIC. Wave uses Linux userspace systems to better align system functionality with SmartNIC capabilities. It also introduces a new host-SmartNIC communication API that enables offloading of even μs-scale system software. To evaluate Wave, we offloaded preexisting userspace system software including kernel thread scheduling, memory management, and an RPC stack to SmartNIC ARM cores, which showed a performance degradation of 1.1%-7.4% in an apples-to-apples comparison with on-host implementations. Wave recovered host resources consumed by on-host system software for memory management (saving 16 host cores), RPCs (saving 8 host cores), and virtual machines (an 11.2% performance improvement). Wave highlights the potential for rethinking system software placement in modern datacenters, unlocking new opportunities for efficiency and scalability. View details
    PageFlex: Flexible and Efficient User-space Delegation of Linux Paging Policies with eBPF
    Kan Wu
    Zhiyuan Guo
    Suli Yang
    Rajath Shashidhara
    Wei Xu
    Alex Snoeren
    Kim Keeton
    2025
    Preview abstract To increase platform memory efficiency, hyperscalers like Google and Meta transparently demote “cold” application data to cheaper cost-per-byte memory tiers like compressed memory and NVMe SSDs. These systems rely on standard kernel paging policies and mechanisms to maximize the achievable memory savings without hurting application performance. Although the literature promises better policies, implementing and deploying them within the Linux kernel is challenging. Delegating policies and mechanisms to user space, through userfaultfd or library-based approaches, incurs overheads and may require modifying application code. We present PageFlex, a framework for delegating Linux paging policies to user space with minimal overhead and full compatibility with existing real-world deployments. PageFlex uses eBPF to delegate policy decisions while providing low-overhead access to in-kernel memory state and access information, thus balancing flexibility and performance. Additionally, PageFlex supports different paging strategies for distinct memory regions and application phases. We show that PageFlex can delegate existing kernel-based policies with little (< 1%) application slowdown, effectively realizing the benefits of state-of-the-art policies like Hyperbolic caching and Leap prefetching, and unlocking application-specific benefits through region- and phase-aware policy specialization. View details