Jump to Content
Milo Martin

Milo Martin

Authored Publications
Google Publications
Other Publications
Sort By
  • Title
  • Title, desc
  • Year
  • Year, desc
    Aquila: A unified, low-latency fabric for datacenter networks
    Hema Hariharan
    Eric Lance
    Moray Mclaren
    Stephen Wang
    Zhehua Wu
    Sunghwan Yoo
    Raghuraman Balasubramanian
    Prashant Chandra
    Michael Cutforth
    Peter James Cuy
    David Decotigny
    Rakesh Gautam
    Rick Roy
    Zuowei Shen
    Ming Tan
    Ye Tang
    Monica C Wong-Chan
    Joe Zbiciak
    Aquila: A unified, low-latency fabric for datacenter networks (2022)
    Preview abstract Datacenter workloads have evolved from the data intensive, loosely-coupled workloads of the past decade to more tightly coupled ones, wherein ultra-low latency communication is essential for resource disaggregation over the network and to enable emerging programming models. We introduce Aquila, an experimental datacenter network fabric built with ultra-low latency support as a first-class design goal, while also supporting traditional datacenter traffic. Aquila uses a new Layer 2 cell-based protocol, GNet, an integrated switch, and a custom ASIC with low-latency Remote Memory Access (RMA) capabilities co-designed with GNet. We demonstrate that Aquila is able to achieve under 40 μs tail fabric Round Trip Time (RTT) for IP traffic and sub-10 μs RMA execution time across hundreds of host machines, even in the presence of background throughput-oriented IP traffic. This translates to more than 5x reduction in tail latency for a production quality key-value store running on a prototype Aquila network. View details
    CliqueMap: Productionizing an RMA-Based Distributed Caching System
    Aditya Akella
    Amanda Strominger
    Arjun Singhvi
    Maggie Anderson
    Rob Cauble
    Thomas F. Wenisch
    SIGCOMM 2021 (2021) (to appear)
    Preview abstract Distributed caching is a key component in the design of performant, scalable Internet services, but accessing such caches via RPC incurs high cost. Remote Memory Access (RMA) offers a promising, less costly alternative, but achieving a rich production feature set with RMA-based systems is a significant challenge, as the rich abstraction of RPC lends itself to solutions for interoperability and upgradeability requirements of real systems. This work describes CliqueMap, a fully productionized RMA/RPC hybrid serving and caching system, and the production experience derived from three years of operation in Google’s datacenters. Building on internal technologies, CliqueMap serves multiple internal product areas and underlies several end-user-visible services. View details
    1RMA: Re-Envisioning Remote Memory Access for Multi-Tenant Datacenters
    Aditya Akella
    Arjun Singhvi
    Joel Scherpelz
    Monica C Wong-Chan
    Moray Mclaren
    Prashant Chandra
    Rob Cauble
    Sean Clark
    Simon Sabato
    Thomas F. Wenisch
    Proceedings of the Annual Conference of the ACM Special Interest Group on Data Communication on the Applications, Technologies, Architectures, and Protocols for Computer Communication, Association for Computing Machinery, New York, NY, USA (2020), 708–721
    Preview abstract Remote Direct Memory Access (RDMA) plays a key role in supporting performance-hungry datacenter applications. However, existing RDMA technologies are ill-suited to multi-tenant datacenters, where applications run at massive scales, tenants require isolation and security, and the workload mix changes over time. Our experiences seeking to operationalize RDMA at scale indicate that these ills are rooted in standard RDMA's basic design attributes: connection-orientedness and complex policies baked into hardware. We describe a new approach to remote memory access -- One-Shot RMA (1RMA) -- suited to the constraints imposed by our multi-tenant datacenter settings. The 1RMA NIC is connection-free and fixed-function; it treats each RMA operation independently, assisting software by offering fine-grained delay measurements and fast failure notifications. 1RMA software provides operation pacing, congestion control, failure recovery, and inter-operation ordering, when needed. The NIC, deployed in our production datacenters, supports encryption at line rate (100Gbps and 100M ops/sec) with minimal performance/availability disruption for encryption key rotation. View details
    Preview abstract It is our pleasure to introduce the 2015 Top Picks in Computer Architecture. We co-chaired the Selection Committee that had the formidable task of selecting the best computer architecture papers that were published in conferences in the previous year. Many excellent papers are published every year, and choosing among them is challenging, not least because of the need to define “best.” The committee identified 11 papers as being Top Picks this year. The range of topics is wide and reflects the healthy broadening of what the community considers to be computer architecture. View details
    No Results Found