Milo Martin
Authored Publications
Sort By
Aquila: A unified, low-latency fabric for datacenter networks
Hema Hariharan
Eric Lance
Moray Mclaren
Stephen Wang
Zhehua Wu
Sunghwan Yoo
Raghuraman Balasubramanian
Prashant Chandra
Michael Cutforth
Peter James Cuy
David Decotigny
Rakesh Gautam
Rick Roy
Zuowei Shen
Ming Tan
Ye Tang
Monica C Wong-Chan
Joe Zbiciak
Aquila: A unified, low-latency fabric for datacenter networks (2022)
Preview abstract
Datacenter workloads have evolved from the data intensive, loosely-coupled workloads of the past decade to more tightly coupled ones, wherein ultra-low latency communication is essential for resource disaggregation over the network and to enable emerging programming models.
We introduce Aquila, an experimental datacenter network fabric built with ultra-low latency support as a first-class design goal, while also supporting traditional datacenter traffic. Aquila uses a new Layer 2 cell-based protocol, GNet, an integrated switch, and a custom ASIC with low-latency Remote Memory Access (RMA) capabilities co-designed with GNet. We demonstrate that Aquila is able to achieve under 40 μs tail fabric Round Trip Time (RTT) for IP traffic and sub-10 μs RMA execution time across hundreds of host machines, even in the presence of background throughput-oriented IP traffic. This translates to more than 5x reduction in tail latency for a production quality key-value store running on a prototype Aquila network.
View details
CliqueMap: Productionizing an RMA-Based Distributed Caching System
Aditya Akella
Amanda Strominger
Arjun Singhvi
Maggie Anderson
Rob Cauble
Thomas F. Wenisch
SIGCOMM 2021 (2021) (to appear)
Preview abstract
Distributed caching is a key component in the design of performant, scalable Internet services, but accessing such caches
via RPC incurs high cost. Remote Memory Access (RMA)
offers a promising, less costly alternative, but achieving a rich
production feature set with RMA-based systems is a significant challenge, as the rich abstraction of RPC lends itself to
solutions for interoperability and upgradeability requirements
of real systems. This work describes CliqueMap, a fully productionized RMA/RPC hybrid serving and caching system,
and the production experience derived from three years of
operation in Google’s datacenters. Building on internal technologies, CliqueMap serves multiple internal product areas
and underlies several end-user-visible services.
View details
1RMA: Re-Envisioning Remote Memory Access for Multi-Tenant Datacenters
Aditya Akella
Arjun Singhvi
Joel Scherpelz
Monica C Wong-Chan
Moray Mclaren
Prashant Chandra
Rob Cauble
Sean Clark
Simon Sabato
Thomas F. Wenisch
Proceedings of the Annual Conference of the ACM Special Interest Group on Data Communication on the Applications, Technologies, Architectures, and Protocols for Computer Communication, Association for Computing Machinery, New York, NY, USA (2020), 708–721
Preview abstract
Remote Direct Memory Access (RDMA) plays a key role in supporting performance-hungry datacenter applications. However, existing RDMA technologies are ill-suited to multi-tenant datacenters, where applications run at massive scales, tenants require isolation and security, and the workload mix changes over time. Our experiences seeking to operationalize RDMA at scale indicate that these ills are rooted in standard RDMA's basic design attributes: connection-orientedness and complex policies baked into hardware.
We describe a new approach to remote memory access -- One-Shot RMA (1RMA) -- suited to the constraints imposed by our multi-tenant datacenter settings. The 1RMA NIC is connection-free and fixed-function; it treats each RMA operation independently, assisting software by offering fine-grained delay measurements and fast failure notifications. 1RMA software provides operation pacing, congestion control, failure recovery, and inter-operation ordering, when needed. The NIC, deployed in our production datacenters, supports encryption at line rate (100Gbps and 100M ops/sec) with minimal performance/availability disruption for encryption key rotation.
View details
Preview abstract
It is our pleasure to introduce the 2015 Top Picks in Computer Architecture. We co-chaired the Selection Committee that had the formidable task of selecting the best computer architecture papers that were published in conferences in the previous year. Many excellent papers are published every year, and choosing among them is challenging, not least because of the need to define “best.” The committee identified 11 papers as being Top Picks this year. The range of topics is wide and reflects the healthy broadening of what the community considers to be computer architecture.
View details