Aequitas: Admission Control for latency-critical RPCs in Datacenters

Yiwen Zhang

Gautam Kumar

Nandita Dukkipati

Xian Wu

Priyaranjan Jha

Mosharaf Chowdhury

Amin Vahdat

SIGCOMM(2022)

Download Google Scholar

Abstract

A modern datacenter hosts thousands of services with a mix of latency-sensitive, throughput-intensive, and best-effort traffic with high degrees of fan-out and fan-in patterns. Maintaining low tail latency under high overload conditions is difficult, especially for latency-sensitive (LS) RPCs. In this paper, we consider the challenging case of providing service-level objectives (SLO) to LS RPCs when there are unpredictable surges in demand. We present Aequitas, a distributed sender-driven admission control scheme that is anchored on the key conceptual insight: Weighted-Fair Quality of Service (QoS) queues, found in standard NICs and switches, can be used to guarantee RPC level latency SLOs by a judicious selection of QoS weights and traffic-mix across QoS queues. Aequitas installs cluster-wide RPC latency SLOs by mapping LS RPCs to higher weight QoS queues, and coping with overloads by adaptively apportioning LS RPCs amongst QoS queues based on measured completion times for each queue. When the network demand spikes unexpectedly to 25× of provisioned capacity, Aequitas achieves a latency SLO that is 3.8× lower than the state-of-art congestion control at the 99.9th-p and admits 15× more RPCs meeting SLO target compared to pFabric when RPC sizes are not aligned with priorities.

Defining the technology of today and tomorrow.

Philosophy

People

Teams

AI/ML Foundations  & Capabilities

Algorithms & Optimization

Computing Paradigms

Responsible Human-Centric Technology

Science & Societal Impact

Projects

Publications

Resources

Shaping the future, together.

Student programs

Faculty programs

Conferences & events

Aequitas: Admission Control for latency-critical RPCs in Datacenters

Abstract

Research Areas

Learn more about how we conduct our research

Defining the technology of today and tomorrow.

Philosophy

People

Teams

AI/ML Foundations & Capabilities

Algorithms & Optimization

Computing Paradigms

Responsible Human-Centric Technology

Science & Societal Impact

Projects

Publications

Resources

Shaping the future, together.

Student programs

Faculty programs

Conferences & events

Aequitas: Admission Control for latency-critical RPCs in Datacenters

Abstract

Research Areas

Learn more about how we conduct our research

AI/ML Foundations  & Capabilities