Andromeda: Performance, Isolation, and Velocity at Scale in Cloud Network Virtualization

Mike Dalton
David Schultz
Ahsan Arefin
Alex Docauer
Anshuman Gupta
Brian Matthew Fahs
Dima Rubinstein
Enrique Cauich Zermeno
Erik Rubow
Jake Adriaens
Jesse L Alpert
Jing Ai
Jon Olson
Kevin P. DeCabooter
Nan Hua
Nathan Lewis
Nikhil Kasinadhuni
Riccardo Crepaldi
Srinivas Krishnan
Subbaiah Venkata
Yossi Richter
15th USENIX Symposium on Networked Systems Design and Implementation, NSDI 2018
Google Scholar

Abstract

This paper presents our design and experience with Andromeda, Google Cloud Platform’s network virtualization stack. Our production deployment poses several challenging requirements, including performance isolation among customer virtual networks, scalability, rapid provisioning of large numbers of virtual hosts, bandwidth and latency largely indistinguishable from the underlying hardware, and high feature velocity combined with high availability.

Andromeda is designed around a flexible hierarchy of flow processing paths. Flows are mapped to a programming path dynamically based on feature and performance requirements. We introduce the Hoverboard programming model, which uses gateways for the long tail of low bandwidth flows, and enables the control plane to program network connectivity for tens of thousands of VMs in seconds. The on-host dataplane is based around a high-performance OS bypass software packet processing path. CPU-intensive per packet operations with higher latency targets are executed on coprocessor threads. This architecture allows Andromeda to decouple feature growth from fast path performance, as many features can be implemented solely on the coprocessor path. We demonstrate that the Andromeda datapath achieves performance that is competitive with hardware while maintaining the flexibility and velocity of a software-based architecture.

Research Areas