- Leon Poutievski
- Omid Mashayekhi
- Joon Ong
- Arjun Singh
- Mukarram Tariq
- Rui Wang
- Jianan Zhang
- Virginia Beauregard
- Patrick Conner
- Steve Gribble
- Rishi Kapoor
- Stephen Kratzer
- Nanfang Li
- Hong Liu
- Karthik Nagaraj
- Jason Ornstein
- Samir Sawhney
- Ryohei Urata
- Lorenzo Vicisano
- Kevin Yasumura
- Shidong Zhang
- Junlan Zhou
- Amin Vahdat
Abstract
We present a decade of evolution and production experience with Jupiter datacenter network fabrics. In this period Jupiter has delivered 5x higher speed and capacity, 30% reduction in capex, 41% reduction in power, incremental deployment and technology refresh all while serving live production traffic. A key enabler for these improvements is evolving Jupiter from a Clos to a direct-connect topology among the machine aggregation blocks. Critical architectural changes for this include: A datacenter interconnection layer employing Micro-ElectroMechanical Systems (MEMS) based Optical Circuit Switches (OCSes) to enable dynamic topology reconfiguration, centralized Software-Defined Networking (SDN) control for traffic engineering, and automated network operations for incremental capacity delivery and topology engineering. We show that the combination of traffic and topology engineering on direct-connect fabrics achieves similar throughput as Clos fabrics for our production traffic patterns. We also optimize for path lengths: 60% of the traffic takes direct path from source to destination aggregation blocks, while the remaining transits one additional block, achieving an average blocklevel path length of 1.4 in our fleet today. OCS also achieves 3x faster fabric reconfiguration compared to pre-evolution Clos fabrics that used a patch panel based interconnect.
Research Areas
Learn more about how we do research
We maintain a portfolio of research projects, providing individuals and teams the freedom to emphasize specific types of work