We present Orion, a distributed Software-Defined Networking platform deployed globally in Google’s datacenter (Jupiter) as well as Wide Area (B4) networks. Orion was designed around a modular, micro-service architecture with a central publish-subscribe database to enable a distributed, yet tightly-coupled, software-defined network control system. Orion enables intent-based management and control, is highly scalable and amenable to global control hierarchies.
Over the years, Orion has matured with continuously improving performance in convergence (up to 40x faster), throughput (handling up to 1.16 million network updates per second), system scalability (supporting 16x larger networks), and data plane availability (50x, 100x reduction in unavailable time in Jupiter and B4, respectively) while maintaining high development velocity with bi-weekly release cadence. Today, Orion robustly enables all of Google’s Software-Defined Networks defending against failure modes that are both generic to large scale production networks as well as unique to SDN systems.View details
Private WANs are increasingly important to the operation of enterprises, telecoms, and cloud providers. For example, B4, Google’s private software-defined WAN, is larger and growing faster than our connectivity to the public Internet. In this paper, we present the five-year evolution of B4. We describe the techniques we employed to incrementally move from offering best-effort content-copy services to carrier-grade availability, while concurrently scaling B4 to accommodate 100x more traffic. Our key challenge is balancing the tension introduced by hierarchy required for scalability, the partitioning required for availability, and the capacity asymmetry inherent to the construction and operation of any large-scale network. We discuss our approach to managing this tension: i) we design a custom hierarchical network topology for both horizontal and vertical software scaling, ii) we manage inherent capacity asymmetry in hierarchical topologies using a novel traffic engineering algorithm without packet encapsulation, and iii) we re-architect switch forwarding rules via two-stage matching/hashing to deal with asymmetric network failures at scale.View details
One of the goals of traffic engineering is to achieve a
flexible trade-off between fairness and throughput so that users
are satisfied with their bandwidth allocation and the network
operator is satisfied with the utilization of network resources. In
this paper, we propose a novel way to balance the throughput
and fairness objectives with linear programming. It allows the
network operator to precisely control the trade-off by bounding
the fairness degradation for each commodity compared to the
max-min fair solution or the throughput degradation compared
to the optimal throughput. We also present improvements to a
previous algorithm that achieves max-min fairness by solving a
series of linear programs. We significantly reduce the number
of steps needed when the access rate of commodities is limited.
We extend the algorithm to two important practical use cases:
importance weights and piece-wise linear utility functions for
commodities. Our experiments on synthetic and real networks
show that our algorithms achieve a significant speedup and
provide practical insights on the trade-off between fairness and
No Results Found
We're always looking for more talented, passionate people.