Global networking

The Global Networking (GN) team at Google is responsible for the design, development, build and operation of the networks that connect our data centers to our customers.

About the team

The Global Networking team is responsible for the design, development, build and operation of Google’s global network that every Google service, including the Google Cloud Platform, runs on. We develop cutting-edge networking technologies that allow Google's global WAN to be zero touch, build some of the largest scale software defined networks (SDNs) infrastructure ever deployed (B4, Espresso), scale Google's global content delivery networks (CDNs) that support Google services, and develop sophisticated software systems for network capacity forecasting, planning and optimization. We leverage the latest advances in AI/ML to drive greater autonomy in network design and operations.

We continuously expand the reach of Google's network across the world by laying new optical fibers and building hundreds of points of presence worldwide. This global footprint allows us to optimize the end-to-end speed and reliability of the traffic that we carry for our users and for Google Cloud customers, delivering optimal performance and best-in-class availability.

In doing all this, we develop and rely on the most advanced techniques in network hardware and software, traffic engineering, and network management to deliver unprecedented scale, availability and performance at industry-leading cost points. Additionally, we are also advancing the state-of-the-art in data analytics and machine learning to drive network efficiency and optimization at scale.

Google has a long history of fundamental research in networking. We have engaged in a collaborative research effort with the National Science Foundation and other industrial partners to launch a $40 million program in academic research for Resilient and Intelligent Next-Generation (NextG) Systems, or RINGS. In addition to funding, Google offers expertise, research collaborations, infrastructure, and in-kind support for researchers and students as they advance knowledge and progress in the field.

Team focus summaries

Congestion control and traffic management

All networks are subject to congestion; we operate ours at high utilization levels, while meeting strict performance objectives. We’re inventing new congestion avoidance protocols, and improving our global-scale, near-real-time, automated traffic engineering system. We’re building better ways to measure our networks, accurately and at scale, to drive our evaluation of congestion-control techniques, and as real-time input to automated traffic management.

Data mining and telemetry

We collect traffic statistics all around our network infrastructure to track performance, quickly detect unusual events, and compute SLA compliance. We rely on the most advanced data science techniques, machine learning in particular, to reduce the time it takes to detect and root cause events. We have designed and deployed techniques that can detect, pinpoint and mitigate network problems within a few minutes without human intervention. We use predictive analytics to anticipate some types of problems and adjust our traffic engineering, or to plan capacity increases.

Network management

We’re building automated network management systems, enabling us to rapidly repair and improve our networks with little or no downtime. We’re using techniques such as formal modeling of network topologies and highly-available distributed systems, while working closely with vendors and network operators to implement open software APIs to enable greater levels of automation and programmability across the entire network management lifecycle.

Optical networking - terrestrial and submarine

We work on developing and deploying cutting-edge optical solutions to scale cost-effectively and to increase network availability. These include new coherent transmission technologies, disaggregated line systems, high-capacity submarine wet plants, subsea switching technologies, transport SDN configurations, and sophisticated physical and logical layer design and optimization tools.

Programmable packet processing

We are developing new mechanisms for low-latency, CPU-efficient communication. We want our network switches and endpoints to implement novel on-device packet processing functions, without compromising cost or performance. We’re exploring hardware and software techniques for fast, flexible, safe packet processing, including onload, offload, RDMA, P4, and more.

Rapid and reactive development and testing

To introduce network innovations into production as rapidly as possible, without compromising availability, we test our designs and implementations early, often, and extensively. We are developing advanced software validation techniques. We embrace automation in all aspects of testing and qualification, and we build powerful infrastructure for testing, debugging, and root-causing, in both physical and emulated testbeds, building a “digital twin” for our network infrastructure.

Software-defined networking (SDN)

We employ SDN extensively. We were early users of, and contributors to, OpenFlow, and continue, with the P4 programming language, to raise the level of abstraction for silicon-agnostic switching. We are developing SDN controller platforms that can handle Google’s needs for scale and reliability, and SDN applications for routing, traffic management, and other functions. We recently published a paper on our experiments with decentralizing SDN architecture for increased scalability and faster convergence times.

WAN design

We’ve developed one of the world’s largest, most cost-effective wide area networks, and we continue to increase its scale and reliability, while extracting the best possible performance from WAN hardware and fiber links. We’re employing Google-designed and vendor-supplied hardware, SDN controllers, and global-scale automated traffic engineering to address these challenges.

AIOps

Google is rapidly expanding its global network to meet the surging and unpredictable demands of its products, cloud customers, and groundbreaking AI/ML applications. To support this growth, AIOps is pioneering transformative approaches to build and manage networks powered by best-in-class AI algorithms and GenAI models. Specifically, we are leveraging Gemini to develop intelligent agents that enhance network autonomy. Additionally, we are training custom machine learning models on Google's internal network data to drive robust forecasting, failure predictions and prevention, and precise root cause analysis and mitigation.

Featured publications

CAPA: An Architecture For Operating Cluster Networks With High Availability
Bingzhe Liu
Brighten Godfrey
Omid Alipourfard
Joon Ong
Virginia Beauregard
Mukarram Tariq
Mayur Patel
Prerepa Viswanadham
Manish Verma
Xander Lin
Patrick Conner
Deepak Arulkannan
Amr Sabaa
Rich Alimi
Alex Smirnov
Google, Google, 1600 Amphitheatre Pkwy, Mountain View, CA 94043 (2023)
Preview abstract Management operations are a major source of outages for networks. A number of best practices designed to reduce and mitigate such outages are well known, but their enforcement has been challenging, leaving the network vulnerable to inadvertent mistakes and gaps which repeatedly result in outages. We present our experiences with CAPA, Google’s “containment and prevention architecture” for regulating management operations on our cluster networking fleet. Our goal with CAPA is to limit the systems where strict adherence to best practices is required, so that availability of the network is not dependent on the good intentions of every engineer and operator. We enumerate the features of CAPA which we have found to be necessary to effectively enforce best practices within a thin “regulation“ layer. We evaluate CAPA based on case studies of outages prevented, counterfactual analysis of past incidents, and known limitations. Management-plane-related outages have substantially reduced both in frequency and severity, with a 82% reduction in cumulative duration of incidents normalized to fleet size over five years View details
A Decentralized SDN Architecture for the WAN
Hakim Weatherspoon
Sylvia Ratnasamy
Ashok Narayanan
Nitika Saran
Ankit Singla
2024 ACM Special Interest Group on Data Communication (SIGCOMM) (2024)
Preview abstract Motivated by our experiences operating a global WAN, we argue that SDN’s reliance on infrastructure external to the data plane has significantly complicated the challenge of maintaining high availability. We propose a new decentralized SDN (dSDN) architecture in which SDN control logic instead runs within routers, eliminating the control plane’s reliance on external infrastructure and restoring fate sharing between control and data planes. We present dSDN as a simpler approach to realizing the benefits of SDN in the WAN. Despite its much simpler design, we show that dSDN is practical from an implementation viewpoint, and outperforms centralized SDN in terms of routing convergence and SLO impact. View details
The Case for Validating Inputs in Software-Defined WANs
Isaac Keslassy
Rishabh Iyer
Sylvia Ratnasamy
The 23rd ACM Workshop on Hot Topics in Networks (HOTNETS ’24), ACM, Irvine, CA (2024) (to appear)
Preview abstract We highlight a problem that the networking community has largely overlooked: ensuring that the inputs to network controllers in software- defined WANs are accurate. We we show that “incorrect” inputs are a common cause of major outages in practice and propose new directions to address these. View details
Preview abstract We're roughly 10 years into the OpenConfig journey. We have implementations in hand from various vendors, and we've gained significant operational experience in the domains of Streaming Telemetry and in Developing Configuration Systems to leverage the developed models. What have we learned? Are the abstractions we've generated the right ones? If not, why? Were we too influenced by the tools and inertia of the time when we made some critical decisions? How do we need to evolve going forward? This discussion is part retrospective/introspective, a candid look at where we've been and what we need to think about as we evolve the next generation of our management (and control) planes. What should we be thinking about as network engineers who write software? View details
Improving Network Availability with Protective ReRoute
Abdul Kabbani
Brad Morrey
Uma Parthavi Moravapalle
Steven Knight
Van Jacobson
Jim Winget
SIGCOMM 2023
Preview abstract We present PRR (Protective ReRoute), a transport technique for shortening user-visible outages that complements routing repair. It can be added to any transport to provide benefits in multipath networks. PRR responds to flow connectivity failure signals, e.g., retransmission timeouts, by changing the FlowLabel on packets of the flow, which causes switches and hosts to choose a different network path that may avoid the outage. To enable it, we shifted our IPv6 network architecture to use the FlowLabel, so that hosts can change the paths of their flows without application involvement. PRR is deployed fleetwide at Google for TCP and Pony Express, where it has been protecting all production traffic for several years. It is also available to our Cloud customers. We find it highly effective for real outages. In a measurement study on our network backbones, adding PRR reduced the cumulative region-pair outage time for RPC traffic by 63--84%. This is the equivalent of adding 0.4--0.8 "nines'" of availability. View details
Invisinets: Removing Networking from Cloud Networks
Karthick Jayaraman
Ashok Narayanan
Sarah McClure
Deepak Bansal
Jitendra Padhye
Rishabh Tewari
Sylvia Ratnasamy
Zeke Medley
2023
Preview abstract Cloud tenant networks are complex to provision, configure, and manage. Tenants must figure out how to assemble, configure, test, etc. a large set of low-level building blocks in order to achieve their high-level goals. As these networks are increasingly spanning multiple clouds and on-premises infrastructure, the complexity scales poorly. We argue that the current cloud abstractions place an unnecessary burden on the tenant to become a seasoned network operator. We thus propose an alternative interface to the cloud provider's network resources in which a tenant's connectivity needs are reduced to a set of parameters associated with compute endpoints. Our API removes the tenant networking layer of cloud deployments altogether, placing its former duties primarily upon the cloud provider. We demonstrate that this API reduces the complexity experienced by tenants by 80-90% while maintaining a scalable and secure architecture. We provide a prototype of the underlying infrastructure changes necessary to support new functionality introduced by our interface and implement our API on top of current cloud APIs. View details
Optimal Probing with Statistical Guarantees for Network Monitoring at Scale
Branislav Kveton
Shawn Yang
Dimitris Konomis
Jehangir Amjad
Augustin Soule
Computer Communication, 192 (2022), pp. 119-131 (to appear)
Preview abstract Monitoring large-scale cloud networks is a complex task because their scale is prohibitively large, monitoring budgets are limited, network topologies are not entirely regular and the estimates produced are a function of traffic patterns. In this work, we take a statistical approach to estimating a network metric, such as the latency of a set of paths, with guarantees on the estimation error. We aim to do so in an intelligent and scalable manner, without observing all existing traffic, and minimizing the estimation error at a fixed probing budget per unit of time. Our proposed algorithms produce a distribution of probes/samples across network paths which can be used in conjunction with existing probers (or samplers). These algorithms are based on A- and E-optimal experimental designs in statistics, which guarantee a bounded estimation error for any monitoring budget. Unfortunately, these designs are too computationally intensive to be used in production at scale. We propose a scalable and near-optimal approximate implementations based on the Frank-Wolfe algorithm. We validate our approaches with two metrics (latency and loss) in simulations on real network topologies, and also using a production probing system in a real cloud network. We show major gains in reducing the probing budget compared to both production and academic baselines, while maintaining low errors in estimates, even with very low probing budgets. View details
CloudCluster: Unearthing the Functional Structure of a Cloud Service
weiwu pang
Ramesh Govindan
Sourav Panda
Jehangir Amjad
NSDI 2022, USENIX (2022)
Preview abstract In their quest to provide customers with good tools to manage cloud services, cloud providers are hampered by having very little visibility into cloud service functionality; a provider often only knows where VMs of a service are placed, how the virtual networks are configured, how VMs are provisioned, and how VMs communicate with each other. In this paper, we show that, using the VM-to-VM traffic matrix, we can unearth the functional structure of a cloud service and use it to aid cloud service management. Leveraging the observation that cloud services use well-known design patterns for scaling (e.g., replication, communication locality), we show that clustering the VM-to-VM traffic matrix yields the functional structure of the cloud service. Our clustering algorithm, CloudCluster, must overcome challenges imposed by scale (cloud services contain tens of thousands of VMs) and must be robust to orders-of-magnitude variability in traffic volume and measurement noise. To do this, CloudCluster uses a novel combination of feature scaling, dimensionality reduction, and hierarchical clustering to achieve clustering with over 92% homogeneity and completeness. We show that CloudCluster can be used to explore opportunities to reduce cost for customers, identify anomalous traffic and potential misconfigurations. View details
Orion: Google’s Software-Defined Networking Control Plane
Shawn Chen
Lorenzo Vicisano
Karthik Swaminathan Nagaraj
Shidong Zhang
Joon Suan Ong
Min Zhu
Mike Conley
Waqar Mohsin
Henrik Muehe
Amr Sabaa
KondapaNaidu Bollineni
Rich Alimi
(2021)
Preview abstract We present Orion, a distributed Software-Defined Networking platform deployed globally in Google’s datacenter (Jupiter) as well as Wide Area (B4) networks. Orion was designed around a modular, micro-service architecture with a central publish-subscribe database to enable a distributed, yet tightly-coupled, software-defined network control system. Orion enables intent-based management and control, is highly scalable and amenable to global control hierarchies. Over the years, Orion has matured with continuously improving performance in convergence (up to 40x faster), throughput (handling up to 1.16 million network updates per second), system scalability (supporting 16x larger networks), and data plane availability (50x, 100x reduction in unavailable time in Jupiter and B4, respectively) while maintaining high development velocity with bi-weekly release cadence. Today, Orion robustly enables all of Google’s Software-Defined Networks defending against failure modes that are both generic to large scale production networks as well as unique to SDN systems. View details
Preview abstract Network management is becoming increasingly automated, and automation depends on detailed, explicit representations of data about both the state of a network, and about an operator’s intent for its networks. In particular, we must explicitly represent the desired and actual topology of a network; almost all other network-management data either derives from its topology, constrains how to use a topology, or associates resources (e.g., addresses) with specific places in a topology. We describe MALT, a Multi-Abstraction-Layer Topology representation, which supports virtually all of our network management phases: design, deployment, configuration, operation, measurement, and analysis. MALT provides interoperability across software systems, and its support for abstraction allows us to explicitly tie low-level network elements to high-level design intent. MALT supports a declarative style that simplifies what-if analysis and testbed support. We also describe the software base that supports efficient use of MALT, as well as numerous, sometimes painful lessons we have learned about curating the taxonomy for a comprehensive, and evolving, representation for topology. View details
Classification of load balancing in the Internet
Italo Cunha
Darryl Veitch
rafael almeida
renata cruz teixeira
Proceedings of IEEE INFOCOM, IEEE, Beijing, China (2020)
Preview abstract Abstract—Recent advances in programmable data planes, software-defined networking, and the adoption of IPv6, support novel, more complex load balancing strategies. We introduce the Multipath Classification Algorithm (MCA), a probing algorithm that extends traceroute to identify and classify load balancing in Internet routes. MCA extends existing formalism and techniques to consider that load balancers may use arbitrary combinations of bits in the packet header for load balancing. We propose optimizations to reduce probing cost that are applicable to MCA and existing load balancing measurement techniques. Through large-scale measurement campaigns, we characterize and study the evolution of load balancing on the IPv4 and IPv6 Internet with multiple transport protocols. Our results show that load balancing is more prevalent and that load balancing strategies are more mature than previous characterizations have found. View details
Open Optical Communication Systems at a Hyperscale Operator
Rene Marcel Schmogrow
Vijay Vusirikala
Matt Newland
Journal of Optical Communications (2020)
Preview abstract Open optical networks present a variety of benefits such as single vendor independence and the opportunity to select best in class devices for each individual role. In this paper we review two degrees of open optical networks, namely ones with transponder-line system and line system-line system interoperability. In this context we discuss Google's experiences with respect to optical link design, software, and controls, deployment, and operation. View details
Network Error Logging: Client-side measurement of end-to-end web service reliability
Harsha V. Madhyastha
Douglas Creager
Ben Jones
Charles Stahl
Lily Chen
Julia Elizabeth Tuttle
Brian Rogan
Ilya Grigorik
Misha Efimov
17th USENIX Symposium on Networked Systems Design and Implementation, NSDI 2020
Preview abstract We present NEL (Network Error Logging), Google’s planet scale, client-side, network reliability measurement system. NEL is implemented in Chrome and has been proposed as a new W3C standard, letting any web site operator collect reports of clients’ successful and failed requests to their sites. These reports are similar to web server logs, but include information about failed requests that never reach serving infrastructure. Reports are uploaded via redundant failover paths, reducing the likelihood of shared-fate failures of report uploads. We have used NEL to monitor all of Google’s domains since 2014, allowing us to detect and investigate instances of DNS hijacking, BGP route leaks, protocol deployment bugs, and other problems where packets might never reach our servers. This paper presents the design of NEL, case studies of real outages, and deployment lessons for other operators who choose to use NEL to monitor their traffic. View details
The subsea fiber as a Shannon channel
Maxim Bolshtyansky
Omar Ait Sab
Eduardo Mateo
Georg Mohs
Takanori Inoue
Olivier Gautheron
Vincent Letellier
Olivier Courtois
Stephen Grubb
Elizabeth Rivera Hartling
Yoshihisa Inada
Alexei Pilipetskii
Pascal Pecci
Priyanth Mehta
Dmitry Kovsh
Massimiliano Salsi
Valey Kamalov
Vijay Vusirikala
SubOptic 2019
Preview abstract Since many years, the Q-budget table (normalized by the ITU-T G.977) has been widely used to characterize the transmission performance of subsea cables: this table detailed the margin allowance breakdown for any modulated wavelength. The fiber achievable transmission capacity was then deduced from the wavelength spacing and the system operating bandwidth. However, the emergence of coherent detection and Digital Signal Processing (DSP) capabilities has enabled the deployment of a wide range of modulation schemes featuring various bit rate, FEC encoding, constellation and spectral shaping, non-linear effect mitigation, thus leading to a transponder-dependent fiber transmission capacity. Combined to the recent trend of the industry to deploy “open” cables it is now time to define a new method to characterize the subsea fiber performance independently of the transponder type. This is emphasized by the introduction of Space Division Multiplexing (SDM) systems equipped with a high fiber pairs count, bringing the granularity at the fiber level: easy to swap, to sell and to manage. Cable capacity will be evaluated via the sum of fiber capacities deduced from any SLTE (Submarine Line Terminal Equipment) at any time with any margin. The proposed method for non-dispersion-managed undersea systems, relies on the General Signal to Noise ratio (GSNR) to remove the effect of baud rate, which is changing rapidly in each generation of SLTE. These have been metrics already widely debated at conferences/publications. Topics such as accuracy, Gaussian Noise (GN) model, assumptions, and measurability, are discussed to clarify definitions and a methodology. Finally, the paper reviews and discusses fiber capacity based on a given GSNR-based performance budget and various transponder types. View details
B4 and After: Managing Hierarchy, Partitioning, and Asymmetry for Availability and Scale in Google's Software-Defined WAN
Chandan Bhagat
Jay Kaimal
Jeffrey Liang
Joon Ong
Min Zhu
Kirill Mendelev
Saikat Ray
Faro Thomas Rabe
Malveeka Tewari
Sourabh Jain
Monika Zahn
Kondapa Naidu Bollineni
Rich Alimi
SIGCOMM'18 (2018)
Preview abstract Private WANs are increasingly important to the operation of enterprises, telecoms, and cloud providers. For example, B4, Google’s private software-defined WAN, is larger and growing faster than our connectivity to the public Internet. In this paper, we present the five-year evolution of B4. We describe the techniques we employed to incrementally move from offering best-effort content-copy services to carrier-grade availability, while concurrently scaling B4 to accommodate 100x more traffic. Our key challenge is balancing the tension introduced by hierarchy required for scalability, the partitioning required for availability, and the capacity asymmetry inherent to the construction and operation of any large-scale network. We discuss our approach to managing this tension: i) we design a custom hierarchical network topology for both horizontal and vertical software scaling, ii) we manage inherent capacity asymmetry in hierarchical topologies using a novel traffic engineering algorithm without packet encapsulation, and iii) we re-architect switch forwarding rules via two-stage matching/hashing to deal with asymmetric network failures at scale. View details
Preview abstract Modern networks have significantly outpaced the monitoring capabilities of SNMP and command-line scraping. Over the last three years we at Google have been working with members of the networking industry via the OpenConfig.net effort to redefine network monitoring. We have now deployed Streaming Telemetry in production to monitor devices from multiple vendors. We will talk about the experience and highlight the open source components we are providing to the community to accelerate industry-wide adoption. View details
Taking the Edge off with Espresso: Scale, Reliability and Programmability for Global Internet Peering
Calvin Ying
TaeEun Kim
Matthew Holliman
Ashok Narayanan
Colin Rice
Puneet Sood
Mukarram Tariq
Gary Baldus
Dzevad Trumic
Victor Lin
Bert Tanaka
Manish Verma
Brian Rogan
Vytautas Valancius
Mahesh Kallahalla
Marcus Hines
Sigcomm (2017)
Preview abstract We present the design of Espresso, Google’s SDN-based Internet peering edge routing infrastructure. This architecture grew out of a need to exponentially scale the Internet edge cost-effectively and to enable application-aware routing at Internet-peering scale. Espresso utilizes commodity switches and host-based routing/packet processing to implement a novel fine-grained traffic engineering capability. Overall, Espresso provides Google a scalable peering edge that is programmable, reliable, and integrated with global traffic systems. Espresso also greatly accelerated deployment of new networking features at our peering edge. Espresso has been in production for two years and serves over 22% of Google’s total traffic to the Internet. View details
An Internet-Wide Analysis of Traffic Policing
Luis Pedrosa
Ethan Katz-Bassett
Tobias Flach
Ramesh Govindan
Tayeb Karim
SIGCOMM (2016)
Preview abstract Large flows like videos consume significant bandwidth. Some ISPs actively manage these high volume flows with techniques like policing, which enforces a flow rate by dropping excess traffic. While the existence of policing is well known, our contribution is an Internet-wide study quantifying its prevalence and impact on video quality metrics. We developed a heuristic to identify policing from server-side traces and built a pipeline to deploy it at scale on hundreds of servers worldwide within one of the largest online content providers. Using a dataset of 270 billion packets served to 28,400 client ASes, we find that, depending on region, up to 7% of lossy transfers are policed. Loss rates are on average 6× higher when a trace is policed, and it impacts video playback quality. We show that alternatives to policing, like pacing and shaping, can achieve traffic management goals while avoiding the deleterious effects of policing. View details