Jump to Content

Networking

Networking is central to modern computing, from WANs connecting cell phones to massive data stores, to the data-center interconnects that deliver seamless storage and fine-grained distributed computing. Because our distributed computing infrastructure is a key differentiator for the company, Google has long focused on building network infrastructure to support our scale, availability, and performance needs, and to apply our expertise and infrastructure to solve similar problems for Cloud customers. Our research combines building and deploying novel networking systems at unprecedented scale, with recent work focusing on fundamental questions around data center architecture, cloud virtual networking, and wide-area network interconnects. We helped pioneer the use of Software Defined Networking, the application of ML to networking, and the development of large-scale management infrastructure including telemetry systems. We are also addressing congestion control and bandwidth management, capacity planning, and designing networks to meet traffic demands. We build cross-layer systems to ensure high network availability and reliability. By publishing our findings at premier research venues, we continue to engage both academic and industrial partners to further the state of the art in networked systems.

Recent Publications

Preview abstract This is an invited OFC 2024 conference workshop talk regarding a new type of lower-power datacenter optics design choice: linear pluggable optics. In this talk I will discuss the fundamental performance constraints facing linear pluggable optics and their implications on DCN and ML use cases View details
On the Benefits of Traffic “Reprofiling” The Single Hop Case
Henry Sariowan
Jiaming Qiu
Jiayi Song
Roch Guerin
IEEE/ACM Transactions on Networking (2024)
Preview abstract Datacenters have become a significant source of traffic, much of which is carried over private networks. The operators of those networks commonly have access to detailed traffic profiles and performance goals, which they seek to meet as efficiently as possible. Of interest are solutions that guarantee latency while minimizing network bandwidth. The paper explores a basic building block towards realizing such solutions, namely, a single hop configuration. The main results are in the form of optimal solutions for meeting local deadlines under schedulers of varying complexity and therefore cost. The results demonstrate how judiciously modifying flows’ traffic profiles, i.e., reprofiling them, can help simple schedulers reduce the bandwidth they require, often performing nearly as well as more complex ones. View details
Preview abstract As with most large-scale migration efforts, the last 20% of Alphabet's BeyondCorp migration required disproportionate effort. After successfully transitioning most of the company's workflows to BeyondCorp, we still had a long tail of specific, oddball, or challenging situations to resolve. This article examines how we created processes, tools, and solutions to handle use cases that were not easily adapted to our core HTTPS-based workflow. View details
Preview abstract Bolt is a congestion-control algorithm designed to providesingle-digit microsecond tail network-queuing at near-linerate utilization. Motivated by the need for ultra-low latencyto support applications such as NVMe, as line rates reach200G and beyond, most transfers fit within a single BDP en-tailing that transfer times predominantly become a functionof queuing and propagation delays. Bolt is an attempt topush congestion-control to its theoretical limits by harness-ing the power of programmable dataplanes such as Tofinoand Trident3+ chips. Bolt is founded on three key ideas, (i)Sub-RTT reaction (SRR): reacting to congestion faster thanRTT control-loop delay, (ii) Proactive Ramp-up (PRU): bytracking future flow-completions, and (iii) Supply matching(SM): leveraging Network Calculus concepts to maximizeutilization. Our current results achieve a 75% reduction inqueuing-delays over Swift with upto 3x improvement incompletion times for short transfers. View details
RFC 9476 - The .alt Special-Use Top-Level Domain
Warren Kumari
Paul Hoffman
IETF Request For Comments, RFC Editor (2023), pp. 7
Preview abstract This document reserves a Top-Level Domain (TLD) label "alt" to be used in non-DNS contexts. It also provides advice and guidance to developers creating alternative namespaces. View details
SAC123 - SSAC Report on the Evolution of Internet Name Resolution
Warren Kumari
Internet Corporation for Assigned Names and Numbers (ICANN) , vol. ICANN Security and Stability Advisory Committee (SSAC) Reports and Advisories (2023), pp. 36
Preview abstract New technologies are changing how name resolution happens on the Internet. The DNS remains the prominent, or default, naming system for the Internet, but alternative naming systems are in use as well. This is nothing particularly new, as there have always been naming systems besides the DNS in use throughout the Internet’s history. These alternative naming systems use the same syntax as the DNS, dot-separated labels. There are many motivations for copying this syntax, but the primary reason is because designers of these alternative naming systems wish to benefit from the existence of software applications built to receive DNS names as input. This has the potential to create situations where the same name exists in DNS and in an alternative system, potentially causing name collisions. However, there is only one domain namespace and its referential integrity is important for Internet users and for the stability and security of Internet names. Thus, as alternative naming systems increase in popularity their use threatens to increase ambiguity in the shared single domain namespace. This increased ambiguity in Internet naming threatens to undermine the trust that users have in Internet identifiers and the services that rely on them. Additionally, names are becoming less visible to Internet end users, yet they remain vital to the security and stability of Internet infrastructure. Technologies such as QR codes and URL shorteners offer great utility to Internet users while also obscuring the underlying domain names used and creating new opportunities for malicious behavior. Meanwhile, QR codes and URL shorteners use domain names to access the Internet resource, even if the human user does not see it. These are the two main trends that the SSAC identifies in this report. The same name can resolve in different ways (ambiguous name resolution), and names of service endpoints are less visible (names are less conspicuous to end users). It is the combination of these two trends that fundamentally threatens to undermine confidence in services on the Internet. View details