Jeffrey C. Mogul

Jeffrey C. Mogul

Jeff Mogul works on fast, cheap, reliable, and flexible networking infrastructure for Google. Until 2013, he was Fellow at HP Labs, doing research primarily on computer networks and operating systems issues for enterprise and cloud computer systems; previously, he worked at the DEC/Compaq Western Research Lab. He received his PhD from Stanford in 1986, an MS from Stanford in 1980, and an SB from MIT in 1979. He is an ACM Fellow. Jeff is the author or co-author of several Internet Standards; he contributed extensively to the HTTP/1.1 specification. He was an associate editor of Internetworking: Research and Experience, and has been the chair or co-chair of a variety of conferences and workshops, including SIGCOMM, OSDI, NSDI, USENIX, HotOS, and ANCS. You can find a mostly up-to-date CV at http://jmogul.com/mogulcv.pdf
Authored Publications
Sort By
  • Title
  • Title, descending
  • Year
  • Year, descending
    Physical Deployability Matters
    Proc. HotNets 2023: Twenty-Second ACM Workshop on Hot Topics in Networks
    Preview abstract While many network research papers address issues of deployability, with a few exceptions, this has been limited to protocol compatibility or switch-resource constraints, such as flow table sizes. We argue that good network designs must also consider the costs and complexities of deploying the design within the constraints of the physical environment in a datacenter: \emph{physical} deployability. The traditional metrics of network ``goodness'' mostly do not account for these costs and constraints, and this may partially explain why some otherwise attractive designs have not been deployed in real-world datacenters. View details
    Change Management in Physical Network Lifecycle Automation
    Virginia Beauregard
    Kevin Grant
    Angus Griffith
    Jahangir Hasan
    Chen Huang
    Quan Leng
    Jiayao Li
    Alexander Lin
    Zhoutao Liu
    Ahmed Mansy
    Bill Martinusen
    Nikil Mehta
    Andrew Narver
    Anshul Nigham
    Melanie Obenberger
    Sean Smith
    Kurt Steinkraus
    Sheng Sun
    Edward Thiele
    Proc. 2023 USENIX Annual Technical Conference (USENIX ATC 23)
    Preview abstract Automated management of a physical network's lifecycle is critical for large networks. At Google, we manage network design, construction, evolution, and management via multiple automated systems. In our experience, one of the primary challenges is to reliably and efficiently manage change in this domain -- additions of new hardware and connectivity, planning and sequencing of topology mutations, introduction of new architectures, new software systems and fixes to old ones, etc. We especially have learned the importance of supporting multiple kinds of change in parallel without conflicts or mistakes (which cause outages) while also maintaining parallelism between different teams and between different processes. We now know that this requires automated support. This paper describes some of our network lifecycle goals, the automation we have developed to meet those goals, and the change-management challenges we encountered. We then discuss in detail our approaches to several specific kinds of change management: (1) managing conflicts between multiple operations on the same network; (2) managing conflicts between operations spanning the boundaries between networks; (3) managing representational changes in the models that drive our automated systems. These approaches combine both novel software systems and software-engineering practices. While this paper reports on our experience with large-scale datacenter network infrastructures, we are also applying the same tools and practices in several adjacent domains, such as the management of WAN systems, of machines, and of datacenter physical designs. Our approaches are likely to be useful at smaller scales, too. View details
    Preview abstract We (Google's networking teams) would like to increase our collaborations with academic researchers related to data-driven networking research. There are some significant constraints on our ability to directly share data, and in case not everyone in the community understands these, this document provides a brief summary. There are some models which can work (primarily, interns and visiting scientists). We describe some specific areas where we would welcome proposals to work within those models View details
    Preview abstract We are accustomed to thinking of computers as fail-stop, especially the cores that execute instructions, and most system software implicitly relies on that assumption. During most of the VLSI era, processors that passed manufacturing tests and were operated within specifications have insulated us from this fiction. As fabrication pushes towards smaller feature sizes and more elaborate computational structures, and as increasingly specialized instruction-silicon pairings are introduced to improve performance, we have observed ephemeral computational errors that were not detected during manufacturing tests. These defects cannot always be mitigated by techniques such as microcode updates, and may be correlated to specific components within the processor, allowing small code changes to effect large shifts in reliability. Worse, these failures are often "silent'': the only symptom is an erroneous computation. We refer to a core that develops such behavior as "mercurial.'' Mercurial cores are extremely rare, but in a large fleet of servers we can observe the correlated disruption they cause, often enough to see them as a distinct problem -- one that will require collaboration between hardware designers, processor vendors, and systems software architects. This paper is a call-to-action for a new focus in systems research; we speculate about several software-based approaches to mercurial cores, ranging from better detection and isolating mechanisms, to methods for tolerating the silent data corruption they cause. Please watch our short video summarizing the paper. View details
    Preview abstract To reduce cost, datacenter network operators are exploring blocking network designs. An example of such a design is a "spine-free" form of a Fat-Tree, in which pods directly connect to each other, rather than via spine blocks. To maintain application-perceived performance in the face of dynamic workloads, these new designs must be able to reconfigure routing and the inter-pod topology. Gemini is a system designed to achieve these goals on commodity hardware while reconfiguring the network infrequently, rendering these blocking designs practical enough for deployment in the near future. The key to Gemini is the joint optimization of topology and routing, using as input a robust estimation of future traffic derived from multiple historical traffic matrices. Gemini “hedges” against unpredicted bursts, by spreading these bursts across multiple paths, to minimize packet loss in exchange for a small increase in path lengths. It incorporates a robust decision algorithm to determine when to reconfigure, and whether to use hedging. Data from tens of production fabrics allows us to categorize these as either low- or high-volatility; these categories seem stable. For the former, Gemini finds topologies and routing with near-optimal performance and cost. For the latter, Gemini’s use of multi-traffic-matrix optimization and hedging avoids the need for frequent topology reconfiguration, with only marginal increases in path length. As a result, Gemini can support existing workloads on these production fabrics using a spine-free topology that is half the cost of the existing topology on these fabrics. View details
    Preview abstract Network management is becoming increasingly automated, and automation depends on detailed, explicit representations of data about both the state of a network, and about an operator’s intent for its networks. In particular, we must explicitly represent the desired and actual topology of a network; almost all other network-management data either derives from its topology, constrains how to use a topology, or associates resources (e.g., addresses) with specific places in a topology. We describe MALT, a Multi-Abstraction-Layer Topology representation, which supports virtually all of our network management phases: design, deployment, configuration, operation, measurement, and analysis. MALT provides interoperability across software systems, and its support for abstraction allows us to explicitly tie low-level network elements to high-level design intent. MALT supports a declarative style that simplifies what-if analysis and testbed support. We also describe the software base that supports efficient use of MALT, as well as numerous, sometimes painful lessons we have learned about curating the taxonomy for a comprehensive, and evolving, representation for topology. View details
    Minimal Rewiring: Efficient Live Expansion for Clos Data Center Networks
    Shizhen Zhao
    Joon Ong
    Proc. 16th USENIX Symposium on Networked Systems Design and Implementation (NSDI 2019), USENIX Association (to appear)
    Preview abstract Clos topologies have been widely adopted for large-scale data center networks (DCNs), but it has been difficult to support incremental expansions of Clos DCNs. Some prior work has assumed that it is impossible to design DCN topologies that are both well-structured (non-random) and incrementally expandable at arbitrary granularities. We demonstrate that it is indeed possible to design such networks, and to expand them while they are carrying live traffic, without incurring packet loss. We use a layer of patch panels between blocks of switches in a Clos network, which makes physical rewiring feasible, and we describe how to use integer linear programming (ILP) to minimize the number of patch-panel connections that must be changed, which makes expansions faster and cheaper. We also describe a block-aggregation technique that makes our ILP approach scalable. We tested our "minimal-rewiring" solver on two kinds of fine-grained expansions using 2250 synthetic DCN topologies, and found that the solver can handle 99% of these cases while changing under 25% of the connections. Compared to prior approaches, this solver (on average) reduces the number of "stages" per expansion by about 3.1X -- a significant improvement to our operational costs, and to our exposure (during expansions) to capacity-reducing faults. View details
    Nines are Not Enough: Meaningful Metrics for Clouds
    Proc. 17th Workshop on Hot Topics in Operating Systems (HoTOS) (2019)
    Preview abstract Cloud customers want reliable, understandable promises from cloud providers that their applications will run reliably and with adequate performance, but today, providers offer only limited guarantees, which creates uncertainty for customers. Providers also must define internal metrics to allow them to operate their systems without violating customer promises or expectations. We explore why these guarantees are hard to define. We show that this problem shares some similarities with the challenges of applying statistics to make decisions based on sampled data. We also suggest that defining guarantees in terms of defense against threats, rather than guarantees for application-visible outcomes, can reduce the complexity of these problems. Overall, we offer a partial framework for thinking about Service Level Objectives (SLOs), and discuss some unsolved challenges. View details
    Preview abstract We increasingly depend on the availability of online services, either directly as users, or indirectly, when cloud-provider services support directly-accessed services. The availability of these "visible services" depends in complex ways on the availability of a complex underlying set of invisible infrastructure services. In our experience, most software engineers lack useful frameworks to create and evaluate designs for individual services that support end-to-end availability in these infrastructures, especially given cost, performance, and other constraints on viable commercial services. Even given the extensive research literature on techniques for replicated state machines and other fault-tolerance mechanisms, we found little help in this literature for addressing infrastructure-wide availability. Past research has often focused on point solutions, rather than end-to-end ones. In particular, it seems quite difficult to define useful targets for infrastructure-level availability, and then to translate these to design requirements for individual services. We argue that, in many but not all ways, one can think about availability with the mindset that we have learned to use for security, and we discuss some general techniques that appear useful for implementing and operating high-availability infrastructures. We encourage a shift in emphasis for academic research into availability. View details
    Condor: Better Topologies through Declarative Design
    Brandon Schlinker
    Radhika Niranjan Mysore
    Sean Smith
    Amin Vahdat
    Minlan Yu
    Ethan Katz-Bassett
    Michael Rubin
    Sigcomm '15, Google Inc (2015)
    Preview abstract The design space for large, multipath datacenter networks is large and complex, and no one design fits all purposes. Network architects must trade off many criteria to design cost-effective, reliable, and maintainable networks, and typically cannot explore much of the design space. We present Condor, our approach to enabling a rapid, efficient design cycle. Condor allows architects to express their requirements as constraints via a Topology Description Language (TDL), rather than having to directly specify network structures. Condor then uses constraint-based synthesis to rapidly generate candidate topologies, which can be analyzed against multiple criteria. We show that TDL supports concise descriptions of topologies such as fat-trees, BCube, and DCell; that we can generate known and novel variants of fat-trees with simple changes to a TDL file; and that we can synthesize large topologies in tens of seconds. We also show that Condor supports the daunting task of designing multi-phase network expansions that can be carried out on live networks. View details