Network Infrastructure

About the team

Our team brings together experts in networking, distributed systems, kernel and systems programming, end-host stacks, and advanced algorithms to create the datacenter networks that power Google. Our networks are among the world’s largest and fastest, and we design them to be reliable, cheap, and easy to evolve. We often use new technologies unavailable outside Google.

We exemplify Google’s Hybrid Approach to Research: we deploy real-world systems at global scale. Many members of our team have extensive research experience, we publish papers in conferences such as SIGCOMM, NSDI, SOSP, and OSDI, and we work closely with interns and faculty from leading universities.

Every Google product relies on the technologies we develop. Our networks support complex, highly-available, planetary-scale distributed systems with billions of users. We constantly evolve our networks to meet the requirements of, and create opportunities for, new and better Google products, especially the rapidly-growing Google Cloud.

Our team works in many locations: Sunnyvale CA, New York City, Madison WI, Boulder CO, Reston VA, and Seattle WA.

Team focus summaries

Congestion control, network measurement, and traffic management

All networks are subject to congestion; we want to operate ours at high utilization levels (to reduce costs) while meeting strict performance objectives. We’re inventing new congestion avoidance protocols, and improving our global-scale, near-real-time, automated traffic engineering system. We’re building better ways to measure our networks, accurately and at scale, to drive our evaluation of congestion-control techniques, and as real-time input to automated traffic management.

Data-center network design

We continue to innovate in designs for scalable, fast, cheap, reliable, and evolvable data-center networks. When necessary, we design our own hardware, and innovate in network topology and routing protocols. We use automatic techniques to optimize network designs.

Network management

We’re building automated network management systems, enabling us to rapidly repair and improve our networks with little or no downtime. We’re using techniques such as formal modeling of network topologies and highly-available distributed systems, while working closely with Google’s network engineers and operators to implement automated workflows.

Programmable packet processing

We’re developing new mechanisms for low-latency, CPU-efficient communication. We want our network switches and endpoints to implement novel packet-processing functions without compromising on cost or performance. We’re exploring hardware and software techniques for fast, flexible, safe packet processing, including onload, offload, RDMA, P4, and more.

Software-Defined networking (SDN)

We employ SDN extensively. We were early users of, and contributors to, OpenFlow, and continue, with P4, to raise the level of abstraction for silicon-agnostic switching. We are developing SDN controller platforms that can handle Google’s needs for scale and reliability, and SDN applications for routing, traffic management, and other functions.

High velocity development and testing

To introduce network innovations into production as rapidly as possible, without compromising availability, we test our designs and implementations early, often, and extensively. We’re developing advanced software validation techniques, we embrace automation in all aspects of testing and qualification, and we build powerful infrastructure for testing, debugging, and root-causing, in both physical and emulated testbeds.

Featured publications

Orion: Google’s Software-Defined Networking Control Plane

Amin Vahdat

Amr Sabaa

Andrew Ferguson

Arjun Singh

Chi-yao Hong

Chip Killian

Henrik Muehe

Joon Suan Ong

Karthik Swaminathan Nagaraj

KondapaNaidu Bollineni

Leon Poutievski

Lorenzo Vicisano

Mike Conley

Min Zhu

Rich Alimi

Shawn Chen

Shidong Zhang

Steve Gribble

Subhasree Mandal

Waqar Mohsin

(2021)

1RMA: Re-Envisioning Remote Memory Access for Multi-Tenant Datacenters

Aditya Akella

Amin Vahdat

Arjun Singhvi

Behnam Montazeri

Dan Gibson

Hassan Wassel

Joel Scherpelz

Milo M. K. Martin

Monica C Wong-Chan

Moray Mclaren

Prashant Chandra

Rob Cauble

Sean Clark

Simon Sabato

Thomas F. Wenisch

Proceedings of the Annual Conference of the ACM Special Interest Group on Data Communication on the Applications, Technologies, Architectures, and Protocols for Computer Communication, Association for Computing Machinery, New York, NY, USA(2020), 708–721

Swift: Delay is Simple and Effective for Congestion Control in the Datacenter

Gautam Kumar

Nandita Dukkipati

Keon Jang

Hassan Wassel

Xian Wu

Behnam Montazeri

Yaogong Wang

Kevin Springborn

Christopher Alfeld

Mike Ryan

David J. Wetherall

Amin Vahdat

SIGCOMM 2020(2020)

Andromeda: Performance, Isolation, and Velocity at Scale in Cloud Network Virtualization

Mike Dalton

David Schultz

Ahsan Arefin

Alex Docauer

Anshuman Gupta

Brian Matthew Fahs

Dima Rubinstein

Enrique Cauich Zermeno

Erik Rubow

Jake Adriaens

Jesse L Alpert

Jing Ai

Jon Olson

Kevin P. DeCabooter

Marc Asher de Kruijf

Nan Hua

Nathan Lewis

Nikhil Kasinadhuni

Riccardo Crepaldi

Srinivas Krishnan

Subbaiah Venkata

Yossi Richter

Uday Naik

Amin Vahdat

15th USENIX Symposium on Networked Systems Design and Implementation, NSDI 2018

Snap: a Microkernel Approach to Host Networking

Michael Marty

Marc de Kruijf

Jacob Adriaens

Christopher Alfeld

Sean Bauer

Carlo Contavalli

Mike Dalton

Nandita Dukkipati

William C. Evans

Steve Gribble

Nicholas Kidd

Roman Kononov

Gautam Kumar

Carl Mauer

Emily Musick

Lena Olson

Mike Ryan

Erik Rubow

Kevin Springborn

Paul Turner

Valas Valancius

Xi Wang

Amin Vahdat

In ACM SIGOPS 27th Symposium on Operating Systems Principles, ACM, New York, NY, USA(2019) (to appear)

Experiences with Modeling Network Topologies at Multiple Levels of Abstraction

Jeffrey C. Mogul

Drago Goricanec

Martin Pool

Anees Shaikh

Douglas Turk

Bikash Koley

Xiaoxue Zhao

17th Symposium on Networked Systems Design and Implementation (NSDI)(2020)

Nines are Not Enough: Meaningful Metrics for Clouds

Jeffrey C. Mogul

John Wilkes

Proc. 17th Workshop on Hot Topics in Operating Systems (HoTOS)(2019)

Minimal Rewiring: Efficient Live Expansion for Clos Data Center Networks

Shizhen Zhao

Rui Wang

Junlan Zhou

Joon Ong

Jeffrey C. Mogul

Amin Vahdat

Proc. 16th USENIX Symposium on Networked Systems Design and Implementation (NSDI 2019), USENIX Association (to appear)

BBR: Congestion-Based Congestion Control

Neal Cardwell

Yuchung Cheng

C. Stephen Gunn

Soheil Hassas Yeganeh

Van Jacobson

Communications of the ACM, 60(2017), pp. 58-66

Thinking about Availability in Large Service Infrastructures

Jeffrey C. Mogul

Rebecca Isaacs

Brent Welch

Proc. HotOS XVI(2017)

Some of our locations

Some of our people

Join our team

Internships

We have a vigorous internship program, with a strong focus on PhD-level students who would like to understand how large-scale networks are designed, built, and operated. We also hire Bachelors and Masters interns. Most of our internship projects are focused on building software, especially distributed systems and kernels, and do not necessarily require a prior background in networking.

Please check again in September or October 2024 to find out about internships for 2025.

Open role(s)

Software Engineer, Systems and Infrastructure, PhD University Graduate : Learn more
- PhD-level software engineers in Network Infrastructure apply their research training to the toughest problems of designing and building large-scale, high-performance, high-availability distributed systems to design, manage, measure, and control our datacenter, WAN, and peering-edge SDN networks (each of which has been the subject of at least one SIGCOMM paper). We're also creating innovative end-host stacks, to support CPU-efficient, low-latency, congestion-aware communication, with secure isolation between users. You'll work with other skillful, creative people, including people who wrote research papers you've read, and you'll keep connected with the academic research community.
- Note that this job opening covers teams besides Network Infrastructure; we have several teams looking for a candidates with a mix of various "Systems" skills.

Defining the technology of today and tomorrow.

Philosophy

People

Teams

AI/ML Foundations  & Capabilities

Algorithms & Optimization

Computing Paradigms

Responsible Human-Centric Technology

Science & Societal Impact

Projects

Publications

Resources

Shaping the future, together.

Student programs

Faculty programs

Conferences & events

About the team

Team focus summaries

Congestion control, network measurement, and traffic management

Data-center network design

Network management

Programmable packet processing

Software-Defined networking (SDN)

High velocity development and testing

Featured publications

Some of our locations

Some of our people

Join our team

Internships

Open role(s)

Join us

Defining the technology of today and tomorrow.

Philosophy

People

Teams

AI/ML Foundations & Capabilities

Algorithms & Optimization

Computing Paradigms

Responsible Human-Centric Technology

Science & Societal Impact

Projects

Publications

Resources

Shaping the future, together.

Student programs

Faculty programs

Conferences & events

Network infrastructure

About the team

Team focus summaries

Congestion control, network measurement, and traffic management

Data-center network design

Network management

Programmable packet processing

Software-Defined networking (SDN)

High velocity development and testing

Featured publications

Some of our locations

Some of our people

Join our team

Internships

Open role(s)

Join us

AI/ML Foundations  & Capabilities