Urs Hölzle
Urs Hölzle is a Google Fellow in Google Cloud. Until 2023 he was the Senior Vice President for Technical Infrastructure at Google. In this capacity Urs oversaw the design, installation, and operation of the servers, networks, and datacenters that power Google's services. Through efficiency innovations, Urs and his team have reduced the energy used by Google data centers to less than 50% of the industry average. Urs is renowned for both his red socks and his free-range Leonberger, Yoshka (Google's top dog).
Urs grew up in Switzerland and received a master's degree in computer science from ETH Zurich and, as a Fulbright scholar, a Ph.D. from Stanford. While at Stanford (and then a small start-up that was later acquired by Sun Microsystems) he invented fundamental techniques used in most of today's leading Java compilers. Before joining Google he was a professor of computer science at the University of California, Santa Barbara. He is a Fellow of the ACM and a member of the US National Academy of Engineering and the Swiss Academy of Technical Sciences.
Authored Publications
Sort By
Jupiter Rising: A Decade of Clos Topologies and Centralized Control in Google's Datacenter Network
Joon Ong
Amit Agarwal
Glen Anderson
Ashby Armistead
Roy Bannon
Seb Boving
Gaurav Desai
Bob Felderman
Paulie Germano
Anand Kanagala
Jeff Provost
Jason Simmons
Eiichi Tanda
Jim Wanderer
Stephen Stuart
Communications of the ACM, Vol. 59, No. 9 (2016), pp. 88-97
Preview abstract
We present our approach for overcoming the cost, operational complexity, and limited scale endemic to datacenter networks a decade ago. Three themes unify the five generations of datacenter networks detailed in this paper. First, multi-stage Clos topologies built from commodity switch silicon can support cost-effective deployment of building-scale networks. Second, much of the general, but complex, decentralized network routing and management protocols supporting arbitrary deployment scenarios were overkill for single-operator, pre-planned datacenter networks. We built a centralized control mechanism based on a global configuration pushed to all datacenter switches. Third, modular hardware design coupled with simple, robust software allowed our design to also support inter-cluster and wide-area networks. Our datacenter networks run at dozens of sites across the planet, scaling in capacity by 100x over 10 years to more than 1 Pbps of bisection bandwidth.
View details
Jupiter Rising: A Decade of Clos Topologies and Centralized Control in Google’s Datacenter Network
Joon Ong
Amit Agarwal
Glen Anderson
Ashby Armistead
Roy Bannon
Seb Boving
Gaurav Desai
Paulie Germano
Jeff Provost
Jason Simmons
Eiichi Tanda
Jim Wanderer
Amin Vahdat
Sigcomm '15, Google Inc (2015)
Preview abstract
We present our approach for overcoming the cost, operational complexity, and limited scale endemic to datacenter networks a decade ago. Three themes unify the five generations of datacenter networks detailed in this paper. First, multi-stage Clos topologies built from commodity switch silicon can support cost-effective deployment of building-scale networks. Second, much of the general, but complex, decentralized network routing and management protocols supporting arbitrary deployment scenarios were overkill for single-operator, pre-planned datacenter networks. We built a centralized control mechanism based on a global configuration pushed to all datacenter switches. Third, modular hardware design coupled with simple, robust software allowed our design to also support inter-cluster and wide-area networks. Our datacenter networks run at dozens of sites across the planet, scaling in capacity by 100x over ten years to more than 1Pbps of bisection bandwidth.
View details
Preview abstract
As computation continues to move into the cloud, the computing platform of interest no longer resembles a pizza box or a refrigerator, but a warehouse full of computers. These new large datacenters are quite different from traditional hosting facilities of earlier times and cannot be viewed simply as a collection of co-located servers. Large portions of the hardware and software resources in these facilities must work in concert to efficiently deliver good levels of Internet service performance, something that can only be achieved by a holistic approach to their design and deployment. In other words, we must treat the datacenter itself as one massive warehouse-scale computer (WSC). We describe the architecture of WSCs, the main factors influencing their design, operation, and cost structure, and the characteristics of their software base. We hope it will be useful to architects and programmers of today’s WSCs, as well as those of future many-core platforms which may one day implement the equivalent of today’s WSCs on a single board.
Notes for the Second Edition
After nearly four years of substantial academic and industrial developments in warehouse-scale computing, we are delighted to present our first major update to this lecture. The increased popularity of public clouds has made WSC software techniques relevant to a larger pool of programmers since our first edition. Therefore, we expanded Chapter 2 to reflect our better understanding of WSC software systems and the toolbox of software techniques for WSC programming. In Chapter 3, we added to our coverage of the evolving landscape of wimpy vs. brawny server trade-offs, and we now present an overview of WSC interconnects and storage systems that was promised but lacking in the original edition. Thanks largely to the help of our new co-author, Google Distinguished Engineer Jimmy Clidaras, the material on facility mechanical and power distribution design has been updated and greatly extended (see Chapters 4 and 5). Chapters 6 and 7 have also been revamped significantly. We hope this revised edition continues to meet the needs of educators and professionals in this area.
View details
B4: Experience with a Globally Deployed Software Defined WAN
Preview
Sushant Jain
Joon Ong
Subbaiah Venkata
Jim Wanderer
Junlan Zhou
Min Zhu
Amin Vahdat
Proceedings of the ACM SIGCOMM Conference, Hong Kong, China (2013)
Brawny cores still beat wimpy cores, most of the time
Preview
IEEE MICRO (2010)
Preview abstract
As computation continues to move into the cloud, the computing platform of interest no longer resembles a pizza box or a refrigerator, but a warehouse full of computers. These new large datacenters are quite different from traditional hosting facilities of earlier times and cannot be viewed simply as a collection of co-located servers. Large portions of the hardware and software resources in these facilities must work in concert to efficiently deliver good levels of Internet service performance, something that can only be achieved by a holistic approach to their design and deployment. In other words, we must treat the datacenter itself as one massive warehouse-scale computer (WSC). We describe the architecture of WSCs, the main factors influencing their design, operation, and cost structure, and the characteristics of their software base. We hope it will be useful to architects and programmers of today's WSCs, as well as those of future many-core platforms which may one day implement the equivalent of today's WSCs on a single board.
View details
Preview abstract
In current servers, the lowest energy-efficiency region corresponds to their most common operating mode. Addressing this perfect mismatch will require significant rethinking of components and systems. To that end, we propose that energy proportionality should become a primary design goal. Energy-proportional designs would enable large energy savings in servers, potentially doubling their efficiency in real-life use. Achieving energy proportionality will require significant improvements in the energy usage profile of every system component, particularly the memory and disk subsystems. Although our experience in the server space motivates these observations, we believe that energy-proportional computing also will benefit other types of computing devices.
View details
Preview abstract
The focus of our message is efficiency: power efficiency and programming efficiency. There are several hard technical problems surrounding power efficiency of computers, but we've found one that is actually not particularly challenging and could have a huge impact on the energy used by home computers and low-end servers: increasing power supply efficiency.
View details
Monkey See, Monkey Do: A Tool for TCP Tracing and Replaying
Preview
Stefan Savage
Geoffrey M. Voelker
USENIX Annual Technical Conference, General Track (2004)
Preview abstract
Amenable to extensive parallelization, Google's Web search application lets different queries run on different processors and, by partitioning the overall index, also lets a single query use multiple processors. To handle this workload, Google's architecture features clusters of more than 15,000 commodity class PCs with fault-tolerant software. This architecture achieves superior performance at a fraction of the cost of a system built from fewer, but more expensive, high-end servers.
View details