Bikash Koley
Bikash is currently the Vice President of Google Global Networking (GGN). Bikash’s team is responsible for design, development, build and operation of Google’s massive global network that every Google service and the Google Cloud Platform relies upon. The GGN team develops the cutting edge intent driven networking technologies that allow Google's global WAN to be zero touch, builds out some of the largest scale SDN infrastructure ever deployed (B4, Espresso), scales Google's global CDN that fuels Youtube and all other Google services and expands the reach of Google's network by building the highest capacity submarine cables and the most programmable terrestrial optical networks. This infra is also the core network that underpins of all Google cloud services.
Prior to this, Bikash was the Executive Vice President and Chief Technology Officer of Juniper Networks. In this role, Bikash charted Juniper’s technology strategy and led the execution of the company’s critical technology innovations. Specifically, Bikash was responsible for Juniper’s telco cloud and virtualization, multicloud enterprise datacenter and software defined enterprise networking products and technologies.
Prior to Juniper, Bikash spent close to ten years at Google, where he was a Distinguished Engineer and the Head of Network Architecture, Engineering and Planning.
Bikash received a BTech from IIT, India; and MS and PhD degrees from the University of Maryland at College Park, all in Electrical Engineering.
Research Areas
Authored Publications
Sort By
A Decentralized SDN Architecture for the WAN
Nitika Saran
Ashok Narayanan
Sylvia Ratnasamy
Ankit Singla
Hakim Weatherspoon
2024 ACM Special Interest Group on Data Communication (SIGCOMM) (2024)
Preview abstract
Motivated by our experiences operating a global WAN, we argue that SDN’s reliance on infrastructure external to the data plane has significantly complicated the challenge of maintaining high availability. We propose a new decentralized SDN (dSDN) architecture in which SDN control logic instead runs within routers, eliminating the control plane’s reliance on external infrastructure and restoring fate sharing between control and data planes.
We present dSDN as a simpler approach to realizing the benefits of SDN in the WAN. Despite its much simpler design, we show that dSDN is practical from an implementation viewpoint, and outperforms centralized SDN in terms of routing convergence and SLO impact.
View details
Experiences with Modeling Network Topologies at Multiple Levels of Abstraction
Martin Pool
Xiaoxue Zhao
17th Symposium on Networked Systems Design and Implementation (NSDI) (2020)
Preview abstract
Network management is becoming increasingly automated,
and automation depends on detailed, explicit representations
of data about both the state of a network, and about an operator’s intent for its networks. In particular, we must explicitly
represent the desired and actual topology of a network; almost all other network-management data either derives from
its topology, constrains how to use a topology, or associates
resources (e.g., addresses) with specific places in a topology.
We describe MALT, a Multi-Abstraction-Layer Topology
representation, which supports virtually all of our network
management phases: design, deployment, configuration, operation, measurement, and analysis. MALT provides interoperability across software systems, and its support for abstraction allows us to explicitly tie low-level network elements to high-level design intent. MALT supports a declarative style that simplifies what-if analysis and testbed support.
We also describe the software base that supports efficient use of MALT, as well as numerous, sometimes painful
lessons we have learned about curating the taxonomy for a
comprehensive, and evolving, representation for topology.
View details
Taking the Edge off with Espresso: Scale, Reliability and Programmability for Global Internet Peering
Matthew Holliman
Gary Baldus
Marcus Hines
TaeEun Kim
Ashok Narayanan
Victor Lin
Colin Rice
Brian Rogan
Bert Tanaka
Manish Verma
Puneet Sood
Mukarram Tariq
Dzevad Trumic
Vytautas Valancius
Calvin Ying
Mahesh Kallahalla
Sigcomm (2017)
Preview abstract
We present the design of Espresso, Google’s SDN-based Internet peering edge routing infrastructure. This architecture grew out of a need to exponentially scale the Internet edge cost-effectively and to
enable application-aware routing at Internet-peering scale. Espresso utilizes commodity switches and host-based routing/packet processing to implement a novel fine-grained traffic engineering capability.
Overall, Espresso provides Google a scalable peering edge that is programmable, reliable, and integrated with global traffic systems. Espresso also greatly accelerated deployment of new networking features at our peering edge. Espresso has been in production for two years and serves over 22% of Google’s total traffic to the Internet.
View details
The Zero Touch Network
International Conference on Network and Service Management (2016) (to appear)
Preview abstract
Large scale content and cloud infrastructure providers strive to offer the highest level of availability across the infrastructure stack. This however is not an easy feat given the fast pace of technology evolution, infrastructure expansion and global reach. Google’s network infrastructure has been built to achieve scale, efficiency and very high reliability by following a set of key architectural principles, which we refer to as the “zero touch network”. Failures do happen in any global scale network infrastructure such as Google’s. By analyzing past failures, we found that a large number of them happened when a network management operation was in progress. To minimize such failures, we have built a network infrastructure where all network operations are automated, requiring no additional steps beyond the instantiation of intent. The network infrastructure is fully declarative and changes applied to individual network elements are derived by the network infrastructure from the high-level network-wide intent. Any network changes are automatically halted and automatically rolled-back by the management infrastructure if the network displays unintended behavior. Finally, the infrastructure does not allow operations which violate network policies.
While it might be tempting to limit the rate at which the network evolves to minimize risk of network failures, we have internally come to the opposite conclusion. In a zero-touch-network, continuous incremental evolution results in a more robust infrastructure rather than in-frequent large changes.
View details
Preview abstract
Maintaining the highest levels of availability for content providers is challenging in the face of scale, network evolution, and complexity. Little, however, is known about the network failures large content providers are susceptible to, and what mechanisms they employ to ensure high availability. From a detailed analysis of over 100 high-impact failure events within Google’s network, encompassing many
data centers and two WANs, we quantify several dimensions of availability failures. We find that failures are evenly distributed across different network types and across data, control, and management planes, but that a large number of failures happen when a network management operation is in progress within the network. We discuss some of these failures in detail, and also describe our design principles for high availability motivated by these failures. These include using defense in depth, maintaining consistency across planes, failing open on large failures, carefully preventing and avoiding failures, and assessing root cause quickly. Our findings suggest that, as networks become more complicated, failures lurk everywhere, and, counter-intuitively, continuous incremental evolution of the network can, when applied together with our design principles, result in a more robust network.
View details
Capacity planning for the Google backbone network
Ajay Kumar Bangla
Ben Preskill
Christoph Albrecht
Emilie Danna
Xiaoxue Zhao
ISMP 2015 (International Symposium on Mathematical Programming) (to appear)
Preview abstract
Google operates one of the largest backbone networks in the world. In this
talk, we present optimization and simulation techniques we use to design the
network topology and provision its capacity to achieve conflicting objectives
such as scale, cost, availability, and latency.
View details
Software Defined Networking at Scale
Light Reading (2014), pp. 22
Preview abstract
Software Defined Networks require Software Defined Operations. Google made great progress in SDN data and control plane. This talk discusses how we are working with the industry to transform the network management plane into a software defined framework.
View details
The Prospect of Inter-Data-Center Optical Networks
Xiaoxue Zhao
Vijay Vusirikala
Valey Kamalov
IEEE Communication Magazine, 51 (2013), pp. 32-38
Preview abstract
Mega data centers and their interconnection
networks have drawn great attention in recent
years because of the rapid public adoption of
cloud-based services. The unprecedented
amount of data that needs to be communicated
between data centers imposes new requirements
and challenges to inter-data-center optical networks.
In this article, we discuss the traffic
growth trends and capacity demands of Google’s
inter-data-center network, and how they
drive the network architectures and technologies
to scale capacities and operational ease on existing
fiber plants. We extensively review recent
research findings and emerging technologies,
such as digital coherent detection and the flexgrid
dense wavelength-division multiplexed channel
plan, and propose practical implementations,
such as C+L-band transmission, packet and
optical layer integration, and a software-defined
networking enabled network architecture for
both capacity and operational scaling. In addition,
we point out a few critical areas that require
more attention and research to improve efficiency
and flexibility of an inter-data-center optical
network: optical regeneration, data rate mismatch
between Ethernet and optical transport,
and real-time optical performance monitoring.
View details
Drivers and applications of optical technologies for Internet Data Center networks
Paul Schultz
Optical Fiber Communication Conference and Exposition (OFC/NFOEC), 2011 and the National Fiber Optic Engineers Conference, pp. 1-3
Preview abstract
The rise of large-scale Data Centers to power the Internet infrastructure is driving new architectural directions for optical networking. This paper highlights these architectural options and discusses technology building blocks for scaling inter-Datacenter connectivity.
View details