Jump to Content
Bikash Koley

Bikash Koley

Bikash is currently the Vice President of Google Global Networking (GGN). Bikash’s team is responsible for design, development, build and operation of Google’s massive global network that every Google service and the Google Cloud Platform relies upon. The GGN team develops the cutting edge intent driven networking technologies that allow Google's global WAN to be zero touch, builds out some of the largest scale SDN infrastructure ever deployed (B4, Espresso), scales Google's global CDN that fuels Youtube and all other Google services and expands the reach of Google's network by building the highest capacity submarine cables and the most programmable terrestrial optical networks. This infra is also the core network that underpins of all Google cloud services. Prior to this, Bikash was the Executive Vice President and Chief Technology Officer of Juniper Networks. In this role, Bikash charted Juniper’s technology strategy and led the execution of the company’s critical technology innovations. Specifically, Bikash was responsible for Juniper’s telco cloud and virtualization, multicloud enterprise datacenter and software defined enterprise networking products and technologies. Prior to Juniper, Bikash spent close to ten years at Google, where he was a Distinguished Engineer and the Head of Network Architecture, Engineering and Planning. Bikash received a BTech from IIT, India; and MS and PhD degrees from the University of Maryland at College Park, all in Electrical Engineering.
Authored Publications
Google Publications
Other Publications
Sort By
  • Title
  • Title, desc
  • Year
  • Year, desc
    Preview abstract Network management is becoming increasingly automated, and automation depends on detailed, explicit representations of data about both the state of a network, and about an operator’s intent for its networks. In particular, we must explicitly represent the desired and actual topology of a network; almost all other network-management data either derives from its topology, constrains how to use a topology, or associates resources (e.g., addresses) with specific places in a topology. We describe MALT, a Multi-Abstraction-Layer Topology representation, which supports virtually all of our network management phases: design, deployment, configuration, operation, measurement, and analysis. MALT provides interoperability across software systems, and its support for abstraction allows us to explicitly tie low-level network elements to high-level design intent. MALT supports a declarative style that simplifies what-if analysis and testbed support. We also describe the software base that supports efficient use of MALT, as well as numerous, sometimes painful lessons we have learned about curating the taxonomy for a comprehensive, and evolving, representation for topology. View details
    Taking the Edge off with Espresso: Scale, Reliability and Programmability for Global Internet Peering
    Matthew Holliman
    Gary Baldus
    Marcus Hines
    TaeEun Kim
    Ashok Narayanan
    Victor Lin
    Colin Rice
    Brian Rogan
    Bert Tanaka
    Manish Verma
    Puneet Sood
    Mukarram Tariq
    Dzevad Trumic
    Vytautas Valancius
    Calvin Ying
    Mahesh Kallahalla
    Sigcomm (2017)
    Preview abstract We present the design of Espresso, Google’s SDN-based Internet peering edge routing infrastructure. This architecture grew out of a need to exponentially scale the Internet edge cost-effectively and to enable application-aware routing at Internet-peering scale. Espresso utilizes commodity switches and host-based routing/packet processing to implement a novel fine-grained traffic engineering capability. Overall, Espresso provides Google a scalable peering edge that is programmable, reliable, and integrated with global traffic systems. Espresso also greatly accelerated deployment of new networking features at our peering edge. Espresso has been in production for two years and serves over 22% of Google’s total traffic to the Internet. View details
    The Zero Touch Network
    International Conference on Network and Service Management (2016) (to appear)
    Preview abstract Large scale content and cloud infrastructure providers strive to offer the highest level of availability across the infrastructure stack. This however is not an easy feat given the fast pace of technology evolution, infrastructure expansion and global reach. Google’s network infrastructure has been built to achieve scale, efficiency and very high reliability by following a set of key architectural principles, which we refer to as the “zero touch network”. Failures do happen in any global scale network infrastructure such as Google’s. By analyzing past failures, we found that a large number of them happened when a network management operation was in progress. To minimize such failures, we have built a network infrastructure where all network operations are automated, requiring no additional steps beyond the instantiation of intent. The network infrastructure is fully declarative and changes applied to individual network elements are derived by the network infrastructure from the high-level network-wide intent. Any network changes are automatically halted and automatically rolled-back by the management infrastructure if the network displays unintended behavior. Finally, the infrastructure does not allow operations which violate network policies. While it might be tempting to limit the rate at which the network evolves to minimize risk of network failures, we have internally come to the opposite conclusion. In a zero-touch-network, continuous incremental evolution results in a more robust infrastructure rather than in-frequent large changes. View details
    Evolve or Die: High-Availability Design Principles Drawn from Google's Network Infrastructure
    Ramesh Govindan
    Ina Minei
    Mahesh Kallahalla
    Amin Vahdat
    ACM SIGCOMM (2016)
    Preview abstract Maintaining the highest levels of availability for content providers is challenging in the face of scale, network evolution, and complexity. Little, however, is known about the network failures large content providers are susceptible to, and what mechanisms they employ to ensure high availability. From a detailed analysis of over 100 high-impact failure events within Google’s network, encompassing many data centers and two WANs, we quantify several dimensions of availability failures. We find that failures are evenly distributed across different network types and across data, control, and management planes, but that a large number of failures happen when a network management operation is in progress within the network. We discuss some of these failures in detail, and also describe our design principles for high availability motivated by these failures. These include using defense in depth, maintaining consistency across planes, failing open on large failures, carefully preventing and avoiding failures, and assessing root cause quickly. Our findings suggest that, as networks become more complicated, failures lurk everywhere, and, counter-intuitively, continuous incremental evolution of the network can, when applied together with our design principles, result in a more robust network. View details
    Capacity planning for the Google backbone network
    Ajay Kumar Bangla
    Ben Preskill
    Christoph Albrecht
    Emilie Danna
    Xiaoxue Zhao
    ISMP 2015 (International Symposium on Mathematical Programming) (to appear)
    Preview abstract Google operates one of the largest backbone networks in the world. In this talk, we present optimization and simulation techniques we use to design the network topology and provision its capacity to achieve conflicting objectives such as scale, cost, availability, and latency. View details
    Preview abstract Software Defined Networks require Software Defined Operations. Google made great progress in SDN data and control plane. This talk discusses how we are working with the industry to transform the network management plane into a software defined framework. View details
    The Prospect of Inter-Data-Center Optical Networks
    Xiaoxue Zhao
    Valey Kamalov
    IEEE Communication Magazine, vol. 51 (2013), pp. 32-38
    Preview abstract Mega data centers and their interconnection networks have drawn great attention in recent years because of the rapid public adoption of cloud-based services. The unprecedented amount of data that needs to be communicated between data centers imposes new requirements and challenges to inter-data-center optical networks. In this article, we discuss the traffic growth trends and capacity demands of Google’s inter-data-center network, and how they drive the network architectures and technologies to scale capacities and operational ease on existing fiber plants. We extensively review recent research findings and emerging technologies, such as digital coherent detection and the flexgrid dense wavelength-division multiplexed channel plan, and propose practical implementations, such as C+L-band transmission, packet and optical layer integration, and a software-defined networking enabled network architecture for both capacity and operational scaling. In addition, we point out a few critical areas that require more attention and research to improve efficiency and flexibility of an inter-data-center optical network: optical regeneration, data rate mismatch between Ethernet and optical transport, and real-time optical performance monitoring. View details
    Drivers and applications of optical technologies for Internet Data Center networks
    Paul Schultz
    Optical Fiber Communication Conference and Exposition (OFC/NFOEC), 2011 and the National Fiber Optic Engineers Conference, pp. 1-3
    Preview abstract The rise of large-scale Data Centers to power the Internet infrastructure is driving new architectural directions for optical networking. This paper highlights these architectural options and discusses technology building blocks for scaling inter-Datacenter connectivity. View details
    Field verification of 40G DPSK upgrade in a legacy 10G network
    Valey Kamalov
    Xiaoxue Zhao
    Optical Fiber Communication, IEEE (2010), NTuC2
    Preview abstract We report verification of 1,200 km field upgrade of 10 G NRZ wavelengths with 40 G DPSK channels. Non symmetric dispersion map results in pronounced intra-channel nonlinear effect, which could be significantly reduced by dispersion pre-compensation View details
    100GbE and Beyond for Warehouse Scale Computing
    Vijay Gill
    OptoeElectronics and Communications Conference (OECC) Technical Digest (2010), pp. 106-107
    Preview abstract As computation and storage continues to move from desktops to large internet services, computing platforms running such services are transforming into warehouse-scale computers. 100 Gigabit Ethernet and beyond will be instrumental in scaling the interconnection within and between these ubiquitous warehouse-scale computing infrastructures. In this paper, we describe the drivers for such interfaces and some methods of scaling Ethernet interfaces to speeds beyond 100GbE. View details
    Fiber Optic Communication Technologies: What’s Needed for Datacenter Network Operations
    Xiaoxue Zhao
    Valey Kamalov
    Vijay Gill
    IEEE Communications Magazine, vol. Vol.48 No.7 (2010)
    Preview abstract The authors review the growing trend of warehouse-scale mega-datacenter computing, the Internet transformation driven by mega-datacenter applications, and the opportunities and challenges for fiber optic communication technologies to support the growth of mega-datacenter computing in the next three to four years. View details
    No Results Found