- Ben Jones
- Brian Rogan
- Charles Stahl
- Douglas Creager
- Harsha V. Madhyastha
- Ilya Grigorik
- Julia Elizabeth Tuttle
- Lily Chen
- Misha Efimov
- Pavlos Papageorge
- Sam Burnett
Abstract
We present NEL (Network Error Logging), Google’s planet scale, client-side, network reliability measurement system. NEL is implemented in Chrome and has been proposed as a new W3C standard, letting any web site operator collect reports of clients’ successful and failed requests to their sites. These reports are similar to web server logs, but include information about failed requests that never reach serving infrastructure. Reports are uploaded via redundant failover paths, reducing the likelihood of shared-fate failures of report uploads. We have used NEL to monitor all of Google’s domains since 2014, allowing us to detect and investigate instances of DNS hijacking, BGP route leaks, protocol deployment bugs, and other problems where packets might never reach our servers. This paper presents the design of NEL, case studies of real outages, and deployment lessons for other operators who choose to use NEL to monitor their traffic.
Research Areas
Learn more about how we do research
We maintain a portfolio of research projects, providing individuals and teams the freedom to emphasize specific types of work