Google Research

Network Error Logging: Client-side measurement of end-to-end web service reliability

17th USENIX Symposium on Networked Systems Design and Implementation, NSDI 2020

Abstract

We present NEL (Network Error Logging), Google’s planet scale, client-side, network reliability measurement system. NEL is implemented in Chrome and has been proposed as a new W3C standard, letting any web site operator collect reports of clients’ successful and failed requests to their sites. These reports are similar to web server logs, but include information about failed requests that never reach serving infrastructure. Reports are uploaded via redundant failover paths, reducing the likelihood of shared-fate failures of report uploads. We have used NEL to monitor all of Google’s domains since 2014, allowing us to detect and investigate instances of DNS hijacking, BGP route leaks, protocol deployment bugs, and other problems where packets might never reach our servers. This paper presents the design of NEL, case studies of real outages, and deployment lessons for other operators who choose to use NEL to monitor their traffic.

Learn more about how we do research

We maintain a portfolio of research projects, providing individuals and teams the freedom to emphasize specific types of work