Assessing Web Fingerprinting Risk
Abstract
Modern Web APIs allow developers to provide extensively customized experiences for website visitors, but the richness of the device information they provide also make them vulnerable to being abused by malign actors to construct browser fingerprints, device-specific identifiers that enable covert tracking of users even when cookies are disabled.
Previous research has established entropy, a measure of information, as the key metric for quantifying fingerprinting risk. Earlier studies that estimated the entropy of Web APIs were based on data from a single website or were limited to an extremely small sample of clients. They also analyzed each Web API separately and then summed their entropies to quantify overall fingerprinting risk, an approach that can lead to gross overestimates.
We provide the first study of browser fingerprinting which addresses the limitations of prior work. Our study is based on actual visited pages and Web API function calls reported by tens of millions of real Chrome browsers in-the-wild. We accounted for the dependencies and correlations among Web APIs, which is crucial for obtaining more realistic entropy estimates. We also developed a novel experimental design that accurately estimates entropy while never observing too much information from any single user. Our results provide an understanding of the distribution of entropy for different website categories, confirm the utility of entropy as a fingerprinting proxy, and offer a method for evaluating browser enhancements which are intended to mitigate fingerprinting.
Previous research has established entropy, a measure of information, as the key metric for quantifying fingerprinting risk. Earlier studies that estimated the entropy of Web APIs were based on data from a single website or were limited to an extremely small sample of clients. They also analyzed each Web API separately and then summed their entropies to quantify overall fingerprinting risk, an approach that can lead to gross overestimates.
We provide the first study of browser fingerprinting which addresses the limitations of prior work. Our study is based on actual visited pages and Web API function calls reported by tens of millions of real Chrome browsers in-the-wild. We accounted for the dependencies and correlations among Web APIs, which is crucial for obtaining more realistic entropy estimates. We also developed a novel experimental design that accurately estimates entropy while never observing too much information from any single user. Our results provide an understanding of the distribution of entropy for different website categories, confirm the utility of entropy as a fingerprinting proxy, and offer a method for evaluating browser enhancements which are intended to mitigate fingerprinting.