Igor Bilogrevic
I am a Staff Research Scientist and research lead. I work on applied machine learning in order to build novel privacy and security features in our products. I have a PhD on applied cryptography and machine learning for privacy-enhancing technologies from EPFL.
Previously, I worked in collaboration with the Nokia Research Center on privacy challenges in pervasive mobile networks, encompassing data, location and information-sharing privacy. I've spent a summer at PARC (a Xerox Company), conducting research on topics related to private data analytics. I am a co-inventor on several patents filed by Nokia, PARC and Google.
I am interested in several domains that are related to the applications of machine learning and AI to privacy and security, such as web browser privacy and contextual intelligence.
Authored Publications
Sort By
Shorts vs. Regular Videos on YouTube: A Comparative Analysis of User Engagement and Content Creation Trends
Caroline Violot
Tugrulcan Elmais
Mathias Humbert
ACM Web Science Conference 2024 (WEBSCI24) (2024)
Preview abstract
YouTube introduced the Shorts video format in 2021, allowing users to upload short videos that are prominently displayed on its website and app. Despite having such a large visual footprint, there are no studies to date that have looked at the impact Shorts introduction had on the production and consumption of content on YouTube.
This paper presents the first comparative analysis of YouTube Shorts versus regular videos with respect to user engagement (i.e., views, likes, and comments), content creation frequency and video categories. We collected a dataset containing information about 70k channels that posted at least one Short, and we analyzed the metadata of all the videos (9.9M Shorts and 6.9M regular videos) they uploaded between January 2021 and December 2022, spanning a two-year period including the introduction of Shorts. Our longitudinal analysis shows that content creators consistently increased the frequency of Shorts production over this period, especially for newly-created channels, which surpassed that of regular videos. We also observe that Shorts target mostly entertainment categories, while regular videos cover a wide variety of categories. In general, Shorts attract more views and likes per view than regular videos, but attract less comments per view. However, Shorts do not outperform regular videos in the education and political categories as much as they do in other categories.
Our study contributes to understanding social media dynamics, to quantifying the spread of short-form content, and to motivating future research on its impact on society.
View details
Assessing Web Fingerprinting Risk
Robert Busa-Fekete
Antonio Sartori
Proceedings of the ACM Web Conference (WWW 2024)
Preview abstract
Modern Web APIs allow developers to provide extensively customized experiences for website visitors, but the richness of the device information they provide also make them vulnerable to being abused by malign actors to construct browser fingerprints, device-specific identifiers that enable covert tracking of users even when cookies are disabled.
Previous research has established entropy, a measure of information, as the key metric for quantifying fingerprinting risk. Earlier studies that estimated the entropy of Web APIs were based on data from a single website or were limited to an extremely small sample of clients. They also analyzed each Web API separately and then summed their entropies to quantify overall fingerprinting risk, an approach that can lead to gross overestimates.
We provide the first study of browser fingerprinting which addresses the limitations of prior work. Our study is based on actual visited pages and Web API function calls reported by tens of millions of real Chrome browsers in-the-wild. We accounted for the dependencies and correlations among Web APIs, which is crucial for obtaining more realistic entropy estimates. We also developed a novel experimental design that accurately estimates entropy while never observing too much information from any single user. Our results provide an understanding of the distribution of entropy for different website categories, confirm the utility of entropy as a fingerprinting proxy, and offer a method for evaluating browser enhancements which are intended to mitigate fingerprinting.
View details
Preview abstract
Browser fingerprinting is often associated with cross-site user tracking, a practice that many browsers (e.g., Safari, Brave, Edge, Firefox and Chrome) want to block. However, less is publicly known about its uses to enhance online safety, where it can provide an additional security layer against service abuses (e.g., in combination with CAPTCHAs) or during user authentication. To the best of our knowledge, no fingerprinting defenses deployed thus far consider this important distinction when blocking fingerprinting attempts, so they might negatively affect website functionality and security.
To address this issue we make three main contributions. First, we propose and evaluate a novel machine learning-based method to automatically identify authentication pages (i.e. sign-in and sign-up pages). Our algorithm -- which relies on a hybrid unsupervised/supervised approach -- achieves 96-98% precision and recall on a large, manually-labelled dataset of 10,000 popular sites. Second, we compare our algorithm with other methods from prior works on the same dataset, showing that it significantly outperforms all of them (+83% F1-score). Third, we quantify the prevalence of fingerprinting scripts across sign-in and sign-up pages (9.2%) versus those executed on other pages (8.9%); while the rates of fingerprinting are similar, home pages and authentication pages differ in the third-party scripts they include and how often these scripts are labeled as tracking. We also highlight the substantial differences in fingerprinting behavior on login and sign-up pages.
Our work sheds light on the complicated reality that fingerprinting is used to both protect user security and invade user privacy, and that this dual nature must be considered by fingerprinting mitigations.
View details
FP-Fed: Privacy-Preserving Federated Detection of Browser Fingerprinting
Meenatchi Sundaram Muthu Selva Annamalai
Emiliano De Cristofaro
Network and Distributed System Security (NDSS) Symposium (2024)
Preview abstract
Browser fingerprinting often provides an attractive alternative to third-party cookies for tracking users across the web. In fact, the increasing restrictions on third-party cookies placed by common web browsers and recent regulations like the GDPR may accelerate the transition. To counter browser fingerprinting, previous work proposed several techniques to detect its prevalence and severity. However, these rely on 1) centralized web crawls and/or 2) computationally intensive operations to extract and process signals (e.g., information-flow and static analysis).
To address these limitations, we present FP-Fed, the first distributed system for browser fingerprinting detection. Using FP-Fed, users can collaboratively train on-device models based on their real browsing patterns, without sharing their training data with a central entity, by relying on Differentially Private Federated Learning (DP-FL). To demonstrate its feasibility and effectiveness, we evaluate FP-Fed’s performance on a set of 18.3k popular websites with different privacy levels, numbers of participants, and features extracted from the scripts. Our experiments show that FP-Fed achieves reasonably high detection performance and can perform both training and inference efficiently, on-device, by only relying on runtime signals extracted from the execution trace, without requiring any resource-intensive operation.
View details
Don’t Interrupt Me – A Large-Scale Study of On-Device Permission Prompt Quieting in Chrome
Marian Harbach
Ravjit Uppal
Andy Paicu
Elias Klim
Balazs Engedy
(2024)
Preview abstract
A recent large-scale experiment conducted by Chrome has demonstrated that a "quieter" web permission prompt can reduce unwanted interruptions while only marginally affecting grant rates. However, the experiment and the partial roll-out were missing two important elements: (1) an effective and context-aware activation mechanism for such a quieter prompt, and (2) an analysis of user attitudes and sentiment towards such an intervention. In this paper, we address these two limitations by means of a novel ML-based activation mechanism -- and its real-world on-device deployment in Chrome -- and a large-scale user study with 13.1k participants from 156 countries. First, the telemetry-based results, computed on more than 20 million samples from Chrome users in-the-wild, indicate that the novel on-device ML-based approach is both extremely precise (>99% post-hoc precision) and has very high coverage (96% recall for notifications permission). Second, our large-scale, in-context user study shows that quieting is often perceived as helpful and does not cause high levels of unease for most respondents.
View details
"Shhh...be Quiet!" Reducing the Unwanted Interruptions of Notification Permission Prompts on Chrome
Balazs Engedy
Jud Porter
Kamila Hasanbega
Andrew Paseltiner
Hwi Lee
Edward Jung
PJ McLachlan
Jason James
30th USENIX Security Symposium (USENIX Security 21), USENIX Association, Vancouver, B.C. (2021)
Preview abstract
Push notifications are an extremely useful feature. In web browsers, they allow users to receive timely updates even if the website is not currently open. On Chrome, the feature has become extremely popular since its inception in 2015, but it is also the least likely to be accepted by users. Our telemetry shows that, although 74% of all permission prompts are about notifications, they are also the least likely to be granted with only a 10% grant rate on desktop and 21% grant rate on Android. In order to preserve its utility for the websites and to reduce unwanted interruptions for the users, we designed and tested a new UI for notification permission prompt on Chrome.
In this paper, we conduct two large-scale studies of Chrome users interactions with the notifications permission prompt in the wild, in order to understand how users interact with such prompts and to evaluate a novel design that we introduced in Chrome version 80 in February 2020. Our main goal for the redesigned UI is to reduce the unwanted interruptions due to notification permission prompts for Chrome users, the frequency at which users have to suppress them and the ease of changing a previously made choice.
Our results, based on an A/B test using behavioral data from more than 40 million users who interacted with more than 100 million prompts on more than 70 thousand websites, show that the new UI is very effective at reducing the unwanted interruptions and their frequency (up to 30% fewer unnecessary actions on the prompts), with a minimal impact (less than 5%) on the grant rates, across all types of users and websites. We achieve these results thanks to a novel adaptive activation mechanism coupled with a block list of interrupting websites, which is derived from crowd-sourced telemetry from Chrome clients.
View details
Nothing Standard About It: An Analysis of Minimum Security Standards in Organizations
Jake Weidman
Jens Grossklags
ESORICS 2020, Computer Security, Springer International Publishing, pp. 263-282
Preview abstract
Written security policies are an important part of the complex set of measures to protect organizations from adverse events. However, research detailing these policies and their effectiveness is comparatively sparse. We tackle this research gap by conducting an analysis of a specific user-oriented sub-component of a full information security policy, the Minimum Security Standard.
Specifically, we conduct an analysis of 29 publicly accessible minimum security standard documents from U.S. academic institutions. We study the prevalence of an extensive set of user-oriented provisions across these statements such as who is being addressed, whether the standard is considered binding and how it is being enforced, and which specific procedures and practices for users are introduced. We demonstrate significant diversity in focus, style and comprehensiveness in this sample of minimum security standards and discuss their significance within the overall security landscape of organizations.
View details
Reducing Permission Requests in Mobile Apps
Martin Pelikan
Giles Hogben
Proceedings of ACM Internet Measurement Conference (IMC) (2019)
Preview abstract
Users of mobile apps sometimes express discomfort or concerns with what they see as unnecessary or intrusive permission requests by certain apps. However encouraging mobile app developers to request fewer permissions is challenging because there are many reasons why permissions are requested; furthermore, prior work has shown it is hard to disambiguate the purpose of a particular permission with high certainty. In this work we describe a novel, algorithmic mechanism intended to discourage mobile-app developers from asking for unnecessary permissions. Developers are incentivized by an automated alert, or "nudge", shown in the Google Play Console when their apps ask for permissions that are requested by very few functionally-similar apps---in other words, by their competition. Empirically, this incentive is effective, with significant developer response since its deployment. Permissions have been redacted by 59% of apps that were warned, and this attenuation has occurred broadly across both app categories and app popularity levels. Importantly, billions of users' app installs from the Google Play have benefited from these redactions
View details
Towards Usable Checksums: Automating Web Downloads Verification for the Masses
Alexandre Meylan
Bertil Chapuis
Kevin Huguenin
Mathias Humbert
Mauro Cherubini
ACM CCS (2018)
Preview abstract
Internet users can download software for their computers from app stores (e.g., Mac App Store and Windows Store) or from other sources, such as the developers' websites. Most Internet users in the US rely on the latter, according to our representative study, which makes them directly responsible for the content they download. To enable users to detect if the downloaded files have been corrupted, developers can publish a checksum together with the link to the program file; users can then manually verify that the checksum matches the one they obtain from the downloaded file.
In this paper, we assess the prevalence of such behavior among the general Internet population in the US (N=2,000), and we develop easy-to-use tools for users and developers to automate both the process of checksum verification and generation. Specifically, we propose an extension to the recent W3C specification for sub-resource integrity in order to provide integrity protection for download links. Also, we develop an extension for the popular Chrome browser that computes and verifies checksums of downloaded files automatically, and an extension for the WordPress CMS that developers can use to easily attach checksums to their remote content. Our in situ experiments with 40 participants demonstrate the usability and effectiveness issues of checksums verification, and shows user desirability for our extension.
View details
Privacy in Geospatial Applications and Location-Based Social Networks
Handbook of Mobile Data Privacy, Springer (2018), pp. 195-228
Preview abstract
The use of location data has greatly benefited from the availability of location-based services, the popularity of social networks, and the accessibility of public location data sets. However, in addition to providing users with the ability to obtain accurate driving directions or the convenience of geo-tagging friends and pictures, location is also a very sensitive type of data, as attested by more than a decade of research on different aspects of privacy related to location data.
In this chapter, we focus on two domains that rely on location data as their core component: Geospatial applications (such as thematic maps and crowdsourced geo-information) and location-based social networks. We discuss the increasing relevance of geospatial applications to the current location-aware services, and we describe relevant concepts such as volunteered geographic information, geo-surveillance and how they relate to privacy. Then, we focus on a subcategory of geospatial applications, location-based social networks, and we introduce the different entities (such as users, services and providers) that are involved in such networks, and we characterize their role and interactions. We present the main privacy challenges and we discuss the approaches that have been proposed to mitigate privacy risks in location-based social networks. Finally, we conclude with a discussion of open research questions and promising directions that will contribute to improve privacy for users of location-based social networks.
View details