Elie Bursztein
I lead Google's anti-abuse research team, which invents ways to protect users against cyber-criminal activities and Internet threats. I've redesigned Google's CAPTCHA to make it easier, and I've made Chrome safer and faster by implementing better cryptography. I spend my spare time doing video game research, photography, and magic tricks. I was born in Paris, France, wear berets, and now live with my wife in Mountain View, California
Authored Publications
Google Publications
Other Publications
Sort By
Hybrid Post-Quantum Signatures in Hardware Security Keys
Diana Ghinea
Jennifer Pullman
Julien Cretin
Rafael Misoczki
Stefan Kölbl
Applied Cryptography and Network Security Workshop (2023)
Preview abstract
Recent advances in quantum computing are increasingly jeopardizing the security of cryptosystems currently in widespread use, such as RSA or elliptic-curve signatures. To address this threat, researchers and standardization institutes have accelerated the transition to quantum-resistant cryptosystems, collectively known as Post-Quantum Cryptography (PQC). These PQC schemes present new challenges due to their larger memory and computational footprints and their higher chance of latent vulnerabilities.
In this work, we address these challenges by introducing a scheme to upgrade the digital signatures used by security keys to PQC. We introduce a hybrid digital signature scheme based on two building blocks: a classically-secure scheme, ECDSA, and a post-quantum secure one, Dilithium.
Our hybrid scheme maintains the guarantees of each underlying building block even if the other one is broken, thus being resistant to classical and quantum attacks.
We experimentally show that our hybrid signature scheme can successfully execute on current security keys, even though secure PQC schemes are known to require substantial resources.
We publish an open-source implementation of our scheme at https://github.com/google/OpenSK/releases/tag/hybrid-pqc so that other researchers can reproduce our results on a nRF52840 development kit.
View details
SoK: Hate, Harassment, and the Changing Landscape of Online Abuse
Devdatta Akhawe
Michael Bailey
Dan Boneh
Nicola Dell
Zakir Durumeric
Patrick Gage Kelley
Deepak Kumar
Damon McCoy
Sarah Meiklejohn
Thomas Ristenpart
Gianluca Stringhini
(2021)
Preview abstract
We argue that existing security, privacy, and anti-abuse protections fail to address the growing threat of online hate and harassment. In order for our community to understand and address this gap, we propose a taxonomy for reasoning about online hate and harassment. Our taxonomy draws on over 150 interdisciplinary research papers that cover disparate threats ranging from intimate partner violence to coordinated mobs. In the process, we identify seven classes of attacks---such as toxic content and surveillance---that each stem from different attacker capabilities and intents. We also provide longitudinal evidence from a three-year survey that hate and harassment is a pervasive, growing experience for online users, particularly for at-risk communities like young adults and people who identify as LGBTQ+. Responding to each class of hate and harassment requires a unique strategy and we highlight five such potential research directions that ultimately empower individuals, communities, and platforms to do so.
View details
Designing Toxic Content Classification for a Diversity of Perspectives
Deepak Kumar
Patrick Gage Kelley
Joshua Mason
Zakir Durumeric
Michael Bailey
(2021)
Preview abstract
In this work, we demonstrate how existing classifiers for identifying toxic comments online fail to generalize to the diverse concerns of Internet users. We survey 17,280 participants to understand how user expectations for what constitutes toxic content differ across demographics, beliefs, and personal experiences. We find that groups historically at-risk of harassment—such as people who identify as LGBTQ+ or young adults—are more likely to to flag a random comment drawn from Reddit, Twitter, or 4chan as toxic, as are people who have personally experienced harassment in the past. Based on our findings, we show how current one-size-fits-all toxicity classification algorithms, like the Perspective API from Jigsaw, can improve in accuracy by 86% on average through personalized model tuning. Ultimately, we highlight current pitfalls and new design directions that can improve the equity and efficacy of toxic content classifiers for all users.
View details
“Why wouldn’t someone think of democracy as a target?”: Security practices & challenges of people involved with U.S. political campaigns
Patrick Gage Kelley
Tara Matthews
Lee Carosi Dunn
Proceedings of the USENIX Security Symposium (2021)
Preview abstract
People who are involved with political campaigns face increased digital security threats from well-funded, sophisticated attackers, especially nation-states. Improving political campaign security is a vital part of protecting democracy. To identify campaign security issues, we conducted qualitative research with 28 participants across the U.S. political spectrum to understand the digital security practices, challenges, and perceptions of people involved in campaigns. A main, overarching finding is that a unique combination of threats, constraints, and work culture lead people involved with political campaigns to use technologies from across platforms and domains in ways that leave them—and democracy—vulnerable to security attacks. Sensitive data was kept in a plethora of personal and work accounts, with ad hoc adoption of strong passwords, two-factor authentication, encryption, and access controls. No individual company, committee, organization, campaign, or academic institution can solve the identified problems on their own. To this end, we provide an initial understanding of this complex problem space and recommendations for how a diverse group of experts can begin working together to improve security for political campaigns.
View details
Who is targeted by email-based phishing and malware? Measuring factors that differentiate risk
Camelia Simoiu
Proceedings of the Internet Measurement Conference (2020)
Preview abstract
As technologies to defend against phishing and malware often impose an additional financial and usability cost on users (such as security keys), a question remains as to who should adopt these heightened protections. We measure over 1.2 billion email-based phishing and malware attacks against Gmail users to understand what factors place a person at heightened risk of attack. We find that attack campaigns are typically short-lived and at first glance indiscriminately target users on a global scale. However, by modeling the distribution of targeted users, we find that a person's demographics, location, email usage patterns, and security posture all significantly influence the likelihood of attack. Our findings represent a first step towards empirically identifying the most at-risk users.
View details
Spotlight: Malware Lead Generation at Scale
Bernhard Grill
Jennifer Pullman
Cecilia M. Procopiuc
David Tao
Borbala Benko
Proceedings of Annual Computer Security Applications Conference (ACSAC) (2020)
Preview abstract
Malware is one of the key threats to online security today, with applications ranging from phishing mailers to ransomware andtrojans. Due to the sheer size and variety of the malware threat, it is impractical to combat it as a whole. Instead, governments and companies have instituted teams dedicated to identifying, prioritizing, and removing specific malware families that directly affect their population or business model. The identification and prioritization of the most disconcerting malware families (known as malware hunting) is a time-consuming activity, accounting for more than 20% of the work hours of a typical threat intelligence researcher, according to our survey. To save this precious resource and amplify the team’s impact on users’ online safety we present Spotlight, a large-scale malware lead-generation framework. Spotlight first sifts through a large malware data set to remove known malware families, based on first and third-party threat intelligence. It then clusters the remaining malware into potentially-undiscovered families, and prioritizes them for further investigation using a score based on their potential business impact.
We evaluate Spotlight on 67M malware samples, to show that it can produce top-priority clusters with over 99% purity (i.e., homogeneity), which is higher than simpler approaches and prior work. To showcase Spotlight’s effectiveness, we apply it to ad-fraud malware hunting on real-world data. Using Spotlight’s output, threat intelligence researchers were able to quickly identify three large botnets that perform ad fraud.
View details
Preview abstract
Traffic monetization is a crucial component of
running most for-profit online businesses. One of its latest
incarnations is cryptocurrency mining, where a website instructs
the visitor’s browser to participate in building a cryptocurrency
ledger (e.g., Bitcoin, Monero) in exchange for a small reward in
the same currency.
In its essence, this practice trades the user’s electric bill
(or battery level) for cryptocurrency. With user consent, this
exchange can be a legitimate funding source – for example,
UNICEF has collected over 27k charity donations on a website
dedicated to this purpose, thehopepage.org. Regrettably, this
practice also easily lends itself to abuse: in this form, called
cryptojacking, attacks surreptitiously mine in the users browser,
and profits are collected either by website owners or by hackers
that planted the mining script into a vulnerable page.
Understandably, users frown upon this practice and have
sought to mitigate it by installing blacklist-based browser extensions (the top 3 for Chrome total over one million installs),
whereas researchers have devised more robust methods to detect
it [1]–[6]. In turn, cryptojackers have been bettering their evasion
techniques, incorporating in their toolkits domain fluxing, content
obfuscation, the use of WebAssembly, and throttling. The latter,
for example, grew from being a niche feature, adopted by only
one in ten sites in 2018 [2], to become commonplace in 2019,
reaching an adoption ratio of 58%. Whereas most state-of-the-art defenses address multiple of these evasion techniques, none
is resistant against all.
In this paper, we offer a novel detection method, CoinPolice, that is robust against all of the aforementioned evasion
techniques. CoinPolice flips throttling against cryptojackers,
artificially varying the browser’s CPU power to observe the
presence of throttling. Based on a deep neural network classifier,
CoinPolice can detect 97.87% of hidden miners with a low false
positive rate (0.74%). We compare CoinPolice performance with
the current state of the art and show our approach outperforms
it when detecting aggressively throttled miners.
Finally, we deploy Coinpolice to perform the largest-scale
cryptoming investigation to date, identifying 6700 sites that
monetize traffic in this fashion.
View details
"They Don't Leave Us Alone Anywhere We Go": Gender and Digital Abuse in South Asia
Nithya Sambasivan
Amna Batool
Nova Ahmed
Tara Matthews
Sane Gaytán
David Nemer
(2019) (to appear)
Preview abstract
South Asia faces one of the largest gender gaps online globally, and online safety is one of the main barriers to gender-equitable Internet access [GSMA, 2015]. To better understand the gendered risks and coping practices online in South Asia, we present a qualitative study of the online abuse experiences and coping practices of 199 people who identified as women and 6 NGO staff from India, Pakistan, and Bangladesh, using a feminist analysis. We found that a majority of our participants regularly contended with online abuse, experiencing three major abuse types: cyberstalking, impersonation, and personal content leakages. Consequences of abuse included emotional harm, reputation damage, and physical and sexual violence. Participants coped through informal channels rather than through technological protections or law enforcement. Altogether, our findings point to opportunities for designs, policies, and algorithms to improve women's safety online in South Asia.
View details
Five Years of the Right to be Forgotten
Theo Bertram
Stephanie Caro
Hubert Chao
Rutledge Chin Feman
Peter Fleischer
Albin Gustafsson
Jess Hemerly
Chris Hibbert
Lanah Kammourieh Donnelly
Jason Ketover
Jay Laefer
Paul Nicholas
Yuan Niu
Harjinder Obhi
David Price
Andrew Strait
Al Verney
Proceedings of the Conference on Computer and Communications Security (2019)
Preview abstract
The “Right to be Forgotten” is a privacy ruling that enables Europeans to delist certain URLs appearing in search results related to their name. In order to illuminate the effect this ruling has on information access, we conducted a retrospective measurement study of 3.2 million URLs that were requested for delisting from Google Search over five years. Our analysis reveals the countries and anonymized parties generating the largest volume of requests (just 1,000 requesters generated 16% of requests); the news, government, social media, and directory sites most frequently targeted for delisting (17% of removals relate to a requester’s legal history including crimes and wrongdoing); and the prevalence of extraterritorial requests. Our results dramatically increase transparency around the Right to be Forgotten and reveal the complexity of weighing personal privacy against public interest when resolving multi-party privacy conflicts that occur across the Internet. The results of our investigation have since been added to Google’s transparency report.
View details
Rethinking the detection of child sexual abuse imagery on the Internet
Travis Bright
Michelle DeLaune
David M. Eliff
Nick Hsu
Lindsey Olson
John Shehan
Madhukar Thakur
(2019)
Preview abstract
Over the last decade, the illegal distribution of child sexual abuse imagery (CSAI) has transformed alongside the rise of online sharing platforms. In this paper, we present the first longitudinal measurement study of CSAI distribution online and the threat it poses to society's ability to combat child sexual abuse. Our results illustrate that CSAI has grown exponentially---to nearly 1 million detected events per month---exceeding the capabilities of independent clearinghouses and law enforcement to take action. In order to scale CSAI protections moving forward, we discuss techniques for automating detection and response by using recent advancements in machine learning.
View details
Protecting accounts from credential stuffing with password breach alerting
Jennifer Pullman
Kevin Yeo
Ananth Raghunathan
Patrick Gage Kelley
Borbala Benko
Sarvar Patel
Dan Boneh
Proceedings of the USENIX Security Symposium, Usenix (2019)
Preview abstract
Protecting accounts from credential stuffing attacks remains
burdensome due to an asymmetry of knowledge: attackers
have wide-scale access to billions of stolen usernames and
passwords, while users and identity providers remain in the
dark as to which accounts require remediation. In this paper,
we propose a privacy-preserving protocol whereby a client can
query a centralized breach repository to determine whether
a specific username and password combination is publicly
exposed, but without revealing the information queried. Here,
a client can be an end user, a password manager, or an identity
provider. To demonstrate the feasibility of our protocol, we
implement a cloud service that mediates access to over 4
billion credentials found in breaches and a Chrome extension
serving as an initial client. Based on anonymous telemetry
from nearly 670,000 users and 21 million logins, we find that
1.5% of logins on the web involve breached credentials. By
alerting users to this breach status, 26% of our warnings result
in users migrating to a new password, at least as strong as
the original. Our study illustrates how secure, democratized
access to password breach alerting can help mitigate one
dimension of account hijacking.
View details
Towards gender-equitable privacy and security in South Asia
Amna Batool
David Nemer
Nithya Sambasivan
Nova Ahmed
Sane Gaytán
Tara Matthews
IEEE Security & Privacy (2019)
Preview abstract
2017 marked the year when half the world went online. But women remain under-represented on the Internet. Nearly two-thirds of countries have more men than women online [1]. South Asia has one of the largest gender gaps when it comes to mobile and Internet access: 29% of users from India are women and they are 26% less likely than South Asian men to own a phone [2]. A large and growing population of nearly 760 million women live in India, Bangladesh, and Pakistan [3-5]. As a result a growing affordability and ease of access, women will comprise a significant proportion of new Internet users. As the gaps close online, there is enormous potential for security and privacy technologies to turn towards gender-equitable designs and enable women to equitably participate online.
View details
Data Breaches: User Comprehension, Expectations, and Concerns with Handling Exposed Data
SOUPS: Fourteenth Symposium on Usable Privacy and Security, USENIX (2018)
Preview abstract
Data exposed by breaches persist as a security and privacy threat for Internet users. Despite this, best practices for how companies should respond to breaches, or how to responsibly handle data after it is leaked, have yet to be identified. We bring users into this discussion through two surveys. In the first, we examine the comprehension of 551 participants on the risks of data breaches and their sentiment towards potential remediation steps. In the second survey, we ask 10,212 participants to rate their level of comfort towards eight different scenarios that capture real-world examples of security practitioners, researchers, journalists, and commercial entities investigating leaked data. Our findings indicate that users readily understand the risk of data breaches and have consistent expectations for technical and non-technical remediation steps. We also find that participants are comfortable with applications that examine leaked data---such as threat sharing or a "hacked or not'' service---when the application has a direct, tangible security benefit. Our findings help to inform a broader discussion on responsible uses of data exposed by breaches.
View details
Tracking Ransomware End-to-end
Danny Y. Huang
Maxwell Matthaios Aliapoulios
Vector Guo Li
Kylie McRoberts
Jonathan Levin
Kirill Levchenko
Alex C. Snoeren
Damon McCoy
Security & Privacy 2018 (2018)
Preview abstract
Ransomware is a type of malware that encrypts the
files of infected hosts and demands payment, often in a cryptocurrency
such as bitcoin. In this paper, we create a measurement
framework that we use to perform a large-scale, two-year,
end-to-end measurement of ransomware payments, victims, and
operators. By combining an array of data sources, including
ransomware binaries, seed ransom payments, victim telemetry
from infections, and a large database of bitcoin addresses
annotated with their owners, we sketch the outlines of this
burgeoning ecosystem and associated third-party infrastructure.
In particular, we trace the financial transactions, from the
moment victims acquire bitcoins, to when ransomware operators
cash them out. We find that many ransomware operators cashed
out using BTC-e, a now-defunct bitcoin exchange. In total we
are able to track over $16 million in likely ransom payments
made by 19,750 potential victims during a two-year period. While
our study focuses on ransomware, our methods are potentially
applicable to other cybercriminal operations that have similarly
adopted bitcoin as their payment channel.
View details
Pinning Down Abuse on Google Maps
Danny Y. Huang
Doug Grundman
Abhishek Kumar
Kirill Levchenko
Alex C. Snoeren
Proceedings of the International Conference on World Wide Web (WWW) (2017)
Preview abstract
In this paper, we investigate a new form of blackhat search engine optimization that targets local listing services like Google Maps. Miscreants register abusive business listings in an attempt to siphon search traffic away from legitimate businesses and funnel it to deceptive service industries---such as unaccredited locksmiths---or to traffic-referral scams, often for the restaurant and hotel industry. In order to understand the prevalence and scope of this threat, we obtain access to over a hundred-thousand business listings on Google Maps that were suspended for abuse. We categorize the types of abuse affecting Google Maps; analyze how miscreants circumvented the protections against fraudulent business registration such as postcard mail verification; identify the volume of search queries affected; and ultimately explore how miscreants generated a profit from traffic that necessitates physical proximity to the victim. This physical requirement leads to unique abusive behaviors that are distinct from other online fraud such as pharmaceutical and luxury product scams.
View details
Understanding the Mirai Botnet
Manos Antonakakis
Tim April
Michael Bailey
Matt Bernhard
Jaime Cochran
Zakir Durumeric
J. Alex Halderman
Michalis Kallitsis
Deepak Kumar
Chaz Lever
Zane Ma
Joshua Mason
Damian Menscher
Chad Seaman
Nick Sullivan
Yi Zhou
Proceedings of the 26th USENIX Security Symposium (2017)
Preview abstract
The Mirai botnet, composed primarily of embedded
and IoT devices, took the Internet by storm in late 2016
when it overwhelmed several high-profile targets with
massive distributed denial-of-service (DDoS) attacks. In
this paper, we provide a seven-month retrospective analysis
of Mirai’s growth to a peak of 600k infections and
a history of its DDoS victims. By combining a variety
of measurement perspectives, we analyze how the botnet
emerged, what classes of devices were affected, and
how Mirai variants evolved and competed for vulnerable
hosts. Our measurements serve as a lens into the fragile
ecosystem of IoT devices. We argue that Mirai may represent
a sea change in the evolutionary development of
botnets—the simplicity through which devices were infected
and its precipitous growth, demonstrate that novice
malicious techniques can compromise enough low-end
devices to threaten even some of the best-defended targets.
To address this risk, we recommend technical and nontechnical
interventions, as well as propose future research
directions.
View details
Data breaches, phishing, or malware? Understanding the risks of stolen credentials
Frank Li
Juri Ranieri
Yarik Markov
Vijay Eranti
Daniel Margolis
Vern Paxson
(2017)
Preview abstract
In this paper, we present the first longitudinal measurement study of the underground ecosystem fueling credential theft and assess the risk it poses to millions of users. Over the course of March, 2016--March, 2017, we identify 788,000 potential victims of off-the-shelf keyloggers; 12.4 million potential victims of phishing kits; and 1.9 billion usernames and passwords exposed via data breaches and traded on blackmarket forums. Using this dataset, we explore to what degree the stolen passwords---which originate from thousands of online services---enable an attacker to obtain a victim's valid email credentials---and thus complete control of their online identity due to transitive trust. Drawing upon Google as a case study, we find 7--25\% of exposed passwords match a victim's Google account. For these accounts, we show how hardening authentication mechanisms to include additional risk signals such as a user's historical geolocations and device profiles helps to mitigate the risk of hijacking. Beyond these risk metrics, we delve into the global reach of the miscreants involved in credential theft and the blackhat tools they rely on. We observe a remarkable lack of external pressure on bad actors, with phishing kit playbooks and keylogger capabilities remaining largely unchanged since the mid-2000s.
View details
Cloak of Visibility: Detecting When Machines Browse a Different Web
Alexandros Kapravelos
Proceedings of the 37th IEEE Symposium on Security and Privacy (2016)
Preview abstract
The contentious battle between web services and miscreants involved in blackhat search engine optimization and malicious advertisements has driven the underground to develop increasingly sophisticated techniques that hide the true nature of malicious sites. These web cloaking techniques hinder the effectiveness of security crawlers and potentially expose Internet users to harmful content. In this work, we study the spectrum of blackhat cloaking techniques that target browser, network, or contextual cues to detect organic visitors. As a starting point, we investigate the capabilities of ten prominent cloaking services marketed within the underground. This includes a first look at multiple IP blacklists that contain over 50 million addresses tied to the top five search engines and tens of anti-virus and security crawlers. We use our findings to develop an anti-cloaking system that detects split-view content returned to two or more distinct browsing profiles with an accuracy of 95.5% and a false positive rate of 0.9% when tested on a labeled dataset of 94,946 URLs. We apply our system to an unlabeled set of 135,577 search and advertisement URLs keyed on high-risk terms (e.g., luxury products, weight loss supplements) to characterize the prevalence of threats in the wild and expose variations in cloaking techniques across traffic sources. Our study provides the first broad perspective of cloaking as it affects Google Search and Google Ads and underscores the minimum capabilities necessary of security crawlers to bypass the state of the art in mobile, rDNS, and IP cloaking.
View details
Picasso: Lightweight Device Class Fingerprinting for Web Clients
Artem Malyshey
Workshop on Security and Privacy in Smartphones and Mobile Devices (2016)
Preview abstract
In this work we present Picasso: a lightweight device class fingerprinting protocol that allows a server to verify the software and hardware stack of a mobile or desktop client. As an example, Picasso can distinguish between traffic sent by an authentic iPhone running Safari on iOS from an emulator or desktop client spoofing the same configuration. Our fingerprinting scheme builds on unpredictable yet stable noise introduced by a client's browser, operating system, and graphical stack when rendering HTML5 canvases. Our algorithm is resistant to replay and includes a hardware-bound proof of work that forces a client to expend a configurable amount of CPU and memory to solve challenges. We demonstrate that Picasso can distinguish 52 million Android, iOS, Windows, and OSX clients running a diversity of browsers with 100% accuracy. We discuss applications of Picasso in abuse fighting, including protecting the Play Store or other mobile app marketplaces from inorganic interactions; or identifying login attempts to user accounts from previously unseen device classes.
View details
Users Really Do Plug in USB Drives They Find
Matthew Tischer
Zakir Durumeric
Sam Foster
Sunny Duan
Alec Mori
Michael Bailey
Security and Privacy, IEEE (2016)
Preview abstract
We investigate the anecdotal belief that end users will pick up and plug in USB flash drives they find by completing a controlled experiment in which we drop 297 flash drives on a large university campus. We find that the attack is effective with an estimated success rate of 45–98% and expeditious with the first drive connected in less than six minutes. We analyze the types of drives users connected and survey those users to understand their motivation and security profile. We find that a drive’s appearance does not increase attack success. Instead, users connect the drive with the altruistic intention of finding the owner. These individuals are not technically incompetent, but are rather typical community members who appear to take more recreational risks then their peers. We conclude with lessons learned and discussion on how social engineering attacks —while less technical— continue to be an effective attack vector that our community has yet to successfully address.
View details
Remedying Web Hijacking: Notification Effectiveness and Webmaster Comprehension
Frank Li
Grant Ho
Eric Kuan
Yuan Niu
Lucas Ballard
Vern Paxson
International World Wide Web Conference (2016)
Preview abstract
As miscreants routinely hijack thousands of vulnerable web servers weekly for cheap hosting and traffic acquisition, security services have turned to notifications both to alert webmasters of ongoing incidents as well as to expedite recovery. In this work we present the first large-scale measurement study on the effectiveness of combinations of browser, search, and direct webmaster notifications at reducing the duration a site remains compromised. Our study captures the life cycle of 760,935 hijacking incidents from July, 2014– June, 2015, as identified by Google Safe Browsing and Search Quality. We observe that direct communication with webmasters increases the likelihood of cleanup by over 50% and reduces infection lengths by at least 62%. Absent this open channel for communication, we find browser interstitials—while intended to alert visitors to potentially harmful content—correlate with faster remediation. As part of our study, we also explore whether webmasters exhibit the necessary technical expertise to address hijacking incidents. Based on appeal logs where webmasters alert Google that their site is no longer compromised, we find 80% of operators successfully clean up symptoms on their first appeal. However, a sizeable fraction of site owners do not address the root cause of compromise, with over 12% of sites falling victim to a new attack within 30 days. We distill these findings into a set of recommendations for improving web security and best practices for webmasters.
View details
The Abuse Sharing Economy: Understanding the Limits of Threat Exchanges
Rony Amira
Adi Ben-Yoash
Ori Folger
Amir Hardon
Ari Berger
Michael Bailey
Proceedings of the International Symposium on Research in Attacks, Intrusions and Defenses (2016)
Preview abstract
The underground commoditization of compromised hosts suggests a tacit capability where miscreants leverage the same machine---subscribed by multiple criminal ventures---to simultaneously profit from spam, fake account registration, malicious hosting, and other forms of automated abuse. To expedite the detection of these commonly abusive hosts, there are now multiple industry-wide efforts that aggregate abuse reports into centralized threat exchanges. In this work, we investigate the potential benefit of global reputation tracking and the pitfalls therein. We develop our findings from a snapshot of 45 million IP addresses abusing six Google services including Gmail, YouTube, and ReCaptcha between April 7--April 21, 2015. We estimate the scale of end hosts controlled by attackers, expose underground biases that skew the abuse perspectives of individual web services, and examine the frequency that criminals re-use the same infrastructure to attack multiple, heterogeneous services. Our results indicate that an average Google service can block 14% of abusive traffic based on threats aggregated from seemingly unrelated services, though we demonstrate that outright blacklisting incurs an untenable volume of false positives.
View details
Investigating Commercial Pay-Per-Install and the Distribution of Unwanted Software
Ryan Rasti
Cait Phillips
Marc-André (MAD) Decoste
Chris Sharp
Fabio Tirelo
Ali Tofigh
Marc-Antoine Courteau
Lucas Ballard
Robert Shield
Nav Jagpal
Niels Provos
Damon McCoy
Proceedings of the USENIX Security Symposium (2016)
Preview abstract
In this work, we explore the ecosystem of commercial pay-per-install (PPI) and the role it plays in the proliferation of unwanted software. Commercial PPI enables companies to bundle their applications with more popular software in return for a fee, effectively commoditizing access to user devices. We develop an analysis pipeline to track the business relationships underpinning four of the largest commercial PPI networks and classify the software families bundled. In turn, we measure their impact on end users and enumerate the distribution techniques involved. We find that unwanted ad injectors, browser settings hijackers, and cleanup utilities dominate the software families buying installs. Developers of these families pay $0.10--$1.50 per install---upfront costs that they recuperate by monetizing users without their consent or by charging exorbitant subscription fees. Based on Google Safe Browsing telemetry, we estimate that PPI networks drive over 60 million download attempts every week---nearly three times that of malware. While anti-virus and browsers have rolled out defenses to protect users from unwanted software, we find evidence that PPI networks actively interfere with or evade detection. Our results illustrate the deceptive practices of some commercial PPI operators that persist today.
View details
A Comparison of Questionnaire Biases Across Sample Providers
Victoria Sosik
American Association for Public Opinion Research, 2015 Annual Conference (2015)
Preview abstract
Survey research, like all methods, is fraught with potential sources of error that can significantly affect the validity and reliability of results. There are four major types of error common to surveys as a data collection method: (1) coverage error arising from certain segments of a target population being excluded, (2) nonresponse error where not all those selected for a sample respond, (3) sampling error which results from the fact that surveys only collect data from a subset of the population being measured, and (4) measurement error. Measurement error can arise from the wording and design of survey questions (i.e., instrument error), as well as the variability in respondent ability and motivation (i.e., respondent error) [17].
This paper focuses primarily on measurement error as a source of bias in surveys. It is well established that instrument error [34, 40] and respondent error (e.g., [21]) can yield meaningful differences in results. For example, variations in response order, response scales, descriptive text, or images used in a survey can lead to instrument error which can result in skewed response distributions. Certain types of questions can trigger other instrument error biases, such as the tendency to agree with statements presented in an agree/disagree format (acquiescence bias) or the hesitancy to admit undesirable behaviors or overreport desirable behaviors (social desirability bias). Respondent error is largely related to the amount of cognitive effort required to answer a survey and arises when respondents are either unable or unwilling to exert the required effort [21].
Such measurement error has been compared across survey modes, such as face-to-face, telephone, and Internet (e.g., [9, 18]), but little work has compared different Internet samples, such as crowdsourcing task platforms (e.g., Amazon’s Mechanical Turk), paywall surveys (e.g., Google Consumer Surveys), opt-in panels (e.g., Survey Sampling International), and probability based panels (e.g., the Gfk KnowledgePanel). Because these samples differ in recruiting, context, and incentives, respondents may be more or less motivated to effortfully respond to questions, leading to different degrees of bias in different samples. The specific instruments deployed to respondents in these different modes can also exacerbate the situation by requiring more or less cognitive effort to answer satisfactorily.
The present study has two goals:
Investigate the impact of question wording on response distributions in order to measure the strength of common survey biases arising from instrument and respondent error
Compare the variance in the degree of these biases across Internet survey samples with differing characteristics in order to determine whether certain types of samples are more susceptible to certain biases than others.
View details
Neither Snow Nor Rain Nor MITM ... An Empirical Analysis of Email Delivery Security
Zakir Durumeric
David Adrian
Ariana Mirian
James Kasten
Nicolas Lidzborski
Vijay Eranti
Michael Bailey
J. Alex Halderman
Proceedings of the Internet Measurement Conference (2015)
Preview abstract
The SMTP protocol is responsible for carrying some of users most intimate communication, but like other Internet protocols, authentication and confidentiality were added only as an afterthought. In this work, we present the first report on global adoption rates of SMTP security extensions, including: STARTTLS, SPF, DKIM, and DMARC. We present data from two perspectives: SMTP server configurations for the Alexa Top Million domains, and over a year of SMTP connections to and from Gmail. We find that the top mail providers (e.g., Gmail, Yahoo, and Outlook) all proactively encrypt and authenticate messages. However, these best practices have yet to reach widespread adoption in a long tail of over 700,000 SMTP servers, of which only 35% successfully configure encryption, and 1.1% specify a DMARC authentication policy. This security patchwork -- paired with SMTP policies that favor failing open to allow gradual deployment -- exposes users to attackers who downgrade TLS connections in favor of cleartext and who falsify MX records to reroute messages. We present evidence of such attacks in the wild, highlighting seven countries where more than 20% of inbound Gmail messages arrive in cleartext due to network attackers.
View details
Understanding Sensitivity by Analyzing Anonymity
Aleksandra Korolova
IEEE Security & Privacy, vol. 13 (2015), pp. 14-21
Preview abstract
The range of topics that users of online services consider sensitive is often broader than what service providers or regulators deem sensitive. A data-driven approach can help providers improve products with features that let users exercise privacy preferences more effectively.
View details
Ad Injection at Scale: Assessing Deceptive Advertisement Modifications
Chris Grier
Grant Ho
Nav Jagpal
Alexandros Kapravelos
Damon McCoy
Antonio Nappa
Vern Paxson
Paul Pearce
Niels Provos
Proceedings of the IEEE Symposium on Security and Privacy (2015)
Preview abstract
Today, web injection manifests in many forms, but fundamentally occurs when malicious and unwanted actors tamper directly with browser sessions for their own profit. In this work we illuminate the scope and negative impact of one of these forms, ad injection, in which users have ads imposed on them in addition to, or different from, those that websites originally sent them. We develop a multi-staged pipeline that identifies ad injection in the wild and captures its distribution and revenue chains. We find that ad injection has entrenched itself as a cross-browser monetization platform impacting more than 5% of unique daily IP addresses accessing Google—tens of millions of users around the globe. Injected ads arrive on a client’s machine through multiple vectors: our measurements identify 50,870 Chrome extensions and 34,407 Windows binaries, 38% and 17% of which are explicitly malicious. A small number of software developers support the vast majority of these injectors who in turn syndicate from the larger ad ecosystem. We have contacted the Chrome Web Store and the advertisers targeted by ad injectors to alert each of the deceptive practices involved.
View details
Secrets, Lies, and Account Recovery: Lessons from the Use of Personal Knowledge Questions at Google
Joseph Bonneau
Ilan Caron
Rob Jackson
Mike Williamson
WWW'15 - Proceedings of the 22nd international conference on World Wide Web, ACM (2015)
Preview abstract
We examine the first large real-world data set on personal knowledge question's security and memorability from their deployment at Google. Our analysis confirms that secret questions generally offer a security level that is far lower than user-chosen passwords. It turns out to be even lower than proxies such as the real distribution of surnames in the population would indicate. Surprisingly, we found that a significant cause of this insecurity is that users often don't answer truthfully. A user survey we conducted revealed that a significant fraction of users (37%) who admitted to providing fake answers did so in an attempt to make them "harder to guess" although on aggregate this behavior had the opposite effect as people "harden" their answers in a predictable way.
On the usability side, we show that secret answers have surprisingly poor memorability despite the assumption that reliability motivates their continued deployment. From millions of account recovery attempts we observed a significant fraction of users (e.g 40\% of our English-speaking US users) were unable to recall their answers when needed. This is lower than the success rate of alternative recovery mechanisms such as SMS reset codes (over 80%).
Comparing question strength and memorability reveals that the questions that are potentially the most secure (e.g what is your first phone number) are also the ones with the worst memorability.
We conclude that it appears next to impossible to find secret questions that are both secure and memorable. Secret questions continue have some use when combined with other signals, but they should not be used alone and best practice should favor more reliable alternatives.
View details
Framing Dependencies Introduced by Underground Commoditization
Danny Huang
David Wang
Chris Grier
Thomas J. Holt
Christopher Kruegel
Damon McCoy
Stefan Savage
Giovanni Vigna
Workshop on the Economics of Information Security (2015)
Preview abstract
Internet crime has become increasingly dependent on the underground economy: a loose federation of specialists selling capabilities, services, and resources explicitly tailored to the abuse ecosystem. Through these emerging markets, modern criminal entrepreneurs piece together dozens of à la carte components into entirely new criminal endeavors. From an abuse fighting perspective, criminal reliance on this black market introduces fragile dependencies that, if disrupted, undermine entire operations that as a composite appear intractable to protect against. However, without a clear framework for examining the costs and infrastructure behind Internet crime, it becomes impossible to evaluate the effectiveness of novel intervention strategies.
In this paper, we survey a wealth of existing research in order to systematize the community’s understanding of the underground economy. In the process, we develop a taxonomy of profit centers and support centers for reasoning about the flow of capital (and thus dependencies) within the black market. Profit centers represent activities that transfer money from victims and institutions into the underground. These activities range from selling products to unwitting customers (in the case of spamvertised products) to outright theft from victims (in case of financial fraud). Support centers provide critical resources that other miscreants request to streamline abuse. These include exploit kits, compromised credentials, and even human services (e.g., manual CAPTCHA solvers) that have no credible non-criminal applications. We use this framework to contextualize the latest intervention strategies and their effectiveness. In the end, we champion a drastic departure from solely focusing on protecting users and systems (tantamount to a fire fight) and argue security practitioners must also strategically disrupt frail underground relationships that underpin the entire for-profit abuse ecosystem--including actors, infrastructure, and access to capital.
View details
Easy Does It: More Usable CAPTCHAs
Celine Fabry
Steven Bethard
John C. Mitchell
Dan Jurafasky
CHI '14 Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, ACM, 1600 Amphitheatre Pkwy (2014), pp. 2637-2646
Preview abstract
Websites present users with puzzles called CAPTCHAs to curb abuse caused by computer algorithms masquerading as people. While CAPTCHAs are generally effective at stopping abuse, they might impair website usability if they are not properly designed. In this paper we describe how we designed two new CAPTCHA schemes for Google that focus on maximizing usability. We began by running an evaluation on Amazon Mechanical Turk with over 27,000 respondents to test the us- ability of different feature combinations. Then we studied user preferences using Google’s consumer survey infrastructure. Finally, drawing on the insights gleaned during those studies, we tested our new captcha schemes first on Mechanical Turk and then on a fraction of production traffic. The resulting scheme is now an integral part of our production system and is served to millions of users. Our scheme achieved a 95.3% human accuracy, a 6.7% improvement.
View details
Dialing Back Abuse on Phone Verified Accounts
Dmytro Iatskiv
Chris Grier
Damon McCoy
Proceedings of the 21st ACM Conference on Computer and Communications Security (2014)
Preview abstract
In the past decade the increase of for-profit cybercrime has given rise to an entire underground ecosystem supporting large-scale abuse, a facet of which encompasses the bulk registration of fraudulent accounts. In this paper, we present a 10 month longitudinal study of the underlying technical and financial capabilities of criminals who register phone verified accounts (PVA). To carry out our study, we purchase 4,695 Google PVA as well as acquire a random sample of 300,000 Google PVA through a collaboration with Google. We find that miscreants rampantly abuse free VOIP services to circumvent the intended cost of acquiring phone numbers, in effect undermining phone verification. Combined with short lived phone numbers from India and Indonesia that we suspect are tied to human verification farms, this confluence of factors correlates with a market-wide price drop of 30--40% for Google PVA until Google penalized verifications from frequently abused carriers. We distill our findings into a set of recommendations for any services performing phone verification as well as highlight open challenges related to PVA abuse moving forward.
View details
Handcrafted Fraud and Extortion: Manual Account Hijacking in the Wild
Borbala Benko
Daniel Margolis
Andy Archer
Allan Aquino
Andreas Pitsillidis
Stefan Savage
IMC '14 Proceedings of the 2014 Conference on Internet Measurement Conference, ACM, 1600 Amphitheatre Parkway, pp. 347-358
Preview abstract
Online accounts are inherently valuable resources---both for the data they contain and the reputation they accrue over time. Unsurprisingly, this value drives criminals to steal, or hijack, such accounts. In this paper we focus on manual account hijacking---account hijacking performed manually by humans instead of botnets. We describe the details of the hijacking workflow: the attack vectors, the exploitation phase, and post-hijacking remediation. Finally we share, as a large online company, which defense strategies we found effective to curb manual hijacking.
View details
Cloak and Swagger: Understanding Data Sensitivity through the Lens of User Anonymity
Preview
Aleksandra Korolova
2014 IEEE Symposium on Security and Privacy, SP 2014, Berkeley, CA, USA, May 18-21, 2014, IEEE Computer Society, pp. 493-508
The End is Nigh: Generic Solving of Text-based CAPTCHAs
Jonathan Aigrain
John C. Mitchell
WOOT'14 Proceedings of the 8th USENIX conference on Offensive Technologies, Usenix (2014)
Preview abstract
Over the last decade, it has become well-established that a captcha’s ability to withstand automated solving lies in the difficulty of segmenting the image into individual characters. The standard approach to solving captchas automatically has been a sequential process wherein a segmentation algorithm splits the image into segments that contain individual characters, followed by a character recognition step that uses machine learning. While this approach has been effective against particular captcha schemes, its generality is limited by the segmentation step, which is hand-crafted to defeat the distortion at hand. No general algorithm is known for the character collapsing anti-segmentation technique used by most prominent real world captcha schemes.
This paper introduces a novel approach to solving captchas in a single step that uses machine learning to attack the segmentation and the recognition problems simultaneously. Performing both operations jointly allows our algorithm to exploit information and context that is not available when they are done sequentially. At the same time, it removes the need for any hand-crafted component, making our approach generalize to new captcha schemes where the previous approach can not. We were able to solve all the real world captcha schemes we evaluated ac- curately enough to consider the scheme insecure in practice, including Yahoo (5.33%) and ReCaptcha (33.34%), without any adjustments to the algorithm or its parameters. Our success against the Baidu (38.68%) and CNN (51.09%) schemes that use occluding lines as well as character collapsing leads us to believe that our approach is able to defeat occluding lines in an equally general manner. The effectiveness and universality of our results suggests that combining segmentation and recognition is the next evolution of captcha solving, and that it supersedes the sequential approach used in earlier works. More generally, our approach raises questions about how to develop sufficiently secure captchas in the future.
View details
Online Microsurveys for User Experience Research
Preview
Victoria Schwanda Sosik
Gueorgi Kossinets
Kerwell Liao
Paul McDonald
CHI '14 Extended Abstracts on Human Factors in Computing Systems (2014)
No Results Found