Publications

Our teams aspire to make discoveries that impact everyone, and core to our approach is sharing our research and tools to fuel progress in the field.

people standing in front of a screen with images and a chipboard

Our teams aspire to make discoveries that impact everyone, and core to our approach is sharing our research and tools to fuel progress in the field.

Sort By
  • Title
  • Title, descending
  • Year
  • Year, descending
1 - 15 of 11347 publications
LLM-Powered Analysis of IoT User Reviews: Tracking and Ranking Security and Privacy Concerns
Taufiq Islam Protick
Anupam Das
Proceedings of the International AAAI Conference on Web and Social Media (ICWSM) (2026)
Preview abstract Being able to understand the security and privacy (S&P) concerns of IoT users brings benefits to both developers and users. To learn about users' views, we examine Amazon IoT reviews - one of the biggest IoT markets. This work presents a state-of-the-art methodology to identify and categorize reviews in which users express S&P concerns. We developed an automated pipeline by fine-tuning GPT-3.5-Turbo to build two models: the Classifier-Rationalizer-Categorizer and the Thematic Mapper. By leveraging dynamic few-shot prompting and the model's large context size, our pipeline achieved over 97% precision and recall, significantly outperforming keyword-based and classical ML methods. We applied our pipeline to 91K Amazon reviews about fitness trackers, smart speakers and cameras, over multiple years. We found that on average 5% contained S&P concerns, while security camera exhibited the highest prevalence at 10%. Our method detected significantly more S&P-relevant reviews than prior works: 15x more for fitness trackers, 29% more for smart speakers, and 70% more for cameras. Our longitudinal analysis reveals that concerns like surveillance and data control have persisted for years, suggesting limited industry progress. We demonstrate that across all device types, users consistently demand more precise control over what data is collected and shared. We uncover challenges in multi-user and multi-device interactions, identifying two previously unreported themes concerning inadequate controls for account separation and data access. These findings, ranging from broad persistent trends to specific instances of customer loss, offer actionable insights for developers to improve user satisfaction and trust. View details
Preview abstract Generative AI is reshaping software development, yet its psychological impact remains under-researched. During May and August 2025 we conducted reflexive thematic analysis of interviews with 12 senior engineers (≥5 years experience) recruited from Western technology hubs to explore shifts in professional identity. We identify a central transition from "coder to conductor," where AI acts as a cognitive partner. Key findings include: (1) a re-architecting of focus from implementation to strategy; (2) a shift in productivity metrics from output to impact; and (3) a dual-impact on agency, where AI empowers autonomy but threatens competence through de-skilling anxieties. These findings suggest that as implementation becomes commoditised, organisational training and career progression must prioritise architectural mastery and metacognitive oversight to ensure sustained developer motivation and system integrity. View details
Pixel Watch: Robust Heart Rate Sensing from Multipath PPG and On-Device Deep Learning Trained on 10,000 hours of Free-Living and Fitness Data
Megan Walker
Yojan Patel
Shyam Tailor
Matt Wimmer
Brennan Garrett
Dan Howe
Hamed Vavadi
Tien Le
Steve Diamond
Oleksiy Vyalov
Vik Sharma
Pete Richards
Tracy Giest
Erika Siegel
Tuan Phan
Sam Mravca
Derrick Vickers
Benjamin Stone
Katarina Vukosavljević
Justin Phillips
YongSuk Cho
Stefanie Hollidge
Antony Siahaan
Soren Brage
Shwetak Patel
Robert Harle
IEEE Sensors Letters (2026)
Preview abstract The Pixel Watch 2 (PW2) is the first Google smartwatch to combine multipath photoplethysmography (PPG) with deep learning-based heart rate inference, designed to significantly improve sensing accuracy during motion-heavy activities. The device processes 10 optical channels using an on-device, 15-layer temporally dilated convolutional neural network (~300K parameters) to yield a 1 Hz heart rate output. Crucial to this model's performance was its training on a massive dataset comprising 10,000 hours of data from 962 participants, curated from a broader corpus of controlled and free-living activities. We evaluated the PW2's sensing performance across two independent validation sets: an in-house fitness dataset (229 participants, 250 hours) and an external free-living dataset (27 participants, 1000+ hours). The system achieved 95% Limits of Agreement of -10.34 to 8.66 BPM during exercise and -6.57 to 7.48 BPM during free-living activities, demonstrating substantially tighter error margins than previous Google devices. Finally, we discuss key design lessons, emphasizing that large-scale deep learning was instrumental in fully leveraging multipath PPG hardware over traditional signal processing approaches. View details
Preview abstract The management of a hybrid workforce comprising human and autonomous computational agents may be challenged by the use of separate systems for human capital and software assets, which can create a governance gap. A system can provide a unified framework for managing a hybrid workforce. For example, the system may utilize a labor service mesh to analyze and route tasks to either a human intent tier or an agentic execution tier. A potential principle of the system is structural symmetry, where computational agents can be assigned digital identities and managed through a lifecycle process that may parallel human resource functions, such as onboarding, performance evaluation, and structured offboarding. This integrated approach can facilitate a unified system of record and governance model for an organization's intelligence capacity. View details
Preview abstract Deep-learning methods have boosted the analytical power of Raman spectroscopy, yet they still require large, task-specific, labeled datasets and often fail to transfer across application domains. The study explores pre-trained encoders as a solution. Pre-trained encoders have significantly impacted Natural Language Processing and Computer Vision with their ability to learn transferable representations that can be applied to a variety of datasets, significantly reducing the amount of time and data required to create capable models. The following work puts forward a new approach that applies these benefits to Raman Spectroscopy. The proposed approach, RSPTE (Raman Spectroscopy Pre-Trained Encoder), is designed to learn generalizable spectral representations without labels. RSPTE employs a novel domain adaptation strategy using unsupervised Barlow Twins decorrelation objectives to learn fundamental spectral patterns from multi-domain Raman Spectroscopy datasets containing samples from medicine, biology, and mineralogy. Transferability is demonstrated through evaluation on several models created by fine-tuning RSPTE for different application domains: Medicine (detection of Melanoma and COVID), Biology (Pathogen Identification), and Agriculture. As an example, using only 20% of the dataset, models trained with RSPTE achieve accuracies ranging 50%–86% (depending on the dataset used) while without RSPTE the range is 9%–57%. Using the full dataset, accuracies with RSPTE range 81%–97%, and without pretraining 51%–97%. Current methods and state-of-the-art models in Raman Spectroscopy are compared to RSPTE for context, and RSPTE exhibits competitive results, especially with less data as well. These results provide evidence that the proposed RSPTE model can effectively learn and transfer generalizable spectral features across different domains, achieving accurate results with less data in less time (both data collection time and training time). View details
Preview abstract Here’s a thought experiment. Say I wave a magic wand across a codebase and an entire class of technical debt, poof, goes away and immediately evaporates if introduced in the future. For example, maybe I make it so that dead feature flags are simply no longer a problem: they just delete themselves as soon as the engineer wills it. Or maybe large-scale migrations just migrate themselves. Maybe we magically have 100% test coverage, without an engineer lifting a finger. What will happen to developer productivity? Surely, developer productivity increases overall. But will the productivity metrics that we all use as a proxy for “developer productivity” move up and to the right. Let’s explore this idea. View details
Differential Sensitivity of Impedance Plethysmography and Photoplethysmography Sensors to Temperature-Induced Peripheral Vasoconstriction
Seobin Jung
Alexandros Pantelopoulos
Lindsey Sunden
Pete Richards
Shwetak Patel
Sam Sheng
Scientific Reports (2026)
Preview abstract Impedance plethysmography (IPG) and photoplethysmography (PPG) are non-invasive techniques for measuring blood volume changes. This study investigated the differential responses of IPG and PPG to temperature-mediated vasoconstriction induced by localized cooling. Twenty-one participants underwent control and treatment conditions, with fake or real ice cubes applied to the forearm. Blood pressure remained stable, while heart rate decreased. PPG signal amplitude significantly decreased with cooling (p_adj = 0.004), indicating sensitivity to superficial blood flow changes. In contrast, IPG signal amplitude remained stable (p_adj = 1.0). No statistically significant differences were observed in timing-derived metrics. These findings suggest IPG is less sensitive to superficial changes in blood flow than PPG, and may be more suitable for monitoring deeper blood flow. This study provides insights into the distinct sensitivities of IPG and PPG, with implications for wearable device development and cardiovascular monitoring. View details
Tech Worker Challenges Managing Humanlike GenAI
Eric Corbett
Proceedings of the 2026 CHI Conference on Human Factors in Computing Systems, ACM (2026), pp. 1-18
Preview abstract Organizations are adopting or exploring anthropomorphic genAI — meaning XYZ. Anthropomorphic AI is often held up for its potential to improve the productivity and efficiency of workers and technologies; however, there are not yet accepted industry-wide standards for the responsible development of anthropomorphic technologies. Given their roles as central figures responsible for implementing anthropomorphic genAI into technologies that are served to the broader public, we must understand workers’ reasoning about anthropomorphic genAI to understand its impacts. However, there is a dearth of empirical knowledge about technology workers’ perspectives on anthropomorphic technologies, including their perspectives on potential risks and benefits. To address this gap, we conducted focus groups with 31 technology workers across 6 job roles (UX, software engineers, product managers, designers, marketing, and trust and safety) regarding how they define anthropomorphic genAI, their perceptions of anthropomorphic genAI, and their experiences working with anthropomorphic genAI. We find that workers’ have expansive definitions of what constitutes “humanlike” AI, which at times sit in tension with each other. They draw on their personal and professional standpoints to sensemake about real and possible anthropomorphic genAI hazards to people, knowledge work fields, and society at-large. Importantly, we find that these social hazards map to different facets of anthropomorphic genAI, suggesting that effective mitigation of personal and social risks requires developer attention to specific dimensions of anthropomorphism. We mapped the relationships between dimensions of anthropomorphism and hazards, to support technology workers. We argue that effective mitigation of the risks of anthropomorphism requires attention to the multiple facets of anthropomorphism. View details
Who Controls the Curriculum for AI? The Limits of Participatory Design for Educational AI
Michael Madaio
Learning Under Algorithmic Conditions, University of Minnesota Press (2026)
Preview abstract Participatory design is a long-standing effort to shift control over technology design from technologists to users and communities impacted by technologies. For educational AI, this means involving students, families, teachers, and other stakeholders in shaping the design of AI systems. While promising, in this article, I situate the recent calls for participatory design of educational AI systems within a different historical tradition—that of contests over local control of educational curricula. I argue that approaches that attempt to steer the design and development of educational AI through participatory methods may inadvertently reproduce the history of political contestation of educational curricula, in ways that may privilege the most powerful communities, rather than those inequitably impacted. What might it look like to treat participatory AI design as a site for political contestation? How might these approaches avoid reproducing the same majoritarian tendencies that led to educational inequities in the first place? View details
Preview abstract This paper demonstrates that artificial intelligence can accelerate mathematical discovery by autonomously solving an open problem in theoretical physics. We present a neuro-symbolic system, combining the Gemini Deep Think large language model with a systematic Tree Search (TS) framework and automated numerical feedback, that successfully derived novel, exact analytical solutions for the power spectrum of gravitational radiation emitted by cosmic strings. Specifically, the agent evaluated the core integral for arbitrary loop geometries, directly improving upon recent AI-assisted attempts that only yielded partial asymptotic solutions. To substantiate our methodological claims regarding AI-accelerated discovery and to ensure transparency, we detail system prompts, search constraints, and intermittent feedback loops that guided the model. The agent identified a suite of 6 different analytical methods, the most elegant of which expands the kernel in Gegenbauer polynomials to naturally absorb the integrand's singularities. The methods lead to an asymptotic result for at large that both agrees with numerical results and also connects to the continuous Feynman parameterization of Quantum Field Theory. We detail both the algorithmic methodology that enabled this discovery and the resulting mathematical derivations. View details
Preview abstract This study examines the psychological and ethical implications of generative-AI chatbot use among youth, introducing the CTRL framework (Cognitive Trust, Reliance, and Learning Diminution) to explain how repeated use fosters cognitive offloading and reduced verification behavior. Survey data from 420 participants analyzed through factor analysis and structural equation modeling reveal that higher trust predicts greater reliance and diminished critical evaluation, alongside elevated concerns around privacy and academic integrity. Findings highlight the need for AI literacy and responsible design to mitigate unintended cognitive impacts. View details
Preview abstract A growing body of qualitative research has identified contextual risk factors that elevate people’s chances of experiencing digital-safety attacks. However, the lack of quantitative data on the population level distribution of these risk factors prevents policymakers and tech companies from developing targeted, evidence-based interventions to improve digital safety. To address this gap, we surveyed 5,001 adults in the United States to analyze: (1) the frequency of and relationship between digital-safety attacks (e.g., scams, harassment, account hacking), and (2) how these attacks align with 10 contextual risk factors. Nearly half of our respondents identify as resource constrained, which significantly correlates with higher likelihood of experiencing four common attacks. We also present qualitative insights to expand our understanding of the factors beyond the existing literature (e.g., “prominence” included high-visibility roles in local communities). This study provides the first large-scale quantitative analysis correlating digital-safety attacks with contextual risk factors and demographics. View details
Preview abstract Generative AI (GenAI) is evolving from standalone tools to interconnected ecosystems that integrate chatbots, cloud platforms, and third-party services. While this ecosystem model enables personalization and extended services, it also introduces complex information flows and amplifies privacy risks. Existing solutions focus on system-level protections, offering little support for users to make meaningful privacy choices. To address this gap, we conducted two vignette-based survey studies with 486 participants and a followup interview study with 16 participants. We also explored users’ needs and preferences for privacy choice design across both GenAI personalization and data-sharing. Our results reveal paradoxical patterns: participants sometimes trusted third-party ecosystems more for personalization but perceived greater control in first-party ecosystems when data was shared externally. We discuss design implications for privacy choice interfaces that enhance transparency, control, and trust in GenAI ecosystems. View details
Fair Allocation of Indivisible Goods with Variable Groups
Paul Golz
Warut Suksompong
Ayumi Igarashi
AAAI (2026)
Preview abstract We study the fair allocation of indivisible goods with variable groups. In this model, the goal is to partition the agents into groups of given sizes and allocate the goods to the groups in a fair manner. We show that for any number of groups and corresponding sizes, there always exists an envy-free up to one good (EF1) outcome, thereby generalizing an important result from the individual setting. Our result holds for arbitrary monotonic utilities and comes with an efficient algorithm. We also prove that the EF1 existence can be guaranteed even when the goods lie on a path and each group must receive a connected bundle. In addition, we consider a probabilistic model where the utilities are additive and drawn randomly from a distribution. We show that if there are n agents and the number of goods m is divisible by the number of groups k, then an envy-free outcome exists with high probability if m = ω(log n), and this bound is tight. On the other hand, if m is not divisible by k, then an envy-free outcome is unlikely to exist as long as m = o(√n). View details
Preview abstract Voice activity detection (VAD) plays a vital role in enabling applications such as speech recognition. We analyze the impact of window size on the accuracy of three VAD algorithms: Silero, WebRTC, and Root Mean Square (RMS) across a set of diverse real-world digital audio streams. We additionally explore the use of hysteresis on top of each VAD output. Our results offer practical references for optimizing VAD systems. Silero significantly outperforms WebRTC and RMS, and hysteresis provides a benefit for WebRTC. View details
×