Publications

Our teams aspire to make discoveries that impact everyone, and core to our approach is sharing our research and tools to fuel progress in the field.

people standing in front of a screen with images and a chipboard

Our teams aspire to make discoveries that impact everyone, and core to our approach is sharing our research and tools to fuel progress in the field.

Sort By
  • Title
  • Title, descending
  • Year
  • Year, descending
1 - 15 of 10463 publications
    Fine-grained Measurement of Vehicle Delay Fairness
    Eliav Buchnik
    Tom Kalvari
    Jack Haddad
    Dan Karliner
    Danny Veikherman
    Ron Tsibulsky
    Shai Ferster
    Ori Rottenstreich
    2025
    Preview abstract Optimizing signal timing in traffic lights helps to improve traffic flow and reduce emissions through reducing delays. At intersections, vehicles from different movements observe different delays impacted by the traffic light plan. This paper analyzes delay fairness among various vehicles at intersections. We refer to three cities: Rio de Janeiro, Hamburg and Seattle with a total number of over 5100 intersections. We present an intuitive methodology to compute delay fairness based on Gini index, a common fairness measure in economics. We evaluate the fairness based on real traffic data and provide insights on the relationship of fairness with day hours and traffic demand. We also examine real changes in traffic light plans that occurred in practice to check whether improving delay is often aligned with increasing fairness. View details
    Preview abstract We introduce efficient differentially private (DP) algorithms for several linear algebraic tasks, including solving linear equalities over arbitrary fields, linear inequalities over the reals, and computing affine spans and convex hulls. As an application, we obtain efficient DP algorithms for learning halfspaces and affine subspaces. Our algorithms addressing equalities are strongly polynomial, whereas those addressing inequalities are weakly polynomial. Furthermore, this distinction is inevitable: no DP algorithm for linear programming can be strongly polynomial-time efficient. View details
    Preview abstract Initially conceived as a way to explain memory sharing in romantic couples, the concept of transactive memory systems (TMS) has been adopted by organizational psychology, information management, and other fields of study to examine team performance in corporate settings. While findings highlight a clear advantage for humans teams with TMS, it's not evident if AI-human teams could also develop such a psychological dynamic. This paper considers AI-human interaction through the lens of TMS and identifies potential opportunities for improvement in this area. View details
    PROTECT: A Framework to Foster Digital Resilience for Youth Navigating Technology-Facilitated Abuse
    Diana Freed
    Natalie Bazarova
    Dan Cosley
    Patrick Gage Kelley
    Social Sciences Journal, 14(6) (2025)
    Preview abstract Youth are increasingly exposed to a broad range of technology-facilitated abuse that challenges their safety and well-being. Building on previous work that examined youth help-seeking behaviors, coping strategies, threats they encounter, and the social support systems around them, we articulate a framework— called PROTECT—Problem recognition, Reaching out, Organizing support, Training, Engaging experts, Continuous support, and Tackling safety measures—which integrates existing models of support, help-seeking, and digital skills to offer a high-level, structured approach to adults who serve as a support system to youth navigate technology-facilitated abuse. The framework unpacks social and contextual dynamics that influence help-seeking behaviors, providing a foundation for educators, advocates, health professionals, developers and other adult stakeholders to design and develop trauma-informed, timely interventions to promote resilience. View details
    Global earthquake detection and warning using Android phones
    Marc Stogaitis
    Youngmin Cho
    Richard Allen
    Boone Spooner
    Patrick Robertson
    Micah Berman
    Greg Wimpey
    Robert Bosch
    Nivetha Thiruverahan
    Steve Malkos
    Alexei Barski
    Science, 389 (2025), pp. 254-259
    Preview abstract Earthquake early-warning systems are increasingly being deployed as a strategy to reduce losses in earthquakes, but the regional seismic networks they require do not exist in many earthquake-prone countries. We use the global Android smartphone network to develop an earthquake detection capability, an alert delivery system, and a user feedback framework. Over 3 years of operation, the system detected an average of 312 earthquakes per month with magnitudes from M 1.9 to M 7.8 in Türkiye. Alerts were delivered in 98 countries for earthquakes with M ≥4.5, corresponding to ~60 events and 18 million alerts per month. User feedback shows that 85% of people receiving an alert felt shaking, and 36, 28, and 23% received the alert before, during, and after shaking, respectively. We show how smartphone-based earthquake detection algorithms can be implemented at scale and improved through postevent analysis. View details
    Neural Speech and Audio Coding
    Minje Kim
    IEEE Signal Processing Magazine, 41 (2025), pp. 85-93
    Preview abstract This paper explores the integration of model-based and data-driven approaches within the realm of neural speech and audio coding systems. It highlights the challenges posed by the subjective evaluation processes of speech and audio codecs and discusses the limitations of purely data-driven approaches, which often require inefficiently large architectures to match the performance of model-based methods. The study presents hybrid systems as a viable solution, offering significant improvements to the performance of conventional codecs through meticulously chosen design enhancements. Specifically, it introduces a neural network-based signal enhancer designed to post-process existing codecs’ output, along with the autoencoder-based end-to-end models and LPCNet—hybrid systems that combine linear predictive coding (LPC) with neural networks. Furthermore, the paper delves into predictive models operating within custom feature spaces (TF-Codec) or predefined transform domains (MDCTNet) and examines the use of psychoacoustically calibrated loss functions to train end-to-end neural audio codecs. Through these investigations, the paper demonstrates the potential of hybrid systems to advance the field of speech and audio coding by bridging the gap between traditional model-based approaches and modern data-driven techniques. View details
    Day-of-the-week Awareness in Time of Day Breakpoints for Traffic Light Plans
    Ori Rottenstreich
    Eliav Buchnik
    Shai Ferster
    Tom Kalvari
    Ron Tsibulsky
    Danny Veikherman
    Jack Haddad
    2025
    Preview abstract Time-of-day breakpoints (TODs) refer to the times over the day in which the plan of a traffic light is changed. Traditionally, TODs are selected jointly for all weekdays (Monday-Friday), typically with additional TODs dedicated to weekends. In this paper, we present an alternative approach motivated by traffic characteristics that can differ among the weekdays Monday-Friday and consider TODs which are day-of-the-week aware. The traffic-aware approach studies similarities among days and computes TODs that can be shared among days with similar characteristics but can also have other forms for weekdays with unique characteristics. Based on traffic properties derived from anonymized trajectories, we apply the new methodology to compute time-of-day breakpoints that are day-of-the-week aware in the city of Rio de Janeiro, Brazil and estimate the impact of the new methodology. View details
    Preview abstract Summary: Silent Data Corruption by 10x Test Escapes Threatens Reliable Computing" highlights a critical issue: manufacturing defects, dubbed "test escapes," are evading current testing methods at an alarming rate, ten times higher than industry targets. These defects lead to Silent Data Corruption (SDC), where applications produce incorrect outputs without error indications, costing companies significantly in debugging, data recovery, and service disruptions. The paper proposes a three-pronged approach: quick diagnosis of defective chips directly from system-level behaviors, in-field detection using advanced testing and error detection techniques like CASP, and new, rigorous test experiments to validate these solutions and improve manufacturing testing practices. View details
    Beyond Digital Literacy: Building Youth Digital Resilience Through Existing “Information Sensibility” Practices
    Mia Hassoun
    Ian Beacock
    Todd Carmody
    Patrick Gage Kelley
    Beth Goldberg
    Devika Kumar
    Laura Murray
    Rebekah Park
    Behzad Sarmadi
    Social Sciences Journal, 14(4) (2025)
    Preview abstract Youth media consumption and disordered eating practices have historically been subjects of moral panics, often resulting in protective, deficit-based interventions like content removal. We argue for interventions which instead equip youth to evaluate and manage risks in their online environments, building upon their existing “information sensibility” practices. Drawing upon ethnographic research and intervention testing with 77 participants in the US and India, we analyze how youth (aged 13–26), including those with diverse political perspectives and those recovering from disordered eating (DE), engage with online news and health information. Participants generally algorithmically encountered (rather than searched for) information online, and their engagement was shaped more by social motivations—like belonging—than truth seeking. Participants interpreted online information collaboratively, relying on social cues and peer validation within their online communities. They demonstrated preference for personal testimonies and relatable sources, particularly those with similar social identities. We propose resilience-building interventions that build upon these youth online information practices by: (1) leveraging peer networks, promoting critical information engagement through collaborative learning and peer-to-peer support within online communities; (2) developing social media sensibility, equipping youth to critically evaluate information sources in situ; (3) providing pathways offline, connecting youth to desired in-person communities; and (4) encouraging probabilistic thinking. View details
    Preview abstract Too many defective compute chips are escaping today’s manufacturing tests – at least an order of magnitude more than industrial targets across all compute chip types in data centers. Silent data corruptions (SDCs) caused by test escapes, when left unaddressed, pose a major threat to reliable computing. We present a three-pronged approach outlining future directions for overcoming test escapes: (a) Quick diagnosis of defective chips directly from system-level incorrect behaviors. Such diagnosis is critical for gaining insights into why so many defective chips escape existing manufacturing testing. (b) In-field detection of defective chips. (c) New test experiments to understand the effectiveness of new techniques for detecting defective chips. These experiments must overcome the drawbacks and pitfalls of previous industrial test experiments and case studies. View details
    Preview abstract Intuitively, the more complex a software system is, the harder it is to maintain. Statistically, it is not clear which complexity measures correlate with maintenance effort; in fact, it is not even clear how to objectively measure maintenance burden, so that developers’ sentiment and intuition can be supported by numbers. Without effective complexity and maintenance measures, it remains difficult to objectively monitor maintenance, control complexity, or justify refactoring. In this paper, we report a large-scale study of 1200+ projects written in C++ and Java from Google LLC. In this study, we collected three categories of measures: (1) architectural complexity, measured using propagation cost (PC), decoupling level (DL), and structural anti-patterns; (2) maintenance activity, measured using the number of changes, lines of code (LOC) written, and active coding time (ACT) spent on feature-addition vs. bug-fixing, and (3) developer sentiment on complexity and productivity, collected from 7200 survey responses. We statistically analysed the correlations among these measures and obtained significant evidence of the following findings: 1) the more complex the architecture is (higher propagation cost, more instances of anti-patterns), the more LOC is spent on bug-fixing, rather than adding new features; 2) developers who commit more changes for features, spend more lines of code on features, or spend more time on features also feel that they are less hindered by technical debt and complexity. To the best of our knowledge, this is the first large-scale empirical study establishing the statistical correlation among architectural complexity, maintenance activity, and developer sentiment. The implication is that, instead of solely relying upon developer sentiment and intuitions to detect degraded structure or increased burden to evolve, it is possible to objectively and continuously measure and monitor architectural complexity and maintenance difficulty, increasing feature delivery efficiency by reducing architectural complexity and anti-patterns. View details
    Fast Tensor Completion via Approximate Richardson Iteration
    Mehrdad Ghadiri
    Yunbum Kook
    Ali Jadbabaie
    Proceedings of the 42nd International Conference on Machine Learning (2025)
    Preview abstract We study tensor completion (TC) through the lens of low-rank tensor decomposition (TD). Many TD algorithms use fast alternating minimization methods, which solve highly structured linear regression problems at each step (e.g., for CP, Tucker, and tensor-train decompositions). However, such algebraic structure is lost in TC regression problems, making direct extensions unclear. To address this, we propose a lifting approach that approximately solves TC regression problems using structured TD regression algorithms as blackbox subroutines, enabling sublinear-time methods. We theoretically analyze the convergence rate of our approximate Richardson iteration based algorithm, and we demonstrate on real-world tensors that its running time can be 100x faster than direct methods for CP completion. View details
    Preview abstract Julia's strength in mathematical computation and high performance makes it a popular choice across scientific fields, mostly due to its focus on mathematics in a broad sense and execution performance. It is a language of choice to implement new numerical algorithms, but it really shines in modelling for optimisation thanks to JuMP.jl and MathOptInterface.jl. These libraries are, first and foremost, made for mathematical optimisation (linear, mixed-integer, conic, etc.), yet they are now generic enough to support more paradigms, such as constraint programming. This talk will introduce the basic principles behind the current implementation of JuMP.jl and explain why and how they are very good matches for modelling using constraint programming… and solving using any kind of mixed-integer-programming solver. Constraint-programming solvers can also be implemented using linear programming, in a great collaboration between discrete and continuous optimisation. This talk will briefly explain the connection and its implementation in Google’s CP-SAT, a leading, award-winning constraint solver that uses linear programs in its solving process — a solver that will soon be available in Julia too. View details
    Necro-reaper: Pruning away Dead Memory Traffic in Warehouse-Scale Computers
    Proceedings of the 30th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Association for Computing Machinery (2025)
    Preview abstract Memory bandwidth is emerging as a critical bottleneck in warehouse-scale computing (WSC). This work reveals that a significant portion of memory traffic in WSC is surprisingly unnecessary, consisting of unnecessary writebacks of deallocated data and fetches of uninitialized data. This issue is particularly acute in WSC, where short-lived heap allocations bigger than a cache line are prevalent. To address this problem, this work proposes a pragmatic approach tailored to WSC. Leveraging the existing WSC ecosystem of vertical integration, profile-guided compilation flows, and customized memory allocators, this work presents Necro-reaper, a novel software/hardware co-design that avoids dead memory traffic without requiring the hardware tracking of prior work. New ISA instructions enable the hardware to avoid unnecessary dead traffic, while extended software components, including a profile-guided compiler and memory allocator, optimize the utilization of these instructions. Evaluation across a diverse set of 10 WSC workloads demonstrates that Necro-reaper achieves a geomean memory traffic reduction of 26% and a geomean IPC increase of 6%. View details
    Scalability of Generative AI Models: Challenges and Opportunities in Large-Scale Data Generation and Training
    International Journal of Computer Science and Information Technology Research (IJCSITR) (2025)
    Preview abstract Scalability of Generative AI Models: Challenges and Opportunities in Large-Scale Data Generation and Training View details