Publications

Our teams aspire to make discoveries that impact everyone, and core to our approach is sharing our research and tools to fuel progress in the field.

people standing in front of a screen with images and a chipboard

Our teams aspire to make discoveries that impact everyone, and core to our approach is sharing our research and tools to fuel progress in the field.

Sort By
  • Title
  • Title, descending
  • Year
  • Year, descending
1 - 15 of 10464 publications
    PageFlex: Flexible and Efficient User-space Delegation of Linux Paging Policies with eBPF
    Kan Wu
    Zhiyuan Guo
    Suli Yang
    Rajath Shashidhara
    Wei Xu
    Alex Snoeren
    Kim Keeton
    2025
    Preview abstract To increase platform memory efficiency, hyperscalers like Google and Meta transparently demote “cold” application data to cheaper cost-per-byte memory tiers like compressed memory and NVMe SSDs. These systems rely on standard kernel paging policies and mechanisms to maximize the achievable memory savings without hurting application performance. Although the literature promises better policies, implementing and deploying them within the Linux kernel is challenging. Delegating policies and mechanisms to user space, through userfaultfd or library-based approaches, incurs overheads and may require modifying application code. We present PageFlex, a framework for delegating Linux paging policies to user space with minimal overhead and full compatibility with existing real-world deployments. PageFlex uses eBPF to delegate policy decisions while providing low-overhead access to in-kernel memory state and access information, thus balancing flexibility and performance. Additionally, PageFlex supports different paging strategies for distinct memory regions and application phases. We show that PageFlex can delegate existing kernel-based policies with little (< 1%) application slowdown, effectively realizing the benefits of state-of-the-art policies like Hyperbolic caching and Leap prefetching, and unlocking application-specific benefits through region- and phase-aware policy specialization. View details
    Preview abstract We study the existence of almost fair and near-optimal solutions to a routing problem as defined in the seminal work of Rosenthal. We focus on the setting where multiple alternative routes are available for each potential request (which corresponds to a potential user of the network). This model captures a collection of diverse applications such as packet routing in communication networks, routing in road networks with multiple alternative routes, and the economics of transportation of goods. Our recommended routes have provable guarantees in terms of both the total cost and fairness concepts such as approximate envy-freeness. We employ and appropriately combine tools from algorithmic game theory and fair division. Our results apply on two distinct models: the splittable case where the request is split among the selected paths (e.g., routing a fleet of trucks) and the unsplittable case where the request is assigned to one of its designated paths (e.g., a single user request). Finally, we conduct an empirical analysis to test the performance of our approach against simpler baselines using the real world road network of New York City. View details
    Gemini & Physical World: Large Language Models Can Estimate the Intensity of Earthquake Shaking from Multi-Modal Social Media Posts
    Marc Stogaitis
    Tajinder Gadh
    Richard Allen
    Alexei Barski
    Robert Bosch
    Patrick Robertson
    Youngmin Cho
    Nivetha Thiruverahan
    Aman Raj
    Geophysical Journal International (2025), ggae436
    Preview abstract This paper presents a novel approach for estimating the ground shaking intensity using real-time social media data and CCTV footage. Employing the Gemini 1.5 Pro’s (Reid et al. 2024) model, a multi-modal language model, we demonstrate the ability to extract relevant information from unstructured data utilizing generative AI and natural language processing. The model’s output, in the form of Modified Mercalli Intensity (MMI) values, align well with independent observational data. Furthermore, our results suggest that beyond its advanced visual and auditory understanding abilities, Gemini appears to utilize additional sources of knowledge, including a simplified understanding of the general relationship between earthquake magnitude, distance, and MMI intensity, which it presumably acquired during its training, in its reasoning and decision-making processes. These findings raise intriguing questions about the extent of Gemini's general understanding of the physical world and its phenomena. Gemini’s ability to generate results consistent with established scientific knowledge highlights the potential of LLMs like Gemini in augmenting our understanding of complex physical phenomena such as earthquakes. More specifically, the results of this study highlight the potential of LLMs like Gemini to revolutionize citizen seismology by enabling rapid, effective, and flexible analysis of crowdsourced data from eyewitness accounts for assessing earthquake impact and providing crisis situational awareness. This approach holds a great promise for improving early warning systems, disaster response, and overall resilience in earthquake-prone regions. This study provides a significant step toward harnessing the power of social media and AI for earthquake disaster mitigation. View details
    RemapRoute: Local Remapping of Internet Path Changes
    renata cruz teixeira
    italo cunha
    Elverton Fazzion
    Darryl Veitch
    2025
    Preview abstract Several systems rely on traceroute to track a large number of Internet paths as they change over time. Monitoring systems perform this task by remapping paths periodically or whenever a change is detected. This paper shows that such complete remapping is inefficient, because most path changes are localized to a few hops of a path. We develop RemapRoute, a tool to remap a path locally given the previously known path and a change point. RemapRoute sends targeted probes to locate and remap the often few hops that have changed. Our evaluation with trace-driven simulations and in a real deployment shows that local remapping reduces the average number of probes issued during remapping by 63% and 79%, respectively, when compared with complete remapping. At the same time, our results show that local remapping has little impact on the accuracy of inferred paths. View details
    Fast ACS: Low-Latency File-Based Ordered Message Delivery at Scale
    Anil Raghunath Iyer
    Neel Bagora
    Chang Yu
    Olivier Pomerleau
    Vivek Kumar
    Prunthaban Kanthakumar
    Usenix Annual Technical Conference (2025)
    Preview abstract Low-latency message delivery is crucial for real-time systems. Data originating from a producer must be delivered to consumers, potentially distributed in clusters across metropolitan and continental boundaries. With the growing scale of computing, there can be several thousand consumers of the data. Such systems require a robust messaging system capable of transmitting messages containing data across clusters and efficiently delivering them to consumers. The system must offer guarantees like ordering and at-least-once delivery while avoiding overload on consumers, allowing them to consume messages at their own pace. This paper presents the design of Fast ACS (an abbreviation for Ads Copy Service), a file-based ordered message delivery system that leverages a combination of two-sided (inter-cluster) and one-sided (intra-cluster) communication primitives—namely, Remote Procedure Call and Remote Direct Memory Access, respectively—to deliver messages. The system has been successfully deployed to dozens of production clusters and scales to accommodate several thousand consumers within each cluster, which amounts to Tbps-scale intra-cluster consumer traffic at peak. Notably, Fast ACS delivers messages to consumers across the globe within a few seconds or even sub-seconds (p99) based on the message volume and consumer scale, at a low resource cost. View details
    Preview abstract Many AI applications of interest require specialized multi-modal models. Yet, relevant data for training these models is inherently scarce. Human annotation is prohibitively expensive, error-prone, and time-consuming. Meanwhile, existing synthetic data generation methods often rely on manual prompts, evolutionary algorithms, or extensive seed data from the target distribution - limiting scalability and control. In this paper, we introduce Simula, a novel, seedless framework that balances global and local reasoning to generate synthetic datasets. We utilize taxonomies to capture a global coverage space and use a series of agentic refinements to promote local diversity and complexity. Our approach allows users to define desired dataset characteristics through an explainable and controllable process, without relying on seed data. This unlocks new opportunities for developing and deploying AI in domains where data scarcity or privacy concerns are paramount. View details
    Preview abstract Estimating Origin-Destination (OD) travel demand is vital for effective urban planning and traffic management. Developing universally applicable OD estimation methodologies is significantly challenged by the pervasive scarcity of high-fidelity traffic data and the difficulty in obtaining city-specific prior OD estimates (or seed ODs), which are often prerequisite for traditional approaches. Our proposed method directly estimates OD travel demand by systematically leveraging aggregated, anonymized statistics from Google Maps Traffic Trends, obviating the need for conventional census or city-provided OD data. The OD demand is estimated by formulating a single-level, one-dimensional, continuous nonlinear optimization problem with nonlinear equality and bound constraints to replicate highway path travel times. The method achieves efficiency and scalability by employing a differentiable analytical macroscopic network model. This model by design is computationally lightweight, distinguished by its parsimonious parameterization that requires minimal calibration effort and its capacity for instantaneous evaluation. These attributes ensure the method's broad applicability and practical utility across diverse cities globally. Using segment sensor counts from Los Angeles and San Diego highway networks, we validate our proposed approach, demonstrating a two-thirds to three-quarters improvement in the fit to segment count data over a baseline. Beyond validation, we establish the method's scalability and robust performance in replicating path travel times across diverse highway networks, including Seattle, Orlando, Denver, Philadelphia, and Boston. In these expanded evaluations, our method not only aligns with simulation-based benchmarks but also achieves an average 13% improvement in it's ability to fit travel time data compared to the baseline during afternoon peak hours. View details
    Software development is a team sport
    Jie Chen
    Alison Chang
    Rayven Plaza
    Marie Huber
    Claire Taylor
    IEEE Software (2025)
    Preview abstract In this article, we describe our human-centered research focused on understanding the role of collaboration and teamwork in productive software development. We describe creation of a logs-based metric to identify collaboration through observable events and a survey-based multi-item scale to assess team functioning. View details
    Preview abstract Cloud application development faces the inherent challenge of balancing rapid innovation with high availability. This blog post details how Google Workspace's Site Reliability Engineering team addresses this conflict by implementing vertical partitioning of serving stacks. By isolating application servers and storage into distinct partitions, the "blast radius" of code changes and updates is significantly reduced, minimizing the risk of global outages. This approach, which complements canary deployments, enhances service availability, provides flexibility for experimentation, and facilitates data localization. While challenges such as data model complexities and inter-service partition misalignment exist, the benefits of improved reliability and controlled deployments make partitioning a crucial strategy for maintaining robust cloud applications View details
    Preview abstract While large language models (LLMs) have shown promise in diagnostic dialogue, their capabilities for effective management reasoning - including disease progression, therapeutic response, and safe medication prescription - remain under-explored. We advance the previously demonstrated diagnostic capabilities of the Articulate Medical Intelligence Explorer (AMIE) through a new LLM-based agentic system optimised for clinical management and dialogue, incorporating reasoning over the evolution of disease and multiple patient visit encounters, response to therapy, and professional competence in medication prescription. To ground its reasoning in authoritative clinical knowledge, AMIE leverages Gemini's long-context capabilities, combining in-context retrieval with structured reasoning to align its output with relevant and up-to-date clinical practice guidelines and drug formularies. In a randomized, blinded virtual Objective Structured Clinical Examination (OSCE) study, AMIE was compared to 21 primary care physicians (PCPs) across 100 multi-visit case scenarios designed to reflect UK NICE Guidance and BMJ Best Practice guidelines. AMIE was non-inferior to PCPs in management reasoning as assessed by specialist physicians and scored better in both preciseness of treatments and investigations, and in its alignment with and grounding of management plans in clinical guidelines. To benchmark medication reasoning, we developed RxQA, a multiple-choice question benchmark derived from two national drug formularies (US, UK) and validated by board-certified pharmacists. While AMIE and PCPs both benefited from the ability to access external drug information, AMIE outperformed PCPs on higher difficulty questions. While further research would be needed before real-world translation, AMIE's strong performance across evaluations marks a significant step towards conversational AI as a tool in disease management. View details
    Ransomware over Modern Web Browsers: A Novel Strain and A New Defense Mechanism
    Harun Oz
    Ahmet Aris
    Leonardo Babun
    Selcuk Uluagac
    Abbas Acar
    ACM Transactions on the Web (2025)
    Preview abstract Ransomware is an increasingly prevalent form of malware targeting end-users, governments, and businesses. As it has evolved, adversaries added new capabilities to their arsenal. Throughout the ransomware evolution, the adversaries propose a next-generation browser-based ransomware, RøB, that performs its malicious actions via emerging web technologies, File System Access API (FSA) and WebAssembly (Wasm). RøB uses this API through the victims’ browsers; hence, it does not require the victims to download and install malicious binaries. We performed extensive evaluations with 3 different OSs, 23 file formats, 29 distinct directories, 5 cloud providers, and 4 antivirus solutions. Our evaluations show that RøB can encrypt various types of files in the local and cloud-integrated directories, external storage devices, and network-shared folders of victims. Our experiments also reveal that popular cloud solutions, Box Individual and Apple iCloud can be severely affected by RøB. Moreover, we conducted tests with commercial antivirus software such as AVG, Avast, Kaspersky, Malware Bytes that perform sensitive directory and suspicious behavior monitoring against ransomware. We verified that RøB can evade these antivirus software and encrypt victim files. Moreover, existing ransomware detection solutions in the literature also cannot be a remedy against RøB due to its distinct features. Therefore, in this paper, we also propose broguard, a new detection system for RøB-like attacks. broguard monitors the web applications that use the FSA API via function hooking and uses a machine learning classifier to detect RøB-like attacks in real-time without any file loss. Performance evaluations of broguard on a comprehensive dataset show that broguard can detect RøB-like browser-based ransomware attacks with over 99% accuracy and minimal overhead. View details
    Performance of a Deep Learning Diabetic Retinopathy Algorithm in India
    Arthur Brant
    Xiang Yin
    Lu Yang
    Divleen Jeji
    Sunny Virmani
    Anchintha Meenu
    Naresh Babu Kannan
    Florence Thng
    Lily Peng
    Ramasamy Kim
    JAMA Network Open (2025)
    Preview abstract Importance: While prospective studies have investigated the accuracy of artificial intelligence (AI) for detection of diabetic retinopathy (DR) and diabetic macular edema (DME), to date, little published data exist on the clinical performance of these algorithms. Objective: To evaluate the clinical performance of an automated retinal disease assessment (ARDA) algorithm in the postdeployment setting at Aravind Eye Hospital in India. Design, Setting, and Participants: This cross-sectional analysis involved an approximate 1% sample of fundus photographs from patients screened using ARDA. Images were graded via adjudication by US ophthalmologists for DR and DME, and ARDA’s output was compared against the adjudicated grades at 45 sites in Southern India. Patients were randomly selected between January 1, 2019, and July 31, 2023. Main Outcomes and Measures: Primary analyses were the sensitivity and specificity of ARDA for severe nonproliferative DR (NPDR) or proliferative DR (PDR). Secondary analyses focused on sensitivity and specificity for sight-threatening DR (STDR) (DME or severe NPDR or PDR). Results: Among the 4537 patients with 4537 images with adjudicated grades, mean (SD) age was 55.2 (11.9) years and 2272 (50.1%) were male. Among the 3941 patients with gradable photographs, 683 (17.3%) had any DR, 146 (3.7%) had severe NPDR or PDR, 109 (2.8%) had PDR, and 398 (10.1%) had STDR. ARDA’s sensitivity and specificity for severe NPDR or PDR were 97.0% (95% CI, 92.6%-99.2%) and 96.4% (95% CI, 95.7%-97.0%), respectively. Positive predictive value (PPV) was 50.7% and negative predictive value (NPV) was 99.9%. The clinically important miss rate for severe NPDR or PDR was 0% (eg, some patients with severe NPDR or PDR were interpreted as having moderate DR and referred to clinic). ARDA’s sensitivity for STDR was 95.9% (95% CI, 93.0%-97.4%) and specificity was 94.9% (95% CI, 94.1%-95.7%); PPV and NPV were 67.9% and 99.5%, respectively. Conclusions and Relevance: In this cross-sectional study investigating the clinical performance of ARDA, sensitivity and specificity for severe NPDR and PDR exceeded 96% and caught 100% of patients with severe  NPDR and PDR for ophthalmology referral. This preliminary large-scale postmarketing report of the performance of ARDA after screening 600 000 patients in India underscores the importance of monitoring and publication an algorithm's clinical performance, consistent with recommendations by regulatory bodies. View details
    Circadian rhythm of heart rate and activity: a cross-sectional study
    Maryam Khalid
    Logan Schneider
    Aravind Natarajan
    Conor Heneghan
    Karla Gleichauf
    Chronobiology International (2025)
    Preview abstract ABSTRACT Background: Circadian rhythms are commonly observed in a number of physiological processes. Consumer wearable devices have made it possible to obtain continuous time series data from a large number of individuals. We study circadian rhythms from measurements of heart rate, movement, and sleep, from a cohort of nearly 20,000 participants over the course of 30 days. Methods: Participation was restricted to Fitbit users of age 21 years or older residing in the United States or Canada. Participants were enrolled through a recruitment banner shown on the Fitbit App. The advertisement was shown to 531,359 Fitbit users, and 23,239 enrolled in the program. Of these, we obtained heart rate data from 19,350 participants. We obtain the underlying circadian rhythm from time series heart rate by modeling the circadian rhythm as a sum over the first two Fourier harmonics. The first Fourier harmonic accounts for the 24-hour rhythmicity, while the second harmonic accounts for non-sinusoidal perturbations. Findings: We observe a circadian rhythm in both heart rate and acceleration. From the diurnal modulation, we obtain the following circadian parameters: (i) amplitude of modulation, (ii) bathyphase, (iii) acrophase, (iv) non-sinusoidal fraction, and (v) fraction of day when the heart rate is greater than the mean. The amplitude, bathyphase, and acrophase depend on sex, and decrease with age. The waketime on average, follows the bathyphase by 2.4 hours. In most individuals, the circadian rhythm of heart rate lags the circadian rhythm of activity. Interpretation: Circadian metrics for heart rate and activity can be reliably obtained from commercially available wearable devices. Distributions of circadian metrics can be valuable tools for individual-level interpretation. View details
    Preview abstract Natural disasters, including earthquakes, wildfires and cyclones, bear a huge risk on human lives as well as infrastructure assets. An effective response to disaster depends on the ability to rapidly and efficiently assess the intensity of damage. Artificial Intelligence (AI) and Generative Artificial Intelligence (GenAI) presents a breakthrough solution, capable of combining knowledge from multiple types and sources of data, simulating realistic scenarios of disaster, and identifying emerging trends at a speed previously unimaginable. In this paper, we present a comprehensive review on the prospects of AI and GenAI in damage assessment for various natural disasters, highlighting both its strengths and limitations. We talk about its application to multimodal data such as text, image, video, and audio, and also cover major issues of data privacy, security, and ethical use of the technology during crises. The paper also recognizes the threat of Generative AI misuse, in the form of dissemination of misinformation and for adversarial attacks. Finally, we outline avenues of future research, emphasizing the need for secure, reliable, and ethical Generative AI systems for disaster management in general. We believe that this work represents the first comprehensive survey of Gen-AI techniques being used in the field of Disaster Assessment and Response. View details
    Fast Tensor Completion via Approximate Richardson Iteration
    Mehrdad Ghadiri
    Yunbum Kook
    Ali Jadbabaie
    Proceedings of the 42nd International Conference on Machine Learning (2025)
    Preview abstract We study tensor completion (TC) through the lens of low-rank tensor decomposition (TD). Many TD algorithms use fast alternating minimization methods, which solve highly structured linear regression problems at each step (e.g., for CP, Tucker, and tensor-train decompositions). However, such algebraic structure is lost in TC regression problems, making direct extensions unclear. To address this, we propose a lifting approach that approximately solves TC regression problems using structured TD regression algorithms as blackbox subroutines, enabling sublinear-time methods. We theoretically analyze the convergence rate of our approximate Richardson iteration based algorithm, and we demonstrate on real-world tensors that its running time can be 100x faster than direct methods for CP completion. View details