Publications

Our teams aspire to make discoveries that impact everyone, and core to our approach is sharing our research and tools to fuel progress in the field.

people standing in front of a screen with images and a chipboard

Our teams aspire to make discoveries that impact everyone, and core to our approach is sharing our research and tools to fuel progress in the field.

Sort By
  • Title
  • Title, descending
  • Year
  • Year, descending
1 - 15 of 10383 publications
    Ransomware over Modern Web Browsers: A Novel Strain and A New Defense Mechanism
    Harun Oz
    Ahmet Aris
    Leonardo Babun
    Selcuk Uluagac
    Abbas Acar
    ACM Transactions on the Web (2025)
    Preview abstract Ransomware is an increasingly prevalent form of malware targeting end-users, governments, and businesses. As it has evolved, adversaries added new capabilities to their arsenal. Throughout the ransomware evolution, the adversaries propose a next-generation browser-based ransomware, RøB, that performs its malicious actions via emerging web technologies, File System Access API (FSA) and WebAssembly (Wasm). RøB uses this API through the victims’ browsers; hence, it does not require the victims to download and install malicious binaries. We performed extensive evaluations with 3 different OSs, 23 file formats, 29 distinct directories, 5 cloud providers, and 4 antivirus solutions. Our evaluations show that RøB can encrypt various types of files in the local and cloud-integrated directories, external storage devices, and network-shared folders of victims. Our experiments also reveal that popular cloud solutions, Box Individual and Apple iCloud can be severely affected by RøB. Moreover, we conducted tests with commercial antivirus software such as AVG, Avast, Kaspersky, Malware Bytes that perform sensitive directory and suspicious behavior monitoring against ransomware. We verified that RøB can evade these antivirus software and encrypt victim files. Moreover, existing ransomware detection solutions in the literature also cannot be a remedy against RøB due to its distinct features. Therefore, in this paper, we also propose broguard, a new detection system for RøB-like attacks. broguard monitors the web applications that use the FSA API via function hooking and uses a machine learning classifier to detect RøB-like attacks in real-time without any file loss. Performance evaluations of broguard on a comprehensive dataset show that broguard can detect RøB-like browser-based ransomware attacks with over 99% accuracy and minimal overhead. View details
    Preview abstract Users of routing services like Apple Maps, Google Maps, and Waze frequently wonder why a given route is proposed. This question particularly arises when dynamic conditions like traffic and road closures cause unusual routes to be proposed. While many such dynamic conditions may exist in a road network at any time, only a small fraction of those conditions are typically relevant to a given user's route. In this work, we give a simple algorithm that identifies a small set of traffic-laden road segments that answer the following question: Which traffic conditions cause a particular shortest traffic-aware route to differ from the shortest traffic-free route? We theoretically and experimentally show that our algorithm generates small and interpretable answers to this question. View details
    Development and Evaluation of ML Models for Cardiotocography Interpretation
    Nicole Chiou
    Nichole Young-Lin
    Abdoulaye Diack
    Christopher Kelly
    Sanmi Koyejo
    NPJ Women's Health (2025)
    Preview abstract The inherent variability in the visual interpretation of cardiotocograms (CTGs) by obstetric clinical experts, both intra- and inter-observer, presents a substantial challenge in obstetric care. In response, we investigate automated CTG interpretation as a potential solution to enhance the early detection of fetal hypoxia during labor, thereby reducing unnecessary operative interventions and improving overall maternal and neonatal care. This study employs deep learning techniques to reduce the subjectivity associated with visual CTG interpretation. Our results demonstrate that employing objective cord blood pH measurements, rather than clinician-defined Apgar scores, yields more consistent and robust model performance. Additionally, through a series of ablation studies, we investigate the impact of temporal distribution shifts on the performance of these deep learning models. We examine tradeoffs between performance and fairness, specifically evaluating performance across demographic and clinical subgroups. Finally, we discuss the practical implications of our findings for the real-world deployment of such systems, emphasizing their potential utility in medical settings with limited resources. View details
    Improving simulation-based origin-destination demand calibration using sample segment counts data
    Arwa Alanqary
    Yechen Li
    The 12th Triennial Symposium on Transportation Analysis conference (TRISTAN XII), Okinawa, Japan (2025) (to appear)
    Preview abstract This paper introduces a novel approach to demand estimation that utilizes partial observations of segment-level track counts. Building on established simulation-based demand estimation methods, we present a modified formulation that integrates sample track counts as a regularization term. This approach effectively addresses the underdetermination challenge in demand estimation, moving beyond the conventional reliance on a prior OD matrix. The proposed formulation aims to preserve the distribution of the observed track counts while optimizing the demand to align with observed path-level travel times. We tested this approach on Seattle's highway network with various congestion levels. Our findings reveal significant enhancements in the solution quality, particularly in accurately recovering ground truth demand patterns at both the OD and segment levels. View details
    Preview abstract We study the existence of almost fair and near-optimal solutions to a routing problem as defined in the seminal work of Rosenthal. We focus on the setting where multiple alternative routes are available for each potential request (which corresponds to a potential user of the network). This model captures a collection of diverse applications such as packet routing in communication networks, routing in road networks with multiple alternative routes, and the economics of transportation of goods. Our recommended routes have provable guarantees in terms of both the total cost and fairness concepts such as approximate envy-freeness. We employ and appropriately combine tools from algorithmic game theory and fair division. Our results apply on two distinct models: the splittable case where the request is split among the selected paths (e.g., routing a fleet of trucks) and the unsplittable case where the request is assigned to one of its designated paths (e.g., a single user request). Finally, we conduct an empirical analysis to test the performance of our approach against simpler baselines using the real world road network of New York City. View details
    Preview abstract We consider the Coalition Structure Learning (CSL) problem in multi-agent systems, motivated by the existence of coalitions in many real-world systems, e.g., trading platforms and auction systems. In this problem, there is a hidden coalition structure within a set of $n$ agents, which affects the behavior of the agents in games. Our goal is to actively design a sequence of games for the agents to play, such that observations in these games can be used to learn the hidden coalition structure. In particular, we consider the setting where in each round, we design and present a game together with a strategy profile to the agents, and receive a multiple-bit observation -- for each agent, we observe whether or not they would like to deviate from the specified strategy in this given game. Our contributions are three-fold: First, we show that we can learn the coalition structure in $O(\log n)$ rounds if we are allowed to choose any normal-form game in each round, matching the information-theoretical lower bound, and the result can be extended to congestion games. Second, in a more restricted setting where we can only choose a graphical game with degree limit $d$, we develop an algorithm to learn the coalition structure in $O(n/d+\log d)$ rounds. Third, when we can only learn the coalition structure through running second-price auctions with personalized reserve prices, we show that the coalition structure can be learned in $O(c\log n)$ rounds, where $c$ is the size of the largest coalition. View details
    Preview abstract Recent work suggested utilizing inference compute, showing that scaling of number of samples consistently improves the fractions of problems solved by any attempt, namely the coverage. In this work, we suggest that inference scaling gains should be compared with proper baselines, as some datasets become degenerate when allowing a large number of attempts. We focus on two domains - mathematical reasoning and factual knowledge, showing that for the MATH and Entity Questions datasets, informed answer enumeration obtains similar or even better results than repeated model sampling, with a much lower sample budget. While we believe that inference scaling is a promising approach for unlocking the potential of language models, we recommend carefully selecting models and datasets when applying this method. Otherwise, the results of inference scaling should be interpreted with caution. View details
    Google's Approach for Secure AI Agents
    Santiago (Sal) Díaz
    Kara Olive
    Google (2025)
    Preview abstract As part of Google's ongoing efforts to define best practices for secure AI systems, we’re sharing our aspirational framework for secure AI agents. We advocate for a hybrid, defense-in-depth strategy that combines the strengths of traditional, deterministic security controls with dynamic, reasoning-based defenses. This approach is grounded in three core principles: agents must have well-defined human controllers, their powers must be carefully limited, and their actions and planning must be observable. This paper reflects our current thinking and the direction of our efforts as we work towards ensuring that AI agents can be powerful, useful, and secure by default. View details
    Preview abstract Multimodal models represent a significant advancement in Artificial Intelligence. A single model is trained to understand unstructured modalities: text, image, video, and audio. Open-source variants of multimodal models have made these breakthroughs further accessible. ML practitioners adopt, finetune, and deploy open-source models in real-world applications. However, considering the vast landscape of adversarial attacks across these modalities, these models also inherit vulnerabilities of all the modalities, and eventually, the adversarial threat amplifies. While broad research is available on possible attacks within or across these modalities, a practitioner-focused view of outlining attack types remains absent in the multimodal world. This paper addresses the gap by surveying adversarial attacks targeting all four modalities: text, image, video, and audio. This survey provides a view of the adversarial attack landscape and presents how multimodal adversarial threats have evolved. To the best of our knowledge, this survey is the first comprehensive summarization of the threat landscape in the multimodal world. View details
    Preview abstract Generative AI (GenAI), particularly Large Language Models (LLMs), offer powerful capabilities for interpreting the complex data landscape in healthcare. In this paper, we present a comprehensive overview of the capabilities, requirements and applications of GenAI for deriving clinical insights and improving clinical efficiency. We first provide some background on the forms and sources of patient data, namely real-time Remote Patient Monitoring (RPM) streams and traditional Electronic Health Records (EHR). The sheer volume and heterogeneity of this combined data present significant challenges to clinicians and contribute to information overload. In addition, we explore the potential of LLM-powered applications for improving clinical efficiency. These applications can enhance navigation of longitudinal patient data and provide actionable clinical decision support through natural language dialogue. We discuss the opportunities this presents for streamlining clinician workflows and personalizing care, alongside critical challenges such as data integration complexity, ensuring data quality and RPM data reliability, maintaining patient privacy, validating AI outputs for clinical safety, mitigating bias, and ensuring clinical acceptance. We believe this work represents the first summarization of GenAI techniques for managing clinician data overload due to combined RPM / EHR data complexities. View details
    ZAPBench: A Benchmark for Whole-Brain Activity Prediction in Zebrafish
    Alexander Immer
    Alex Bo-Yuan Chen
    Mariela D. Petkova
    Nirmala A. Iyer
    Luuk Willem Hesselink
    Aparna Dev
    Gudrun Ihrke
    Woohyun Park
    Alyson Petruncio
    Aubrey Weigel
    Wyatt Korff
    Florian Engert
    Jeff W. Lichtman
    Misha B. Ahrens
    International Conference on Learning Representations (ICLR) (2025)
    Preview abstract Data-driven benchmarks have led to significant progress in key scientific modeling domains including weather and structural biology. Here, we present the Zebrafish Activity Prediction Benchmark (ZAPBench), which quantitatively measures progress on the problem of predicting cellular-resolution neural activity throughout an entire vertebrate brain. The benchmark is based on a novel dataset containing 4d light-sheet microscopy recordings of more than 70,000 neurons in a larval zebrafish brain, along with motion stabilized and voxel-level cell segmentations of these data that facilitate development of a variety of forecasting methods. Initial results from a selection of time series and volumetric video modeling approaches achieve better performance than naive baseline methods, but also show room for further improvement. The specific brain used in the activity recording is also undergoing synaptic-level anatomical mapping, which will enable future integration of detailed structural information into ZAP forecasting methods. View details
    Toward Sensor-In-the-Loop LLM Agent: Benchmarks and Implications
    Zhiwei Ren
    Junbo Li
    Minjia Zhang
    Di Wang
    Longfei Shangguan
    SenSys 2025 - The 23rd ACM Conference on Embedded Networked Sensor Systems (2025)
    Preview abstract This paper advocates for sensor-informed personal agents that can take advantage of sensor hints on wearables to enhance the personal agent's response. We demonstrate that such a sensor-in-the-loop design paradigm can be easily integrated into existing LLM agents by building a prototype named WellMax based on existing well-developed techniques such as structured prompt tuning and few-shot prompting. The head-to-head comparison with a non-sensor-informed agent across five use scenarios demonstrates that this sensor-in-the-loop design can effectively improve users' needs and their overall experience. The deep-dive into agents' replies and participants' feedback further reveals that sensor-in-the-loop agents not only provide more contextually relevant responses but also exhibit a greater understanding of user priorities and situational nuances. We further conduct two case studies to examine the potential pitfalls and distill key insights from this sensor-in-the-loop agent. We believe this work sets the stage for more intelligent, empathetic, and effective interactions in future AI-driven personal assistants. View details
    HueManity: Probing Fine-Grained Visual Perception in MLLMs
    Rynaa Grover
    Jayant Tamarapalli
    Sahiti Yerramilli
    Nilay Pande
    (2025)
    Preview abstract Multimodal Large Language Models (MLLMs) excel at high-level visual reasoning, but their performance on nuanced perceptual tasks remains surprisingly limited. We present HueManity, a benchmark designed to assess visual perception in MLLMs. The dataset comprises 83,850 images featuring two-character alphanumeric strings embedded in Ishihara test style dot patterns, challenging models on precise pattern recognition. Our evaluation of nine state-of-the-art MLLMs on HueManity demonstrates a significant performance deficit compared to human and traditional computer vision baselines. The best-performing MLLM achieved a 33.6% accuracy on the numeric "easy" task and a striking 3% on the alphanumeric "hard" task. In contrast, human participants achieved near-perfect scores (100% and 95.6%), and a fine-tuned ResNet50 model reached accuracies of 96.5% and 94.5%. These results highlight a critical gap in the visual capabilities of current MLLMs. Our analysis further explores potential architectural and training-paradigm factors contributing to this perceptual gap in MLLMs. We will open-source HueManity dataset and code to foster further research in improving perceptual robustness of MLLMs. View details
    Preview abstract Specific quantum algorithms exist to—in theory— break elliptic curve cryptographic protocols. Implementing these algorithms requires designing quantum circuits that perform elliptic curve arithmetic. To accurately judge a cryptographic protocol’s resistance against future quantum computers, researchers figure out minimal resource-count circuits for performing these operations while still being correct. To assure the correctness of a circuit, it is integral to restore all ancilla qubits used to their original states. Failure to do so could result in decoherence of the computation’s final result. Through rigorous classical simulation and unit testing, I surfaced four inconsistencies in the state-ofthe-art quantum circuit for elliptic curve point addition where the circuit diagram states the qubits are returned in the original (|0⟩) state, but the intermediate values are not uncomputed. I provide fixes to the circuit without increasing the leading-order gate cost. View details