Publications
Our teams aspire to make discoveries that impact everyone, and core to our approach is sharing our research and tools to fuel progress in the field.
Our teams aspire to make discoveries that impact everyone, and core to our approach is sharing our research and tools to fuel progress in the field.
Sort By
1 - 15 of 10795 publications
Preview abstract
AI coding assistants are rapidly becoming integral to modern software development. A key challenge in this space is the continual need to migrate and modernize codebases in response to evolving software ecosystems. Traditionally, such migrations have relied on rule-based systems and human intervention. With the advent of powerful large language models (LLMs), AI-driven agentic frameworks offer a promising alternative—but their effectiveness remains underexplored. In this paper, we introduce FreshBrew, a novel benchmark for evaluating AI-based agentic frameworks on project-level Java migrations. We benchmark several such frameworks, powered by state-of-the-art LLMs, and compare their performance against established rule-based tools. Our evaluation of AI agents on this benchmark of 228 repositories shows that the top-performing model, Gemini 2.5 Flash, can successfully migrate 56.5% of projects to JDK 17. Our empirical analysis reveals novel insights into the critical strengths and limitations of current agentic approaches, offering actionable insights into their real-world applicability. By releasing FreshBrew publicly upon acceptance, we aim to facilitate rigorous, reproducible evaluation and catalyze progress in AI-driven codebase modernization.
View details
Preview abstract
For many practical applications of quantum computing, the slowest and most costly steps involve coherently accessing classical data. We help address this challenge by applying mass production techniques, which can sometimes allow us to perform operations many times in parallel for a cost that is comparable to a single execution[1-3]. We combine existing mass-production results with modern approaches for loading classical data using ``quantum read-only memory.'' We show that quantum mass production techniques offer no benefit when we consider a cost model that focuses purely on the number of non-Clifford gates. However, analyzing the constant factors in a more nuanced cost model, we find that it may be possible to obtain a reduction in cost of an order or magnitude or more for a variety reasonably-sized fault-tolerant quantum algorithms. We present several applications of quantum mass-production techniques beyond naive parallelization, including a strategy for reducing the cost of serial calls to the same data loading step.
View details
Preview abstract
Despite the advent of legislation such as the General Data Protection Regulation (GDPR) with its associated "Right to be Forgotten" (RTBF), few, if any, studies have measured user reactions to realistic edge cases with public-interest content. Surveying both users covered by and excluded from RTBF, this vignette-based survey experiment sought to better understand how users think of delisting content from search engine results and what factors influence user perceptions. While leaving information accessible in search engine results generally leads to warmer feelings towards those search engines than delisting it, we find that users do prefer different outcomes
depending on contextual elements specific to given cases. We also find that whether a country has active RTBF legislation does seem to be associated with both knowledge and attitudes about RTBF, but is unlikely to explain all of it. These results indicate a complex context around removing public-interest content from search engines’ results; it is essential that experts sensitive to local context perform the review in order to ensure that removal requests are handled in a way that meets users’ expectations.
View details
An Empirical Study of Time of Day Breakpoints in Traffic Light Plans
Eliav Buchnik
Tom Kalvari
Jack Haddad
Dan Karliner
Danny Veikherman
Shai Ferster
Ori Rottenstreich
2025
Preview abstract
Fixed time strategy is a common approach in signal traffic control in which signal plans are simple and periodic, enjoying easy implementation without detection mechanisms. A traffic light is associated with several daily plans, each applied to several consecutive hours. Time-of-day breakpoints (TODs) refer to the times over the day in which the plan is changed. TODs are often selected based on traffic, aiming to divide the day into groups of consecutive hours with similar traffic characteristics within each group of hours. We present a methodology to study time-of-day breakpoints in practice. We use this methodology to estimate and analyze time-of-day breakpoints in the city of Rio de Janeiro, Brazil based on traffic properties derived from traffic trajectories. Our study examines over 900 of the city intersections. We refer to properties such as the number of daily plans and the times by which plans start. We also provide traffic-aware insights on the potential improvement in the selection of TODs and identify key intersections where adjusting TODs could reduce average delay times. We identify potential improvements in over 8% of the examined intersections. These findings provide valuable insights for traffic engineers seeking to optimize signal timing.
View details
Speculative Knowledge Distillation: Bridging the Teacher-Student Gap Through Interleaved Sampling
Wenda Xu
Dhruv Madeka
Lei Li
William Wang
Rishabh Agarwal
2025
Preview abstract
Recent advances in knowledge distillation (KD) have enabled smaller student models to approach the performance of larger teacher models. However, popular methods such as supervised KD and on-policy KD, are adversely impacted by the knowledge gaps between teacher-student in practical scenarios. Supervised KD suffers from a distribution mismatch between training with a static dataset and inference over final student-generated outputs. Conversely, on-policy KD, which uses student-generated samples for training, can suffer from low-quality training examples with which teacher models are not familiar, resulting in inaccurate teacher feedback. To address these limitations, we introduce Speculative Knowledge Distillation (SKD), a novel approach that leverages cooperation between student and teacher models to generate high-quality training data on-the-fly while aligning with the student’s inference-time distribution. In SKD, the student proposes tokens, and the teacher replaces poorly ranked ones based on its own distribution, transferring high-quality knowledge adaptively. We evaluate SKD on various text generation tasks, including translation, summarization, math, and instruction following, and show that SKD consistently outperforms existing KD methods across different domains, data sizes, and model initialization strategies
View details
Improving simulation-based origin-destination demand calibration using sample segment counts data
Arwa Alanqary
Yechen Li
The 12th Triennial Symposium on Transportation Analysis conference (TRISTAN XII), Okinawa, Japan (2025)
Preview abstract
This paper introduces a novel approach to demand estimation that utilizes partial observations of segment-level track counts. Building on established simulation-based demand estimation methods, we present a modified formulation that integrates sample track counts as a regularization term. This approach effectively addresses the underdetermination challenge in demand estimation, moving beyond the conventional reliance on a prior OD matrix. The proposed formulation aims to preserve the distribution of the observed track counts while optimizing the demand to align with observed path-level travel times. We tested this approach on Seattle's highway network with various congestion levels. Our findings reveal significant enhancements in the solution quality, particularly in accurately recovering ground truth demand patterns at both the OD and segment levels.
View details
PolyPath: Adapting a Large Multimodal Model for Multislide Pathology Report Generation
Lin Yang
Shravya Shetty
Tiam Jaroensri
Faruk Ahmed
Daniel Golden
Modern Pathology (2025)
Preview abstract
The interpretation of histopathology cases underlies many important diagnostic and treatment decisions in medicine. Notably, this process typically requires pathologists to integrate and summarize findings across multiple slides per case. Existing vision-language capabilities in computational pathology have so far been largely limited to small regions of interest, larger regions at low magnification, or single whole-slide images (WSIs). This limits interpretation of findings that span multiple high-magnification regions across multiple WSIs. By making use of Gemini 1.5 Flash, a large multimodal model with a 1-million token context window, we demonstrate the ability to generate bottom-line diagnoses from up to 40,000 image patches of size 768 × 768 pixels from multiple WSIs at 10× magnification. This is the equivalent of up to 11 hours of video at 1 fps. Expert pathologist evaluations demonstrate that the generated report text is clinically accurate and equivalent to or preferred over the original reporting for 68% (95% CI, 60%-76%) of multi-slide examples with up to 5 slides. Although performance decreased for examples with ≥6 slides, this study demonstrates the promise of leveraging the long-context capabilities of modern large multimodal models for the uniquely challenging task of medical report generation where each case can contain thousands of image patches.
View details
Preview abstract
Suppose Alice has a distribution ${P}$ and Bob has a distribution ${Q}$. Alice wants to draw a sample $a\sim {P}$ and Bob a sample $b \sim {Q}$ such that $a = b$ with as high of probability as possible. It is well-known that, by sampling from an optimal coupling between the distributions, Alice and Bob can achieve $\Pr[a = b] = 1 - D_{\text{tv}}({P},{Q})$, where $D_{\text{tv}}({P},{Q})$ is the total variation distance between ${P}$ and ${Q}$.
What if Alice and Bob must solve this same problem \emph{without communicating at all?} Perhaps surprisingly, with access to public randomness, they can still achieve $\Pr[a = b] \geq \frac{1 - D_{\text{tv}}({P},{Q})}{1 + D_{\text{tv}}({P},{Q})} \geq 1-2D_{\text{tv}}({P},{Q})$ using a simple protocol based on the Weighted MinHash algorithm. This bound was shown to be optimal in the worst-case by Bavarian, Ghazi, Haramaty, Kamath, Rivest, and Sudan [ToC 2020].
In this work, we revisit the ``communication-free coupling'' problem. We provide a simpler proof of the optimality result from [Bavarian et al., 2020]. Moreover we show that, while the \emph{worst-case} success probability of Weighted MinHash cannot be improved, an equally simple protocol based on Gumbel sampling offers a Pareto improvement: for every pair of distributions ${P}$ and ${Q}$, Gumbel sampling achieves an equal or higher value of $\Pr[a = b]$ than Weighted MinHash.
Importantly, this improvement translates to practice. We demonstrate an application of communication-free coupling to \emph{speculative decoding}, a recent method for accelerating autoregressive large language models [Leviathan, Kalman, Matias, ICML 2023].
We show that communication-free protocols can be used to contruct \emph{\CSD{}} schemes, which have the desirable property that their output is fixed given a fixed random seed, regardless of what drafter is used for speculation. In experiments on a language generation task, Gumbel sampling outperforms Weighted MinHash.
Code is available at \url{https://github.com/majid-daliri/DISD}.
Finally, we study the coupling problem in the setting where communication is \emph{bounded}, rather than completely eliminated. We describe a protocol that uses just $O(\log(n/\epsilon))$ bits of communication to achieve $\Pr[a = b] = 1 - D_{\text{tv}}({P},{Q}) - \epsilon$, i.e. to essentially match optimal coupling.
View details
Differentiable Approximations for Distance Queries
David M. Mount
Proceedings of the 2025 Annual ACM-SIAM Symposium on Discrete Algorithms (SODA)
Preview abstract
The widespread use of gradient-based optimization has motivated the adaptation of various classical algorithms into differentiable solvers compatible with learning pipelines. In this paper, we investigate the enhancement of traditional geometric query problems such that the result consists of both the geometric function as well as its gradient. Specifically, we study the fundamental problem of distance queries against a set of points P in R^d, which also underlies various similarity measures for learning algorithms.
The main result of this paper is a multiplicative (1+epsilon)-approximation of the Euclidean distance to P which is differentiable at all points in R^d \ P with asymptotically optimal bounds on the norms of its gradient and Hessian, from a data structure with storage and query time matching state-of-the-art results for approximate nearest-neighbor searching. The approximation is realized as a regularized distance through a partition-of-unity framework, which efficiently blends multiple local approximations, over a suitably defined covering of space, into a smooth global approximation. In order to obtain the local distance approximations in a manner that facilitates blending, we develop a new approximate Voronoi diagram based on a simple point-location data structure, simplifying away both the lifting transformation and ray shooting.
View details
Fast ACS: Low-Latency File-Based Ordered Message Delivery at Scale
Anil Raghunath Iyer
Neel Bagora
Chang Yu
Olivier Pomerleau
Vivek Kumar
Prunthaban Kanthakumar
Usenix Annual Technical Conference (2025)
Preview abstract
Low-latency message delivery is crucial for real-time systems. Data originating from a producer must be delivered to consumers, potentially distributed in clusters across metropolitan and continental boundaries. With the growing scale of computing, there can be several thousand consumers of the data. Such systems require a robust messaging system capable of transmitting messages containing data across clusters and efficiently delivering them to consumers. The system must offer guarantees like ordering and at-least-once delivery while avoiding overload on consumers, allowing them to consume messages at their own pace.
This paper presents the design of Fast ACS (an abbreviation for Ads Copy Service), a file-based ordered message delivery system that leverages a combination of two-sided (inter-cluster) and one-sided (intra-cluster) communication primitives—namely, Remote Procedure Call and Remote Direct Memory Access, respectively—to deliver messages. The system has been successfully deployed to dozens of production clusters and scales to accommodate several thousand consumers within each cluster, which amounts to Tbps-scale intra-cluster consumer traffic at peak. Notably, Fast ACS delivers messages to consumers across the globe within a few seconds or even sub-seconds (p99) based on the message volume and consumer scale, at a low resource cost.
View details
Society-Centric Product Innovation in the Era of Customer Obsession
International Journal of Science and Research Archive (IJSRA), Volume 14 - Issue 1 (2025)
Preview abstract
This article provides a comprehensive analysis of the evolving landscape of innovation in the technology sector, with a focus on the intersection of technological progress and social responsibility. The article explores key challenges facing the industry, including public trust erosion, digital privacy concerns, and the impact of automation on workforce dynamics. It investigates responsible innovation frameworks' emergence and implementation across various organizations, highlighting the transformation from traditional development approaches to more society-centric models. The article demonstrates how companies balance innovation speed with social responsibility, incorporate ethical considerations into their development processes, and address digital disparities across different demographics. By examining how companies balance the pace of innovation with ethical responsibilities, integrate social considerations into their processes, and address digital inequities across diverse demographics, the article underscores the transformative potential of these frameworks. Through insights into cross-functional teams, impact assessment tools, and stakeholder engagement strategies, it demonstrates how responsible innovation drives both sustainable business value and societal progress.
View details
Simulation-Based Inference: A Practical Guide
Michael Deistler
Jan Boelts
Peter Steinbach
Guy Moss
Thomas Moreau
Manuel Gloeckler
Pedro L. C. Rodriguez
Julia Linhart
Janne K. Lappalainen
Benjamin Kurt Miller
Pedro J. Goncalves
Cornelius Schröder
Jakob H. Macke
arXiv (2025)
Preview abstract
A central challenge in many areas of science and engineering is to identify model parameters that are consistent with empirical data and prior knowledge. Bayesian inference offers a principled framework for this task, but can be computationally prohibitive when models are defined by stochastic simulators. Simulation-Based Inference (SBI) provides a suite of methods to overcome this limitation and has enabled scientific discoveries in fields such as particle physics, astrophysics and neuroscience. The core idea of SBI is to train neural networks on data generated by a simulator, without requiring access to likelihood evaluations. Once trained, the neural network can rapidly perform inference on empirical observations without requiring additional optimization or simulations. In this tutorial, we provide a practical guide for practitioners aiming to apply SBI methods. We outline a structured SBI workflow and offer practical guidelines and diagnostic tools for every stage of the process--from setting up the simulator and prior, choosing the SBI method and neural network architecture, training the inference model, to validating results and interpreting the inferred parameters. We illustrate these steps through examples from astrophysics, psychophysics, and neuroscience. This tutorial empowers researchers to apply state-of-the-art SBI methods, facilitating efficient parameter inference for scientific discovery.
View details
Preview abstract
Recent advances in large foundation models (FMs) have enabled
learning general-purpose representations in natural language, vi-
sion, and audio. Yet geospatial artificial intelligence (GeoAI) still
lacks widely adopted foundation models that generalize across
tasks. We argue that a key bottleneck is the absence of unified,
general-purpose, and transferable representations for geospatially
embedded objects (GEOs). Such objects include points, polylines,
and polygons in geographic space, enriched with semantic context
and critical for geospatial reasoning. Much current GeoAI research
compares GEOs to tokens in language models, where patterns of
human movement and spatiotemporal interactions yield contextual
meaning similar to patterns of words in text. However, modeling
GEOs introduces challenges fundamentally different from language,
including spatial continuity, variable scale and resolution, temporal
dynamics, and data sparsity. Moreover, privacy constraints and
global variation in mobility further complicates modeling and gen-
eralization. This paper formalizes these challenges, identifies key
representational gaps, and outlines research directions for build-
ing foundation models that learn behavior-informed, transferable
representations of GEOs from large-scale human mobility data.
View details
Better autoregressive regression with LLMs via regression-aware fine-tuning
Zhao Meng
Aditya Menon
The Thirteenth International Conference on Learning Representations (2025)
Preview abstract
Decoder-based large language models (LLMs) have proven highly versatile, with remarkable successes even on problems ostensibly removed from traditional language generation. One such example is solving regression problems, where the targets are real numbers rather than textual tokens. A common approach to use LLMs on such problems is to perform fine-tuning based on the cross-entropy loss, and use autoregressive sampling at inference time. Another approach relies on fine-tuning a separate predictive head with a suitable loss such as squared error. While each approach has had success, there has been limited study on principled ways of using decoder LLMs for regression. In this work, we compare different prior works under a unified view, and introduce regression-aware fine-tuning(RAFT), a novel approach based on the Bayes-optimal decision rule. We demonstrate how RAFT improves over established baselines on several benchmarks and model families.
View details
Databases in the Era of Memory-Centric Computing
Yannis Chronis
Anastasia Ailamaki
Lawrence Benson
Jana Gičeva
Eric Seldar
Lisa Wu Wills
2025
Preview abstract
The increasing disparity between processor core counts and memory bandwidth, coupled with the rising cost and underutilization of memory, introduces a performance and cost Memory Wall and presents a significant challenge to the scalability of database systems. We argue that current processor-centric designs are unsustainable, and we advocate for a shift towards memory-centric computing, where disaggregated memory pools enable cost-effective scaling and robust performance. Database systems are uniquely positioned to leverage memory-centric systems because of their intrinsic data-centric nature. We demonstrate how memory-centric database operations can be realized with current hardware, paving the way for more efficient and scalable data management in the cloud.
View details