Publications
Our teams aspire to make discoveries that impact everyone, and core to our approach is sharing our research and tools to fuel progress in the field.
 
        
        Our teams aspire to make discoveries that impact everyone, and core to our approach is sharing our research and tools to fuel progress in the field.
        Sort By
        
        
    
    
  
    1 - 15 of 10795 publications
  
  
            
        
        
          
              Preview abstract
          
          
              For many practical applications of quantum computing, the slowest and most costly steps involve coherently accessing classical data. We help address this challenge by applying mass production techniques, which can sometimes allow us to perform operations many times in parallel for a cost that is comparable to a single execution[1-3]. We combine existing mass-production results with modern approaches for loading classical data using ``quantum read-only memory.'' We show that quantum mass production techniques offer no benefit when we consider a cost model that focuses purely on the number of non-Clifford gates. However, analyzing the constant factors in a more nuanced cost model, we find that it may be possible to obtain a reduction in cost of an order or magnitude or more for a variety reasonably-sized fault-tolerant quantum algorithms. We present several applications of quantum mass-production techniques beyond naive parallelization, including a strategy for reducing the cost of serial calls to the same data loading step.
              
  
View details
          
        
      
    
        
        
          
              Preview abstract
          
          
              AI coding assistants are rapidly becoming integral to modern software development. A key challenge in this space is the continual need to migrate and modernize codebases in response to evolving software ecosystems. Traditionally, such migrations have relied on rule-based systems and human intervention. With the advent of powerful large language models (LLMs), AI-driven agentic frameworks offer a promising alternative—but their effectiveness remains underexplored. In this paper, we introduce FreshBrew, a novel benchmark for evaluating AI-based agentic frameworks on project-level Java migrations. We benchmark several such frameworks, powered by state-of-the-art LLMs, and compare their performance against established rule-based tools. Our evaluation of AI agents on this benchmark of 228 repositories shows that the top-performing model, Gemini 2.5 Flash, can successfully migrate 56.5% of projects to JDK 17. Our empirical analysis reveals novel insights into the critical strengths and limitations of current agentic approaches, offering actionable insights into their real-world applicability. By releasing FreshBrew publicly upon acceptance, we aim to facilitate rigorous, reproducible evaluation and catalyze progress in AI-driven codebase modernization.
              
  
View details
          
        
      
    
        
          
            
              Oculomics: Current Concepts and Evidence
            
          
        
        
          
            
              
                
                  
                    
    
    
    
    
    
                      
                        Zhuoting Zhu
                      
                    
                
              
            
              
                
                  
                    
                    
                      
                        Yueye Wang
                      
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Ziyi Qi
                      
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Wenyi Hu
                      
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Xiayin Zhang
                      
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Siegfried Wagner
                      
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Yujie Wang
                      
                    
                  
              
            
              
                
                  
                    
                    
                      
                        An Ran Ran
                      
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Joshua Ong
                      
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Ethan Waisberg
                      
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Mouayad Masalkhi
                      
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Alex Suh
                      
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Yih Chung Tham
                      
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Carol Y. Cheung
                      
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Xiaohong Yang
                      
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Honghua Yu
                      
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Zongyuan Ge
                      
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Wei Wang
                      
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Bin Sheng
                      
                    
                  
              
            
              
                
                  
                    
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Andrew G. Lee
                      
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Alastair Denniston
                      
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Peter van Wijngaarden
                      
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Pearse Keane
                      
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Ching-Yu Cheng
                      
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Mingguang He
                      
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Tien Yin Wong
                      
                    
                  
              
            
          
          
          
          
            Progress in Retinal and Eye Research (2025)
          
          
        
        
        
          
              Preview abstract
          
          
              The eye provides novel insights into general health, as well as pathogenesis and development of systemic diseases. In the past decade, growing evidence has demonstrated that the eye's structure and function mirror multiple systemic health conditions, especially in cardiovascular diseases, neurodegenerative disorders, and kidney impairments. This has given rise to the field of oculomics- the application of ophthalmic biomarkers to understand mechanisms, detect and predict disease. The development of this field has been accelerated by three major advances: 1) the availability and widespread clinical adoption of high-resolution and non-invasive ophthalmic imaging (“hardware”); 2) the availability of large studies to interrogate associations (“big data”); 3) the development of novel analytical methods, including artificial intelligence (AI) (“software”). Oculomics offers an opportunity to enhance our understanding of the interplay between the eye and the body, while supporting development of innovative diagnostic, prognostic, and therapeutic tools. These advances have been further accelerated by developments in AI, coupled with large-scale linkage datasets linking ocular imaging data with systemic health data. Oculomics also enables the detection, screening, diagnosis, and monitoring of many systemic health conditions. Furthermore, oculomics with AI allows prediction of the risk of systemic diseases, enabling risk stratification, opening up new avenues for prevention or individualized risk prediction and prevention, facilitating personalized medicine. In this review, we summarise current concepts and evidence in the field of oculomics, highlighting the progress that has been made, remaining challenges, and the opportunities for future research.
              
  
View details
          
        
      
    
        
          
            
              Life at the Boundary of Chemical Kinetics and Program       Execution
            
          
        
        
          
            
              
                
                  
                    
    
    
    
    
    
                      
                        Thomas Fischbacher
                      
                    
                
              
            
          
          
          
          
            Physical Review E (2025)
          
          
        
        
        
          
              Preview abstract
          
          
              Abstract
This work introduces a generic quantitative framework for studying
processes that involve interactions of polymer sequences. Possible
applications range from quantitative studies of the reaction kinetics
of polymerization processes to explorations of the behavior of
chemical implementations of computational - including basic life-like
- processes. This way, we establish a bridge between thermodynamic and
computational aspects of systems that are defined in terms of sequence
interactions. As a by-product of these investigations, we clarify some
common confusion around the notion of ``autocatalysis''.
Using a Markov process model of polymer sequence composition and
dynamical evolution of the Markov process's parameters via an ODE that
arises when taking the double ``chemical'' many-particle limit as well
as ``rarefied interactions'' limit, this approach enables - for example
- accurate quantitative explorations of entropy generation in systems
where computation is driven by relaxation to thermodynamic equilibrium.
The computational framework internally utilizes the Scheme programming
language's intrinsic continuation mechanisms to provide nondeterministic
evaluation primitives that allow the user to specify example systems in
straight purely functional code, making exploration of all possible
relevant sequence composition constellations - which would be otherwise
tedious to write code for - automatic and hidden from the user.
As the original motivation for this work came from investigations into
emergent program evolution that arises in computational substrates of
the form discussed in recent work on ``Computational Life''
\cite{alakuijala2024computational}, a major focus of attention is on
giving a deeper explanation of key requirements for the possible
emergence of self-replicators especially in settings whose behavior is
governed by real world physics rather than ad-hoc rules that may be
difficult to implement in a physical system. A collection of fully
worked out examples elucidate how this modeling approach is
quantitatively related to Metropolis Monte Carlo based simulations as
well as exact or approximate analytic approaches, and how it can be
utilized to study a broad range of different systems. These examples
can also serve as starting points for further explorations.
              
  
View details
          
        
      
    
        
          
            
              Roll the dice & look before you leap: Going beyond the creative limits of next-token prediction
            
          
        
        
          
            
              
                
                  
                    
    
    
    
    
    
                      
                        Vaishnavh Nagarajan
                      
                    
                
              
            
              
                
                  
                    
                    
                      
                        Chen Wu
                      
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Charles Ding
                      
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Aditi Raghunathan
                      
                    
                  
              
            
          
          
          
          
            2025
          
          
        
        
        
          
              Preview abstract
          
          
              We design a suite of minimal algorithmic tasks that are a loose abstraction of open-ended real-world tasks. This allows us to cleanly and controllably quantify the creative limits of the present-day language model. Much like real-world tasks that require a creative, far-sighted leap of thought, our tasks require an implicit, open-ended stochastic planning step that either (a) discovers new connections in an abstract knowledge graph (like in wordplay, drawing analogies, or research) or (b) constructs new patterns (like in designing math problems or new proteins). In these tasks, we empirically and conceptually argue how next-token learning is myopic; multi-token approaches, namely teacherless training and diffusion models, comparatively excel in producing diverse and original output. Secondly, to elicit randomness without hurting coherence, we find that injecting noise at the input layer (dubbed seed-conditioning) works surprisingly as well as (and in some conditions, better than) temperature sampling from the output layer. Thus, our work offers a principled, minimal test-bed for analyzing open-ended creative skills, and offers new arguments for going beyond next-token learning and temperature sampling.
              
  
View details
          
        
      
    
        
          
            
              Necro-reaper: Pruning away Dead Memory Traffic in Warehouse-Scale Computers
            
          
        
        
          
            
              
                
                  
                    
                
              
            
              
                
                  
                    
                    
                  
              
            
              
                
                  
                    
                    
                  
              
            
              
                
                  
                    
                    
                  
              
            
          
          
          
          
    
    
    
    
    
            Proceedings of the 30th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Association for Computing Machinery (2025)
          
          
        
        
        
          
              Preview abstract
          
          
              Memory bandwidth is emerging as a critical bottleneck in warehouse-scale computing (WSC). This work reveals that a significant portion of memory traffic in WSC is surprisingly unnecessary, consisting of unnecessary writebacks of deallocated data and fetches of uninitialized data. This issue is particularly acute in WSC, where short-lived heap allocations bigger than a cache line are prevalent. To address this problem, this work proposes a pragmatic approach tailored to WSC. Leveraging the existing WSC ecosystem of vertical integration, profile-guided compilation flows, and customized memory allocators, this work presents Necro-reaper, a novel software/hardware co-design that avoids dead memory traffic without requiring the hardware tracking of prior work. New ISA instructions enable the hardware to avoid unnecessary dead traffic, while extended software components, including a profile-guided compiler and memory allocator, optimize the utilization of these instructions. Evaluation across a diverse set of 10 WSC workloads demonstrates that Necro-reaper achieves a geomean memory traffic reduction of 26% and a geomean IPC increase of 6%.
              
  
View details
          
        
      
    
        
        
          
              Preview abstract
          
          
              During remote communication, participants often share both digital and physical content, such as product designs, digital assets, and environments, to enhance mutual understanding. Recent advances in augmented communication have facilitated users to swiftly create and share digital 2D copies of physical objects from video feeds into a shared space. However, conventional 2D representations of digital objects limits spatial referencing in immersive environments. To address this, we propose Thing2Reality, an Extended Reality (XR) meeting platform that facilitates spontaneous discussions of both digital and physical items during remote sessions. With Thing2Reality, users can quickly materialize ideas or objects in immersive environments and share them as conditioned multiview renderings or 3D Gaussians. Thing2Reality enables users to interact with remote objects or discuss concepts in a collaborative manner. Our user studies revealed that the ability to interact with and manipulate 3D representations of objects significantly enhances the efficiency of discussions, with the potential to augment discussion of 2D artifacts.
              
  
View details
          
        
      
    
        
          
            
              VaultGemma
            
          
        
        
          
            
              
                
                  
                    
    
    
    
    
    
                      
                        Lynn Chua
                      
                    
                
              
            
              
                
                  
                    
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Prem Eruvbetine
                      
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Chiyuan Zhang
                      
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Thomas Mesnard
                      
                    
                  
              
            
              
                
                  
                    
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Borja De Balle Pigem
                      
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Daogao Liu
                      
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Amer Sinha
                      
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Pritish Kamath
                      
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Yangsibo Huang
                      
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Christopher A. Choquette-Choo
                      
                    
                  
              
            
              
                
                  
                    
                    
                      
                        George Kaissis
                      
                    
                  
              
            
              
                
                  
                    
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Armand Joulin
                      
                    
                  
              
            
              
                
                  
                    
                    
                  
              
            
              
                
                  
                    
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Da Yu
                      
                    
                  
              
            
              
                
                  
                    
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Ryan McKenna
                      
                    
                  
              
            
              
                
                  
                    
                    
                  
              
            
          
          
          
          
            arxiv (2025)
          
          
        
        
        
          
              Preview abstract
          
          
              In this work, we present VaultGemma 1B, a model based on the Gemma family of models fully trained with differential privacy. VaultGemma 1B is 1 billion parameter pretrained model based on the Gemma 2 series of models and uses the same dataset for training. We will be releasing a tech report and the weights of this model.
              
  
View details
          
        
      
    
        
          
            
              Probing non-equilibrium topological order on a quantum processor
            
          
        
        
          
            
              
                
                  
                    
    
    
    
    
    
                      
                        Melissa Will
                      
                    
                
              
            
              
                
                  
                    
                    
                      
                        Tyler Cochran
                      
                    
                  
              
            
              
                
                  
                    
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Bernhard Jobst
                      
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Norhan Eassa
                      
                    
                  
              
            
              
                
                  
                    
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Michael Knap
                      
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Adam Gammon-Smith
                      
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Frank Pollmann
                      
                    
                  
              
            
          
          
          
          
            Nature, 645 (2025), 348–353
          
          
        
        
        
          
              Preview abstract
          
          
              Out-of-equilibrium phases in many-body systems constitute a new paradigm in quantum matter—they exhibit dynamical properties that may otherwise be forbidden by equilibrium thermodynamics. Among these non-equilibrium phases are periodically driven (Floquet) systems, which are generically difficult to simulate classically because of their high entanglement. Here we realize a Floquet topologically ordered state on an array of superconducting qubits. We image the characteristic dynamics of its chiral edge modes and characterize its emergent anyonic excitations. Devising an interferometric algorithm allows us to introduce and measure a bulk topological invariant to probe the dynamical transmutation of anyons for system sizes up to 58 qubits. Our work demonstrates that quantum processors can provide key insights into the thus-far largely unexplored landscape of highly entangled non-equilibrium phases of matter.
              
  
View details
          
        
      
    
        
          
            
              Empirical Privacy Variance
            
          
        
        
          
            
              
                
                  
                    
    
    
    
    
    
                      
                        Ruicheng Xian
                      
                    
                
              
            
              
                
                  
                    
                    
                      
                        Chiyuan Zhang
                      
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Fan Wu
                      
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Yuzheng Hu
                      
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Pritish Kamath
                      
                    
                  
              
            
              
                
                  
                    
                    
                      
                        David Forsyth
                      
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Yuhang Liu
                      
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Lydia Zakynthinou
                      
                    
                  
              
            
          
          
          
          
            International Conference on Machine Learning (ICML) (2025)
          
          
        
        
        
          
              Preview abstract
          
          
              We propose the notion of empirical privacy variance and study it in the context of differentially private fine-tuning of language models. Specifically, we show that models calibrated to the same (ε,δ)-DP guarantee using DP-SGD with different hyperparameter configurations can exhibit significant variations in empirical privacy, which we quantify through the lens of memorization. We investigate the generality of this phenomenon across multiple dimensions and discuss why it is surprising and relevant. Through regression analysis, we examine how individual and composite hyperparameters influence empirical privacy. The results reveal a no-free-lunch trade-off: existing practices of hyperparameter tuning in DP-SGD, which focus on optimizing utility under a fixed privacy budget, often come at the expense of empirical privacy. To address this, we propose refined heuristics for hyperparameter selection that explicitly account for empirical privacy, showing that they are both precise and practically useful. Finally, we take preliminary steps to understand empirical privacy variance. We propose two hypotheses, identify limitations in existing techniques like privacy auditing, and outline open questions for future research.
              
  
View details
          
        
      
    
        
          
            
              Score-based Causal Representation Learning: Linear and General Transformations
            
          
        
        
          
            
              
                
                  
                    
    
    
    
    
    
                      
                        Burak Varici
                      
                    
                
              
            
              
                
                  
                    
                    
                      
                        Emre Acarturk
                      
                    
                  
              
            
              
                
                  
                    
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Abhishek Kumar
                      
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Ali Tajer
                      
                    
                  
              
            
          
          
          
          
            Journal of Machine Learning Research (JMLR) 2025 (2025)
          
          
        
        
        
          
              Preview abstract
          
          
              This paper addresses intervention-based causal representation learning (CRL) under a
general nonparametric latent causal model and an unknown transformation that maps the
latent variables to the observed variables. Linear and general transformations are investigated. The paper addresses both the identifiability and achievability aspects. Identifiability
refers to determining algorithm-agnostic conditions that ensure the recovery of the true
latent causal variables and the underlying latent causal graph. Achievability refers to the algorithmic aspects and addresses designing algorithms that achieve identifiability guarantees.
By drawing novel connections between score functions (i.e., the gradients of the logarithm of
density functions) and CRL, this paper designs a score-based class of algorithms that ensures
both identifiability and achievability. First, the paper focuses on linear transformations and
shows that one stochastic hard intervention per node suffices to guarantee identifiability. It
also provides partial identifiability guarantees for soft interventions, including identifiability
up to mixing with parents for general causal models and perfect recovery of the latent graph
for sufficiently nonlinear causal models. Secondly, it focuses on general transformations
and demonstrates that two stochastic hard interventions per node are sufficient for identifiability. This is achieved by defining a differentiable loss function whose global optima
ensure identifiability for general CRL. Notably, one does not need to know which pair of
interventional environments has the same node intervened. Finally, the theoretical results
are empirically validated via experiments on structured synthetic data and image data.
              
  
View details
          
        
      
    
        
          
            
              Faster electronic structure quantum simulation by spectrum amplification
            
          
        
        
          
            
              
                
                  
                    
    
    
    
    
    
                      
                        Guang Hao Low
                      
                    
                
              
            
              
                
                  
                    
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Robbie King
                      
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Alec White
                      
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Rolando Somma
                      
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Dominic Berry
                      
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Qiushi Han
                      
                    
                  
              
            
              
                
                  
                    
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Albert Eugene DePrince III
                      
                    
                  
              
            
          
          
          
          
            arXiv (2025) (to appear)
          
          
        
        
        
          
              Preview abstract
          
          
              We discover that many interesting electronic structure Hamiltonians have a compact and close-to-frustration-free sum-of-squares representation with a small energy gap. We show that this gap enables spectrum amplification in estimating ground state energies, which improves the cost scaling of previous approaches from the block-encoding normalization factor $\lambda$ to just $\sqrt{\lambda E_{\text{gap}}}$. For any constant-degree polynomial basis of fermionic operators, a sum-of-squares representation with optimal gap can be efficiently computed using semi-definite programming. Although the gap can be made arbitrarily small with an exponential-size basis, we find that the degree-$2$ spin-free basis in combination with approximating two-body interactions by a new Double-Factorized (DF) generalization of Tensor-Hyper-Contraction (THC) gives an excellent balance of gap, $\lambda$, and block-encoding costs. For classically-hard FeMoco complexes -- candidate applications for first useful quantum advantage -- this combination improves the Toffoli gates cost of the first estimates with DF [Phys. Rev. Research 3, 033055] or THC [PRX Quantum 2, 030305] by over two orders of magnitude.  
https://drive.google.com/file/d/1hw4zFv_X0GeMpE4et6SS9gAUM9My98iJ/view?usp=sharing
              
  
View details
          
        
      
    
        
          
            
              Mastering Multiple-Expert Routing: Realizable H-Consistency and Strong Guarantees for Learning to Defer
            
          
        
        
          
            
              
                
                  
                    
    
    
    
    
    
                      
                        Anqi Mao
                      
                    
                
              
            
              
                
                  
                    
                    
                  
              
            
              
                
                  
                    
                    
                  
              
            
          
          
          
          
            Proceedings of the 42nd International Conference on Machine Learning (ICML 2025)
          
          
        
        
        
          
              Preview abstract
          
          
              The problem of learning to defer with multiple experts consists of optimally assigning input instances to experts, balancing the trade-off between their accuracy and computational cost. This is a critical challenge in natural language generation, but also in other fields such as image processing, and medical diagnostics.  Recent studies have proposed surrogate loss functions to optimize deferral, but challenges remain in ensuring their consistency properties. This paper introduces novel surrogate loss functions and efficient algorithms with strong theoretical learning guarantees. We address open questions regarding realizable $H$-consistency, $H$-consistency bounds, and Bayes-consistency for both single-stage (jointly learning predictor and deferral function) and two-stage (learning only the deferral function with a fixed expert) learning scenarios. For single-stage deferral, we introduce a family of new realizable $H$-consistent surrogate losses and further prove $H$-consistency for a selected member. For two-stage deferral, we derive new surrogate losses that achieve realizable $H$-consistency, $H$-consistency bounds, and Bayes-consistency for the two-expert scenario and, under natural assumptions, multiple-expert scenario. Additionally, we provide enhanced theoretical guarantees under low-noise assumptions for both scenarios. Finally, we report the results of experiments using our proposed surrogate losses, comparing their performance against existing baselines.
              
  
View details
          
        
      
    
        
          
            
              Astute RAG: Overcoming Imperfect Retrieval Augmentation and Knowledge Conflicts for Large Language Models
            
          
        
        
          
            
              
                
                  
                    
    
    
    
    
    
                      
                        Fei Wang
                      
                    
                
              
            
              
                
                  
                    
                    
                  
              
            
              
                
                  
                    
                    
                  
              
            
              
                
                  
                    
                    
                  
              
            
              
                
                  
                    
                    
                  
              
            
          
          
          
          
            The Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (ACL 2025) (2025) (to appear)
          
          
        
        
        
          
              Preview abstract
          
          
              Retrieval-Augmented Generation (RAG), while effective in integrating external knowledge to address the limitations of large language models (LLMs), can be undermined by imperfect retrieval, which may introduce irrelevant, misleading, or even malicious information. Despite its importance, previous studies have rarely explored the behavior of RAG through joint analysis on how errors from imperfect retrieval attribute and propagate, and how potential conflicts arise between the LLMs' internal knowledge and external sources. We find that imperfect retrieval augmentation might be inevitable and quite harmful, through controlled analysis under realistic conditions. We identify the knowledge conflicts between LLM-internal and external knowledge from retrieval as a bottleneck to overcome in the post-retrieval stage of RAG. To render LLMs resilient to imperfect retrieval, we propose Astute RAG, a novel RAG approach that adaptively elicits essential information from LLMs' internal knowledge, iteratively consolidates internal and external knowledge with source-awareness, and finalizes the answer according to information reliability. Our experiments using Gemini and Claude demonstrate that Astute RAG significantly outperforms previous robustness-enhanced RAG methods. Notably, Astute RAG is the only approach that matches or exceeds the performance of LLMs without RAG under worst-case scenarios. Further analysis reveals that Astute RAG effectively resolves knowledge conflicts, improving the reliability and trustworthiness of RAG systems.
              
  
View details
          
        
      
    
        
          
            
              Fast ACS: Low-Latency File-Based Ordered Message Delivery at Scale
            
          
        
        
          
            
              
                
                  
                    
                
              
            
              
                
                  
                    
                    
    
    
    
    
    
                      
                        Anil Raghunath Iyer
                      
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Neel Bagora
                      
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Chang Yu
                      
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Olivier Pomerleau
                      
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Vivek Kumar
                      
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Prunthaban Kanthakumar
                      
                    
                  
              
            
          
          
          
          
            Usenix Annual Technical Conference (2025)
          
          
        
        
        
          
              Preview abstract
          
          
              Low-latency message delivery is crucial for real-time systems. Data originating from a producer must be delivered to consumers, potentially distributed in clusters across metropolitan and continental boundaries. With the growing scale of computing, there can be several thousand consumers of the data. Such systems require a robust messaging system capable of transmitting messages containing data across clusters and efficiently delivering them to consumers. The system must offer guarantees like ordering and at-least-once delivery while avoiding overload on consumers, allowing them to consume messages at their own pace.
This paper presents the design of Fast ACS (an abbreviation for Ads Copy Service), a file-based ordered message delivery system that leverages a combination of two-sided (inter-cluster) and one-sided (intra-cluster) communication primitives—namely, Remote Procedure Call and Remote Direct Memory Access, respectively—to deliver messages. The system has been successfully deployed to dozens of production clusters and scales to accommodate several thousand consumers within each cluster, which amounts to Tbps-scale intra-cluster consumer traffic at peak. Notably, Fast ACS delivers messages to consumers across the globe within a few seconds or even sub-seconds (p99) based on the message volume and consumer scale, at a low resource cost.
              
  
View details
          
        
      
    