Publications

Our teams aspire to make discoveries that impact everyone, and core to our approach is sharing our research and tools to fuel progress in the field.

people standing in front of a screen with images and a chipboard

Our teams aspire to make discoveries that impact everyone, and core to our approach is sharing our research and tools to fuel progress in the field.

Sort By
  • Title
  • Title, descending
  • Year
  • Year, descending
1 - 15 of 10827 publications
    Productionizing Quantum Mass Production
    Bill Huggins
    Nathan Wiebe
    arXiv for now (2026) (to appear)
    Preview abstract For many practical applications of quantum computing, the slowest and most costly steps involve coherently accessing classical data. We help address this challenge by applying mass production techniques, which can sometimes allow us to perform operations many times in parallel for a cost that is comparable to a single execution[1-3]. We combine existing mass-production results with modern approaches for loading classical data using ``quantum read-only memory.'' We show that quantum mass production techniques offer no benefit when we consider a cost model that focuses purely on the number of non-Clifford gates. However, analyzing the constant factors in a more nuanced cost model, we find that it may be possible to obtain a reduction in cost of an order or magnitude or more for a variety reasonably-sized fault-tolerant quantum algorithms. We present several applications of quantum mass-production techniques beyond naive parallelization, including a strategy for reducing the cost of serial calls to the same data loading step. View details
    FreshBrew: A Benchmark for Evaluating AI Agents on Java Code Migration
    Diganta Misra
    Yanqi Luo
    Anjali Sridhar
    Justine Gehring
    Silvio Soares Ribeiro Junior
    2026
    Preview abstract AI coding assistants are rapidly becoming integral to modern software development. A key challenge in this space is the continual need to migrate and modernize codebases in response to evolving software ecosystems. Traditionally, such migrations have relied on rule-based systems and human intervention. With the advent of powerful large language models (LLMs), AI-driven agentic frameworks offer a promising alternative—but their effectiveness remains underexplored. In this paper, we introduce FreshBrew, a novel benchmark for evaluating AI-based agentic frameworks on project-level Java migrations. We benchmark several such frameworks, powered by state-of-the-art LLMs, and compare their performance against established rule-based tools. Our evaluation of AI agents on this benchmark of 228 repositories shows that the top-performing model, Gemini 2.5 Flash, can successfully migrate 56.5% of projects to JDK 17. Our empirical analysis reveals novel insights into the critical strengths and limitations of current agentic approaches, offering actionable insights into their real-world applicability. By releasing FreshBrew publicly upon acceptance, we aim to facilitate rigorous, reproducible evaluation and catalyze progress in AI-driven codebase modernization. View details
    Data Selection for ERMs
    Alexander Shlimovich
    Steve Hanneke
    Amir Yehudayoff
    Shay Moran
    2025
    Preview abstract Learning theory has traditionally followed a model-centric approach, focusing on designing optimal algorithms for a fixed natural learning task (e.g., linear classification or regression). In this paper, we adopt a complementary data-centric perspective, whereby we fix a natural learning rule and focus on optimizing the training data. Specifically, we study the following question: given a learning rule \(\mathcal{A}\) and a data selection budget \(n\), how well can \(\mathcal{A}\) perform when trained on at most \(n\) data points selected from a population of \(N\) points? We investigate when it is possible to select \(n \ll N\) points and achieve performance comparable to training on the entire population. We address this question across a variety of empirical risk minimizers. Our results include optimal data-selection bounds for mean estimation, linear classification, and linear regression. Additionally, we establish two general results: a taxonomy of error rates in binary classification and in stochastic convex optimization. Finally, we propose several open questions and directions for future research. View details
    A classical quadratic speedup for Planted kXOR
    Meghal Gupta
    Noah G. Singer
    William He
    Ryan ODonnell
    (2025)
    Preview abstract A recent work of Schmidhuber et al. (QIP, SODA, & Phys. Rev. X 2025) exhibited a quantum algorithm for the noisy planted kXOR problem running quartically faster than all known classical algorithms. In this work, we design a new classical algorithm that is quadratically faster than the best previous one, in the case of large constant k. Thus for such k, the quantum advantage of Schmidhuber et al. becomes only quadratic. Our algorithm, which also works in the semirandom case, combines tools from sublinear-time algorithms (essentially, the birthday paradox) and polynomial anticoncentration. View details
    Preview abstract Graph Neural Networks (GNNs) have become indispensable tools in many domains, such as social network analysis, financial fraud detection, and drug discovery. Prior research primarily concentrated on improving prediction accuracy while overlooking how reliable the model predictions are. Conformal prediction on graphs emerges as a promising solution, offering statistically sound uncertainty estimates with a pre-defined coverage level. Despite the promising progress, existing works only focus on achieving model coverage guarantees without considering fairness in the coverage within different demographic groups. To bridge the gap between conformal prediction and fair coverage across different groups, we pose the fundamental question: Can fair GNNs enable the uncertainty estimates to be fairly applied across demographic groups? To answer this question, we provide a comprehensive analysis of the uncertainty estimation in fair GNNs employing various strategies. We prove theoretically that fair GNNs can enforce consistent uncertainty bounds across different demographic groups, thereby minimizing bias in uncertainty estimates. Furthermore, we conduct extensive experiments on five commonly used datasets across seven state-of-the-art fair GNN models to validate our theoretical findings. Additionally, based on the theoretical and empirical insights, we identify and analyze the key strategies from various fair GNN models that contribute to ensuring equalized uncertainty estimates. Our work estimates a solid foundation for future exploration of the practical implications and potential adjustments needed to enhance fairness in GNN applications across various domains. View details
    Preview abstract Understanding and controlling the reasoning processes of large language models (LLMs) is crucial for their reliable deployment. In this work, we investigate the latent representation of self-evaluation behavior - the ability of a model to assess its own reasoning steps - a vital behavior for robust reasoning. Through targeted steering vector computation, we identify a direction within LLM activations that represents this self-evaluation behavior. Crucially, we demonstrate that this steering vector for self-evaluation exhibits remarkable cross-contextual efficacy, working well across different domains (e.g., math and medicine) and languages (e.g., English and Spanish). This suggests that the identified latent direction captures a fundamental, abstract representation of self-evaluation within the LLM's internal state, offering a promising avenue for interpretable and controllable reasoning across diverse applications. View details
    Perceptual Audio Coding: A 40-Year Historical Perspective
    Juergen Herre
    Schuyler Quackenbush
    Minje Kim
    2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (2025)
    Preview abstract In the history of audio and acoustic signal processing perceptual audio coding has certainly excelled as a bright success story by its ubiquitous deployment in virtually all digital media devices, such as computers, tablets, mobile phones, set-top-boxes, and digital radios. From a technology perspective, perceptual audio coding has undergone tremendous development from the first very basic perceptually driven coders (including the popular mp3 format) to today’s full-blown integrated coding/rendering systems. This paper provides a historical overview of this research journey by pinpointing the pivotal development steps in the evolution of perceptual audio coding. Finally, it provides thoughts about future directions in this area. View details
    Our Approach to Protecting AI Training Data
    Cindy Muya
    Jason Novak
    Cindee Madison
    Ben Kamber
    Niha Vempati
    Jeremy Wiesner
    Google, Google, Google, 1600 Amphitheatre Parkway, Mountain View, CA, 94043 (2025) (2025)
    Preview abstract Google has over 25 years experience protecting data from inappropriate access and unauthorized use. In the era of AI, Google has extended these best practices in data protection to ensure that the right data is used the right way to train models. This paper presents a number of these best practices, describes how Google applies them in its systems, and describes how Google Cloud customers can use Google Cloud capabilities to implement these practices themselves. Protecting data requires both technical controls to enable safe data use at scale, and governance processes to ensure that companies have visibility and control over how their data is used. This fundamentally requires: understanding data and ensuring it has sufficient metadata in the form of attributes, controlling the data and implementing policies to allow (or disallow) certain usage based on those attributes, transforming data to enable its usage in policy compliant ways, and human oversight and governance. Protecting data in AI inherits these requirements and introduces new requirements to account for unique AI-specific risks including memorization/recitation and the costs of training foundational models. Meeting these new risks requires new capabilities including enhanced understanding of data and model lineage as well as an increased ability to control data usage through checks on data for policy compliance at the time a training job is configured before it is run. This white paper offers an in-depth look at data protection best practices and Google’s data protection capabilities, and is one of a series of publications about Google's Secure AI Framework (SAIF). Building upon its secure development practices, Google has developed and deployed a number of capabilities to understand, control, and transform data in its infrastructure so that data is both protected and used appropriately. This involves robust annotation systems to represent metadata and enable granular understanding of data at both an item and dataset level, policy engines that evaluate machine readable policies on that data using the metadata attributes, and sensors to understand how data is flowing across Google’s systems and raise alerts when policy violations occur. Moreover, Google has developed de-identification and anonymization systems to transform data to make it policy compliant and safer to use for AI training. View details
    Improving Informally Romanized Language Identification
    Adrian Benton
    Christo Kirov
    Proceedings of EMNLP (2025) (to appear)
    Preview abstract The Latin script is often used informally to write languages with non-Latin native scripts. In many cases (e.g., most languages in India), there is no orthography, meaning that there is no conventional spelling of words in the Latin script, hence there will be high spelling variability in written text. Such romanization can render languages that are normally easily distinguished based on script highly confusable, such as Hindi and Urdu. In this work, we present methods to improve language identification of romanized text by improving methods to synthesize training sets. We find that training on synthetic samples which incorporate natural spelling variation yields higher language identification system accuracy than including available naturally occurring examples in the training set or even training higher capacity models. We demonstrate new state-of-the-art language identification performance on romanized text from 20 Indic languages in the Bhasha-Abhijnaanam evaluation set (Madhani et al., 2023a), improving test F1 from the reported 74.7% (using a pretrained neural model) to 85.4% using a linear classifier trained solely on synthetic data and 88.2% when also training on available harvested text. View details
    Preview abstract We study differential privacy (DP) in a multi-party setting where each party only trusts a (known) subset of the other parties with its data. Specifically, given a trust graph where vertices correspond to parties and neighbors are mutually trusting, we give a DP algorithm for aggregation with a much better privacy-utility trade-off than in the well-studied local model of DP (where each party trusts no other party). We further study a robust variant where each party trusts all but an unknown subset of at most t of its neighbors (where t is a given parameter), and give an algorithm for this setting. We complement our algorithms with lower bounds, and discuss implications of our work to other tasks in private learning and analytics. View details
    Data Quality Issues in Multilingual Speech Datasets: The Need for Sociolinguistic Awareness and Proactive Language Planning
    Mingfei Lau
    Allen Chen
    Yeming Fang
    Tingting Xu
    Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics, Association for Computational Linguistics (ACL), Vienna, Austria (2025), 7466–7492
    Preview
    Life at the Boundary of Chemical Kinetics and Program Execution
    Thomas Fischbacher
    Physical Review E (2025)
    Preview abstract Abstract This work introduces a generic quantitative framework for studying processes that involve interactions of polymer sequences. Possible applications range from quantitative studies of the reaction kinetics of polymerization processes to explorations of the behavior of chemical implementations of computational - including basic life-like - processes. This way, we establish a bridge between thermodynamic and computational aspects of systems that are defined in terms of sequence interactions. As a by-product of these investigations, we clarify some common confusion around the notion of ``autocatalysis''. Using a Markov process model of polymer sequence composition and dynamical evolution of the Markov process's parameters via an ODE that arises when taking the double ``chemical'' many-particle limit as well as ``rarefied interactions'' limit, this approach enables - for example - accurate quantitative explorations of entropy generation in systems where computation is driven by relaxation to thermodynamic equilibrium. The computational framework internally utilizes the Scheme programming language's intrinsic continuation mechanisms to provide nondeterministic evaluation primitives that allow the user to specify example systems in straight purely functional code, making exploration of all possible relevant sequence composition constellations - which would be otherwise tedious to write code for - automatic and hidden from the user. As the original motivation for this work came from investigations into emergent program evolution that arises in computational substrates of the form discussed in recent work on ``Computational Life'' \cite{alakuijala2024computational}, a major focus of attention is on giving a deeper explanation of key requirements for the possible emergence of self-replicators especially in settings whose behavior is governed by real world physics rather than ad-hoc rules that may be difficult to implement in a physical system. A collection of fully worked out examples elucidate how this modeling approach is quantitatively related to Metropolis Monte Carlo based simulations as well as exact or approximate analytic approaches, and how it can be utilized to study a broad range of different systems. These examples can also serve as starting points for further explorations. View details
    Small Models, Big Results: Achieving Superior Intent Extraction through Decomposition
    Danielle Cohen
    Yoni Halpern
    Noam Kahlon
    Joel Oren
    Omri Berkovitch
    Sapir Caduri
    Ido Dagan
    Tal Efros
    2025
    Preview abstract Understanding user intents from UI interaction trajectories remains a challenging, yet crucial, frontier in intelligent agent development. While massive, datacenter-based, multi-modal large language models (MLLMs) possess greater capacity to handle the complexities of such sequences, smaller models which can run on-device to provide a privacy-preserving, low-cost, and low-latency user experience, struggle with accurate intent inference. We address these limitations by introducing a novel decomposed approach: first, we perform structured interaction summarization, capturing key information from each user action. Second, we perform intent extraction using a fine-tuned model operating on the aggregated summaries. This method improves intent understanding in resource-constrained models, even surpassing the base performance of large MLLMs. View details
    Preview abstract Building on the linear programming approach to competitive equilibrium pricing, we develop a general method for constructing iterative auctions that achieve Vickrey-Clarke-Groves (VCG) outcomes. We show how to transform a linear program characterizing competitive equilibrium prices into one that characterizes universal competitive equilibrium (UCE) prices, which elicit precisely the information needed to compute VCG payments. By applying a primal-dual algorithm to these transformed programs, we derive iterative auctions that maintain a single price path, eliminating the overhead and incentive problems associated with multiple price paths used solely for payment calculations. We demonstrate the versatility of our method by developing novel UCE auctions for multi-unit settings and deriving an iterative UCE variant of the Product-Mix auction. The resulting auctions combine the transparency of iterative price discovery with the efficiency and incentive properties of the VCG mechanism. View details
    Preview abstract The accelerating pace of innovation is fundamentally reshaping product development, creating a complex environment that demands rapid decision-making and efficient information management. To remain competitive, organizations must integrate Generative AI (GenAI) tools into their Product Lifecycle Management (PLM) processes. This integration is crucial because traditional PLM systems, often built on decades-old architectures, struggle to manage modern product complexity, vast data volumes, and interconnected supply chains.1 Limitations such as data silos, inflexible change management, and inadequate collaboration capabilities hinder the agility required today.3 GenAI offers transformative potential by automating complex tasks, enhancing data analysis, and facilitating more dynamic design and collaboration within the PLM ecosystem.5 This integration is not merely an upgrade but an essential evolution to overcome the inherent architectural and process constraints of legacy systems, which impede the speed and data fluidity necessary in the current market. View details
    ×