Publications
Our teams aspire to make discoveries that impact everyone, and core to our approach is sharing our research and tools to fuel progress in the field.
Our teams aspire to make discoveries that impact everyone, and core to our approach is sharing our research and tools to fuel progress in the field.
Sort By
1 - 15 of 10822 publications
Preview abstract
AI coding assistants are rapidly becoming integral to modern software development. A key challenge in this space is the continual need to migrate and modernize codebases in response to evolving software ecosystems. Traditionally, such migrations have relied on rule-based systems and human intervention. With the advent of powerful large language models (LLMs), AI-driven agentic frameworks offer a promising alternative—but their effectiveness remains underexplored. In this paper, we introduce FreshBrew, a novel benchmark for evaluating AI-based agentic frameworks on project-level Java migrations. We benchmark several such frameworks, powered by state-of-the-art LLMs, and compare their performance against established rule-based tools. Our evaluation of AI agents on this benchmark of 228 repositories shows that the top-performing model, Gemini 2.5 Flash, can successfully migrate 56.5% of projects to JDK 17. Our empirical analysis reveals novel insights into the critical strengths and limitations of current agentic approaches, offering actionable insights into their real-world applicability. By releasing FreshBrew publicly upon acceptance, we aim to facilitate rigorous, reproducible evaluation and catalyze progress in AI-driven codebase modernization.
View details
Preview abstract
For many practical applications of quantum computing, the slowest and most costly steps involve coherently accessing classical data. We help address this challenge by applying mass production techniques, which can sometimes allow us to perform operations many times in parallel for a cost that is comparable to a single execution[1-3]. We combine existing mass-production results with modern approaches for loading classical data using ``quantum read-only memory.'' We show that quantum mass production techniques offer no benefit when we consider a cost model that focuses purely on the number of non-Clifford gates. However, analyzing the constant factors in a more nuanced cost model, we find that it may be possible to obtain a reduction in cost of an order or magnitude or more for a variety reasonably-sized fault-tolerant quantum algorithms. We present several applications of quantum mass-production techniques beyond naive parallelization, including a strategy for reducing the cost of serial calls to the same data loading step.
View details
Mix&Slice
Marco Rosa
Encyclopedia of Cryptography, Security and Privacy, Springer Nature Switzerland (2025), pp. 1550-1555
Preview abstract
Mix&Slice is an encryption technique that enables efficient and robust access revocation on resources stored at external cloud providers. The technique makes use of a transformation that provides strong inter-dependency in the encrypted representation of a resource. To perform access revocation, it is then sufficient to re-encrypt a small portion of the resource to have guarantees that the resource (and any of its parts) will become unintelligible to those from whom access has been revoked.
View details
Faster electronic structure quantum simulation by spectrum amplification
Guang Hao Low
Robbie King
Dominic Berry
Qiushi Han
Albert Eugene DePrince III
Alec White
Rolando Somma
Physical Review X, 15 (2025), pp. 041016
Preview abstract
The most advanced techniques using fault-tolerant quantum computers to estimate the ground-state energy of a chemical Hamiltonian involve compression of the Coulomb operator through tensor factorizations, enabling efficient block encodings of the Hamiltonian. A natural challenge of these methods is the degree to which block-encoding costs can be reduced. We address this challenge through the technique of spectral amplification, which magnifies the spectrum of the low-energy states of Hamiltonians that can be expressed as sums of squares. Spectral amplification enables estimating ground-state energies with significantly improved cost scaling in the block encoding normalization factor Λ to just √2Λ𝐸gap, where 𝐸gap ≪Λ is the lowest energy of the sum-of-squares Hamiltonian. To achieve this, we show that sum-of-squares representations of the electronic structure Hamiltonian are efficiently computable by a family of classical simulation techniques that approximate the ground-state energy from below. In order to further optimize, we also develop a novel factorization that provides a trade-off between the two leading Coulomb integral factorization schemes—namely, double factorization and tensor hypercontraction—that when combined with spectral amplification yields a factor of 4 to 195 speedup over the state of the art in ground-state energy estimation for models of iron-sulfur complexes and a CO2-fixation catalyst.
View details
Scaling Large Language Models For Next-Generation Single-Cell Analysis
Syed Asad Rizvi
Daniel Levine
Aakash Patel
Shiyang Zhang
Eric Wang
Curtis Jamison Perry
Nicole Mayerli Constante
Sizhuang He
David Zhang
Cerise Tang
Zhuoyang Lyu
Rayyan Darji
Chang Li
Emily Sun
David Jeong
Lawrence Zhao
Jennifer Kwan
David Braun
Brian Hafler
Hattie Chung
Rahul M. Dhodapkar
Paul Jaeger
Jeffrey Ishizuka
David van Dijk
biorxiv (2025)
Preview abstract
Single-cell RNA sequencing has transformed our understanding of cellular diversity, yet current singlecell foundation models (scFMs) remain limited in their scalability, flexibility across diverse tasks, and ability to natively integrate textual information. In this work, we build upon the Cell2Sentence (C2S) framework, which represents scRNA-seq profiles as textual “cell sentences,” to train Large Language Models (LLMs) on a corpus comprising over one billion tokens of transcriptomic data, biological text, and metadata. Scaling the model to 27 billion parameters yields consistent improvements in predictive and generative capabilities and supports advanced downstream tasks that require synthesis of information across multi-cellular contexts. Targeted fine-tuning with modern reinforcement learning techniques produces strong performance in perturbation response prediction, natural language interpretation, and complex biological reasoning. This predictive strength directly enabled a dualcontext virtual screen that uncovered a striking context split for the kinase inhibitor silmitasertib (CX-4945), suggesting its potential as a synergistic, interferon-conditional amplifier of antigen presentation. Experimental validation in human cell models unseen during training confirmed this hypothesis, demonstrating that C2S-Scale can generate biologically grounded, testable discoveries of context-conditioned biology. C2S-Scale unifies transcriptomic and textual data at unprecedented scales, surpassing both specialized single-cell models and general-purpose LLMs to provide a platform for next-generation single-cell analysis and the development of “virtual cells.”
View details
Google’s Approach to Protecting Privacy in the Age of AI
Google, , 1600 Amphitheatre Parkway, Mountain View, CA, 94043 (2025)
Preview abstract
AI products introduce new privacy challenges. Finding the right privacy solution is central to developing innovative products, especially as AI models increasingly handle user data. In this paper, we propose a framework to reason about privacy in AI, and discuss how Privacy Enhancing Technologies (PETs) enable novel user experiences by reducing privacy risks in the AI development lifecycle. We argue that privacy protections are not inherently at odds with utility; in contrast, we discuss how building privacy into products from the start can create better, more trustworthy experiences for everyone.
View details
Preview abstract
The global adoption of Large Language Models (LLMs) in healthcare shows promise for enhancing clinical workflows and improving patient outcomes. However, Automatic Speech
Recognition (ASR) errors in critical medical entities remain a significant challenge. These
errors can lead to severe consequences if undetected. This study investigates the prevalence and impact of ASR errors in medical transcription across Africa, Europe, and North America. By examining variations in accented English across three continents, we analyze the impact of regional speech patterns on ASR performance. Our research quantifies both the potential and limitations of LLMs in mitigating ASR inaccuracies within various medical settings, with particular attention to performance variations across regional accents and medical terminology. Our findings highlight significant disparities in ASR accuracy across regions and identify specific conditions under which LLM corrections prove most effective.
View details
UWB Radar-based Heart Rate Monitoring: A Transfer Learning Approach
Elzbieta Gruzewska
Sebastien Baur
Matthew Baugh
Sharanya Srinivas
Matthew Thompson
Pramod Rudrapatna
Michael A. Sanchez
Lawrence Z. Cai
Timothy JA Chico
Robert F Storey
Emily Maz
Umesh Telang
Shravya Shetty
Mayank Daswani
arXiv (2025)
Preview abstract
Radar technology presents untapped potential for continuous, contactless, and passive heart rate monitoring via consumer electronics like mobile phones. However the variety of available radar systems and lack of standardization means that a large new paired dataset collection is required for each radar system. This study demonstrates transfer learning between frequency-modulated continuous wave (FMCW) and impulse-radio ultra-wideband (IR-UWB) radar systems, both increasingly integrated into consumer devices. FMCW radar utilizes a continuous chirp, while IR-UWB radar employs short pulses. Our mm-wave FMCW radar operated at 60 GHz with a 5.5 GHz bandwidth (2.7 cm resolution, 3 receiving antennas [Rx]), and our IR-UWB radar at 8 GHz with a 500 MHz bandwidth (30 cm resolution, 2 Rx). Using a novel 2D+1D ResNet architecture we achieved a mean absolute error (MAE) of 0.85 bpm and a mean absolute percentage error (MAPE) of 1.42% for heart rate monitoring with FMCW radar (N=119 participants, an average of 8 hours per participant). This model maintained performance (under 5 MAE/10% MAPE) across various body positions and heart rate ranges, with a 98.9% recall. We then fine-tuned a variant of this model, trained on single-antenna and single-range bin FMCW data, using a small (N=376, avg 6 minutes per participant) IR-UWB dataset. This transfer learning approach yielded a model with MAE 4.1 bpm and MAPE 6.3% (97.5% recall), a 25% MAE reduction over the IR-UWB baseline. This demonstration of transfer learning between radar systems for heart rate monitoring has the potential to accelerate its introduction into existing consumer devices.
View details
Concerns Beyond the Accuracy of AI Output
DORA, Google (2025)
Preview abstract
Generative AI's potential for hallucinations and inaccuracies are by far the most discussed limitation in AI-assisted software development. But, whether developers have other concerns about using generative AI in their coding practice has not been thoroughly explored. This article describes the results of in-depth interviews with developers about their other concerns about generative AI in coding, beyond the tools accuracy, and discusses related policy implications for organizations developing software.
View details
Preview abstract
The accelerating pace of innovation is
fundamentally reshaping product development,
creating a complex environment that demands rapid
decision-making and efficient information
management. To remain competitive, organizations
must integrate Generative AI (GenAI) tools into
their Product Lifecycle Management (PLM)
processes. This integration is crucial because
traditional PLM systems, often built on decades-old
architectures, struggle to manage modern product
complexity, vast data volumes, and interconnected
supply chains.1 Limitations such as data silos,
inflexible change management, and inadequate
collaboration capabilities hinder the agility required
today.3 GenAI offers transformative potential by
automating complex tasks, enhancing data analysis,
and facilitating more dynamic design and
collaboration within the PLM ecosystem.5 This
integration is not merely an upgrade but an
essential evolution to overcome the inherent
architectural and process constraints of legacy
systems, which impede the speed and data fluidity
necessary in the current market.
View details
Sensible Agent: A Framework for Unobtrusive Interaction with Proactive AR Agent
Min Xia
Nels Numan
Dinesh Manocha
Proceedings of the 39th Annual ACM Symposium on User Interface Software and Technology (UIST), ACM (2025), pp. 22
Preview abstract
Proactive AR agents promise context-aware assistance, but their interactions often rely on explicit voice prompts or responses, which can be disruptive or socially awkward. We introduce Sensible Agent, a framework designed for unobtrusive interaction with these proactive agents. Sensible Agent dynamically adapts both “what” assistance to offer and, crucially, “how” to deliver it, based on real-time multimodal context sensing. Informed by an expert workshop (n=12) and a data annotation study (n=40), the framework leverages egocentric cameras, multimodal sensing, and Large Multimodal Models (LMMs) to infer context and suggest appropriate actions delivered via minimally intrusive interaction modes. We demonstrate our prototype on an XR headset through a user study (n=10) in both AR and VR scenarios. Results indicate that Sensible Agent significantly reduces perceived intrusiveness and interaction effort compared to voice-prompted baseline, while maintaining high utility.
View details
PROTECT: A Framework to Foster Digital Resilience for Youth Navigating Technology-Facilitated Abuse
Diana Freed
Natalie Bazarova
Dan Cosley
Patrick Gage Kelley
Social Sciences Journal, 14(6) (2025)
Preview abstract
Youth are increasingly exposed to a broad range of technology-facilitated abuse that challenges their safety and well-being. Building on previous work that examined youth help-seeking behaviors, coping strategies, threats they encounter, and the social support systems around them, we articulate a framework— called PROTECT—Problem recognition, Reaching out, Organizing support, Training, Engaging experts, Continuous support, and Tackling safety measures—which integrates existing models of support, help-seeking, and digital skills to offer a high-level, structured approach to adults who serve as a support system to youth navigate technology-facilitated abuse. The framework unpacks social and contextual dynamics that influence help-seeking behaviors, providing a foundation for educators, advocates, health professionals, developers and other adult stakeholders to design and develop trauma-informed, timely interventions to promote resilience.
View details
Preview abstract
Despite coming up on two decades of network verification research, verification tooling continues to see limited real-world adoption and outages continue to occur. Relying on interviews with network engineers and our own experience as a large network operator, we ask why. These conversations reveal that the culprit is traditional verification's reliance on hand-crafted network models, which leads to issues with coverage, correctness, maintainability, and fidelity, ultimately hindering practical applicability and adoption.
To address this, we call for the research community to embrace "model-free verification" through network emulation. Recent technology advancements – maturation of orchestration infrastructure and vendor-provided container images – make it possible to leverage emulation to obtain a high-fidelity converged dataplane from actual router control plane code, and then apply established dataplane verification techniques to this extracted state. We prototype such a system with open-source components, and present early results showing this approach can accurately verify configurations previously untestable, paving the way for more robust, practical network verification.
View details
Preview abstract
We present new efficient algorithms for high-dimensional calibration via reduction to the TreeSwap algorithm of Dagan et al.
View details
Preview abstract
Settler colonialism has led to ancestral language endangerment and extinction on a mass scale. It has also forced `global' languages such as English on Indigenous communities worldwide. In Australia, post-contact languages, including creoles, and local varieties of international languages emerged as a result of forced contact with English speakers. These contact varieties are widely used, but to date they have to-date been poorly supported by language technologies. This oversight presents barriers to participation in civil and economic society for Indigenous communities using these languages. It also reproduces minoritisation of contemporary Indigenous sociolinguistic identities. This paper concerns the question of whether (and, if so, how) Indigenous people may be supported by technologies for their non-ancestral languages. We argue that multiple real-world opportunities exist, and explore this position through a case study of a project which aims to improve Automated Speech Recognition for Australian Aboriginal English. We discuss how we integrated culturally appropriate processes into the project. We call for increased support for languages used by Indigenous communities, including contact varieties, providing practical economic and socio-cultural benefits.
View details