Publications

Our teams aspire to make discoveries that impact everyone, and core to our approach is sharing our research and tools to fuel progress in the field.

people standing in front of a screen with images and a chipboard

Our teams aspire to make discoveries that impact everyone, and core to our approach is sharing our research and tools to fuel progress in the field.

Sort By
  • Title
  • Title, descending
  • Year
  • Year, descending
1 - 15 of 10795 publications
    Productionizing Quantum Mass Production
    Bill Huggins
    Nathan Wiebe
    arXiv for now (2026) (to appear)
    Preview abstract For many practical applications of quantum computing, the slowest and most costly steps involve coherently accessing classical data. We help address this challenge by applying mass production techniques, which can sometimes allow us to perform operations many times in parallel for a cost that is comparable to a single execution[1-3]. We combine existing mass-production results with modern approaches for loading classical data using ``quantum read-only memory.'' We show that quantum mass production techniques offer no benefit when we consider a cost model that focuses purely on the number of non-Clifford gates. However, analyzing the constant factors in a more nuanced cost model, we find that it may be possible to obtain a reduction in cost of an order or magnitude or more for a variety reasonably-sized fault-tolerant quantum algorithms. We present several applications of quantum mass-production techniques beyond naive parallelization, including a strategy for reducing the cost of serial calls to the same data loading step. View details
    FreshBrew: A Benchmark for Evaluating AI Agents on Java Code Migration
    Diganta Misra
    Yanqi Luo
    Anjali Sridhar
    Justine Gehring
    Silvio Soares Ribeiro Junior
    2026
    Preview abstract AI coding assistants are rapidly becoming integral to modern software development. A key challenge in this space is the continual need to migrate and modernize codebases in response to evolving software ecosystems. Traditionally, such migrations have relied on rule-based systems and human intervention. With the advent of powerful large language models (LLMs), AI-driven agentic frameworks offer a promising alternative—but their effectiveness remains underexplored. In this paper, we introduce FreshBrew, a novel benchmark for evaluating AI-based agentic frameworks on project-level Java migrations. We benchmark several such frameworks, powered by state-of-the-art LLMs, and compare their performance against established rule-based tools. Our evaluation of AI agents on this benchmark of 228 repositories shows that the top-performing model, Gemini 2.5 Flash, can successfully migrate 56.5% of projects to JDK 17. Our empirical analysis reveals novel insights into the critical strengths and limitations of current agentic approaches, offering actionable insights into their real-world applicability. By releasing FreshBrew publicly upon acceptance, we aim to facilitate rigorous, reproducible evaluation and catalyze progress in AI-driven codebase modernization. View details
    Our Approach to Protecting AI Training Data
    Cindy Muya
    Jason Novak
    Cindee Madison
    Ben Kamber
    Niha Vempati
    Jeremy Wiesner
    Google, Google, Google, 1600 Amphitheatre Parkway, Mountain View, CA, 94043 (2025) (2025)
    Preview abstract Google has over 25 years experience protecting data from inappropriate access and unauthorized use. In the era of AI, Google has extended these best practices in data protection to ensure that the right data is used the right way to train models. This paper presents a number of these best practices, describes how Google applies them in its systems, and describes how Google Cloud customers can use Google Cloud capabilities to implement these practices themselves. Protecting data requires both technical controls to enable safe data use at scale, and governance processes to ensure that companies have visibility and control over how their data is used. This fundamentally requires: understanding data and ensuring it has sufficient metadata in the form of attributes, controlling the data and implementing policies to allow (or disallow) certain usage based on those attributes, transforming data to enable its usage in policy compliant ways, and human oversight and governance. Protecting data in AI inherits these requirements and introduces new requirements to account for unique AI-specific risks including memorization/recitation and the costs of training foundational models. Meeting these new risks requires new capabilities including enhanced understanding of data and model lineage as well as an increased ability to control data usage through checks on data for policy compliance at the time a training job is configured before it is run. This white paper offers an in-depth look at data protection best practices and Google’s data protection capabilities, and is one of a series of publications about Google's Secure AI Framework (SAIF). Building upon its secure development practices, Google has developed and deployed a number of capabilities to understand, control, and transform data in its infrastructure so that data is both protected and used appropriately. This involves robust annotation systems to represent metadata and enable granular understanding of data at both an item and dataset level, policy engines that evaluate machine readable policies on that data using the metadata attributes, and sensors to understand how data is flowing across Google’s systems and raise alerts when policy violations occur. Moreover, Google has developed de-identification and anonymization systems to transform data to make it policy compliant and safer to use for AI training. View details
    Leveraging Per-Example Privacy for Machine Unlearning
    Nazanin Mohammadi Sepahvand
    Anvith Thudi
    Ashmita Bhattacharyya
    Nicolas Papernot
    Eleni Triantafillou
    Daniel M. Roy
    Karolina Dziugaite
    International Conference on Machine Learning (ICML) (2025)
    Preview abstract This work focuses on developing fine-grained theoretical insights to quantify unlearning difficulty at the level of individual data points for fine-tuning-based unlearning. Unlike other unlearning methods that lack theoretical guarantees for non-convex models, our approach builds on recent advances in differential privacy to provide per-instance guarantees using Rényi divergence. While our theoretical analysis applies to Langevin dynamics, we empirically demonstrate that the derived guarantees—and their trends—continue to hold for fine-tuning, even in the absence of explicit noise. Our results show that per-instance privacy levels computed from training dynamics reliably predict unlearning difficulty, offering a principled and practical way to assess unlearning performance. Furthermore, our method identifies harder-to-unlearn data more effectively than existing heuristics, providing a more precise tool for guiding unlearning strategies. These findings pave the way for adaptive and efficient unlearning methods tailored to the properties of specific data points. View details
    Smartwatch-Based Walking Metrics Estimation
    Amir Farjadian
    Anupam Pathak
    Alicia Kokoszka
    Jonathan Hsu
    Kyle DeHolton
    Lawrence Cai
    Shwetak Patel
    Mark Malhotra
    Jonathan Wang
    Shun Liao
    2025
    Preview abstract Gait parameters are important health indicators of neurological control, musculoskeletal health and fall risk, but traditional analysis requires specialized laboratory equipment. While smartphone inertial measurement units (IMUs) enable estimation of gait metrics, their real-world use may be limited by inconsistent placement and user burden. With a fixed on-wrist placement, smartwatches offer a stable, convenient and continuous monitoring potential, but wrist-based sensing presents inherent challenges due to the indirect coupling between arm swing and leg movement. This paper introduces a novel multi-head deep learning model leveraging IMU signals from a consumer smartwatch, along with user height information to estimate a comprehensive suite of spatio-temporal walking metrics, including step length , gait speed, swing time, stance time, and double support time. Results from 250 participants across two countries demonstrate that the model achieves high validity (Pearson r > 0.7) and reliability (ICC > 0.7) for most gait metrics, comparable or exceeding leading smartphone-based approaches. This work, the largest in-lab, smartwatch-based gait study to date, highlights the feasibility of gait monitoring using ubiquitous consumer smartwatches. View details
    Preview abstract Many everyday tasks ranging from fixing appliances, cooking recipes to car maintenance require expert knowledge, especially when tasks are complex and multi-step. Despite growing interest in AI agents, there is a scarcity of dialogue-video datasets grounded for real world task assistance. In this paper, we propose a simple yet effective approach that transforms single-person instructional videos into task-guidance two-person dialogues, aligned with fine grained steps and video-clips. Our fully automatic approach, powered by large language models, offers an efficient alternative to the substantial cost and effort required for manual data collection. Using this technique, we build HowToDIV, a large-scale dataset containing 507 conversations, 6636 question-answer pairs and 24 hours of videoclips across diverse tasks in cooking, mechanics, and planting. Each session includes multi-turn conversation where an expert teaches a novice user how to perform a task step by step, while observing user's surrounding through a camera and microphone equipped wearable device. We establish the baseline benchmark performance on HowToDIV dataset through Gemma-3 model, for future research on this new task of dialogues for procedural-task assistance. Our dataset and code are publicly available at our project page: https://github.com/google/howtodiv. View details
    SSDTrain: Faster Large Language Model Training Using SSD-Based Activation Offloading
    Kun Wu
    Jeongmin Brian Park
    Mert Hidayetoğlu
    Vikram Sharma Mailthody
    Sitao Huang
    Steven Lumetta
    Wen-mei Hwu
    Design Automation Conference (DAC) (2025)
    Preview abstract The scaling up of Large Language Models (LLMs) demands more memory than current GPUs can provide, hindering the training process. To address this challenge, we propose SSDTrain to efficiently offload activations, the intermediate tensors produced during LLM training, to SSDs. This approach reduces GPU memory usage without impacting performance by adaptively overlapping data transfers with computation. SSDTrain is compatible with popular deep learning frameworks like PyTorch, Megatron, and DeepSpeed, and it employs techniques such as tensor deduplication, forwarding, and adaptive offloading to further enhance efficiency. We conduct extensive experiments on Llama, BERT, and T5. Results demonstrate that SSDTrain effectively reduces 45% of the activation peak memory usage. It can perfectly overlap the IO with the computation without introducing performance penalty. SSDTrain can achieve a performance boost of up to 31% compared to the conventional training strategy using the same GPU systems. View details
    Preview abstract When dividing items among agents, two of the most widely studied fairness notions are envy-freeness and proportionality. We consider a setting where m chores are allocated to n agents and the disutility of each chore for each agent is drawn from a probability distribution. We show that an envy-free allocation exists with high probability provided that m ≥ 2n, and moreover, m must be at least n + Θ(n) in order for the existence to hold. On the other hand, we prove that a proportional allocation is likely to exist as long as m = ω(1), and this threshold is asymptotically tight. Our results reveal a clear contrast with the allocation of goods, where a larger number of items is necessary to ensure existence for both notions. View details
    Preview abstract Mainstream artificial neural network models, such as Deep Neural Networks (DNNs) are computation-heavy and energy-hungry. Weightless Neural Networks (WNNs) are natively built with RAM-based neurons and represent an entirely distinct type of neural network computing compared to DNNs. WNNs are extremely low-latency, low-energy, and suitable for efficient, accurate, edge inference. The WNN approach derives an implicit inspiration from the decoding process observed in the dendritic trees of biological neurons, making neurons based on Random Access Memories (RAMs) and/or Lookup Tables (LUTs) ready-to-deploy neuromorphic digital circuits. Since FPGAs are abundant in LUTs, LUT based WNNs are a natural fit for implementing edge inference in FPGAs. WNNs has been demonstrated to be an energetically efficient AI model, both in software, as well as in hardware. For instance, the most recent DWN – Differential Weightless Neural Network – model demonstrates up to 135× reduction in energy costs in FPGA implementations compared to other multiplication-free approaches, such as binary neural networks (BNNs) and DiffLogicNet, up to 9% higher accuracy in deployments on constrained devices, and culminate in up to 42.8× reduction in circuit area for ultra-low-cost chip implementations. This tutorial will help participants understand how WNNs work, why WNNs were underdogs for such a long time, and be introduced to the most recent members of the WNN family, such as BTHOWeN , LogicWiSARD, COIN, ULEEN and DWN, and contrast to BNNs and LogicNets. View details
    Preview abstract The dominant paradigm in image retrieval systems today is to search large databases using global image features, and re-rank those initial results with local image feature matching techniques. This design, dubbed \emph{global-to-local}, stems from the computational cost of local matching approaches, which can only be afforded for a small number of retrieved images. However, emerging efficient local feature search approaches have opened up new possibilities, in particular enabling detailed retrieval at large scale, to find partial matches which are often missed by global feature search. In parallel, global feature-based re-ranking has shown promising results with high computational efficiency. In this work, we leverage these building blocks to introduce a \emph{local-to-global} retrieval paradigm, where efficient local feature search meets effective global feature re-ranking. Critically, we propose a re-ranking method where global features are computed on-the-fly, based on the local feature retrieval similarities. Such re-ranking-only global features, dubbed \emph{similarity embeddings}, leverage multidimensional scaling techniques to create embeddings which respect the local similarities obtained during search, enabling a significant re-ranking boost. Experimentally, we demonstrate unprecedented retrieval performance on the Revisited Oxford and Paris datasets, setting new state-of-the-art results. View details
    Google's Approach for Secure AI Agents
    Santiago (Sal) Díaz
    Kara Olive
    Google (2025)
    Preview abstract As part of Google's ongoing efforts to define best practices for secure AI systems, we’re sharing our aspirational framework for secure AI agents. We advocate for a hybrid, defense-in-depth strategy that combines the strengths of traditional, deterministic security controls with dynamic, reasoning-based defenses. This approach is grounded in three core principles: agents must have well-defined human controllers, their powers must be carefully limited, and their actions and planning must be observable. This paper reflects our current thinking and the direction of our efforts as we work towards ensuring that AI agents can be powerful, useful, and secure by default. View details
    Heterogeneous graph neural networks for species distribution modeling
    Christine Kaeser-Chen
    Keith Anderson
    Michelangelo Conserva
    Elise Kleeman
    Maxim Neumann
    Matt Overlan
    Millie Chapman
    Drew Purves
    arxiv (2025)
    Preview abstract Species distribution models (SDMs) are necessary for measuring and predicting occurrences and habitat suitability of species and their relationship with environmental factors. We introduce a novel presence-only SDM with graph neural networks (GNN). In our model, species and locations are treated as two distinct node sets, and the learning task is predicting detection records as the edges that connect locations to species. Using GNN for SDM allows us to model fine-grained interactions between species and the environment. We evaluate the potential of this methodology on the six-region dataset compiled by National Center for Ecological Analysis and Synthesis (NCEAS) for benchmarking SDMs. For each of the regions, the heterogeneous GNN model is comparable to or outperforms previously-benchmarked single-species SDMs as well as a feed-forward neural network baseline model. View details
    LeakyFeeder: In-Air Gesture Control Through Leaky Acoustic Waves
    Yongjie Yang
    Tao Chen
    Zhenlin An
    Shirui Cao
    Shangguan Longfei
    SenSys 2025 - The 23rd ACM Conference on Embedded Networked Sensor Systems (2025)
    Preview abstract We present LeekyFeeder, a mobile application that explores the acoustic signals leaked from headphones to reconstruct gesture motions around the ear for fine-grained gesture control. To achieve this goal, LeekyFeeder reuses the speaker and feed-forward microphones on active noise cancellation (ANC) headphones as a SONAR system, emitting an inaudible frequency-modulated continuous-wave (FMCW) signal to track gesture reflections over time. Since this single-receiver SONAR system is unable to differentiate reflection angles and further disentangle signal reflections from different gesture parts, we draw on principles of multi-modal learning to frame gesture motion reconstruction as a multi-modal translation task and propose a deep learning-based approach to fill the information gap between low-dimensional FMCW ranging readings and high-dimensional 3D hand movements. We implement LeekyFeeder on a pair of Google Pixel Buds and conduct experiments to examine the efficacy and robustness of LeekyFeeder in various conditions. Experiments based on six gesture types inspired by Apple Vision Pro demonstrate that LeekyFeeder achieves a PCK performance of 89% at 3cm across ten users, with an average MPJPE and MPJRPE error of 2.71cm and 1.88cm, respectively. View details
    Preview abstract The problem of learning to defer with multiple experts consists of optimally assigning input instances to experts, balancing the trade-off between their accuracy and computational cost. This is a critical challenge in natural language generation, but also in other fields such as image processing, and medical diagnostics. Recent studies have proposed surrogate loss functions to optimize deferral, but challenges remain in ensuring their consistency properties. This paper introduces novel surrogate loss functions and efficient algorithms with strong theoretical learning guarantees. We address open questions regarding realizable $H$-consistency, $H$-consistency bounds, and Bayes-consistency for both single-stage (jointly learning predictor and deferral function) and two-stage (learning only the deferral function with a fixed expert) learning scenarios. For single-stage deferral, we introduce a family of new realizable $H$-consistent surrogate losses and further prove $H$-consistency for a selected member. For two-stage deferral, we derive new surrogate losses that achieve realizable $H$-consistency, $H$-consistency bounds, and Bayes-consistency for the two-expert scenario and, under natural assumptions, multiple-expert scenario. Additionally, we provide enhanced theoretical guarantees under low-noise assumptions for both scenarios. Finally, we report the results of experiments using our proposed surrogate losses, comparing their performance against existing baselines. View details
    Participatory AI Considerations for Advancing Racial Health Equity
    Andrea G. Parker
    Jatin Alla
    Proceedings of the 2025 CHI Conference on Human Factors in Computing Systems (CHI) (2025) (to appear)
    Preview abstract Health-related artificial intelligence (health AI) systems are being rapidly created, largely without input from racially minoritized communities who experience persistent health inequities and stand to be negatively affected if these systems are poorly designed. Addressing this problematic trend, we critically review prior work focused on the participatory design of health AI innovations (participatory AI research), surfacing eight gaps in this work that inhibit racial health equity and provide strategies for addressing these gaps. Our strategies emphasize that “participation” in design must go beyond typical focus areas of data collection, annotation, and application co-design, to also include co-generating overarching health AI agendas and policies. Further, participatory AI methods must prioritize community-centered design that supports collaborative learning around health equity and AI, addresses root causes of inequity and AI stakeholder power dynamics, centers relationalism and emotion, supports flourishing, and facilitates longitudinal design. These strategies will help catalyze research that advances racial health equity. View details