Publications
Our teams aspire to make discoveries that impact everyone, and core to our approach is sharing our research and tools to fuel progress in the field.

Our teams aspire to make discoveries that impact everyone, and core to our approach is sharing our research and tools to fuel progress in the field.
Sort By
1 - 15 of 10473 publications
A Call to Action: Advancing the Conversation Around Neurodivergent Education-Employment Transitions
Dannie Lynn Fountain
Vicki Baker
Kevin Danley
Closing the Gap (2025)
Preview abstract
Neurodiversity is still largely stigmatized and excluded from DEIB frameworks and related organizational initiatives, despite the increased recognition regarding the benefits of neuroinclusion within the education and corporate spheres. We seek to address this knowledge-to-practice gap through the creation of the Neurodiversity Engagement Framework. By highlighting supports needed for neurodivergent individuals, and those that support them, the framework helps neurodivergent individuals navigate within and across higher education and industry contexts. Informed by an interdisciplinary review of literature from higher education, industry, and corporate leadership contexts, the Neurodiversity Engagement Framework brings to light prevailing challenges within practices and policies, serving as a guide for the creation of a more supportive foundation for neurodiverse individuals to thrive. In this manuscript, readers are encouraged to consider the myriad of impacts that neurodiversity has on higher education and industry experiences and the ways that organizations can be more proactive in their support of this growing population. To conclude, we offer a roadmap for future research and practice to further elucidate ways academic and corporation leaders and policymakers can effectively support neurodivergent individuals.
View details
Differentiable Approximations for Distance Queries
David M. Mount
Proceedings of the 2025 Annual ACM-SIAM Symposium on Discrete Algorithms (SODA)
Preview abstract
The widespread use of gradient-based optimization has motivated the adaptation of various classical algorithms into differentiable solvers compatible with learning pipelines. In this paper, we investigate the enhancement of traditional geometric query problems such that the result consists of both the geometric function as well as its gradient. Specifically, we study the fundamental problem of distance queries against a set of points P in R^d, which also underlies various similarity measures for learning algorithms.
The main result of this paper is a multiplicative (1+epsilon)-approximation of the Euclidean distance to P which is differentiable at all points in R^d \ P with asymptotically optimal bounds on the norms of its gradient and Hessian, from a data structure with storage and query time matching state-of-the-art results for approximate nearest-neighbor searching. The approximation is realized as a regularized distance through a partition-of-unity framework, which efficiently blends multiple local approximations, over a suitably defined covering of space, into a smooth global approximation. In order to obtain the local distance approximations in a manner that facilitates blending, we develop a new approximate Voronoi diagram based on a simple point-location data structure, simplifying away both the lifting transformation and ray shooting.
View details
The Case for Leveraging Transport Signals to Improve Internet Speed Test Efficiency
Cristina Leon
Computer Communication Review (2025) (to appear)
Preview abstract
Internet speed tests are an important tool to enable consumers and regulators to monitor the quality of Internet access. However, increased Internet speeds to the home and an increased demand for speed testing pose scaling challenges to providers of speed tests, who must maintain costly infrastructure to keep up with this demand. In recent years, this has led the popular NDT speed test to limit data transfer to a total of 250MB, which comes at the cost of accuracy for high bandwidth speed test clients.
In this paper, we observe that the NDT speed test server’s congestion control algorithm (BBRv1) is also trying to estimate the capacity of the connection. We leverage this observation and signals from BBR to improve the accuracy and efficiency of speed tests. We first show how leveraging signals from BBR can more than double the accuracy of a 10MB test–from 17% to 43%–for clients with speeds over 400Mbps.
We then show how using BBR signals to adaptively end the speed test reduces data transfer by 36% and increased accuracy by 13% for high bandwidth clients, relative to a 100MB fixed length test. Even accounting for clients that never observe enough samples to utilize the BBR signal, this adaptive approach still uses 25% less data than a fixed 100MB test with 37-44% higher accuracy.
View details
Matryoshka Model Learning for Improved Elastic Student Models
Cho-Jui Hsieh
Chetan Verma
Inderjit Dhillon
Xin Liu
Wen Chen
Ngot Bui
Yang Zhang
2025
Preview abstract
Production machine learning models in the industry are often devel-oped with a primary focus on maximizing model quality. However,these models must ultimately operate within the resource con-straints of their serving infrastructure, including limitations on com-pute, memory and bandwidth. The rapid evolution of serving hard-ware, particularly with advancements in accelerator technology,necessitates periodic retraining to leverage newer, more efficientinfrastructure. This cyclical retraining process is resource-intensive,demanding significant model development time and incurring sub-stantial training costs. This challenge is further amplified by thetrend towards increasingly complex models, which inherently re-quire greater computational resources for training and deployment.While prior work has explored techniques like supernet sub-modelextraction to address training efficiency, a critical gap remains: theefficient generation of a spectrum of high-quality models froman existing production model, a common requirement in diverseindustrial applications. To bridge this gap, we introduce a novel ap-proach leveraging a "Teaching Assistant" (TA) model, derived froma given production model (referred to as the Student model). Wedemonstrate that through co-training the Student and TA modelswith Matryoshka structure while using online distillation, we notonly enhance the Student model’s performance but also enable theflexible creation of a model family offering a compelling trade-offbetween model quality and model size.
View details
Preview abstract
Large Language Models (LLMs) have demonstrated impressive capabilities across a range of natural language processing tasks. In particular, improvements in reasoning abilities and the expansion of context windows have opened new avenues for leveraging these powerful models.
NL2SQL is challenging in that the natural language question is inherently ambiguous, while the SQL generation requires a precise understanding of complex data schema and semantics. One approach to this semantic ambiguous problem is to provide more and sufficient contextual information.
In this work, we explore the performance and the latency trade-offs of the extended context window (a.k.a., long context) offered by Google's state-of-the-art LLM (\textit{gemini-1.5-pro}).
We study the impact of various contextual information, including column example values, question and SQL query pairs, user-provided hints, SQL documentation, and schema. To the best of our knowledge, this is the first work to study how the extended context window and extra contextual information can help NL2SQL generation with respect to both accuracy and latency cost.
We show that long context LLMs are robust and do not get lost in the extended contextual information. Additionally, our long-context NL2SQL pipeline based on Google's \textit{gemini-pro-1.5} achieve a strong performance with 67.41\% on BIRD benchmark (dev) without finetuning and expensive self-consistency based techniques.
View details
Preview abstract
In this article, we describe our human-centered research focused on understanding the role of collaboration and teamwork in productive software development. We describe creation of a logs-based metric to identify collaboration through observable events and a survey-based multi-item scale to assess team functioning.
View details
The ASPLOS 2025 / EuroSys 2025 Contest on Intra-Operator Parallelism for Distributed Deep Learning
Pratik Fegade
Proceedings of the 30th ACM International Conference on Architectural Support for Programming Languages and Operating Systems (2025), pp. 5-17
Preview abstract
A chief enabler of large-scale deep learning is the distribution of computation across multiple interconnected hardware accelerators. In order to unlock the maximum possible performance, a compiler must first select a reasonable strategy to parallelize a model's operations. Since neural network architectures admit multiple flavors of parallelism, determining the proper strategy for each instruction is a critical (albeit non-trivial) task. To solicit new ideas toward solving this challenging combinatorial optimization problem, we organized the ASPLOS 2025 / EuroSys 2025 Contest on Intra-Operator Parallelism for Distributed Deep Learning, a multi-month competition focused on advancing the state-of-the-art for model partitioning algorithms. In this paper, we offer a retrospective of this event, including the basic problem formulation, key challenges & opportunities, our new benchmark suite, and the quality of submissions received.
View details
Preview abstract
As part of Google's ongoing efforts to define best practices for secure AI systems, we’re sharing our aspirational framework for secure AI agents. We advocate for a hybrid, defense-in-depth strategy that combines the strengths of traditional, deterministic security controls with dynamic, reasoning-based defenses. This approach is grounded in three core principles: agents must have well-defined human controllers, their powers must be carefully limited, and their actions and planning must be observable. This paper reflects our current thinking and the direction of our efforts as we work towards ensuring that AI agents can be powerful, useful, and secure by default.
View details
Sufficient Context: A New Lens on Retrieval Augmented Generation Systems
Hailey Joren
Jianyi Zhang
Chun-Sung Ferng
Ankur Taly
International Conference on Learning Representations (ICLR) (2025)
Preview abstract
Augmenting LLMs with context leads to improved performance across many applications. Despite much research on Retrieval Augmented Generation (RAG) systems, an open question is whether errors arise because LLMs fail to utilize the context from retrieval or the context itself is insufficient to answer the query. To shed light on this, we develop a new notion of sufficient context, along with a method to classify instances that have enough information to answer the query. We then use sufficient context to analyze several models and datasets. By stratifying errors based on context sufficiency, we find that larger models with higher baseline performance (Gemini 1.5 Pro, GPT 4o, Claude 3.5) excel at answering queries when the context is sufficient, but often output incorrect answers instead of abstaining when the context is not. On the other hand, smaller models with lower baseline performance (Llama 3.1, Mistral 3, Gemma 2) hallucinate or abstain often, even with sufficient context. We further categorize cases when the context is useful, and improves accuracy, even though it does not fully answer the query and the model errs without the context. Building on our findings, we explore ways to reduce hallucinations in RAG systems, including a new selective generation method that leverages sufficient context information for guided abstention. Our method improves the fraction of correct answers among times where the model responds by 2--10% for Gemini, GPT, and Gemma.
View details
Fast Tensor Completion via Approximate Richardson Iteration
Mehrdad Ghadiri
Yunbum Kook
Ali Jadbabaie
Proceedings of the 42nd International Conference on Machine Learning (2025)
Preview abstract
We study tensor completion (TC) through the lens of low-rank tensor decomposition (TD). Many TD algorithms use fast alternating minimization methods, which solve highly structured linear regression problems at each step (e.g., for CP, Tucker, and tensor-train decompositions). However, such algebraic structure is lost in TC regression problems, making direct extensions unclear. To address this, we propose a lifting approach that approximately solves TC regression problems using structured TD regression algorithms as blackbox subroutines, enabling sublinear-time methods. We theoretically analyze the convergence rate of our approximate Richardson iteration based algorithm, and we demonstrate on real-world tensors that its running time can be 100x faster than direct methods for CP completion.
View details
Benchmarking and improving algorithms for attributing satellite-observed contrails to flights
Vincent Rudolf Meijer
Rémi Chevallier
Allie Duncan
Kyle McConnaughay
Atmospheric Measurement Techniques, 18 (2025), pp. 3495-3532
Preview abstract
Condensation trail (contrail) cirrus clouds cause a substantial fraction of aviation's climate impact. One proposed method for the mitigation of this impact involves modifying flight paths to avoid particular regions of the atmosphere that are conducive to the formation of persistent contrails, which can transform into contrail cirrus. Determining the success of such avoidance maneuvers can be achieved by ascertaining which flight formed each nearby contrail observed in satellite imagery. The same process can be used to assess the skill of contrail forecast models. The problem of contrail-to-flight attribution is complicated by several factors, such as the time required for a contrail to become visible in satellite imagery, high air traffic densities, and errors in wind data. Recent work has introduced automated algorithms for solving the attribution problem, but it lacks an evaluation against ground-truth data. In this work, we present a method for producing synthetic contrail detections with predetermined contrail-to-flight attributions that can be used to evaluate – or “benchmark” – and improve such attribution algorithms. The resulting performance metrics can be employed to understand the implications of using these observational data in downstream tasks, such as forecast model evaluation and the analysis of contrail avoidance trials, although the metrics do not directly quantify real-world performance. We also introduce a novel, highly scalable contrail-to-flight attribution algorithm that leverages the characteristic compounding of error induced by simulating contrail advection using numerical weather models. The benchmark shows an improvement of approximately 25 % in precision versus previous contrail-to-flight attribution algorithms, without compromising recall.
View details
Preview abstract
Mainstream artificial neural network models, such as Deep Neural Networks (DNNs) are computation-heavy and energy-hungry. Weightless Neural Networks (WNNs) are natively built with RAM-based neurons and represent an entirely distinct type of neural network computing compared to DNNs. WNNs are extremely low-latency, low-energy, and suitable for efficient, accurate, edge inference. The WNN approach derives an implicit inspiration from the decoding process observed in the dendritic trees of biological neurons, making neurons based on Random Access Memories (RAMs) and/or Lookup Tables (LUTs) ready-to-deploy neuromorphic digital circuits. Since FPGAs are abundant in LUTs, LUT based WNNs are a natural fit for implementing edge inference in FPGAs.
WNNs has been demonstrated to be an energetically efficient AI model, both in software, as well as in hardware. For instance, the most recent DWN – Differential Weightless Neural Network – model demonstrates up to 135× reduction in energy costs in FPGA implementations compared to other multiplication-free approaches, such as binary neural networks (BNNs) and DiffLogicNet, up to 9% higher accuracy in deployments on constrained devices, and culminate in up to 42.8× reduction in circuit area for ultra-low-cost chip implementations. This tutorial will help participants understand how WNNs work, why WNNs were underdogs for such a long time, and be introduced to the most recent members of the WNN family, such as BTHOWeN , LogicWiSARD, COIN, ULEEN and DWN, and contrast to BNNs and LogicNets.
View details
PAIGE: Examining Student Learning Outcomes and Experiences with Personalized AI-Generated Podcasts
Tiffany Do
Usama Bin Shafqat
Elsie Ling
Νikhil Sarda
2025
Preview abstract
Generative AI is revolutionizing content creation and holds promise for real-time, personalized educational experiences. We investigated the effectiveness of converting textbook chapters into AI-generated podcasts and explored the impact of personalizing these podcasts
for individual learner profiles. We conducted a 3x3 user study with 180 college students in the United States, comparing traditional textbook reading with both generalized and personalized AI-generated podcasts across three textbook subjects. The personalized podcasts were tailored to students’ majors, interests, and learning styles. Our findings show that students found the AI-generated podcast format to be more enjoyable than textbooks and that personalized podcasts led to significantly improved learning outcomes, although this was subject-specific. These results highlight that AI-generated podcasts can offer an engaging and effective modality
transformation of textbook material, with personalization enhancing content relevance. We conclude with design recommendations for leveraging AI in education, informed by student feedback.
View details
"It is important to consult" a linguist: Verb-Argument Constructions in ChatGPT and human experts' medical and financial advice
Chris Stewart
Alistair Windsor
J. Elliott Casal
PLOS One (2025)
Preview abstract
This paper adopts a Usage-Based Construction Grammar perspective to compare human- and AI-generated language, focusing on Verb-Argument Constructions (VACs) as a lens for analysis. Specifically, we examine solicited advice texts in two domains—Finance and Medicine—produced by humans and ChatGPT across different GPT models (3.5, 4, and 4o) and interfaces (3.5 Web vs. 3.5 API). Our findings reveal broad consistency in the frequency and distribution of the most common VACs across human- and AI-generated texts, though ChatGPT exhibits a slightly higher reliance on the most frequent constructions. A closer examination of the verbs occupying these constructions uncovers significant differences in the meanings conveyed, with a notable growth away from human-like language production in macro level perspectives (e.g., length) and towards humanlike verb-VAC patterns with newer models. These results underscore the potential of VACs as a powerful tool for analyzing AI-generated language and tracking its evolution over time.
View details
Neural Pathways to Program Success: Hopfield Networks for PERT Analysis
Proceedings of Technology and Engineering Management Society Conference (TEMSCON Global), IEEE (2025)
Preview abstract
Project and task scheduling under uncertainty remains a fundamental challenge in program and project management, where accurate estimation of task durations and dependencies is critical for delivering complex, multi project systems. The Program Evaluation and Review Technique provides a probabilistic framework to model task variability and critical paths. In this paper, the author presents a novel formulation of PERT scheduling as an energy minimization problem within a Hopfield neural network architecture. By mapping task start times and precedence constraints into a neural computation framework, the networks inherent optimization dynamics is exploited to approximate globally consistent schedules. The author addresses key theoretical issues related to energy function differentiability, constraint encoding, and convergence, and extends the Hopfield model for structured precedence graphs. Numerical simulations on synthetic project networks comprising up to 1000 tasks demonstrate the viability of this approach, achieving near optimal makespans with minimal constraint violations. The findings suggest that neural optimization models offer a promising direction for scalable and adaptive project tasks scheduling under uncertainty in areas such as the agentic AI workflows, microservice based applications that the modern AI systems are being built upon.
View details