Publications
Our teams aspire to make discoveries that impact everyone, and core to our approach is sharing our research and tools to fuel progress in the field.

Our teams aspire to make discoveries that impact everyone, and core to our approach is sharing our research and tools to fuel progress in the field.
Sort By
1 - 15 of 10462 publications
Reasoning-SQL: Reinforcement Learning with Partial Rewards for Reasoning-Enhanced Text-to-SQL
Mohammadreza Pourreza
Shayan Talaei
Hailong Li
Azalia Mirhoseini
Amin Saberi
Conference on Language Modeling (COLM) (2025) (to appear)
Preview abstract
Text-to-SQL is a challenging task involving multiple reasoning-intensive subtasks, including natural language understanding, database schema comprehension, and precise SQL query formulation. Existing approaches often rely on handcrafted reasoning paths with inductive biases that can limit their overall effectiveness. Motivated by the recent success of reasoning-enhanced models such as DeepSeek R1 and OpenAI o1, which effectively leverage reward-driven self-exploration to enhance reasoning capabilities and generalization, we propose a novel set of partial rewards tailored specifically for the Text-to-SQL task. Our reward set includes schema-linking, AI feedback, n-gram similarity, and syntax check, explicitly designed to address the reward sparsity issue prevalent in reinforcement learning (RL). Leveraging group relative policy optimization (GRPO), our approach explicitly encourages large language models (LLMs) to develop intrinsic reasoning skills necessary for accurate SQL query generation. With models of different sizes, we demonstrate that RL-only training with our proposed rewards consistently achieves higher accuracy and superior generalization compared to supervised fine-tuning (SFT). Remarkably, our RL-trained 14B-parameter model significantly outperforms larger proprietary models, e.g. o3-mini by 4% and Gemini-1.5-Pro-002 by 3% on the BIRD benchmark. These highlight the efficacy of our proposed RL-training framework with partial rewards for enhancing both accuracy and reasoning capabilities in Text-to-SQL tasks.
View details
Fast ACS: Low-Latency File-Based Ordered Message Delivery at Scale
Anil Raghunath Iyer
Neel Bagora
Chang Yu
Olivier Pomerleau
Vivek Kumar
Prunthaban Kanthakumar
Usenix Annual Technical Conference (2025)
Preview abstract
Low-latency message delivery is crucial for real-time systems. Data originating from a producer must be delivered to consumers, potentially distributed in clusters across metropolitan and continental boundaries. With the growing scale of computing, there can be several thousand consumers of the data. Such systems require a robust messaging system capable of transmitting messages containing data across clusters and efficiently delivering them to consumers. The system must offer guarantees like ordering and at-least-once delivery while avoiding overload on consumers, allowing them to consume messages at their own pace.
This paper presents the design of Fast ACS (an abbreviation for Ads Copy Service), a file-based ordered message delivery system that leverages a combination of two-sided (inter-cluster) and one-sided (intra-cluster) communication primitives—namely, Remote Procedure Call and Remote Direct Memory Access, respectively—to deliver messages. The system has been successfully deployed to dozens of production clusters and scales to accommodate several thousand consumers within each cluster, which amounts to Tbps-scale intra-cluster consumer traffic at peak. Notably, Fast ACS delivers messages to consumers across the globe within a few seconds or even sub-seconds (p99) based on the message volume and consumer scale, at a low resource cost.
View details
PAIGE: Examining Student Learning Outcomes and Experiences with Personalized AI-Generated Podcasts
Tiffany Do
Usama Bin Shafqat
Elsie Ling
Νikhil Sarda
2025
Preview abstract
Generative AI is revolutionizing content creation and holds promise for real-time, personalized educational experiences. We investigated the effectiveness of converting textbook chapters into AI-generated podcasts and explored the impact of personalizing these podcasts
for individual learner profiles. We conducted a 3x3 user study with 180 college students in the United States, comparing traditional textbook reading with both generalized and personalized AI-generated podcasts across three textbook subjects. The personalized podcasts were tailored to students’ majors, interests, and learning styles. Our findings show that students found the AI-generated podcast format to be more enjoyable than textbooks and that personalized podcasts led to significantly improved learning outcomes, although this was subject-specific. These results highlight that AI-generated podcasts can offer an engaging and effective modality
transformation of textbook material, with personalization enhancing content relevance. We conclude with design recommendations for leveraging AI in education, informed by student feedback.
View details
Record Number of Members Visit U.S. Congress to Talk Tech Policy
IEEE Spectrum (2025)
Preview abstract
This IEEE Spectrum article reflects on advocacy for U.S. technological leadership during my Congressional visit through IEEE-USA. Leading an expert group of other distinguished IEEE members, we urged lawmakers to support critical initiatives. Key priorities included sustained funding for federal research institutions like NIST, NASA, and the NSF, reauthorizing the SBIR/STTR programs vital for small business innovation, and passing the CREATE AI Act to democratize AI resources by establishing the National AI Research Resource (NAIRR).
We also emphasized strengthening the STEM talent pipeline through the CHIPS and Science Act and expanding high-skilled immigrant visas. We highlighted rapid AI advancements, such as autonomous vehicles, the surge in FDA-approved AI based medical devices, as underscoring the need for these strategic investments and policy actions. The article conveys a sense of urgency, calling for concrete congressional action to ensure the U.S. maintains its technological edge while also sharing my personal experiences.
View details
"Accessibility people, you go work on that thing of yours over there": Addressing Disability Inclusion in AI Product Organizations
Sanika Moharana
Erin Buehler
Michael Madaio
Vinita Tibdewal
Proceedings of AIES 2025 (2025) (to appear)
Preview abstract
The rapid emergence of generative AI models and AI powered systems has surfaced a variety of concerns around responsibility, safety, and inclusion. Some of these concerns address specific vulnerable communities, including people with disabilities. At the same time, these systems may introduce harms upon disabled users that do not fit neatly into existing accessibility classifications, and may not be addressed by current accessibility practices. In this paper, we investigate how stakeholders across a variety of job types are encountering and addressing potentially negative impacts of AI on users with disabilities. Through interviews with 25 practitioners, we identify emerging challenges related to AI’s impact on disabled users, systemic obstacles that contribute to problems, and effective strategies for impacting change. Based on these findings, we offer suggestions for improving existing processes for creating AI-powered systems and supporting practitioners in developing skills to address these emerging challenges.
View details
Scalable Private Partition Selection via Adaptive Weighting
Justin Y. Chen
Forty-second International Conference on Machine Learning (2025)
Preview abstract
In the differentially private partition selection problem (a.k.a. private set union, private key discovery), users hold subsets of items from an unbounded universe. The goal is to output as many items as possible from the union of the users' sets while maintaining user-level differential privacy. Solutions to this problem are a core building block for many privacy-preserving ML applications including vocabulary extraction in a private corpus, computing statistics over categorical data and learning embeddings over user-provided items.
We propose an algorithm for this problem, MaxAdaptiveDegree(MAD), which adaptively reroutes weight from items with weight far above the threshold needed for privacy to items with smaller weight, thereby increasing the probability that less frequent items are output. Our algorithm can be efficiently implemented in massively parallel computation systems allowing scalability to very large datasets. We prove that our algorithm stochastically dominates the standard parallel algorithm for this problem. We also develop a two-round version of our algorithm, MAD2R, where results of the computation in the first round are used to bias the weighting in the second round to maximize the number of items output. In experiments, our algorithms provide the best results across the board among parallel algorithms and scale to datasets with hundreds of billions of items, up to three orders of magnitude larger than those analyzed by prior sequential algorithms.
View details
Silent Data Corruption by 10× Test Escapes Threatens Reliable Computing
Rama Govindaraju
Eric Liu
Subhasish Mitra
Mike Fuller
IEEE (2025) (to appear)
Preview abstract
Summary:
Silent Data Corruption by 10x Test Escapes Threatens Reliable Computing" highlights a critical issue: manufacturing defects, dubbed "test escapes," are evading current testing methods at an alarming rate, ten times higher than industry targets. These defects lead to Silent Data Corruption (SDC), where applications produce incorrect outputs without error indications, costing companies significantly in debugging, data recovery, and service disruptions. The paper proposes a three-pronged approach: quick diagnosis of defective chips directly from system-level behaviors, in-field detection using advanced testing and error detection techniques like CASP, and new, rigorous test experiments to validate these solutions and improve manufacturing testing practices.
View details
GitChameleon 2.0: Evaluating AI Code Generation Against Python Library Version Incompatibilities
Diganta Misra
Nizar Islah
Brice Rauby
Zihan Wang
Justine Gehring
Antonio Orvieto
Muawiz Chaudhary
Eilif Muller
Irina Rish
Samira Ebrahimi Kahou
Massimo Caccia
2025
Preview abstract
The rapid evolution of software libraries poses a considerable hurdle for code generation, necessitating continuous adaptation to frequent version updates while preserving backward compatibility. While existing code evolution benchmarks provide valuable insights, they typically lack execution-based evaluation for generating code compliant with specific library versions. To address this, we introduce GitChameleon 2.0, a novel, meticulously curated dataset comprising 328 Python code completion problems, each conditioned on specific library versions and accompanied by executable unit tests. GitChameleon 2.0 rigorously evaluates the capacity of contemporary large language models (LLMs), LLM-powered agents, code assistants, and RAG systems to perform version-conditioned code generation that demonstrates functional accuracy through execution. Our extensive evaluations indicate that state-of-the-art systems encounter significant challenges with this task; enterprise models achieving baseline success rates in the 48-51% range, underscoring the intricacy of the problem. By offering an execution-based benchmark emphasizing the dynamic nature of code libraries, GitChameleon 2.0 enables a clearer understanding of this challenge and helps guide the development of more adaptable and dependable AI code generation methods.
View details
Preview abstract
As large language models (LLMs) improve in their capacity to serve as personal AI assistants, their ability to output uniquely tailored, personalized responses that align with the soft preferences of their users is imperative for maximizing user satisfaction and retention. However, lay users are notoriously bad at prompt specification and often struggle with conveying their latent preferences to AI assistants. To resolve this, we demonstrate that activation steering, an inference-time method, can effectively control the response of the LLMs towards expressing different preferences. In contrast to memory-based personalization methods that require long user history, steering is extremely lightweight and easily-controllable via an interpretable linear strength factor. We further conduct a within-subjects user study (n=14) to investigate how end users personalize their conversations through three different steerable chatbot interfaces. The results demonstrate the effectiveness of preference-based steering for aligning real-world conversations with user preferences, and we discuss qualitative findings on how diverse values around control, transparency, and usability of personalization lead users to prefer different interfaces.
View details
Preview abstract
We investigate Learning from Label Proportions (LLP), a partial information setting where examples in a training set are grouped into bags, and only aggregate label values in each bag are available. Despite the partial observability, the goal is still to achieve small regret at the level of individual examples. We give results on the sample complexity of LLP under square loss, showing that our sample complexity is essentially optimal. From an algorithmic viewpoint, we rely on carefully designed variants of Empirical Risk Minimization, and Stochastic Gradient Descent algorithms, combined with ad hoc variance reduction techniques. On one hand, our theoretical results improve in important ways on the existing literature on LLP, specifically in the way the sample complexity depends on the bag size. On the other hand, we validate our algorithmic solutions on several datasets, demonstrating improved empirical performance (better accuracy for less samples) against recent baselines.
View details
Heterogeneous graph neural networks for species distribution modeling
Christine Kaeser-Chen
Keith Anderson
Michelangelo Conserva
Elise Kleeman
Maxim Neumann
Matt Overlan
Millie Chapman
Drew Purves
arxiv (2025)
Preview abstract
Species distribution models (SDMs) are necessary for measuring and predicting occurrences and habitat suitability of species and their relationship with environmental factors. We introduce a novel presence-only SDM with graph neural networks (GNN). In our model, species and locations are treated as two distinct node sets, and the learning task is predicting detection records as the edges that connect locations to species. Using GNN for SDM allows us to model fine-grained interactions between species and the environment. We evaluate the potential of this methodology on the six-region dataset compiled by National Center for Ecological Analysis and Synthesis (NCEAS) for benchmarking SDMs. For each of the regions, the heterogeneous GNN model is comparable to or outperforms previously-benchmarked single-species SDMs as well as a feed-forward neural network baseline model.
View details
The ASPLOS 2025 / EuroSys 2025 Contest on Intra-Operator Parallelism for Distributed Deep Learning
Pratik Fegade
Proceedings of the 30th ACM International Conference on Architectural Support for Programming Languages and Operating Systems (2025), pp. 5-17
Preview abstract
A chief enabler of large-scale deep learning is the distribution of computation across multiple interconnected hardware accelerators. In order to unlock the maximum possible performance, a compiler must first select a reasonable strategy to parallelize a model's operations. Since neural network architectures admit multiple flavors of parallelism, determining the proper strategy for each instruction is a critical (albeit non-trivial) task. To solicit new ideas toward solving this challenging combinatorial optimization problem, we organized the ASPLOS 2025 / EuroSys 2025 Contest on Intra-Operator Parallelism for Distributed Deep Learning, a multi-month competition focused on advancing the state-of-the-art for model partitioning algorithms. In this paper, we offer a retrospective of this event, including the basic problem formulation, key challenges & opportunities, our new benchmark suite, and the quality of submissions received.
View details
On the Design of the Binaural Rendering Library for Eclipsa Audio Immersive Audio Container
Tomasz Rudzki
Gavin Kearney
AES 158th Convention of the Audio Engineering Society (2025)
Preview abstract
Immersive Audio Media and Formats (IAMF), also known as Eclipsa Audio, is an open-source audio container developed to accommodate multichannel and scene-based audio formats. Headphone-based delivery of IAMF audio requires efficient binaural rendering. This paper introduces the Open Binaural Renderer (OBR), which is designed to render IAMF audio. It discusses the core rendering algorithm, the binaural filter design process as well as real-time implementation of the renderer in a form of an open-source C++ rendering library. Designed for
multi-platform compatibility, the renderer incorporates a novel approach to binaural audio processing, leveraging a combination of spherical harmonic (SH) based virtual listening room model and anechoic binaural filters. Through its design, the IAMF binaural renderer provides a robust solution for delivering high-quality immersive audio across diverse platforms and applications.
View details
Visualizing Dynamics of Charges and Strings in (2+1)D Lattice Gauge Theories
Tyler Cochran
Bernhard Jobst
Yuri Lensky
Gaurav Gyawali
Norhan Eassa
Melissa Will
Aaron Szasz
Dmitry Abanin
Rajeev Acharya
Laleh Beni
Trond Andersen
Markus Ansmann
Frank Arute
Kunal Arya
Abe Asfaw
Juan Atalaya
Brian Ballard
Alexandre Bourassa
Michael Broughton
David Browne
Brett Buchea
Bob Buckley
Tim Burger
Nicholas Bushnell
Anthony Cabrera
Juan Campero
Hung-Shen Chang
Jimmy Chen
Benjamin Chiaro
Jahan Claes
Agnetta Cleland
Josh Cogan
Roberto Collins
Paul Conner
William Courtney
Alex Crook
Ben Curtin
Sayan Das
Laura De Lorenzo
Agustin Di Paolo
Paul Donohoe
ILYA Drozdov
Andrew Dunsworth
Alec Eickbusch
Aviv Elbag
Mahmoud Elzouka
Vinicius Ferreira
Ebrahim Forati
Austin Fowler
Brooks Foxen
Suhas Ganjam
Robert Gasca
Élie Genois
William Giang
Dar Gilboa
Raja Gosula
Alejo Grajales Dau
Dietrich Graumann
Alex Greene
Steve Habegger
Monica Hansen
Sean Harrington
Paula Heu
Oscar Higgott
Jeremy Hilton
Robert Huang
Ashley Huff
Bill Huggins
Cody Jones
Chaitali Joshi
Pavol Juhas
Hui Kang
Amir Karamlou
Kostyantyn Kechedzhi
Trupti Khaire
Bryce Kobrin
Alexander Korotkov
Fedor Kostritsa
John Mark Kreikebaum
Vlad Kurilovich
Dave Landhuis
Tiano Lange-Dei
Brandon Langley
Kim Ming Lau
Justin Ledford
Kenny Lee
Loick Le Guevel
Wing Li
Alexander Lill
Will Livingston
Daniel Lundahl
Aaron Lunt
Sid Madhuk
Ashley Maloney
Salvatore Mandra
Leigh Martin
Orion Martin
Cameron Maxfield
Seneca Meeks
Anthony Megrant
Reza Molavi
Sebastian Molina
Shirin Montazeri
Ramis Movassagh
Charles Neill
Michael Newman
Murray Ich Nguyen
Chia Ni
Kris Ottosson
Alex Pizzuto
Rebecca Potter
Orion Pritchard
Ganesh Ramachandran
Matt Reagor
David Rhodes
Gabrielle Roberts
Kannan Sankaragomathi
Henry Schurkus
Mike Shearn
Aaron Shorter
Noah Shutty
Vladimir Shvarts
Vlad Sivak
Spencer Small
Clarke Smith
Sofia Springer
George Sterling
Jordan Suchard
Alex Sztein
Doug Thor
Mert Torunbalci
Abeer Vaishnav
Justin Vargas
Sergey Vdovichev
Guifre Vidal
Steven Waltman
Shannon Wang
Brayden Ware
Kristi Wong
Cheng Xing
Jamie Yao
Ping Yeh
Bicheng Ying
Juhwan Yoo
Grayson Young
Yaxing Zhang
Ningfeng Zhu
Yu Chen
Vadim Smelyanskiy
Adam Gammon-Smith
Frank Pollmann
Michael Knap
Nature, 642 (2025), 315–320
Preview abstract
Lattice gauge theories (LGTs) can be used to understand a wide range of phenomena, from elementary particle scattering in high-energy physics to effective descriptions of many-body interactions in materials. Studying dynamical properties of emergent phases can be challenging, as it requires solving many-body problems that are generally beyond perturbative limits. Here we investigate the dynamics of local excitations in a LGT using a two-dimensional lattice of superconducting qubits. We first construct a simple variational circuit that prepares low-energy states that have a large overlap with the ground state; then we create charge excitations with local gates and simulate their quantum dynamics by means of a discretized time evolution. As the electric field coupling constant is increased, our measurements show signatures of transitioning from deconfined to confined dynamics. For confined excitations, the electric field induces a tension in the string connecting them. Our method allows us to experimentally image string dynamics in a (2+1)D LGT, from which we uncover two distinct regimes inside the confining phase: for weak confinement, the string fluctuates strongly in the transverse direction, whereas for strong confinement, transverse fluctuations are effectively frozen. We also demonstrate a resonance condition at which dynamical string breaking is facilitated. Our LGT implementation on a quantum processor presents a new set of techniques for investigating emergent excitations and string dynamics.
View details
Linear Elastic Caching via Ski Rental
Todd Lipcon
The biennial Conference on Innovative Data Systems Research (2025)
Preview abstract
In this work we study the Linear Elastic Caching problem, where the goal is to minimize the total cost of a cache inclusive of not just its misses, but also its memory footprint integrated over time. We demonstrate a theoretical connection to the classic ski rental problem and propose a practical algorithm that combines online caching algorithms with ski rental policies. We also introduce a lightweight machine learning-based algorithm for ski rental that is optimized for production workloads and is easy to integrate within existing database systems. Evaluations on both production workloads in Google Spanner and publicly available traces show that the proposed elastic caching approach can significantly reduce the total cache cost compared to traditional fixed-size cache policies.
View details