Publications
Our teams aspire to make discoveries that impact everyone, and core to our approach is sharing our research and tools to fuel progress in the field.

Our teams aspire to make discoveries that impact everyone, and core to our approach is sharing our research and tools to fuel progress in the field.
Sort By
1 - 15 of 10402 publications
Preview abstract
This paper discusses the migration of data orchestration workflows from a legacy tool like Autosys to a modern, cloud - based solution, Google Cloud Composer. It explores the transition from traditional job scheduling to Directed Acyclic Graph (DAG) - based workflows using Apache Airflow, culminating in the deployment and management of these workflows in Cloud Composer. The benefits and challenges of this migration are examined, highlighting the advantages of scalability, flexibility, and cloud integration offered by Cloud Composer.
View details
Preview abstract
Measuring productivity is equivalent to building a model. All models are wrong, but some are useful. Productivity models are often “worryingly selective” (wrong because of omissions). Worrying selectivity can be combated by taking a holistic approach that includes multiple measurements of multiple outcomes. Productivity models should include multiple outcomes, metrics, and methods.
View details
On the Design of the Binaural Rendering Library for Eclipsa Audio Immersive Audio Container
Tomasz Rudzki
Gavin Kearney
AES 158th Convention of the Audio Engineering Society (2025)
Preview abstract
Immersive Audio Media and Formats (IAMF), also known as Eclipsa Audio, is an open-source audio container developed to accommodate multichannel and scene-based audio formats. Headphone-based delivery of IAMF audio requires efficient binaural rendering. This paper introduces the Open Binaural Renderer (OBR), which is designed to render IAMF audio. It discusses the core rendering algorithm, the binaural filter design process as well as real-time implementation of the renderer in a form of an open-source C++ rendering library. Designed for
multi-platform compatibility, the renderer incorporates a novel approach to binaural audio processing, leveraging a combination of spherical harmonic (SH) based virtual listening room model and anechoic binaural filters. Through its design, the IAMF binaural renderer provides a robust solution for delivering high-quality immersive audio across diverse platforms and applications.
View details
Matryoshka Model Learning for Improved Elastic Student Models
Chetan Verma
Aditya Srinivas Timmaraju
Cho-Jui Hsieh
Ngot Bui
Yang Zhang
Wen Chen
Xin Liu
Inderjit Dhillon
2025
Preview abstract
Industry-grade ML models are carefully designed to meet rapidly evolving serving constraints, which requires significant resources for model development. In this paper, we propose MatTA, a framework for training multiple accurate Student models using a novel Teacher-TA-Student recipe. TA models are larger versions of the Student models with higher capacity, and thus allow Student models to better relate to the Teacher model and also bring in more domain-specific expertise. Furthermore, multiple accurate Student models can be extracted from the TA model. Therefore, despite only one training run, our methodology provides multiple servable options to trade off accuracy for lower serving cost. We demonstrate the proposed method, MatTA, on proprietary datasets and models. Its practical efficacy is underscored by live A/B tests within a production ML system, demonstrating 20% improvement on a key metric. We also demonstrate our method on GPT-2 Medium, a public model, and achieve relative improvements of over 24% on SAT Math and over 10% on the LAMBADA benchmark.
View details
A Recipe for Improving Remote Sensing Zero Shot Generalization
Aviad Barzilai
Yotam Gigi
Vered Silverman
Yehonathan Refael
Bolous Jaber
Amr Helmy
3rd ML4RS Workshop at ICLR 2025
Preview abstract
Foundation models have had a significant impact across various AI applications, enabling applications for use cases that were previously impossible. Visual language models (VLMs), in particular, have outperformed other techniques in many tasks. In remote sensing (RS), foundation models have shown improvements across various applications. However, unlike other fields, the use of VLMs with large-scale remote sensing image-text datasets remains limited.
In this work, we first introduce two novel image-caption datasets for training of remote sensing foundation models. The first dataset pairs aerial and satellite imagery, aligned with Google-Maps data, with high-quality captions generated using Gemini. The second utilizes public web images and their corresponding alt-text, filtered for only remote sensing domain, resulting in a highly diverse dataset.
We show that using these datasets to pre-train the Mammut [], a VLM architecture, results in state-of-the-art generalization performance in a zero-shot classification and cross-modal retrieval on well-known public benchmarks. Secondly, we leverage this newly pre-trained VLM to generate inference attention maps for a novel class query (i.e., a class unseen during training). We subsequently propose an iterative self-supervised fine-tuning approach where samples aligned with these attention maps are iteratively pseudo-labeled and utilized for model training.
View details
Linear Elastic Caching via Ski Rental
Todd Lipcon
The biennial Conference on Innovative Data Systems Research (2025)
Preview abstract
In this work we study the Linear Elastic Caching problem, where the goal is to minimize the total cost of a cache inclusive of not just its misses, but also its memory footprint integrated over time. We demonstrate a theoretical connection to the classic ski rental problem and propose a practical algorithm that combines online caching algorithms with ski rental policies. We also introduce a lightweight machine learning-based algorithm for ski rental that is optimized for production workloads and is easy to integrate within existing database systems. Evaluations on both production workloads in Google Spanner and publicly available traces show that the proposed elastic caching approach can significantly reduce the total cache cost compared to traditional fixed-size cache policies.
View details
Heterogenous graph neural networks for species distribution modeling
Christine Kaeser-Chen
Keith Anderson
Michelangelo Conserva
Elise Kleeman
Maxim Neumann
Matt Overlan
Millie Chapman
Drew Purves
arxiv (2025)
Preview abstract
Species distribution models (SDMs) are necessary for measuring and predicting occurrences and habitat suitability of species and their relationship with environmental factors. We introduce a novel presence-only SDM with graph neural networks (GNN). In our model, species and locations are treated as two distinct node sets, and the learning task is predicting detection records as the edges that connect locations to species. Using GNN for SDM allows us to model fine-grained interactions between species and the environment. We evaluate the potential of this methodology on the six-region dataset compiled by National Center for Ecological Analysis and Synthesis (NCEAS) for benchmarking SDMs. For each of the regions, the heterogeneous GNN model is comparable to or outperforms previously-benchmarked single-species SDMs as well as a feed-forward neural network baseline model.
View details
Preview abstract
The global adoption of Large Language Models (LLMs) in healthcare shows promise for enhancing clinical workflows and improving patient outcomes. However, Automatic Speech
Recognition (ASR) errors in critical medical entities remain a significant challenge. These
errors can lead to severe consequences if undetected. This study investigates the prevalence and impact of ASR errors in medical transcription across Africa, Europe, and North America. By examining variations in accented English across three continents, we analyze the impact of regional speech patterns on ASR performance. Our research quantifies both the potential and limitations of LLMs in mitigating ASR inaccuracies within various medical settings, with particular attention to performance variations across regional accents and medical terminology. Our findings highlight significant disparities in ASR accuracy across regions and identify specific conditions under which LLM corrections prove most effective.
View details
Avoid global outages by partitioning cloud applications to reduce blast radius
Karan Anand
https://cloud.google.com/ (2025)
Preview abstract
Cloud application development faces the inherent challenge of balancing rapid innovation with high availability. This blog post details how Google Workspace's Site Reliability Engineering team addresses this conflict by implementing vertical partitioning of serving stacks. By isolating application servers and storage into distinct partitions, the "blast radius" of code changes and updates is significantly reduced, minimizing the risk of global outages. This approach, which complements canary deployments, enhances service availability, provides flexibility for experimentation, and facilitates data localization. While challenges such as data model complexities and inter-service partition misalignment exist, the benefits of improved reliability and controlled deployments make partitioning a crucial strategy for maintaining robust cloud applications
View details
A Call to Action: Advancing the Conversation Around Neurodivergent Education-Employment Transitions
Dannie Lynn Fountain
Vicki Baker
Kevin Danley
Closing the Gap (2025)
Preview abstract
Neurodiversity is still largely stigmatized and excluded from DEIB frameworks and related organizational initiatives, despite the increased recognition regarding the benefits of neuroinclusion within the education and corporate spheres. We seek to address this knowledge-to-practice gap through the creation of the Neurodiversity Engagement Framework. By highlighting supports needed for neurodivergent individuals, and those that support them, the framework helps neurodivergent individuals navigate within and across higher education and industry contexts. Informed by an interdisciplinary review of literature from higher education, industry, and corporate leadership contexts, the Neurodiversity Engagement Framework brings to light prevailing challenges within practices and policies, serving as a guide for the creation of a more supportive foundation for neurodiverse individuals to thrive. In this manuscript, readers are encouraged to consider the myriad of impacts that neurodiversity has on higher education and industry experiences and the ways that organizations can be more proactive in their support of this growing population. To conclude, we offer a roadmap for future research and practice to further elucidate ways academic and corporation leaders and policymakers can effectively support neurodivergent individuals.
View details
Preview abstract
Creativity in software development is frequently overlooked, specifically
in the design of developer tools which often focus on productivity. This is likely
because creativity is not always seen as a goal in software engineering; in part,
this can be explained by the unique way in which software engineers relate to
creativity as centered around reusability rather than novelty. However, creativity is
a critical aspect of software engineering, and importantly, there is a clear
possibility for AI to impact creativity in both positive or negative ways. In this
article, we explore the differences in goals for designing AI tools for productivity
compared to creativity and propose strategies to elevate creativity in the software
engineering workflow. Specifically, we apply seamful design to AI powered
software development to consider the role of seamfulness in software
development workflows as a way to support creativity.
View details
Record Number of Members Visit U.S. Congress to Talk Tech Policy
IEEE Spectrum (2025)
Preview abstract
This IEEE Spectrum article reflects on advocacy for U.S. technological leadership during my Congressional visit through IEEE-USA. Leading an expert group of other distinguished IEEE members, we urged lawmakers to support critical initiatives. Key priorities included sustained funding for federal research institutions like NIST, NASA, and the NSF, reauthorizing the SBIR/STTR programs vital for small business innovation, and passing the CREATE AI Act to democratize AI resources by establishing the National AI Research Resource (NAIRR).
We also emphasized strengthening the STEM talent pipeline through the CHIPS and Science Act and expanding high-skilled immigrant visas. We highlighted rapid AI advancements, such as autonomous vehicles, the surge in FDA-approved AI based medical devices, as underscoring the need for these strategic investments and policy actions. The article conveys a sense of urgency, calling for concrete congressional action to ensure the U.S. maintains its technological edge while also sharing my personal experiences.
View details
Mitigating Clinician Information Overload: Generative AI for Integrated EHR and RPM Data Analysis
Shashank Kapoor
Aman Raj
2025
Preview abstract
Generative AI (GenAI), particularly Large Language Models (LLMs), offer powerful capabilities for interpreting the complex data landscape in healthcare. In this paper, we present a comprehensive overview of the capabilities, requirements and applications of GenAI for deriving clinical insights and improving clinical efficiency. We first provide some background on the forms and sources of patient data, namely real-time Remote Patient Monitoring (RPM) streams and traditional Electronic Health Records (EHR). The sheer volume and heterogeneity of this combined data present significant challenges to clinicians and contribute to information overload.
In addition, we explore the potential of LLM-powered applications for improving clinical efficiency. These applications can enhance navigation of longitudinal patient data and provide actionable clinical decision support through natural language dialogue. We discuss the opportunities this presents for streamlining clinician workflows and personalizing care, alongside critical challenges such as data integration complexity, ensuring data quality and RPM data reliability, maintaining patient privacy, validating AI outputs for clinical safety, mitigating bias, and ensuring clinical acceptance. We believe this work represents the first summarization of GenAI techniques for managing clinician data overload due to combined RPM / EHR data complexities.
View details
Sufficient Context: A New Lens on Retrieval Augmented Generation Systems
Hailey Joren
Jianyi Zhang
Chun-Sung Ferng
Ankur Taly
International Conference on Learning Representations (ICLR) (2025)
Preview abstract
Augmenting LLMs with context leads to improved performance across many applications. Despite much research on Retrieval Augmented Generation (RAG) systems, an open question is whether errors arise because LLMs fail to utilize the context from retrieval or the context itself is insufficient to answer the query. To shed light on this, we develop a new notion of sufficient context, along with a method to classify instances that have enough information to answer the query. We then use sufficient context to analyze several models and datasets. By stratifying errors based on context sufficiency, we find that larger models with higher baseline performance (Gemini 1.5 Pro, GPT 4o, Claude 3.5) excel at answering queries when the context is sufficient, but often output incorrect answers instead of abstaining when the context is not. On the other hand, smaller models with lower baseline performance (Llama 3.1, Mistral 3, Gemma 2) hallucinate or abstain often, even with sufficient context. We further categorize cases when the context is useful, and improves accuracy, even though it does not fully answer the query and the model errs without the context. Building on our findings, we explore ways to reduce hallucinations in RAG systems, including a new selective generation method that leverages sufficient context information for guided abstention. Our method improves the fraction of correct answers among times where the model responds by 2--10% for Gemini, GPT, and Gemma.
View details
Scaling Wearable Foundation Models
Girish Narayanswamy
Kumar Ayush
Yuzhe Yang
Orson Xu
Shun Liao
Shyam Tailor
Jake Sunshine
Tim Althoff
Shrikanth (Shri) Narayanan
Jiening Zhan
Mark Malhotra
Shwetak Patel
Samy Abdel-Ghaffar
Daniel McDuff
2025
Preview abstract
Wearable sensors have become ubiquitous thanks to a variety of health tracking features. The resulting continuous and longitudinal measurements from everyday life generate large volumes of data. However, making sense of these observations for scientific and actionable insights is non-trivial. Inspired by the empirical success of generative modeling, where large neural networks learn powerful representations from vast amounts of text, image, video, or audio data, we investigate the scaling properties of wearable sensor foundation models across compute, data, and model size. Using a dataset of up to 40 million hours of in-situ heart rate, heart rate variability, accelerometer, electrodermal activity, skin temperature, and altimeter per-minute data from over 165,000 people, we create LSM, a multimodal foundation model built on the largest wearable-signals dataset with the most extensive range of sensor modalities to date. Our results establish the scaling laws of LSM for tasks such as imputation, interpolation and extrapolation across both time and sensor modalities. Moreover, we highlight how LSM enables sample-efficient downstream learning for tasks including exercise and activity recognition.
View details