Publications

Our teams aspire to make discoveries that impact everyone, and core to our approach is sharing our research and tools to fuel progress in the field.

people standing in front of a screen with images and a chipboard

Our teams aspire to make discoveries that impact everyone, and core to our approach is sharing our research and tools to fuel progress in the field.

Sort By
  • Title
  • Title, descending
  • Year
  • Year, descending
1 - 15 of 10595 publications
    Preview abstract Browser fingerprinting enables persistent cross-site user tracking via subtle techniques that often evade conventional defenses or cause website breakage when script-level blocking countermeasures are applied. Addressing these challenges requires detection methods offering both function-level precision to minimize breakage and inherent robustness against code obfuscation and URL manipulation. We introduce ByteDefender, the first system leveraging V8 engine bytecode to detect fingerprinting operations specifically at the JavaScript function level. A Transformer-based classifier, trained offline on bytecode sequences, accurately identifies functions exhibiting fingerprinting behavior. We develop and evaluate lightweight signatures derived from this model to enable low-overhead, on-device matching against function bytecode during compilation but prior to execution, which only adds a 4% (average) latency to the page load time. This mechanism facilitates targeted, real-time prevention of fingerprinting function execution, thereby preserving legitimate script functionality. Operating directly on bytecode ensures inherent resilience against common code obfuscation and URL-based evasion. Our evaluation on the top 100k websites demonstrates high detection accuracy at both function- and script-level, with substantial improvements over state-of-the-art AST-based methods, particularly in robustness against obfuscation. ByteDefender offers a practical framework for effective, precise, and robust fingerprinting mitigation. View details
    Triaging mammography with artificial intelligence: an implementation study
    Sarah M. Friedewald
    Sunny Jansen
    Fereshteh Mahvar
    Timo Kohlberger
    David V. Schacht
    Sonya Bhole
    Dipti Gupta
    Scott Mayer McKinney
    Stacey Caron
    David Melnick
    Mozziyar Etemadi
    Samantha Winter
    Alejandra Maciel
    Luca Speroni
    Martha Sevenich
    Arnav Agharwal
    Rubin Zhang
    Gavin Duggan
    Shiro Kadowaki
    Atilla Kiraly
    Jie Yang
    Basil Mustafa
    Krish Eswaran
    Shravya Shetty
    Breast Cancer Research and Treatment (2025)
    Preview abstract Purpose Many breast centers are unable to provide immediate results at the time of screening mammography which results in delayed patient care. Implementing artificial intelligence (AI) could identify patients who may have breast cancer and accelerate the time to diagnostic imaging and biopsy diagnosis. Methods In this prospective randomized, unblinded, controlled implementation study we enrolled 1000 screening participants between March 2021 and May 2022. The experimental group used an AI system to prioritize a subset of cases for same-visit radiologist evaluation, and same-visit diagnostic workup if necessary. The control group followed the standard of care. The primary operational endpoints were time to additional imaging (TA) and time to biopsy diagnosis (TB). Results The final cohort included 463 experimental and 392 control participants. The one-sided Mann-Whitney U test was employed for analysis of TA and TB. In the control group, the TA was 25.6 days [95% CI 22.0–29.9] and TB was 55.9 days [95% CI 45.5–69.6]. In comparison, the experimental group's mean TA was reduced by 25% (6.4 fewer days [one-sided 95% CI > 0.3], p<0.001) and mean TB was reduced by 30% (16.8 fewer days; 95% CI > 5.1], p=0.003). The time reduction was more pronounced for AI-prioritized participants in the experimental group. All participants eventually diagnosed with breast cancer were prioritized by the AI. Conclusions Implementing AI prioritization can accelerate care timelines for patients requiring additional workup, while maintaining the efficiency of delayed interpretation for most participants. Reducing diagnostic delays could contribute to improved patient adherence, decreased anxiety and addressing disparities in access to timely care. View details
    Quantum simulation with sum-of-squares spectral amplification
    Robbie King
    Guang Hao Low
    Rolando Somma
    arXiv:2505.01528 (2025)
    Preview abstract We introduce sum-of-squares spectral amplification (SOSSA), a framework for improving quantum simulation algorithms relevant to low-energy problems. SOSSA first represents the Hamiltonian as a sum-of-squares and then applies spectral amplification to amplify the low-energy spectrum. The sum-of-squares representation can be obtained using semidefinite programming. We show that SOSSA can improve the efficiency of traditional methods in several simulation tasks involving low-energy states. Specifically, we provide fast quantum algorithms for energy and phase estimation that improve over the state-of-the-art in both query and gate complexities, complementing recent results on fast time evolution of low-energy states. To further illustrate the power of SOSSA, we apply it to the Sachdev-Ye-Kitaev model, a representative strongly correlated system, where we demonstrate asymptotic speedups by a factor of the square root of the system size. Notably, SOSSA was recently used in [G.H. Low \textit{et al.}, arXiv:2502.15882 (2025)] to achieve state-of-art costs for phase estimation of real-world quantum chemistry systems. View details
    SSDTrain: Faster Large Language Model Training Using SSD-Based Activation Offloading
    Kun Wu
    Jeongmin Brian Park
    Mert Hidayetoğlu
    Vikram Sharma Mailthody
    Sitao Huang
    Steven Lumetta
    Wen-mei Hwu
    Design Automation Conference (DAC) (2025)
    Preview abstract The scaling up of Large Language Models (LLMs) demands more memory than current GPUs can provide, hindering the training process. To address this challenge, we propose SSDTrain to efficiently offload activations, the intermediate tensors produced during LLM training, to SSDs. This approach reduces GPU memory usage without impacting performance by adaptively overlapping data transfers with computation. SSDTrain is compatible with popular deep learning frameworks like PyTorch, Megatron, and DeepSpeed, and it employs techniques such as tensor deduplication, forwarding, and adaptive offloading to further enhance efficiency. We conduct extensive experiments on Llama, BERT, and T5. Results demonstrate that SSDTrain effectively reduces 45% of the activation peak memory usage. It can perfectly overlap the IO with the computation without introducing performance penalty. SSDTrain can achieve a performance boost of up to 31% compared to the conventional training strategy using the same GPU systems. View details
    Preview abstract The peer-review process is broken and the problem is getting worse, especially in AI: large conferences like NeurIPS increasingly struggle to adequately review huge numbers of paper submissions. I propose a scalable solution that, foremost, recognizes reviewing as important, necessary, \emph{work} and rewards it with crypto-coins owned and managed by the conferences themselves. The idea is at its core quite simple: paper submissions require work (reviews, meta-reviews, etc.) to be done, and therefore the submitter must pay for that work. Each reviewer submits their review to be approved by some designated conference officer (e.g. PC chair, Area Chair, etc.), and upon approval is paid a single coin for a single review. If three reviews are required, the cost of submission should be three coins + a tax that covers payments to all the volunteers who organize the conference. After some one-time startup costs to fairly distribute coins, the process should be relatively stable with new coins minted only when a conference grows. View details
    On Design Principles for Private Adaptive Optimizers
    Abhradeep Guha Thakurta
    Arun Ganesh
    Privacy-Preserving Machine Learning Workshop 2025 (2025) (to appear)
    Preview abstract The spherical noise added to gradients in differentially private (DP) training undermines the performance of adaptive optimizers like AdaGrad and Adam, and hence many recent works have proposed algorithms to address this challenge. However, the empirical results in these works focus on simple tasks and models and the conclusions may not generalize to model training in practice. In this paper we survey several of these variants, and develop better theoretical intuition for them as well as perform empirical studies comparing them. We find that a common intuition of aiming for unbiased estimates of second moments of gradients in adaptive optimizers is misguided, and instead that a simple technique called scale-then-privatize (which does not achieve unbiased second moments) has more desirable theoretical behaviors and outperforms all other variants we study on a small-scale language model training task. We additionally argue that scale-then-privatize causes the noise addition to better match the application of correlated noise mechanisms which are more desirable to use in practice. View details
    Preview abstract This paper investigates the theoretical underpinnings of the widely successful pretrain-then-adapt strategy for foundation models. We introduce a Bayesian model selection criterion, termed the downstream free energy, which quantifies the adaptability of a pretrained checkpoint by measuring, under the downstream data distribution, the concentration of favorable solutions near the checkpoint. However, minimizing this downstream free energy is infeasible without access to downstream data. To address this, we show that under certain conditions, mini- mizing the upstream free energy – which can be estimated using only upstream data – can serve as a reliable proxy. We validate this theoretical insight through preliminary experiments, showing that commonly used pretraining heuristics ef- fectively lower upstream free energy, leading to better downstream performance. View details
    Bridging Sign and Spoken Languages: Pseudo GlossGeneration for Sign Language Translation
    Trevor Cohn
    Jianyuan Guo
    Advances in Neural Information Processing Systems (NeurIPS) (2025)
    Preview abstract Sign Language Translation (SLT) aims to map sign language videos to spoken language text. A common approach leverages gloss annotations as an intermediate representation, decomposing SLT into two sub-tasks: video-to-gloss recognition and gloss-to-text translation. While effective, this paradigm relies on expert-annotated gloss labels, which are costly and increasingly unavailable in many datasets, limiting scalability. To address this challenge, we propose a gloss-free pseudo gloss generation framework that eliminates the need for human-annotated glosses while preserving the structured intermediate representation. Specifically, we prompt a Large Language Model (LLM) with example text-gloss pairs to extract potential sign-related gloss words from the text by leveraging its in-context learning capability. To mitigate the inherent misalignment between generated pseudo glosses and sign sequences in the video, we further refine their order by formulating the alignment as a weakly supervised learning problem. With the reordered pseudo-glosses, additional alignment losses such as CTC can be incorporated to enhance supervision. We train our SLT model—comprising a vision encoder and a translator—under a three-stage pipeline, effectively bridging the gap between sign and spoken language. Despite its simplicity, our approach outperforms previous state-of-the-art gloss-free frameworks across three SLT benchmarks and achieves competitive results with gloss-based methods. View details
    Benchmarking and improving algorithms for attributing satellite-observed contrails to flights
    Vincent Rudolf Meijer
    Rémi Chevallier
    Allie Duncan
    Kyle McConnaughay
    Atmospheric Measurement Techniques, 18 (2025), pp. 3495-3532
    Preview abstract Condensation trail (contrail) cirrus clouds cause a substantial fraction of aviation's climate impact. One proposed method for the mitigation of this impact involves modifying flight paths to avoid particular regions of the atmosphere that are conducive to the formation of persistent contrails, which can transform into contrail cirrus. Determining the success of such avoidance maneuvers can be achieved by ascertaining which flight formed each nearby contrail observed in satellite imagery. The same process can be used to assess the skill of contrail forecast models. The problem of contrail-to-flight attribution is complicated by several factors, such as the time required for a contrail to become visible in satellite imagery, high air traffic densities, and errors in wind data. Recent work has introduced automated algorithms for solving the attribution problem, but it lacks an evaluation against ground-truth data. In this work, we present a method for producing synthetic contrail detections with predetermined contrail-to-flight attributions that can be used to evaluate – or “benchmark” – and improve such attribution algorithms. The resulting performance metrics can be employed to understand the implications of using these observational data in downstream tasks, such as forecast model evaluation and the analysis of contrail avoidance trials, although the metrics do not directly quantify real-world performance. We also introduce a novel, highly scalable contrail-to-flight attribution algorithm that leverages the characteristic compounding of error induced by simulating contrail advection using numerical weather models. The benchmark shows an improvement of approximately 25 % in precision versus previous contrail-to-flight attribution algorithms, without compromising recall. View details
    Preview abstract Natural disasters, including earthquakes, wildfires and cyclones, bear a huge risk on human lives as well as infrastructure assets. An effective response to disaster depends on the ability to rapidly and efficiently assess the intensity of damage. Artificial Intelligence (AI) and Generative Artificial Intelligence (GenAI) presents a breakthrough solution, capable of combining knowledge from multiple types and sources of data, simulating realistic scenarios of disaster, and identifying emerging trends at a speed previously unimaginable. In this paper, we present a comprehensive review on the prospects of AI and GenAI in damage assessment for various natural disasters, highlighting both its strengths and limitations. We talk about its application to multimodal data such as text, image, video, and audio, and also cover major issues of data privacy, security, and ethical use of the technology during crises. The paper also recognizes the threat of Generative AI misuse, in the form of dissemination of misinformation and for adversarial attacks. Finally, we outline avenues of future research, emphasizing the need for secure, reliable, and ethical Generative AI systems for disaster management in general. We believe that this work represents the first comprehensive survey of Gen-AI techniques being used in the field of Disaster Assessment and Response. View details
    Preview abstract Data science, which transforms raw data into actionable insights, is critical for data-driven decision-making. However, these tasks are often complex, involving steps like exploring multiple data sources and synthesizing findings to deliver clear answers. While large language model (LLM) agents show significant promise in automating this process, they often struggle with heterogeneous data formats and generate sub-optimal analysis plans, as verifying plan correctness is inherently difficult without ground-truth labels for such open-ended tasks. To overcome these limitations, we introduce DS-STAR, a novel data science agent. Specifically, DS-STAR makes three key contributions: (1) a data file analysis module that automatically reads and extracts context from diverse data formats, including unstructured types; (2) a verification step where an LLM-based judge evaluates the sufficiency of the analysis plan at each stage; and (3) a sequential planning mechanism that starts with a simple, executable plan and iteratively refines it based the DS-STAR's feedback until its sufficiency is confirmed. This iterative refinement allows DS-STAR to reliably navigate complex analyses involving varied data sources. Our experiments show that DS-STAR achieves state-of-the-art performance, improving accuracy on the challenging DABStep benchmark from 41.0% to 45.2% and on Kramabench from 31.3% to 44.7%. These results demonstrate the effectiveness of our approach for practical, multi-step data science tasks. View details
    Preview abstract In the Max k-Weight SAT (aka Max SAT with Cardinality Constraint) problem, we are given a CNF formula with n variables and m clauses together with a positive integer k. The goal is to find an assignment where at most k variables are set to one that satisfies as many constraints as possible. Recently, Jain et al. (SODA 2023) gave an FPT approximation scheme (FPT-AS) with running time 2^O((dk/ε)^d) * (n + m)^O(1) for Max k-Weight SAT when the incidence graph is K_{d,d}-free. They asked whether a polynomial-size approximate kernel exists. In this work, we answer this question positively by giving an (1 − ε)-approximate kernel with (dk/ε)^O(d) variables. This also implies an improved FPT-AS with running time (dk/ε)^O(dk) * (n+m)^O(1)-time algorithm for the problem. Our approximate kernel is based mainly on a couple of greedy strategies together with a sunflower lemma-style reduction rule. View details
    Supporting the Digital Safety of At-Risk Users: Lessons Learned from 9+ Years of Research and Training
    Tara Matthews
    Patrick Gage Kelley
    Lea Kissner
    Andreas Kramm
    Andrew Oplinger
    Andy Schou
    Stephan Somogyi
    Dalila Szostak
    Jill Woelfer
    Lawrence You
    Izzie Zahorian
    ACM Transactions on Computer-Human Interaction, 32(3) (2025), pp. 1-39
    Preview abstract Creating information technologies intended for broad use that allow everyone to participate safely online—which we refer to as inclusive digital safety—requires understanding and addressing the digital-safety needs of a diverse range of users who face elevated risk of technology-facilitated attacks or disproportionate harm from such attacks—i.e., at-risk users. This article draws from more than 9 years of our work at Google to understand and support the digital safety of at-risk users—including survivors of intimate partner abuse, people involved with political campaigns, content creators, youth, and more—in technology intended for broad use. Among our learnings is that designing for inclusive digital safety across widely varied user needs and dynamic contexts is a wicked problem with no “correct” solution. Given this, we describe frameworks and design principles we have developed to help make at-risk research findings practically applicable to technologies intended for broad use and lessons we have learned about communicating them to practitioners. View details
    Shadow Hamiltonian Simulation
    Rolando Somma
    Robbie King
    Tom O'Brien
    Nature Communications, 16 (2025), pp. 2690
    Preview abstract Simulating quantum dynamics is one of the most important applications of quantum computers. Traditional approaches for quantum simulation involve preparing the full evolved state of the system and then measuring some physical quantity. Here, we present a different and novel approach to quantum simulation that uses a compressed quantum state that we call the "shadow state". The amplitudes of this shadow state are proportional to the time-dependent expectations of a specific set of operators of interest, and it evolves according to its own Schrödinger equation. This evolution can be simulated on a quantum computer efficiently under broad conditions. Applications of this approach to quantum simulation problems include simulating the dynamics of exponentially large systems of free fermions or free bosons, the latter example recovering a recent algorithm for simulating exponentially many classical harmonic oscillators. These simulations are hard for classical methods and also for traditional quantum approaches, as preparing the full states would require exponential resources. Shadow Hamiltonian simulation can also be extended to simulate expectations of more complex operators such as two-time correlators or Green's functions, and to study the evolution of operators themselves in the Heisenberg picture. View details