Publications

Our teams aspire to make discoveries that impact everyone, and core to our approach is sharing our research and tools to fuel progress in the field.

people standing in front of a screen with images and a chipboard

Our teams aspire to make discoveries that impact everyone, and core to our approach is sharing our research and tools to fuel progress in the field.

Sort By
  • Title
  • Title, descending
  • Year
  • Year, descending
1 - 15 of 11257 publications
    Preview abstract Responsive user interfaces enable dynamically adjusting user interfaces based on device-specific aspects such as screen size, aspect ratio, display resolution, etc. However, traditional responsive design fails to account for different types of constraints of a user and task criticality of the task being performed via the UI. Misalignment between the UI design, user context and task criticality can lead to user error. This disclosure describes techniques, implemented with user permission, for dynamically modifying the layout, information density, and/or interactive physics of a user interface based on a dual-factor analysis of user cognitive state and task criticality. The user's cognitive state can be inferred from behavioral telematics. Task criticality can be inferred from semantic analysis. The information density and other parameters of a user interface are automatically adjusted based on such analyses. Such adjustments include applying or relaxing restrictions on interactivity and adjusting visual prominence of various UI elements to adjust the information density of the user interface. The adjustments can also include adjusting friction as appropriate, hiding certain aspects of the user interface, or other types of adjustments. View details
    ConvApparel: A Benchmark Dataset and Validation Framework for User Simulators in Conversational Recommenders
    Guy Tennenholtz
    Jihwan Jeong
    Proceedings of the 19th Conference of the European Chapter of the Association for Computational Linguistics (EACL-26), Rabat, Morocco (2026), pp. 5270-5304
    Preview abstract LLM-based user simulators are a scalable solution for improving conversational AI, but a critical realism gap undermines their effectiveness. To close this gap, we introduce a framework for building and validating high-fidelity simulators. We present a novel dataset of human-AI shopping conversations designed to capture a wide spectrum of user experiences. To measure fidelity, we propose a hybrid evaluation protocol that combines statistical alignment with a learned, discriminator-based Human-Likeness Score. Our most sophisticated simulator, trained via reinforcement learning with iterative critique, achieves a significant leap in realism. Critically, we demonstrate through counterfactual validation that our simulator—trained exclusively on optimal interactions—realistically adapts its behavior to suboptimal system responses, mirroring real user reactions and marking a key advance in creating reliable simulators for robust AI development. View details
    Preview abstract How many T gates are needed to approximate an arbitrary n-qubit quantum state to within a given precision ϵ? Improving prior work of Low, Kliuchnikov and Schaeffer, we show that the optimal asymptotic scaling is Θ(sqrt{2^n log(1/ε)} + log(1/ε)) if we allow an unlimited number of ancilla qubits. We also show that this is the optimal T-count for implementing an arbitrary diagonal n-qubit unitary to within error ϵ. We describe an application to batched synthesis of single-qubit unitaries: we can approximate a tensor product of m = O(log log(1/ϵ)) arbitrary single-qubit unitaries to within error ϵ with the same asymptotic T-count as is required to approximate just one single-qubit unitary. View details
    CoDaS: AI Co-Data-Scientist for Biomarker Discovery via Wearable Sensors
    Juro Gottweis
    CJ Park
    Salman Rahman
    Ahmed Metwally
    Hong Yu
    Ivor Rendulic
    Yuzhe Yang
    Petar Sirkovic
    Daniel McDuff
    Shwetak Patel
    Nicolas Stroppa
    Yubin Kim
    Mark Malhotra
    Orson Xu
    Sam Schmidgall
    Tim Althoff
    Elahe Vedadi
    Cynthia Breazeal
    Hae Won Park
    (2026)
    Preview abstract We study the d-dimensional knapsack problem. We are given a set of items, each with a d-dimensional cost vector and a profit, along with a d-dimensional budget vector. The goal is to select a set of items that do not exceed the budget in all dimensions and maximize the total profit. A polynomial-time approximation scheme (PTAS) with running time n^{Θ(d/{ε})} has long been known for this problem, where {ε} is the error parameter and n is the encoding size. Despite decades of active research, the best running time of a PTAS has remained O(n^{⌈ d/{ε} ⌉ - d}). Unfortunately, existing lower bounds only cover the special case with two dimensions d = 2, and do not answer whether there is a n^{o(d/({ε)})}-time PTAS for larger values of d. In this work, we show that the running times of the best-known PTAS cannot be improved up to a polylogarithmic factor assuming the Exponential Time Hypothesis (ETH). Our techniques are based on a robust reduction from 2-CSP, which embeds 2-CSP constraints into a desired number of dimensions. Then, using a recent result of [Bafna Karthik and Minzer, STOC'25], we succeed in exhibiting tight trade-off between d and {ε} for all regimes of the parameters assuming d is sufficiently large. Informally, our result also shows that under ETH, for any function f there is no f(d/({ε)}) ⋅ n^{õ(d/({ε)})}-time (1-{ε})-approximation for d-dimensional knapsack, where n is the number of items and õ hides polylogarithmic factors in d/({ε)}. View details
    Preview abstract This whitepaper seeks to elucidate implications that the capabilities of developing quantum architectures have on blockchain vulnerabilities and mitigation strategies. First, we provide new resource estimates for breaking the 256-bit Elliptic Curve Discrete Logarithm Problem, the core of modern blockchain cryptography. We demonstrate that Shor's algorithm for this problem can execute with either <1200 logical qubits and <90 million Toffoli gates or <1450 logical qubits and <70 million Toffoli gates. In the interest of responsible disclosure, we use a zero-knowledge proof to validate these results without disclosing attack vectors. On superconducting architectures with 1e-3 physical error rates and planar connectivity, those circuits can execute in minutes using fewer than half a million physical qubits. We introduce a critical distinction between fast-clock (such as superconducting and photonic) and slow-clock (such as neutral atom and ion trap) architectures. Our analysis reveals that the first fast-clock CRQCs would enable on-spend attacks on public mempool transactions of some cryptocurrencies. We survey major cryptocurrency vulnerabilities through this lens, identifying systemic risks associated with advanced features in some blockchains such as smart contracts, Proof-of-Stake consensus, and Data Availability Sampling, as well as the enduring concern of abandoned assets. We argue that technical solutions would benefit from accompanying public policy and discuss various frameworks of digital salvage to regulate the recovery or destruction of dormant assets while preventing adversarial seizure. We also discuss implications for other digital assets and tokenization as well as challenges and successful examples of the ongoing transition to Post-Quantum Cryptography (PQC). Finally, we urge all vulnerable cryptocurrency communities to join the ongoing migration to PQC without delay. View details
    Preview abstract Browser fingerprinting is the practice of tracking users across the Web by collecting attributes from their devices and combining them to create unique identifiers. This practice poses major privacy risks to users, and more than a decade of research has quantified fingerprinting risks due to various attributes, leading browser developers to implement many privacy-enhancing changes. Early work used Shannon entropy to quantify risks. However, Shannon entropy can grow with dataset size, limiting the ability to compare datasets and results. Researchers then introduced normalized entropy as a measure for comparing browser fingerprinting datasets of different sizes and numerous works followed using normalized entropy for this purpose. We identify and address a resulting problem in the fingerprinting literature. We show normalized entropy is ill-suited to compare datasets of different sizes — it decreases as dataset size increases. We show this both analytically and empirically, leveraging a recently published dataset of browser attributes commonly used for fingerprinting. Given the unmet need for a better fingerprinting risk measure, we define a minimal set of desired properties for such a measure: scale-invariance, monotonicity and estimability. We then propose to use Tsallis entropy as a more interpretable fingerprinting risk measure. We evaluate Shannon, normalized, and Tsallis entropy with respect to the properties, and prove that only Tsallis entropy satisfies all of them. View details
    Phoenix: Rowhammer Attacks on DDR5 with Self-Correcting Synchronization
    Michele Marazzi
    Kaveh Razavi
    Salman Qazi
    Diego Meyer
    Patrick Jattke
    IEEE Security & Privacy (S&P) (2026)
    Preview abstract A growing body of qualitative research has identified contextual risk factors that elevate people’s chances of experiencing digital-safety attacks. However, the lack of quantitative data on the population level distribution of these risk factors prevents policymakers and tech companies from developing targeted, evidence-based interventions to improve digital safety. To address this gap, we surveyed 5,001 adults in the United States to analyze: (1) the frequency of and relationship between digital-safety attacks (e.g., scams, harassment, account hacking), and (2) how these attacks align with 10 contextual risk factors. Nearly half of our respondents identify as resource constrained, which significantly correlates with higher likelihood of experiencing four common attacks. We also present qualitative insights to expand our understanding of the factors beyond the existing literature (e.g., “prominence” included high-visibility roles in local communities). This study provides the first large-scale quantitative analysis correlating digital-safety attacks with contextual risk factors and demographics. View details
    Preview abstract There are growing concerns about AI-generated image-based sexual abuse (AI-IBSA), also known as nonconsensual sexualized ′deepfakes.′ Empirical research on AI-IBSA, however, remains very limited. This study surveyed 7231 respondents across Australia, the United Kingdom, and the United States to investigate community attitudes and perceptions on AI-IBSA. Through a vignette study, we explored the relationship between public familiarity with AI-IBSA, normative concerns about consent, and context-dependent judgments that vary based on the target's identity relational status, and how the content was used. Our findings reveal strong condemnation of AI-IBSA, yet respondents demonstrated low familiarity with the technology and their views varied depending on particular contexts. AI-IBSA targeting intimate partners was viewed as more unacceptable than targeting celebrities, and content created solely for personal use was seen as less unacceptable than content intended for distribution. The study highlights the need for approaches that go beyond technical fixes and punitive measures, advocating for a multifaceted response that integrates ethical data governance, digital sexual literacy, and restorative justice approaches. View details
    Preview abstract Voice activity detection (VAD) plays a vital role in enabling applications such as speech recognition. We analyze the impact of window size on the accuracy of three VAD algorithms: Silero, WebRTC, and Root Mean Square (RMS) across a set of diverse real-world digital audio streams. We additionally explore the use of hysteresis on top of each VAD output. Our results offer practical references for optimizing VAD systems. Silero significantly outperforms WebRTC and RMS, and hysteresis provides a benefit for WebRTC. View details
    Preview abstract We introduce KVCIS (KV-Cache Importance Scoring), a novel approach to KV-cache compression that predicts token importance from intermediate-layer activations before attention is computed. Unlike existing methods (H2O, StreamingLLM, Scissorhands) that make compression decisions based on attention scores computed during generation, KVCIS enables proactive compression at cache insertion time—determining how to store each token before paying the computational cost of attention. We discover a two-level importance structure in decoder-only transformers: the beginning-of-sequence (BOS) token acts as an "attention sink" receiving ~76% of attention, while the remaining ~24% is distributed across content tokens with 10-11× importance spread. A simple linear probe achieves R² = 0.998 overall and R² = 0.68–0.79 for discriminating among content tokens. Extensive validation across 3 model families (Llama, Mistral, Gemma), 8 layer depths, context lengths from 256 to 2048 tokens, and multiple downstream tasks demonstrates: 50% memory reduction with zero degradation on NarrativeQA (F1 = 0.064 matching baseline exactly), while uniform quantization degrades by 7.8% at the same compression ratio. KVCIS consistently achieves 5–8× better quality preservation than uniform quantization across all tested context lengths. The memory savings enable increased batch sizes and longer context support; the probe itself adds minimal overhead (~16KB direction vector, 0.06ms per token). This work extends activation-based probing from safety classification to inference optimization, demonstrating that intermediate-layer activations encode predictive signals about token importance for generation. View details
    Bi-level Hierarchical Neural Contextual Bandits for Online Recommendation
    Yunzhe Qi
    Yikun Ban
    Allan Stewart
    Chuanwei Ruan
    Jiachuan He
    Shishir Kumar Prasad
    Haixun Wang
    Jingrui He
    Transactions on Machine Learning Research (2026)
    Preview abstract Contextual bandit algorithms aim to identify the optimal choice among a set of candidate arms, based on their contextual information. Among others, the neural contextual bandit algorithms have demonstrated generally superior performance compared to traditional linear and kernel-based methods. Nevertheless, neural methods are not inherently suitable to handle a large number of candidate arms due to their high computational cost when performing neural exploration. Motivated by the widespread availability of arm category information (e.g., movie genres, retailer types), we formulate contextual bandits into a bi-level recommendation problem based on the accessible arm category information, and propose a novel neural bandit framework, named H2N-Bandit, which utilizes a bi-level hierarchical neural structure to mitigate the substantial computational cost found in conventional neural bandit methods. To demonstrate its effectiveness, we provide the regret bound for H2N-Bandit under the over-parameterized neural bandit settings. Furthermore, to illustrate its efficiency, we conduct extensive experiments on multiple real-world public data sets with various specifications, showing that H2N-Bandit can significantly reduce the computational cost over existing non-linear methods while achieving better or comparable performances against state-of-the-art baselines. View details
    Managing and Securing Google's Fleet of Multi-Node Servers
    Richard Hanley
    Havard Skinnemoen
    Andrés Lagar-Cavilla
    Michael Wong
    Jeff Andersen
    Kishan Prasad
    Patrick Leis
    Shiva Rao
    Chris Koch
    Jad Baydoun
    Anna Sapek
    Communications of the ACM, 69:3 (2026), pp. 82 - 92
    Preview abstract Server hardware and software co-design for a secure, efficient cloud. View details
    Preview abstract The advent of 3D Gaussian Splatting has revolutionized graphics rendering by offering high visual quality and fast rendering speed. However, training large-scale scenes at high quality remains challenging due to the substantial memory demands required to store Gaussians and optimizer states. To address these limitations, we propose GS-Offload, fast and memory-efficient training system for 3D Gaussian Splatting. GS-Offload stores Gaussians and optimizer states in host memory and selectively transfer only the necessary data to GPU memory on demand, significantly reducing GPU memory usage. With carefully designed software pipelining and CPU-side optimizer acceleration, GS-Offload achieves training speed near that of GPU-only setups, while significantly lowering GPU memory demands. View details

    Follow us

    ×