Publications
Our teams aspire to make discoveries that impact everyone, and core to our approach is sharing our research and tools to fuel progress in the field.
Our teams aspire to make discoveries that impact everyone, and core to our approach is sharing our research and tools to fuel progress in the field.
Sort By
1 - 15 of 10133 publications
Exploring the Feasibility of Remote Cardiac Auscultation Using Earphones
Tao Chen
Yongjie Yang
Xiuzhen Guo
Jie Xiong
Shangguan Longfei
MobiCom 2024: The 30th Annual International Conference On Mobile Computing And Networking
Preview abstract
The elderly over 65 accounts for 80% of COVID deaths in the United States. In response to the pandemic, the federal, state governments, and commercial insurers are promoting video visits, through which the elderly can access specialists at home over the Internet, without the risk of COVID exposure. However, the current video visit practice barely relies on video observation and talking. The specialist could not assess the patient's health conditions by performing auscultations.
This paper tries to address this key missing component in video visits by proposing Asclepius, a hardware-software solution that turns the patient's earphones into a stethoscope, allowing the specialist to hear the patient's fine-grained heart sound (i.e., PCG signals) in video visits. To achieve this goal, we contribute a low-cost plug-in peripheral that repurposes the earphone's speaker into a microphone and uses it to capture the patient's minute PCG signals from her ear canal. As the PCG signals suffer from strong attenuation and multi-path effects when propagating from the heart to ear canals, we then propose efficient signal processing algorithms coupled with a data-driven approach to de-reverberate and further correct the amplitude and frequency distortion in raw PCG receptions. We implement Asclepius on a 2-layer PCB board and follow the IRB protocol to evaluate its performance with 30 volunteers. Our extensive experiments show that Asclepius can effectively recover Phonocardiogram (PCG) signals with different types of earphones. The feedback from cardiologists also confirms the efficacy and efficiency of our system. PCG signal samples and benchmark results can be found at an anonymous link https://asclepius-system.github.io/
View details
RewriteLM: An Instruction-Tuned Large LanguageModel for Text Rewriting
Yun Zhu
Simon Tong
Lei Meng
Proceedings of the AAAI Conference on Artificial Intelligence, 38(17), 18970-18980 (2024)
Preview abstract
In recent years, Large Language Models (LLMs) have demonstrated impressive zero-shot capabilities in text generation tasks expressed through natural language instructions. However, text rewriting is a challenging task, and unintended modifications can negatively impact the system's performance. To address this challenge, we introduce a novel benchmark for text rewriting that covers a wide variety of rewriting types expressed through natural language instructions. Unlike previous benchmarks, which were primarily focused on limited rewrite styles and sentence-level rewriting, our benchmark is specifically designed to facilitate open-ended rewriting of long-form text. Additionally, we present a strong baseline model, RewriteLM, which is an instruction-tuned large language model for text rewriting. The model is trained using supervised fine-tuning, reward training, and reinforcement learning. To minimize human intervention in the data collection process, we develop new data generation strategies: (1) utilizing high-quality, long-form edits from Wikipedia as our primary natural training data source, (2) generating a synthetic dataset that includes diverse edit types and non-Wiki domains using chain-of-thoughts and the capabilities of LLMs, and (3) employing human-designed heuristic rankers to generate preference data. Our experiments demonstrate the effectiveness of our proposed benchmark and baseline model, as well as the benefits of our data collection strategies in minimizing human intervention.
View details
Creative ML Assemblages: The Interactive Politics of People, Processes, and Products
Ramya Malur Srinivasan
Katharina Burgdorf
Jennifer Lena
ACM Conference on Computer Supported Cooperative Work and Social Computing (2024) (to appear)
Preview abstract
Creative ML tools are collaborative systems that afford artistic creativity through their myriad interactive relationships. We propose using ``assemblage thinking" to support analyses of creative ML by approaching it as a system in which the elements of people, organizations, culture, practices, and technology constantly influence each other. We model these interactions as ``coordinating elements" that give rise to the social and political characteristics of a particular creative ML context, and call attention to three dynamic elements of creative ML whose interactions provide unique context for the social impact a particular system as: people, creative processes, and products. As creative assemblages are highly contextual, we present these as analytical concepts that computing researchers can adapt to better understand the functioning of a particular system or phenomena and identify intervention points to foster desired change. This paper contributes to theorizing interactions with AI in the context of art, and how these interactions shape the production of algorithmic art.
View details
Learning to Rewrite Prompts for Personalized Text Generation
Qiaozhu Mei
Proceedings of the ACM Web Conference 2024
Preview abstract
Facilitated by large language models (LLMs), personalized text generation has become a rapidly growing research direction. Most existing studies focus on designing specialized models for a particular domain, or they require fine-tuning the LLMs to generate personalized text. We consider a typical scenario in which the large language model, which generates personalized output, is frozen and can only be accessed through APIs. Under this constraint, all one can do is to improve the input text (i.e., text prompts) sent to the LLM, a procedure that is usually done manually. In this paper, we propose a novel method to automatically revise prompts for personalized text generation. The proposed method takes the initial prompts generated by a state-of-the-art, multistage framework for personalized generation and rewrites a few critical components that summarize and synthesize the personal context. The prompt rewriter employs a training paradigm that chains together supervised learning (SL) and reinforcement learning (RL), where SL reduces the search space of RL and RL facilitates end-to-end training of the rewriter. Using datasets from three representative domains, we demonstrate that the rewritten prompts outperform both the original prompts and the prompts optimized via supervised learning or reinforcement learning alone. In-depth analysis of the rewritten prompts shows that they are not only human readable, but also able to guide manual revision of prompts when there is limited resource to employ reinforcement learning to train the prompt rewriter, or when it is costly to deploy an automatic prompt rewriter for inference.
View details
Locality-Aware Graph Rewiring in GNNs
Federico Barbero
Ameya Velingker
Amin Saberi
Michael Bronstein
Francesco Di Giovanni
ICLR 2024
Preview abstract
Graph Neural Networks (GNNs) are popular models for machine learning on graphs that typically follow the message-passing paradigm, whereby the feature of a node is updated recursively upon aggregating information over its neighbors. While exchanging messages over the input graph endows GNNs with a strong inductive bias, it can also make GNNs susceptible to \emph{over-squashing}, thereby preventing them from capturing long-range interactions in the given graph. To rectify this issue, {\em graph rewiring} techniques have been proposed as a means of improving information flow by altering the graph connectivity. In this work, we identify three desiderata for graph-rewiring: (i) reduce over-squashing, (ii) respect the locality of the graph, and (iii) preserve the sparsity of the graph. We highlight fundamental trade-offs that occur between {\em spatial} and {\em spectral} rewiring techniques; while the former often satisfy (i) and (ii) but not (iii), the latter generally satisfy (i) and (iii) at the expense of (ii). We propose a novel rewiring framework that satisfies all of (i)--(iii) through a locality-aware sequence of rewiring operations. We then discuss a specific instance of such rewiring framework and validate its effectiveness on several real-world benchmarks, showing that it either matches or significantly outperforms existing rewiring approaches.
View details
Pathfinder: High-Resolution Control-Flow Attacks with Conditional Branch Predictor
Andrew Kwong
Archit Agarwal
Christina Garman
Daniel Genkin
Dean Tullsen
Deian Stefan
Hosein Yavarzadeh
Max Christman
Mohammadkazem Taram
International Conference on Architectural Support for Programming Languages and Operating Systems, ACM (2024)
Preview abstract
This paper presents novel attack primitives that provide adversaries with the ability to read and write the path history register (PHR) and the prediction history tables (PHTs) of the conditional branch predictor in modern Intel CPUs. These primitives enable us to recover the recent control flow (the last 194 taken branches) and, in most cases, a nearly unlimited control flow history of any victim program. Additionally, we present a tool that transforms the PHR into an unambiguous control flow graph, encompassing the complete history of every branch. This work provides case studies demonstrating the practical impact of novel reading and writing/poisoning primitives. It includes examples of poisoning AES to obtain intermediate values and consequently recover the secret AES key, as well as recovering a secret image by capturing the complete control flow of libjpeg routines. Furthermore, we demonstrate that these attack primitives are effective across virtually all protection boundaries and remain functional in the presence of all recent control-flow mitigations from Intel.
View details
Preview abstract
One of the most basic problems for studying the "price of privacy over time" is the so called private counter problem, introduced by Dwork et al. (2010) and Chan et al. (2011). In this problem, we aim to track the number of events that occur over time, while hiding the existence of every single event. More specifically, in every time step $t\in[T]$ we learn (in an online fashion) that $\Delta_t\geq 0$ new events have occurred, and must respond with an estimate $n_t\approx\sum_{j=1}^t \Delta_j$. The privacy requirement is that all of the outputs together, across all time steps, satisfy event level differential privacy.
The main question here is how our error needs to depend on the total number of time steps $T$ and the total number of events $n$. Dwork et al. (2015) showed an upper bound of $O\left(\log(T)+\log^2(n)\right)$, and Henzinger et al. (2023) showed a lower bound of $\Omega\left(\min\{\log n, \log T\}\right)$. We show a new lower bound of $\Omega\left(\min\{n,\log T\}\right)$, which is tight w.r.t. the dependence on $T$, and is tight in the sparse case where $\log^2 n=O(\log T)$. Our lower bound has the following implications:
* We show that our lower bound extends to the online thresholds problem, where the goal is to privately answer many "quantile queries" when these queries are presented one-by-one. This resolves an open question of Bun et al. (2017).
* Our lower bound implies, for the first time, a separation between the number of mistakes obtainable by a private online learner and a non-private online learner. This partially resolves a COLT'22 open question published by Sanyal and Ramponi.
* Our lower bound also yields the first separation between the standard model of private online learning and a recently proposed relaxed variant of it, called private online prediction.
View details
Nteasee: A qualitative study of expert and general population perspectives on deploying AI for health in African countries
Iskandar Haykel
Florence Ofori
Kerrie Kauer
Tousif Ahmad
Preview abstract
Background: Artificial Intelligence for health has the potential to significantly change and improve healthcare. However in most African countries identifying culturally and contextually attuned approaches for deploying these solutions is not well understood. To bridge this gap, we conduct a qualitative study to investigate the best practices, fairness indicators and potential biases to mitigate when deploying AI for health in African countries, as well as explore opportunities where artificial intelligence could make a positive impact in health.
Methods: We used a mixed methods approach combining in-depth interviews (IDIs) and surveys. We conduct 1.5-2 hour long IDIs with 50 experts in health, policy and AI across 17 countries, and through an inductive approach we conduct a qualitative thematic analysis on expert IDI responses. We administer a blinded 30-minute survey with thought-cases to 672 general population participants across 5 countries in Africa (Ghana, South Africa, Rwanda, Kenya and Nigeria), and analyze responses on quantitative scales, statistically comparing responses by country, age, gender, and level of familiarity with AI. We thematically summarize open-ended responses from surveys.
Results and Conclusion: Our results find generally positive attitudes, high levels of trust, accompanied by moderate levels of concern among general population participants for AI usage for health in Africa. This contrasts with expert responses, where major themes revolved around trust/mistrust, AI ethics concerns, and systemic barriers to overcome, among others. This work presents the first-of-its-kind qualitative research study of the potential of AI for health in Africa with perspectives from both experts and the general population. We hope that this work guides policy makers and drives home the need for education and the inclusion of general population perspectives in decision-making around AI usage.
View details
"We Need Structured Output": Towards User-centered Constraints on Large Language Model Output
Michael Xieyang Liu
Frederick Liu
Alex Fiannaca
Terry Koo
In Extended Abstract in ACM CHI Conference on Human Factors in Computing Systems (CHI EA '24), ACM (2024), pp. 9 (to appear)
Preview abstract
Large language models can produce creative and diverse responses. However, to integrate them into current developer workflows, it is essential to constrain their outputs to follow specific formats or standards. In this work, we surveyed 51 experienced industry professionals to understand the range of scenarios and motivations driving the need for output constraints from a user-centered perspective. We identified 134 concrete use cases for constraints at two levels: low-level, which ensures the output adhere to a structured format and an appropriate length, and high-level, which requires the output to follow semantic and stylistic guidelines without hallucination. Critically, applying output constraints could not only streamline the currently repetitive process of developing, testing, and integrating LLM prompts for developers, but also enhance the user experience of LLM-powered features and applications. We conclude with a discussion on user preferences and needs towards articulating intended constraints for LLMs, alongside an initial design for a constraint prototyping tool.
View details
Drug Design on Quantum Computers
Raffaele Santagati
Alán Aspuru-Guzik
Matthias Degroote
Leticia Gonzalez
Elica Kyoseva
Nikolaj Moll
Markus Oppel
Robert Parrish
Michael Streif
Christofer Tautermann
Horst Weiss
Nathan Wiebe
Clemens Utschig-Utschig
Nature Physics (2024)
Preview abstract
The promised industrial applications of quantum computers often rest on their anticipated ability to perform accurate, efficient quantum chemical calculations. Computational drug discovery relies on accurate predictions of how candidate drugs interact with their targets in a cellular environment involving several thousands of atoms at finite temperatures. Although quantum computers are still far from being used as daily tools in the pharmaceutical industry, here we explore the challenges and opportunities of applying quantum computers to drug design. We discuss where these could transform industrial research and identify the substantial further developments needed to reach this goal.
View details
Individual Welfare Guarantees in the Autobidding World with Machine-learned Advice
Negin Golrezaei
Patrick Jaillet
Jason Cheuk Nam Liang
Proceedings of the ACM on Web Conference 2024, 267–275
Preview abstract
Online advertising channels commonly focus on maximizing total advertiser welfare to enhance channel health, and previous literature has studied augmenting ad auctions with machine learning predictions on advertiser values (also known asmachine-learned advice ) to improve total welfare. Yet, such improvements could come at the cost of individual bidders' welfare and do not shed light on how particular advertiser bidding strategies impact welfare. Motivated by this, we present an analysis on an individual bidder's welfare loss in the autobidding world for auctions with and without machine-learned advice, and also uncover how advertiser strategies relate to such losses. In particular, we demonstrate how ad platforms can utilize ML advice to improve welfare guarantee on the aggregate and individual bidder level by setting ML advice as personalized reserve prices when the platform consists ofautobidders who maximize value while respecting a return on ad spend (ROAS) constraint. Under parallel VCG auctions with such ML advice-based reserves, we present a worst-case welfare lower-bound guarantee for an individual autobidder, and show that the lower-bound guarantee is positively correlated with ML advice quality as well as the scale of bids induced by the autobidder's bidding strategies. Further, we show that no truthful, and possibly randomized mechanism with anonymous allocations can achieve universally better individual welfare guarantees than VCG, in the presence of personalized reserves based on ML-advice of equal quality. Moreover, we extend our individual welfare guarantee results to generalized first price (GFP) and generalized second price (GSP) auctions. Finally, we present numerical studies using semi-synthetic data derived from ad auction logs of a search ad platform to showcase improvements in individual welfare when setting personalized reserve prices with ML-advice.
View details
HyperAttention: Large-scale Attention in Linear Time
Amin Karbasi
Amir Zandieh
Insu Han
David Woodruff
HyperAttention: Long-context Attention in Near-Linear Time (2024) (to appear)
Preview abstract
In this paper, we introduce a novel approximate attention mechanism dubbed ``HyperAttention``. Despite the rapidly increasing size and complexity of contexts used with Large Language Models (LLM), there is still a dire lack of computationally efficient attention mechanisms scaling better than the naive quadratic time. HyperAttention addresses this gap: it delivers provably linear time complexity with respect to the size of the context, while only incurring a negligible loss in downstream quality. Distinctively, it integrates the principles of Locality Sensitive Hashing (LSH), for efficient detection of heavy elements, along with uniform column sampling, allowing for a good approximation both of the heavy and light components of the attention matrix. HyperAttention provably approximates the attention layer in \textit{linear time}, making it the first practical linear time approximate attention mechanism. Crucially, HyperAttention has a highly-modular design, allowing seamless integration of other rapid low-level implementations, most notably FlashAttention. Empirical evaluations indicate that HyperAttention surpasses the existing methods, achieving orders of magnitude speed-up when compared to prevalent state-of-the-art solutions such as Flash Attention. This breakthrough presents significant implications for enabling the scalability of LLMs to significantly larger contexts.
View details
Preview abstract
We present an approach to modeling an image-space prior on scene motion. Our prior is learned from a collection of motion trajectories extracted from real video sequences depicting natural, oscillatory dynamics such as trees, flowers, candles, and clothes swaying in the wind. We model this dense, long-term motion prior in the Fourier domain:given a single image, our trained model uses a frequency-coordinated diffusion sampling process to predict a spectral volume, which can be converted into a motion texture that spans an entire video. Along with an image-based rendering module, these trajectories can be used for a number of downstream applications, such as turning still images into seamlessly looping videos, or allowing users to realistically interact with objects in real pictures by interpreting the spectral volumes as image-space modal bases, which approximate object dynamics.
View details
Context-aware Transliteration of Romanized South Asian Languages
Christo Kirov
Computational Linguistics, 50 (2) (2024), 475–534
Preview abstract
While most transliteration research is focused on single tokens such as named entities -- e.g., transliteration of "અમદાવાદ" from the Gujarati script to the Latin script "Ahmedabad" -- the informal romanization prevalent in South Asia and elsewhere often requires transliteration of full sentences. The lack of large parallel text collections of full sentence (as opposed to single word) transliterations necessitates incorporation of contextual information into transliteration via non-parallel resources, such as via mono-script text collections. In this paper, we present a number of methods for improving transliteration in context for such a use scenario. Some of these methods in fact improve performance without making use of sentential context, allowing for better quantification of the degree to which contextual information in particular is responsible for system improvements. Our final systems, which ultimately rely upon ensembles including large pretrained language models finetuned on simulated parallel data, yield substantial improvements over the best previously reported results for full sentence transliteration from Latin to native script on all 12 languages in the Dakshina dataset (Roark et al. 2020), with an overall 4.8% absolute (27.1% relative) mean word-error rate reduction.
View details
An intentional approach to managing bias in embedding models
Atilla P. Kiraly
Jungyeon Park
Rory Pilgrim
Charles Lau
Heather Cole-Lewis
Shravya Shetty
Krish Eswaran
Leo Anthony Celi
The Lancet Digital Health, 6 (2024), E126-E130
Preview abstract
Advances in machine learning for health care have brought concerns about bias from the research community; specifically, the introduction, perpetuation, or exacerbation of care disparities. Reinforcing these concerns is the finding that medical images often reveal signals about sensitive attributes in ways that are hard to pinpoint by both algorithms and people. This finding raises a question about how to best design general purpose pretrained embeddings (GPPEs, defined as embeddings meant to support a broad array of use cases) for building downstream models that are free from particular types of bias. The downstream model should be carefully evaluated for bias, and audited and improved as appropriate. However, in our view, well intentioned attempts to prevent the upstream components—GPPEs—from learning sensitive attributes can have unintended consequences on the downstream models. Despite producing a veneer of technical neutrality, the resultant end-to-end system might still be biased or poorly performing. We present reasons, by building on previously published data, to support the reasoning that GPPEs should ideally contain as much information as the original data contain, and highlight the perils of trying to remove sensitive attributes from a GPPE. We also emphasise that downstream prediction models trained for specific tasks and settings, whether developed using GPPEs or not, should be carefully designed and evaluated to avoid bias that makes models vulnerable to issues such as distributional shift. These evaluations should be done by a diverse team, including social scientists, on a diverse cohort representing the full breadth of the patient population for which the final model is intended.
View details