Publications
Our teams aspire to make discoveries that impact everyone, and core to our approach is sharing our research and tools to fuel progress in the field.
Our teams aspire to make discoveries that impact everyone, and core to our approach is sharing our research and tools to fuel progress in the field.
Sort By
1 - 15 of 10119 publications
Preview abstract
This paper presents a Multifunctional wearable
sensing system that integrates flexible Laser-Induced-Graphene
(LIG) based sensors and an Open-Source Analog Front-End
(AFE) chip. The LIG sensors are fabricated on polyimide (PI)
Flexible Printed Circuit Board (FPCB) through CO2 infrared
laser direct-write method. The LIG sensors provide repeatable
high-precision temperature sensing, humidity measurement, and
strain detection capabilities. The temperature sensing charac-
terization shows the resistive LIG sensor has a sensitivity of
-0.0493 %/°C, the linear fit R-square factors ≥ 0.9973 across -40
°C to 125 °C. The capacitive humidity sensor achieves a 23.6
times capacitance at 95% relative humidity (RH) compared to
the value observed in a dry environment. Our proposed AFE
chip contains a hybrid folded-cascode Operational Amplifier
(OPAMP) and a Successive Approximation Register Analog-
to-Digital Converter (SAR ADC). Designed using open-source
analog flow and fabricated in GF180 OpenPDK, the AFE chip
serves as a flexible and universal readout platform, adaptable for
various sensing applications. A real-time demonstration of finger
bending detection is performed to validate the functionality.
The multifunctional sensing capability provide by the wearable
system is attractive for personal healthcare application. This
work underscores the integration of the LIG sensors and the
AFE chip, developed using open-source tools which facilitate
rapid and affordable prototyping for a multifunctional flexible
wearable sensing system.
View details
Relational Affect in Dyadic Interactions
CHI Conference on Human Factors in Computing Systems (2024)
Preview abstract
Relational affect is the affective response (encompassing emotion, expression, feeling) that emerges from an interaction between two people. The case study presented here introduces the concept of relational affect through a human perceptual rating task. Forty-five raters watched short video clips of two people interacting and described their perceived emotion of the individuals and that of the overall interaction. Our qualitative analysis of the rater responses showed that raters used a variety of schemes to reason about emotion, including expressions, context, and perceived appraisal of the event. These reasoning schemes were notably different for perceived individual emotion and relational affect. Our findings show that the vocabulary use for relational affect is distinct from that of individual emotion and relational affect as a phenomenon deepens our understanding of social interactions and moves the field a step closer to realizing the goal of fluid interactions between people and technology.
View details
LabelMaker: Automatic Semantic Label Generation from RGB-D Trajectories
Silvan Weder
Hermann Blum
Francis Engelmann
Marc Pollefeys
3DV (2024)
Preview abstract
Semantic annotations are indispensable to train or evaluate perception models, yet very costly to acquire. This work introduces a fully automated 2D/3D labeling framework that, without any human intervention, can generate labels for RGB-D scans at equal (or better) level of accuracy than comparable manually annotated datasets such as ScanNet. Our approach is based on an ensemble of state-of-the-art segmentation models and 3D lifting through neural rendering. We demonstrate the effectiveness of our LabelMaker pipeline by generating significantly better labels for the ScanNet datasets and automatically labelling the previously unlabeled ARKitScenes dataset. Code and models are available at https://labelmaker.org/
View details
Preview abstract
Stereotypes are oversimplified beliefs and ideas about particular groups of people. These cognitive biases are omnipresent in our language, reflected in human-generated dataset and potentially learned and perpetuated by language technologies. Although mitigating stereotypes in language technologies is necessary for preventing harms, stereotypes can impose varying levels of risks for targeted individuals and social groups by appearing in various contexts. Technical challenges in detecting stereotypes are rooted in the societal nuances of stereotyping, making it impossible to capture all intertwined interactions of social groups in diverse cultural context in one generic benchmark. This paper delves into the nuances of detecting stereotypes in an annotation task with humans from various regions of the world. We iteratively disambiguate our definition of the task, refining it as detecting ``generalizing language'' and contribute a multilingual, annotated dataset consisting of sentences mentioning a wide range of social identities in 9 languages and labeled on whether they make broad statements and assumptions about those groups. We experiment with training generalizing language detection models, which provide insight about the linguistic context in which stereotypes can appear, facilitating future research in addressing the dynamic, social aspects of stereotypes.
View details
Knowledge Distillation with Perturbed Loss: From a Vanilla Teacher to a Proxy Teacher
Rongzhi Zhang
Chao Zhang
Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD 2024), ACM, pp. 4278 - 4289
Preview abstract
Knowledge distillation is a popular technique to transfer knowledge from a large teacher model to a small student model. Typically, the student learns to imitate the teacher by minimizing the KL divergence of its output distribution with the teacher's output distribution. In this work, we argue that such a learning objective is sub-optimal because there exists a discrepancy between the teacher's output distribution and the ground truth label distribution. Therefore, forcing the student to blindly imitate the unreliable teacher output distribution leads to inferior performance. To this end, we propose a novel knowledge distillation objective PTLoss by first representing the vanilla KL-based distillation loss function via a Maclaurin series and then perturbing the leading-order terms in this series. This perturbed loss implicitly transforms the original teacher into a proxy teacher with a distribution closer to the ground truth distribution. We establish the theoretical connection between this "distribution closeness'' and the student model generalizability, which enables us to select the PTLoss's perturbation coefficients in a principled way. Extensive experiments on six public benchmark datasets demonstrate the effectiveness of PTLoss with teachers of different scales.
View details
Large Language Models as a Proxy For Human Evaluation in Assessing the Comprehensibility of Disordered Speech Transcription
Richard Cave
Katie Seaver
Jordan Green
Rus Heywood
Proceedings of ICASSP, IEEE (2024)
Preview abstract
Automatic Speech Recognition (ASR) systems, despite significant advances in recent years, still have much room for improvement particularly in the recognition of disordered speech. Even so, erroneous transcripts from ASR models can help people with disordered speech be better understood, especially if the transcription doesn’t significantly change the intended meaning. Evaluating the efficacy of ASR for this use case requires a methodology for measuring the impact of transcription errors on the intended meaning and comprehensibility. Human evaluation is the gold standard for this, but it can be laborious, slow, and expensive. In this work, we tune and evaluate large language models for this task and find them to be a much better proxy for human evaluators than other metrics commonly used. We further present a case-study using the presented approach to assess the quality of personalized ASR models to make model deployment decisions and correctly set user expectations for model quality as part of our trusted tester program.
View details
Delphic Offline Reinforcement Learning under Nonidentifiable Hidden Confounding
Alizée Pace
Hugo Yèche
Bernhard Schölkopf
Gunnar Rätsch
The Twelfth International Conference on Learning Representations (2024)
Preview abstract
A prominent challenge of offline reinforcement learning (RL) is the issue of hidden confounding. There, unobserved variables may influence both the actions taken by the agent and the outcomes observed in the data. Hidden confounding can compromise the validity of any causal conclusion drawn from the data and presents a major obstacle to effective offline RL. In this paper, we tackle the problem of hidden confounding in the nonidentifiable setting. We propose a definition of uncertainty due to confounding bias, termed delphic uncertainty, which uses variation over compatible world models, and differentiate it from the well known epistemic and aleatoric uncertainties. We derive a practical method for estimating the three types of uncertainties, and construct a pessimistic offline RL algorithm to account for them. Our method does not assume identifiability of the unobserved confounders, and attempts to reduce the amount of confounding bias. We demonstrate through extensive experiments and ablations the efficacy of our approach on a sepsis management benchmark, as well as real electronic health records. Our results suggest that nonidentifiable confounding bias can be addressed in practice to improve offline RL solutions.
View details
Preview abstract
Recent efforts to address hallucinations in Large Language Models (LLMs) have focused on attributed text generation, which supplements generated texts with citations of supporting sources for post-generation fact-checking and corrections. Yet, these citations often point to entire documents or paragraphs, burdening users with extensive verification work. In this paper, we introduce a locally-attributable text generation approach, prioritizing concise attributions. Our method, named ``Attribute First, then Generate'', breaks down the conventional end-to-end generation process into three intuitive steps: content selection, sentence planning, and sequential sentence generation. By initially identifying relevant source segments (``select first'') and then conditioning the generation process on them (``then generate''), we ensure these segments also act as the output's fine-grained attributions (``select'' becomes ``attribute''). Tested on Multi-document Summarization and Long-form Question-answering, our method not only yields more concise citations than the baselines but also maintains - and in some cases enhances - both generation quality and attribution accuracy. Furthermore, it significantly reduces the time required for fact verification by human assessors.
View details
Preview abstract
Background: Physical activity levels worldwide have declined over recent decades, with the average number of daily steps decreasing steadily since 1995. Given that physical inactivity is a major modifiable risk factor for chronic disease and mortality, increasing the level of physical activity is a clear opportunity to improve population health on a broad scale. The current study aims to assess the cost-effectiveness and budget impact of a Fitbit-based intervention among healthy, but insufficiently active, adults to quantify the potential clinical and economic value for a commercially insured population in the U.S. Methods: An economic model was developed to compare physical activity, health outcomes, costs, and quality-adjusted life-years (QALYs) associated with usual care and a Fitbit-based intervention that consists of a consumer wearable device alongside goal setting and feedback features provided in a companion software application. Improvement in physical activity was measured in terms of mean daily step count. The effects of increased daily step count were characterized as reduced short-term healthcare costs and decreased incidence of chronic diseases with corresponding improvement in health utility and reduced disease costs. Published literature, standardized costing resources, and data from a National Institutes of Health-funded research program were utilized. Cost-effectiveness and budget impact analyses were performed for a hypothetical cohort of middle-aged adults. Results: The base case cost-effectiveness results found the Fitbit intervention to be dominant (less costly and more effective) compared to usual care. Discounted 15-year incremental costs and QALYs were -$1,257 and 0.011, respectively. In probabilistic analyses, the Fitbit intervention was dominant in 93% of simulations and either dominant or cost-effective (defined as less than $150,000/QALY gained) in 99.4% of simulations. For budget impact analyses conducted from the perspective of a U.S. Commercial payer, the Fitbit intervention was estimated to save approximately $6.5-million dollars over 2 years and $8.5-million dollars over 5 years for a cohort of 8,000 participants. Although the economic analysis results were very robust, the short-term healthcare cost savings were the most uncertain in this population and warrant further research. Conclusions: There is abundant evidence documenting the benefits of wearable activity trackers when used to increase physical activity as measured by daily step counts. Our research provides additional health economic evidence supporting implementation of wearable-based interventions to improve population health and offers compelling support for payers to consider including wearable-based physical activity interventions as part of a comprehensive portfolio of preventive health offerings for their insured populations.
View details
Towards Realistic Synthetic User-Generated Content: A Scaffolding Approach to Generating Online Discussions
Barbara Ikica
Hamidreza Alvari
Mehdi Hafezi Manshadi
(2024)
Preview abstract
The emergence of synthetic data represents a pivotal shift in modern machine learning, offering a solution to satisfy the need for large volumes of data in domains where real data is scarce, highly private, or difficult to obtain. We investigate the feasibility of creating realistic, large-scale synthetic datasets of user-generated content, noting that such content is increasingly prevalent and a source of frequently sought information. Large language models (LLMs) offer a starting point for generating synthetic social media discussion threads, due to their ability to produce diverse responses that typify online interactions. However, as we demonstrate, straightforward application of LLMs yields limited success in capturing the complex structure of online discussions, and standard prompting mechanisms lack sufficient control. We therefore propose a multi-step generation process, predicated on the idea of creating compact representations of discussion threads, referred to as scaffolds. Our framework is generic yet adaptable to the unique characteristics of specific social media platforms. We demonstrate its feasibility using data from two distinct online discussion platforms. To address the fundamental challenge of ensuring the representativeness and realism of synthetic data, we propose a portfolio of evaluation measures to compare various instantiations of our framework.
View details
Recent Books and Journal Articles in Public Opinion, Survey Methods, Survey Statistics, Big Data, Data Science, and User Experience Research. 2023 Update
Mario Callegaro
Survey Practice, 17 (2024)
Preview abstract
Welcome to the 16th edition of this column on recent books and journal articles in the field of public opinion, survey methods, survey statistics, Big Data, data science, and user experience research.
Special issues of journals have a space in this article because, in our view, they are like edited books. We also added review papers from the journal series of Annual Reviews because these papers are seminal state of the art write ups, a mini book, if you wish on a specific subject.
This article is an update of the books and journals published in the 2022 article. Like the previous year, the books are organized by topic; this should help the readers to focus on their interests.
You will note that we use very broad definitions of public opinion, survey methods, survey statistics, Big Data, data science, and user experience research. This is because there are many books published in different outlets that can be very useful to the readers of Survey Practice, even if they do not come from traditional sources of survey content.
It is unlikely we have exhaustively listed all new books in each subcategory; we did our best scouting different resources and websites, but we take full responsibility for any omissions. The list is also focused only on books published in the English language and available for purchase (as an ebook or in print) at the time of this review (April 2024) and with the printed copyright year of 2023. Books are listed based on the relevance to the topic, and no judgment is made in terms of quality of the content. We let the readers do so.
If you want to send information for the next issue, please send it to surveypractice.new.books@gmail.com.
View details
The Case for Validating Inputs in Software-Defined WANs
Rishabh Iyer
Isaac Keslassy
Sylvia Ratnasamy
The 23rd ACM Workshop on Hot Topics in Networks (HOTNETS ’24), ACM, Irvine, CA (2024) (to appear)
Preview abstract
We highlight a problem that the networking community has
largely overlooked: ensuring that the inputs to network controllers in software-
defined WANs are accurate. We we show that “incorrect” inputs are a common
cause of major outages in practice and propose new directions to address these.
View details
Minimizing Live Experiments in Recommender Systems: User Simulation to Evaluate Preference Elicitation Policies
Martin Mladenov
James Pine
Hubert Pham
Shane Li
Xujian Liang
Anton Polishko
Li Yang
Ben Scheetz
Proceedings of he 47th International ACM/SIGIR Conference on Research and Development in Information Retrieval (SIGIR-24), Washington, DC (2024), pp. 2925-2929
Preview abstract
Evaluation of policies in recommender systems (RSs) typically involves A/B testing using live experiments on real users to assess a new policy's impact on relevant metrics. This ``gold standard'' comes at a high cost, however, in terms of cycle time, user cost, and potential user retention. In developing policies for onboarding new users, these costs can be especially problematic, since on-boarding occurs only once. In this work, we describe a simulation methodology used to augment (and reduce) the use of live experiments. We illustrate its deployment for the evaluation of preference elicitation algorithms used to onboard new users of the YouTube Music platform. By developing counterfactually robust user behavior models, and a simulation service that couples such models with production infrastructure, we are able to test new algorithms in a way that reliably predicts their performance on key metrics when deployed live, sometimes more reliably than live experiments due to the scale at which simulation can be realized. We describe our domain, our simulation models and platform, results of experiments and deployment, and suggest future steps needed to further realistic simulation as a powerful complement to live experiments.
View details
Expressing and Analyzing Quantum Algorithms with Qualtran
Charles Yuan
Anurudh Peduri
arXiv::2409.04643 (2024)
Preview abstract
Quantum computing's transition from theory to reality has spurred the need for novel software tools to manage the increasing complexity, sophistication, toil, and chance for error of quantum algorithm development. We present Qualtran, an open-source library for representing and analyzing quantum algorithms. Using carefully chosen abstractions and data structures, we can simulate and test algorithms, automatically generate information-rich diagrams, and tabulate resource requirements. Qualtran offers a \emph{standard library} of algorithmic building blocks that are essential for modern cost-minimizing compilations. Its capabilities are showcased through the re-analysis of key algorithms in Hamiltonian simulation, chemistry, and cryptography. The resulting architecture-independent resource counts can be forwarded to our implementation of cost models to estimate physical costs like wall-clock time and number of physical qubits assuming a surface-code architecture. Qualtran provides a foundation for explicit constructions and reproducible analysis, fostering greater collaboration within the quantum algorithm development community. We believe tools like Qualtran will accelerate progress in the field.
View details
Preview abstract
We're roughly 10 years into the OpenConfig journey. We have implementations in hand from various vendors, and we've gained significant operational experience in the domains of Streaming Telemetry and in Developing Configuration Systems to leverage the developed models. What have we learned? Are the abstractions we've generated the right ones? If not, why? Were we too influenced by the tools and inertia of the time when we made some critical decisions? How do we need to evolve going forward? This discussion is part retrospective/introspective, a candid look at where we've been and what we need to think about as we evolve the next generation of our management (and control) planes. What should we be thinking about as network engineers who write software?
View details