Publications
Our teams aspire to make discoveries that impact everyone, and core to our approach is sharing our research and tools to fuel progress in the field.
Our teams aspire to make discoveries that impact everyone, and core to our approach is sharing our research and tools to fuel progress in the field.
Sort By
1 - 15 of 10129 publications
Preview abstract
Current approaches to Analog Layout Automation
apply ML techniques such as Graph Convolutional Neural
Networks (GCN) to translate netlist to layout. While these ML
approaches have proven to be effective, they lack the powerful
reasoning capabilities, an intuitive human interface, and standard
evaluation benchmarks that have been improving at a rapid de-
velopment pace in Large Language Models (LLMs). The GLayout
framework introduced in this work translates analog layout into
an expressive, technology generic, compact text representation.
Then, an LLM is taught to understand analog layout through
fine-tuning and in-context learning using Retrieval Augmented
Generation (RAG). The LLM is able to successfully layout unseen
circuits based on new information provided in-context. We train
3.8, 7, and 22 Billion parameter quantized LLMs on a dataset
of less than 50 unique circuits, and text documents providing
layout knowledge. The 22B parameter model is tuned in 2 hours
on a single NVIDIA A100 GPU. The open-source evaluation
set is proposed as an automation benchmark for LLM layout
automation tasks, and ranges from 2-transistor circuits to a
∆Σ ADC. The 22B model completes 70% of the tasks in the
evaluation set, and is able to pass DRC and LVS verification on
unseen 4 transistor blocks.
View details
Perspective Chapter: Assessment of Subjective and Objective Sleep Quality from Wrist-Worn Wearable Data
Ben Yetton
Daniel McDuff
Andrew Barakat
Allen Jiang
Nicholas Allen
Logan Schneider
Ari Winbush
Conor Heneghan
Preview abstract
Researchers are interested in measuring both objective and subjective assessments of sleep, and associated phenomena such as sleepiness, quality and restoration. Predicting perceived sleep quality accurately from objective measurements remains an unsolved and interesting problem. Previous studies using polysomnograms and actigraphy have shown poor concordance between objective metrics and subjective sleep quality, but were often limited by study duration (e.g., one or two nights of PSG, study population in low 100 s). In this chapter, we consider whether consumer sleep trackers could significantly improve the assessment of subjective sleep quality through longer periods of assessment and larger data scale. We describe a recent study that modeled two subjective sleep quality metrics (PROMIS Sleep-Related Impairment (SI) and Sleep Disturbance (SD) Index) from objective sleep metrics acquired from a consumer wearable device (Fitbit). However, the goodness-of-fit parameter remains relatively low, even with the increased data availability and scale of data provided by consumer wearables. Specifically, for a well-characterized normative population of 2106 adults, we see that a linear multivariate model produces an R2 of 0.107 for predicting SI and R2 of 0.147 for SR, consistent with prior results using PSG and actigraphy. We conclude that subjective sleep quality remains broadly a psychological construct that cannot be fully modeled solely by objective sleep metrics.
View details
Efficiency of the Generalized Second-Price Auction for Value Maximizers
Hanrui Zhang
Proceedings of the ACM on Web Conference 2024, 46–56
Preview abstract
We study the price of anarchy of the generalized second-price auction where bidders are value maximizers (i.e., autobidders). We show that in general the price of anarchy can be as bad as 0. For comparison, the price of anarchy of running VCG is 1/2 in the autobidding world. We further show a fined-grained price of anarchy with respect to the discount factors (i.e., the ratios of click probabilities between lower slots and the highest slot in each auction) in the generalized second-price auction, which highlights the qualitative relation between the smoothness of the discount factors and the efficiency of the generalized second-price auction.
View details
Open Se Cura: First Silicon Results of an Auditable and Transparent Hardware Root of Trust System using Open EDA in 16-nm
Guanchen Tao
Ming-Hung Chen
Bangfei Pan
Kai Yick
Dennis Sylvester
Mehdi Saligane
IEEE Solid-State Circuits Magazine, 16(2024), pp. 58-66
Preview abstract
Hardware Root of Trust (HRoT) is essential for Internet-of-Things (IoT) devices as it provides critical user data protection. However, each novel use case significantly lengthens the development time for an HRoT system. Furthermore, most HRoT solutions are proprietary, and users lack permission to inspect and audit such systems [1-2]. This paper introduces Open Se Cura, which is an open-source framework designed to expedite the implementation of secure and transparent HRoT systems. It utilizes open-source Electronic Design Automation (EDA) tools like OpenROAD [3-4] and OpenFASOC [5-6], along with open-source Process Design Kits (PDKs), to present a transparent and auditable approach to hardware-software co-design platforms. This approach enables fast and trustworthy HRoT system implementation and is made openly available to reproduce its results and security efficacy [7]. Our reference design is showcased through FPGA emulation, and the first measurement results of a silicon implementation in 16nm of Open Se Cura security domain subsets integrated using open-source EDA are presented.
View details
The Case for Globalizing Fairness: A Mixed Methods Study on the Perceptions of Colonialism, AI and Health in Africa
Iskandar Haykel
Aisha Walcott-Bryant
Sanmi Koyejo
Preview abstract
With growing machine learning (ML) and large language model applications in healthcare, there have been calls for fairness in ML to understand and mitigate ethical concerns these systems may pose. Fairness has implications for health in Africa, which already has inequitable power imbalances between the Global North and South. This paper seeks to explore fairness for global health, with Africa as a case study.
We conduct a scoping review to propose fairness attributes for consideration in the African context and delineate where they may come into play in different ML-enabled medical modalities. We then conduct qualitative research studies with 625 general population study participants in 5 countries in Africa and 28 experts in ML, Health, and/or policy focussed on Africa to obtain feedback on the proposed attributes. We delve specifically into understanding the interplay between AI, health and colonialism.
Our findings demonstrate that among experts there is a general mistrust that technologies that are solely developed by former colonizers can benefit Africans, and that associated resource constraints due to pre-existing economic and infrastructure inequities can be linked to colonialism. General population survey responses found about an average of 40% of people associate an undercurrent of colonialism to AI and this was most dominant amongst participants from South Africa. However the majority of the general population participants surveyed did not think there was a direct link between AI and colonialism.Colonial history, country of origin, National income level were specific axes of disparities that participants felt would cause an AI tool to be biased
This work serves as a basis for policy development around Artificial Intelligence for health in Africa and can be expanded to other regions.
View details
Computational Methodologies for Understanding, Automating, and Evaluating User Interfaces
Yuwen Lu
Yue Jiang
Christof Lutteroth
Toby Jia-Jun Li
Jeffery Nichols
Wolfgang Stuerzlinger
Preview abstract
Building on the success of the first two workshops on user interfaces (UIs) at CHI 2022 and CHI 2023, this workshop aims to advance the research field by further exploring current research trends, such as applying large language models and visual language models. Previous work has explored computational approaches to understanding and adapting UIs using constraint-based optimization models and machine learning-based data-driven approaches. In addition to further delving into these established UI research areas, we aim to trigger the exploration into the application of the latest advancements in general-purpose large language and vision-language models within the UI domain. We will encourage participants to explore novel methods for understanding, automating, and evaluating UIs. The proposed workshop seeks to bring together academic researchers and industry practitioners interested in computational approaches for UIs to discuss the needs and opportunities for future user interface algorithms, models, and applications.
View details
Statistical Analysis of Cardiovascular Diseases Dataset of BRFSS
Ashank Anshuman
Aakarshit Uppal
Indrajit Mukherjee
Open Access Library Journal, 11 (2024)
Preview abstract
Cardiovascular Diseases (CVDs) remain a leading cause of death in the United States. These diseases, including coronary heart disease, heart attack, and stroke, pose significant health risks. Accurate prediction of CVD probability can aid in prevention and management. To address this challenge, we analyzed data from the Behavioral Risk Factor Surveillance System (BRFSS) spanning 1995-2017. We developed innovative methods to handle missing data and normalize values. Deep learning models were employed to predict risk factors and, subsequently, the likelihood of CVDs. Our models were implemented using TensorFlow and trained on a high-performance computing server. The models accurately predicted risk factors with over 90% accuracy, enabling targeted interventions. We successfully predicted CVD probability with greater than 95% accuracy, providing valuable insights for healthcare providers. An online portal was developed to forecast CVD trends over the next 31 years, facilitating proactive planning and resource allocation.
View details
ViSAGe: A Global-Scale Analysis of Visual Stereotypes in Text-to-Image Generation
Akshita Jha
Sarah Laszlo
Rida Qadri
Chandan Reddy
ACL (2024)
Preview abstract
Recent studies have highlighted the issue of varying degrees of stereotypical depictions for different identity group. However, these existing approaches have several key limitations, including a noticeable lack of coverage of identity groups in their evaluation, and the range of their associated stereotypes. Additionally, these studies often lack a critical distinction between inherently visual stereotypes, such as `brown' or `sombrero', and culturally influenced stereotypes like `kind' or `intelligent'. In this work, we address these limitations by grounding our evaluation of regional, geo-cultural stereotypes in the generated images from Text-to-Image models by leveraging existing textual resources. We employ existing stereotype benchmarks to evaluate stereotypes and focus exclusively on the identification of visual stereotypes within the generated images spanning 135 identity groups. We also compute the offensiveness across identity groups, and check the feasibility of identifying stereotypes automatically. Further, through a detailed case study and quantitative analysis, we reveal how the default representations of all identity groups have a more stereotypical appearance, and for historically marginalized groups, how the images across different attributes are visually more similar than other groups, even when explicitly prompted otherwise.
View details
Using large language models to accelerate communication for eye gaze typing users with ALS
Subhashini Venugopalan
Katie Seaver
Xiang Xiao
Sri Jalasutram
Ajit Narayanan
Bob MacDonald
Emily Kornman
Daniel Vance
Blair Casey
Steve Gleason
(2024)
Preview abstract
Accelerating text input in augmentative and alternative communication (AAC) is a long-standing area of research with bearings on the quality of life in individuals with profound motor impairments. Recent advances in large language models (LLMs) pose opportunities for re-thinking strategies for enhanced text entry in AAC. In this paper, we present SpeakFaster, consisting of an LLM-powered user interface for text entry in a highly-abbreviated form, saving 57% more motor actions than traditional predictive keyboards in offline simulation. A pilot study on a mobile device with 19 non-AAC participants demonstrated motor savings in line with simulation and relatively small changes in typing speed. Lab and field testing on two eye-gaze AAC users with amyotrophic lateral sclerosis demonstrated text-entry rates 29–60% above baselines, due to significant saving of expensive keystrokes based on LLM predictions. These findings form a foundation for further exploration of LLM-assisted text entry in AAC and other user interfaces.
View details
Preview abstract
This paper introduces a novel deep neural network architecture for solving the inverse scattering problem in frequency domain with wide-band data, by directly approximating the inverse map, thus avoiding the expensive optimization loop of classical methods. The architecture is motivated by the filtered back-projection formula in the full aperture regime and with homogeneous background, and it leverages the underlying equivariance of the problem and compressibility of the integral operator. This drastically reduces the number of training parameters, and therefore the computational and sample complexity of the method. In particular, we obtain an architecture whose number of parameters scales sub-linearly with respect to the dimension of the inputs, while its inference complexity scales super-linearly but with very small constants. We provide several numerical tests that show that the current approach results in better reconstruction than optimization-based techniques such as full-waveform inversion, but at a fraction of the cost while being competitive with state-of-the-art machine learning methods.
View details
Preview abstract
In-Context Learning (ICL) is an emergent capability of Large Language Models (LLMs).
Only a few demonstrations enable LLMs to be used as blackbox for new tasks. Previous studies have shown that using LLMs' outputs as labels is effective in training models to select demonstrations. Such a label is expected to estimate utility of a demonstration in ICL;
however, it has not been well understood how different labeling strategies affect results on target tasks. This paper presents an analysis on different utility functions by focusing on LLMs' output probability given ground-truth output, and task-specific reward given LLMs' prediction. Unlike the previous work, we introduce a novel labeling method, incremental utility, which estimates how much incremental knowledge is brought into the LLMs by a demonstration. We conduct experiments with instruction-tuned LLMs on binary/multi-class classification, segmentation, and translation across Arabic, English, Finnish, Japanese, and Spanish. Our results show that (1) the probability is effective when the probability values are distributed across the whole value range (on the classification tasks), and (2) the downstream metric is more robust when nuanced reward values are provided with long outputs (on the segmentation and translation tasks). We then show that the proposed incremental utility further helps ICL by contrasting how the LLMs perform with and without the demonstrations.
View details
Preview abstract
This is the seventh installment of the Developer Productivity for Humans column. This installment focuses on software quality: what it means, how developers see it, how we break it down into 4 types of quality, and the impact these have on each other.
View details
Understanding and Designing for Trust in AI Powered Developer Tooling
Ugam Kumar
Quinn Madison
IEEE Software (2024)
Preview abstract
Trust is central to how developers engage with AI. In this article, we discuss what we learned from developers about their level of trust in AI enhanced developer tooling, and how we translated those findings into product design recommendations to support customization, and the challenges we encountered along the way.
View details
RFC 9632 - Finding and Using Geofeed Data
RFC Editor, RFC Editor (2024), pp. 23
Preview abstract
This document specifies how to augment the Routing Policy Specification Language (RPSL) inetnum: class to refer specifically to geofeed comma-separated values (CSV) data files and describes an optional scheme that uses the Resource Public Key Infrastructure (RPKI) to authenticate the geofeed data files. This document obsoletes RFC 9092.
View details
Preview abstract
With the increase in the number of privacy regulations, small development teams are forced to make privacy decisions on their own. In this paper, we conduct a mixed-method survey study, including statistical and qualitative analysis, to evaluate the privacy perceptions, practices, and knowledge of members involved in various phases of the Software Development Life Cycle (SDLC). Our survey includes 362 participants from 23 countries, encompassing roles such as product managers, developers, and testers. Our results show diverse definitions of privacy across SDLC roles, emphasizing the need for a holistic privacy approach throughout SDLC. We find that software teams, regardless of their region, are less familiar with privacy concepts (such as anonymization), relying on self-teaching and forums. Most participants are more familiar with GDPR and HIPAA than other regulations, with multi-jurisdictional compliance being their primary concern. Our results advocate the need for role-dependent solutions to address the privacy challenges, and we highlight research directions and educational takeaways to help improve privacy-aware SDLC.
View details