Publications
Our teams aspire to make discoveries that impact everyone, and core to our approach is sharing our research and tools to fuel progress in the field.
Our teams aspire to make discoveries that impact everyone, and core to our approach is sharing our research and tools to fuel progress in the field.
Sort By
1 - 15 of 10129 publications
FrameQuant: Flexible Low-Bit Quantization for Transformers
Harshavardhan Adepu
Zhanpeng Zeng
Vikas Singh
International Conference on Machine Learning (2024)
Preview abstract
Transformers are the backbone of powerful foundation models for many Vision and Natural Language Processing tasks. But their compute and memory/storage footprint is large, and so, serving such models is expensive often requiring high-end hardware. To mitigate this difficulty, Post-Training Quantization seeks to modify a pre-trained model and quantize it to eight bits or lower, significantly boosting compute/memory/latency efficiency. Such models have been successfully quantized to four bits with some performance loss. In this work, we outline a simple scheme to quantize Transformer-based models to just two bits (plus some overhead) with only a small drop in accuracy. Key to our formulation is a concept borrowed from Harmonic analysis called Fusion Frames. Our main finding is that the quantization must take place not in the original weight space, but instead in the Fusion Frame representations. If quantization is interpreted as the addition of noise, our casting of the problem allows invoking an extensive body of known consistent recovery and noise robustness guarantees. Further, if desired, denoising filters are known in closed form. We show empirically, via a variety of experiments, that (almost) two-bit quantization for Transformer models promises sizable efficiency gains.
View details
Data Exchange Markets via Utility Balancing
Aditya Bhaskara
Sungjin Im
Kamesh Munagala
Govind S. Sankar
WebConf (2024)
Preview abstract
This paper explores the design of a balanced data-sharing marketplace for entities with heterogeneous datasets and machine learning models that they seek to refine using data from other agents. The goal of the marketplace is to encourage participation for data sharing in the presence of such heterogeneity. Our market design approach for data sharing focuses on interim utility balance, where participants contribute and receive equitable utility from refinement of their models. We present such a market model for which we study computational complexity, solution existence, and approximation algorithms for welfare maximization and core stability. We finally support our theoretical insights with simulations on a mean estimation task inspired by road traffic delay estimation.
View details
Large Language Models as a Proxy For Human Evaluation in Assessing the Comprehensibility of Disordered Speech Transcription
Richard Cave
Katie Seaver
Jordan Green
Rus Heywood
Proceedings of ICASSP, IEEE (2024)
Preview abstract
Automatic Speech Recognition (ASR) systems, despite significant advances in recent years, still have much room for improvement particularly in the recognition of disordered speech. Even so, erroneous transcripts from ASR models can help people with disordered speech be better understood, especially if the transcription doesn’t significantly change the intended meaning. Evaluating the efficacy of ASR for this use case requires a methodology for measuring the impact of transcription errors on the intended meaning and comprehensibility. Human evaluation is the gold standard for this, but it can be laborious, slow, and expensive. In this work, we tune and evaluate large language models for this task and find them to be a much better proxy for human evaluators than other metrics commonly used. We further present a case-study using the presented approach to assess the quality of personalized ASR models to make model deployment decisions and correctly set user expectations for model quality as part of our trusted tester program.
View details
Reinforcement Learning-Enhanced Cloud-Based Open Source Analog Circuit Generator for Standard and Cryogenic Temperatures in 130-nm and 180-nm OpenPDKs
Ali Hammoud
Anhang Li
Ayushman Tripathi
Wen Tian
Harsh Khandeparkar
Ryan Wans
Boris Murmann
Dennis Sylvester
Mehdi Saligane
Preview abstract
This work introduces an open-source, Process Technology-agnostic framework for hierarchical circuit netlist, layout, and Reinforcement Learning (RL) optimization. The layout, netlist, and optimization python API is fully modular and publicly installable via PyPI. It features a bottom-up hierarchical construction, which allows for complete design reuse across provided PDKs. The modular hierarchy also facilitates parallel circuit design iterations on cloud platforms. To illustrate its capabilities, a two-stage OpAmp with a 5T first-stage, commonsource second-stage, and miller compensation is implemented. We instantiate the OpAmp in two different open-source process design kits (OpenPDKs) using both room-temperature models and cryogenic (4K) models. With a human designed version as the baseline, we leveraged the parameterization capabilities of the framework and applied the RL optimizer to adapt to the power consumption limits suitable for cryogenic applications while maintaining gain and bandwidth performance. Using the modular RL optimization framework we achieve a 6x reduction in power consumption compared to manually designed circuits while maintaining gain to within 2%.
View details
Drug Design on Quantum Computers
Raffaele Santagati
Alán Aspuru-Guzik
Matthias Degroote
Leticia Gonzalez
Elica Kyoseva
Nikolaj Moll
Markus Oppel
Robert Parrish
Michael Streif
Christofer Tautermann
Horst Weiss
Nathan Wiebe
Clemens Utschig-Utschig
Nature Physics (2024)
Preview abstract
The promised industrial applications of quantum computers often rest on their anticipated ability to perform accurate, efficient quantum chemical calculations. Computational drug discovery relies on accurate predictions of how candidate drugs interact with their targets in a cellular environment involving several thousands of atoms at finite temperatures. Although quantum computers are still far from being used as daily tools in the pharmaceutical industry, here we explore the challenges and opportunities of applying quantum computers to drug design. We discuss where these could transform industrial research and identify the substantial further developments needed to reach this goal.
View details
Preview abstract
In this paper, we present SCOREQ, a novel approach for speech quality prediction. SCOREQ is a triplet loss function for contrastive regression that addresses the domain generalisation shortcoming exhibited by state of the art no-reference speech quality metrics. In the paper we: (i) illustrate the problem of L2 loss training failing at capturing the continuous nature of the mean opinion score (MOS) labels; (ii) demonstrate the lack of generalisation through a benchmarking evaluation across several speech domains; (iii) outline our approach and explore the impact of the architectural design decisions through incremental evaluation; (iv) evaluate the final model against state of the art models for a wide variety of data and domains. The results show that the lack of generalisation observed in state of the art speech quality metrics is addressed by SCOREQ. We conclude that using a triplet loss function
View details
Preview abstract
Almost no modern software system is written from scratch, and developers are required to effectively learn to use third-party libraries and software services. Thus, many practitioners and researchers have looked for ways to create effective documentation that supports developers’ learning. However, few efforts have focused on how people actually use the documentation. In this paper, we report on an exploratory, multi-phase, mixed methods empirical study of documentation page-view logs from four cloud-based industrial services. By analyzing page-view logs for over 100,000 users, we find diverse patterns of documentation page visits. Moreover, we show statistically that which documentation pages people visit often correlates with user characteristics such as past experience with the specific product, on the one hand, and with future adoption of the API on the other hand. We discuss the implications of these results on documentation design and propose documentation page-view log analysis as a feasible technique for design audits of documentation, from ones written for software developers to ones designed to support end users (e.g., Adobe Photoshop).
View details
TextMesh: Generation of Realistic 3D Meshes From Text Prompts
Christina Tsalicoglou
Fabian Manhardt
Michael Niemeyer
3DV 2024 (2024)
Preview abstract
The ability to generate highly realistic 2D images from mere text prompts has recently made huge progress in terms of speed and quality, thanks to the advent of image diffusion models. Naturally, the question arises if this can be also achieved in the generation of 3D content from such text prompts. To this end, a new line of methods recently emerged trying to harness diffusion models, trained on 2D images, for supervision of 3D model generation using view dependent prompts. While achieving impressive results, these methods, however, have two major drawbacks. First, rather than commonly used 3D meshes, they instead generate neural radiance fields (NeRFs), making them impractical for most real applications. Second, these approaches tend to produce over-saturated models, giving the output a cartoonish looking effect. Therefore, in this work we propose a novel method for generation of highly realistic-looking 3D meshes. To this end, we extend NeRF to employ an SDF backbone, leading to improved 3D mesh extraction. In addition, we propose a novel way to finetune the mesh texture, removing the effect of high saturation and improving the details of the output 3D mesh.
View details
Preview abstract
Machine learning has a pseudoscience problem. An abundance of ethical issues arising from the use of machine learning (ML)-based technologies—by now, well documented—is inextricably entwined with the systematic epistemic misuse of these tools. We take a recent resurgence of deep learning-assisted physiognomic research as a case study in the relationship between ML-based pseudoscience and attendant social harms—the standard purview of “AI ethics.” In practice, the epistemic and ethical dimensions of ML misuse often arise from shared underlying reasons and are resolvable by the same pathways. Recent use of ML toward the ends of predicting protected attributes from photographs highlights the need for philosophical, historical, and domain-specific perspectives of particular sciences in the prevention and remediation of misused ML.
View details
Preview abstract
Inter-sentence pauses are the silences that occur between sentences in a paragraph or a dialogue.
They are an important aspect of long-form speech prosody, as they can affect the naturalness, intelligibility, and effectiveness of communication.
However, the user perception of inter-sentence pauses in long-form speech synthesis is not well understood. Previous work often evaluates pause modelling in conjunction with other prosodic features making it hard to explicitly study how raters perceive differences in inter-sentence pause lengths.
In this paper, using multiple text-to-speech (TTS) datasets that cover different content types, domains, and settings, we investigate how sensitive raters are to changes to the durations of inter-sentence pauses in long-form speech by comparing ground truth audio samples with renditions that have manipulated pause durations.
This experimental design is meant to allow us to draw conclusions regarding the utility that can be expected from similar evaluations when applied to synthesized long-form speech.
We find that, using standard evaluation methodologies, raters are not sensitive to variations in pause lengths unless these deviate exceedingly from the norms or expectations of the speech context.
View details
Sleep patterns and risk of chronic disease as measured by long-term monitoring with commercial wearable devices in the All of Us Research Program
Neil S. Zheng
Jeffrey Annis
Hiral Master
Lide Han
Karla Gleichauf
Melody Nasser
Peyton Coleman
Stacy Desine
Douglas M. Ruderfer
John Hernandez
Logan D. Schneider
Evan L. Brittain
Nature Medicine (2024)
Preview abstract
Poor sleep health is associated with increased all-cause mortality and incidence of many chronic conditions. Previous studies have relied on cross-sectional and self-reported survey data or polysomnograms, which have limitations with respect to data granularity, sample size and longitudinal information. Here, using objectively measured, longitudinal sleep data from commercial wearable devices linked to electronic health record data from the All of Us Research Program, we show that sleep patterns, including sleep stages, duration and regularity, are associated with chronic disease incidence. Of the 6,785 participants included in this study, 71% were female, 84% self-identified as white and 71% had a college degree; the median age was 50.2 years (interquartile range = 35.7, 61.5) and the median sleep monitoring period was 4.5 years (2.5, 6.5). We found that rapid eye movement sleep and deep sleep were inversely associated with the odds of incident atrial fibrillation and that increased sleep irregularity was associated with increased odds of incident obesity, hyperlipidemia, hypertension, major depressive disorder and generalized anxiety disorder. Moreover, J-shaped associations were observed between average daily sleep duration and hypertension, major depressive disorder and generalized anxiety disorder. These findings show that sleep stages, duration and regularity are all important factors associated with chronic disease development and may inform evidence-based recommendations on healthy sleeping habits.
View details
The Case for Globalizing Fairness: A Mixed Methods Study on the Perceptions of Colonialism, AI and Health in Africa
Iskandar Haykel
Aisha Walcott-Bryant
Sanmi Koyejo
Preview abstract
With growing machine learning (ML) and large language model applications in healthcare, there have been calls for fairness in ML to understand and mitigate ethical concerns these systems may pose. Fairness has implications for health in Africa, which already has inequitable power imbalances between the Global North and South. This paper seeks to explore fairness for global health, with Africa as a case study.
We conduct a scoping review to propose fairness attributes for consideration in the African context and delineate where they may come into play in different ML-enabled medical modalities. We then conduct qualitative research studies with 625 general population study participants in 5 countries in Africa and 28 experts in ML, Health, and/or policy focussed on Africa to obtain feedback on the proposed attributes. We delve specifically into understanding the interplay between AI, health and colonialism.
Our findings demonstrate that among experts there is a general mistrust that technologies that are solely developed by former colonizers can benefit Africans, and that associated resource constraints due to pre-existing economic and infrastructure inequities can be linked to colonialism. General population survey responses found about an average of 40% of people associate an undercurrent of colonialism to AI and this was most dominant amongst participants from South Africa. However the majority of the general population participants surveyed did not think there was a direct link between AI and colonialism.Colonial history, country of origin, National income level were specific axes of disparities that participants felt would cause an AI tool to be biased
This work serves as a basis for policy development around Artificial Intelligence for health in Africa and can be expanded to other regions.
View details
How we use GenAI in SRE
CommitConf, Madrid (2024)
Preview abstract
Google services are powered by the largest network of computers in the world. Site Reliabity Engineers (SRE) make sure that the whole stack is cool: datacenters are safe, well provisionedl; we have fallback mechanims, and data integrity; to making sure we design our stack properly, using the right storage, replication and software trade-offs.
Generative AI is a great tool to make us super-effective: having access to tools to generate our most toily configurations, to classify risks and events, to manage large swaths of machines with agents or to automate complex workflows cheaply.
This talk will cover the journey that SRE started years ago to become a truly AI-First discipline and the latest advancements in tooling, practices and workflows.
View details
Creativity, Generative AI, and Software Development: A Research Agenda
Victoria Jackson
Bogdan Vasilescu
Daniel Russo
Paul Ralph
Maliheh Izadi
Rafael Prikladnicki
Anielle Lisboa
Andre van der Hoek
Preview abstract
Creativity has always been considered a major differentiator to separate the good from the great, and we believe the importance of creativity to software development will only increase as GenAI becomes embedded in developer tool-chains and working practices. This paper uses the McLuhan tetrad alongside scenarios of how GenAI may disrupt software development more broadly, to identify potential impacts GenAI may have on creativity within software development. The impacts are discussed along with a future research agenda comprising of six connected themes that consider how individual capabilities, team capabilities, the product, unintended consequences, society, and human aspects can be affected.
View details
Preview abstract
A vast amount of human discussion, storytelling, content creation,
and reporting now occurs on social media platforms. As such, social
media posts are often quoted on web pages as context. In this
paper, we argue that these quotations and their surrounding page
context provide a rich, platform-independent source of data for
studying the intersection of natural language and social media.
We introduce a taxonomy of quotation roles that categorizes how
social media posts are used within content. We release a dataset
of 38M social quotes derived from the Common Crawl, and role
labels for a subset assessed by human raters. We show that the
interplay of accounts, roles, and topics across the web graph reveal
valuable social diffusion patterns, and that roles can be predicted
with fine-tuned large language models from web context.
View details