Publications

Our teams aspire to make discoveries that impact everyone, and core to our approach is sharing our research and tools to fuel progress in the field.

people standing in front of a screen with images and a chipboard

Our teams aspire to make discoveries that impact everyone, and core to our approach is sharing our research and tools to fuel progress in the field.

Sort By
  • Title
  • Title, descending
  • Year
  • Year, descending
1 - 15 of 10132 publications
    Automatic Speech Recognition of Conversational Speech in Individuals with Disordered Speech
    Bob MacDonald
    Rus Heywood
    Richard Cave
    Katie Seaver
    Antoine Desjardins
    Jordan Green
    Journal of Speech, Language, and Hearing Research (2024) (to appear)
    Preview abstract Purpose: This study examines the effectiveness of automatic speech recognition (ASR) for individuals with speech disorders, addressing the gap in performance between read and conversational ASR. We analyze the factors influencing this disparity and the effect of speech mode-specific training on ASR accuracy. Method: Recordings of read and conversational speech from 27 individuals with various speech disorders were analyzed using both (1) one speaker-independent ASR system trained and optimized for typical speech and (2) multiple ASR models that were personalized to the speech of the participants with disordered speech. Word Error Rates (WERs) were calculated for each speech mode, read vs conversational, and subject. Linear mixed-effect models were used to assess the impact of speech mode and disorder severity on ASR accuracy. We investigated nine variables, classified as technical, linguistic, or speech impairment factors, for their potential influence on the performance gap. Results: We found a significant performance gap between read and conversational speech in both personalized and unadapted ASR models. Speech impairment severity notably impacted recognition accuracy in unadapted models for both speech modes and in personalized models for read speech. Linguistic attributes of utterances were the most influential on accuracy, though atypical speech characteristics also played a role. Including conversational speech samples in model training notably improved recognition accuracy. Conclusions: We observed a significant performance gap in ASR accuracy between read and conversational speech for individuals with speech disorders. This gap was largely due to the linguistic complexity and unique characteristics of speech disorders in conversational speech. Training personalized ASR models using conversational speech significantly improved recognition accuracy, demonstrating the importance of domain-specific training and highlighting the need for further research into ASR systems capable of handling disordered conversational speech effectively. View details
    BEYOND THE CODE: AI REGULATIONS AS THE SECRET COMPASS OF ENGINEERING MANAGERS
    Proceedings of the American Society for Engineering Management 2024 International Annual Conference (2024)
    Preview abstract Technology is a product of society. As technology evolves, the norms governing it have to mature for enabling its proper use within the society. The interest in Artificial Intelligence (AI) has surged following the introduction of chatGPT. Firms, both large and small, are competing to develop new products and solutions involving AI. Amidst these developments, leading corporations such as Google and Microsoft have proactively committed to upholding responsible innovation in AI development. Governments worldwide are responding with the creation of guidelines and regulations in the field. Notably, in March 2024, the United Nations General Assembly (UNGA) adopted landmark regulation on AI. At the heart of these developments in AI are engineering managers who leverage technical advances to build products and services that create value. To effectively harness AI for human benefit, engineering managers must be aware of these evolving regulations governing AI. Some regulations such as Digital Markets Act (DMA) and General Data Protection Regulations (GDPR) have far reaching consequences for organizations globally. Having a working knowledge of these statutory requirements will enable engineering managers to identify the opportunities and constraints in leveraging AI technology while building products and services. It will allow them to make informed decisions about data collection methods, model training processes, the deployment of AI systems and metrics for their evaluation. At scale, it can become a competitive advantage for the firms they work in, as explored through real-world examples in this paper. View details
    Preview abstract Table-based reasoning with large language models (LLMs) is a promising direction to tackle many table understanding tasks, such as table-based question answering and fact verification. Compared with generic reasoning, table-based reasoning requires the extraction of underlying semantics from both free-form questions and semi-structured tabular data. Chain-of-Thought and its similar approaches incorporate the reasoning chain in the form of textual context, but it is still an open question how to effectively leverage tabular data in the reasoning chain. We propose the Chain-of-Table framework, where tabular data is explicitly used in the reasoning chain as a proxy for intermediate thoughts. Specifically, we guide LLMs using in-context learning to iteratively generate operations and update the table to represent a tabular reasoning chain. LLMs can therefore dynamically plan the next operation based on the results of the previous ones. This continuous evolution of the table forms a chain, showing the reasoning process for a given tabular problem. The chain carries structured information of the intermediate results, enabling more accurate and reliable predictions. Chain-of-Table achieves new state-of-the-art performance on WikiTQ, FeTaQA, and TabFact benchmarks across multiple LLM choices. View details
    Security & Privacy Product Inclusion
    Dave Kleidermacher
    Emmanuel Arriaga
    Eric Wang
    Sebastian Porst
    Roger Piqueras Jover
    Arxive (2024)
    Preview abstract In this paper, we explore the challenges of ensuring security and privacy for users from diverse demographic backgrounds. We propose a threat modeling approach to identify potential risks and countermeasures for product inclusion in security and privacy. We discuss various factors that can affect a user's ability to achieve a high level of security and privacy, including low-income demographics, poor connectivity, shared device usage, ML fairness, etc. We present results from a global security and privacy user experience survey and discuss the implications for product developers. Our work highlights the need for a more inclusive approach to security and privacy and provides a framework for researchers and practitioners to consider when designing products and services for a diverse range of users. View details
    Distributed Tracing for InterPlanetary File System
    Marshall David Miller
    Rachel Han
    Haorui Guo
    2024 International Symposium on Parallel Computing and Distributed Systems (PCDS), IEEE, pp. 1-5
    Preview abstract The InterPlanetary File System (IPFS) is on its way to becoming the backbone of the next generation of the web. However, it suffers from several performance bottlenecks, particularly on the content retrieval path, which are often difficult to debug. This is because content retrieval involves multiple peers on the decentralized network and the issue could lie anywhere in the network. Traditional debugging tools are insufficient to help web developers who face the challenge of slow loading websites and detrimental user experience. This limits the adoption and future scalability of IPFS. In this paper, we aim to gain valuable insights into how content retrieval requests propagate within the IPFS network as well as identify potential performance bottlenecks which could lead to opportunities for improvement. We propose a custom tracing framework that generates and manages traces for crucial events that take place on each peer during content retrieval. The framework leverages event semantics to build a timeline of each protocol involved in the retrieval, helping developers pinpoint problems. Additionally, it is resilient to malicious behaviors of the peers in the decentralized environment. We have implemented this framework on top of an existing IPFS implementation written in Java called Nabu. Our evaluation shows that the framework can identify network delays and issues with each peer involved in content retrieval requests at a very low overhead. View details
    Preview abstract Measurement is one of the essential components of quantum algorithms, and for superconducting qubits it is often the most error prone. Here, we demonstrate a model-based readout optimization achieving low measurement errors while avoiding detrimental side-effects. For simultaneous and mid-circuit measurements across 17 qubits we observe 1.5% error per qubit with a duration of 500 ns end-to-end and minimal excess reset error from residual resonator photons. We also suppress measurement-induced state transitions and achieve a qubit leakage rate limited by natural heating.This technique can scale to hundreds of qubits, and be used to enhance performance of error-correcting codes as well as near-term applications View details
    Quantifying urban park use in the USA at scale: empirical estimates of realised park usage using smartphone location data
    Michael T Young
    Swapnil Vispute
    Stylianos Serghiou
    Akim Kumok
    Yash Shah
    Kevin J. Lane
    Flannery Black-Ingersoll
    Paige Brochu
    Monica Bharel
    Sarah Skenazy
    Shailesh Bavadekar
    Mansi Kansal
    Evgeniy Gabrilovich
    Gregory A. Wellenius
    Lancet Planetary Health (2024)
    Preview abstract Summary Background A large body of evidence connects access to greenspace with substantial benefits to physical and mental health. In urban settings where access to greenspace can be limited, park access and use have been associated with higher levels of physical activity, improved physical health, and lower levels of markers of mental distress. Despite the potential health benefits of urban parks, little is known about how park usage varies across locations (between or within cities) or over time. Methods We estimated park usage among urban residents (identified as residents of urban census tracts) in 498 US cities from 2019 to 2021 from aggregated and anonymised opted-in smartphone location history data. We used descriptive statistics to quantify differences in park usage over time, between cities, and across census tracts within cities, and used generalised linear models to estimate the associations between park usage and census tract level descriptors. Findings In spring (March 1 to May 31) 2019, 18·9% of urban residents visited a park at least once per week, with average use higher in northwest and southwest USA, and lowest in the southeast. Park usage varied substantially both within and between cities; was unequally distributed across census tract-level markers of race, ethnicity, income, and social vulnerability; and was only moderately correlated with established markers of census tract greenspace. In spring 2019, a doubling of walking time to parks was associated with a 10·1% (95% CI 5·6–14·3) lower average weekly park usage, adjusting for city and social vulnerability index. The median decline in park usage from spring 2019 to spring 2020 was 38·0% (IQR 28·4–46·5), coincident with the onset of physical distancing policies across much of the country. We estimated that the COVID-19-related decline in park usage was more pronounced for those living further from a park and those living in areas of higher social vulnerability. Interpretation These estimates provide novel insights into the patterns and correlates of park use and could enable new studies of the health benefits of urban greenspace. In addition, the availability of an empirical park usage metric that varies over time could be a useful tool for assessing the effectiveness of policies intended to increase such activities. View details
    SAC126 - DNSSEC Delegation Signer (DS) Record Automation
    Internet Corporation for Assigned Names and Numbers (ICANN), ICANN Security and Stability Advisory Committee (SSAC) Reports and Advisories (2024), pp. 39
    Preview abstract The deployment of Domain Name System (DNS) Security Extensions (DNSSEC) has been hindered by a number of obstacles. This report focuses on one: the management of Delegation Signer (DS) records, which connect a child zone’s DNSSEC public key and signatures to the chain of trust provided by its parent zone (e.g., a zone corresponding to a top-level domain). DNSSEC is not simply enabled by signing a delegated domain’s DNS zone with DNSSEC signatures. It is also necessary to configure (and later maintain) appropriate DS records, which involves coordinated actions by the DNS operator, registrant, registrar, and registry. In the case where the domain’s DNS service is operated by the registrar, this process can be reduced to a simple internal operation by the registrar. If the functions are separated, this is not possible. This report is therefore focused on when the domain’s DNS service is not operated by the registrar, but by a third-party DNS operator. In such a scenario, current practice holds the registrant responsible for coordinating DS maintenance. The registrant (or someone appointed by them) needs to first obtain DNSSEC public key parameters from the DNS operator, and convey these parameters to the registrar (potentially via a reseller). The registrar will then need to relay these DNSSEC public key parameters to the registry, who will use them to create and publish the DS record in the parent zone. This process often involves idiosyncratic interfaces for each combination of DNS operator and registrar, requiring a level of engagement and time investment, awareness, and understanding that often do not match with what the registrant knows or expects. The complexity of the process further introduces opportunity for error. This can be alleviated by employing automation for the data exchanges required for DS maintenance so that, when the domain’s DNS service is operated by a third party, registries or registrars can, without human involvement, obtain all information needed for keeping DS records up to date. Various approaches to achieve this are possible, such as a scheme where the registry or registrar actively contacts the Child DNS operator, or vice versa. The different approaches come with different challenges with respect to authentication, timing, and efficiency. The IETF has standardized specifications around the first approach, where the parent pulls information from the Child DNS operator, and operational experience has been gained over recent years. However, some standardization gaps remain (such as to improve efficiency and error handling). In addition, the industry could benefit from further development of best practices in deploying the technology. The SSAC believes that automated DS maintenance should be a goal for the domain name industry. To make this a reality, the SSAC makes several recommendations with the goal to spur industry players and ICANN towards an industry best practice for DNSSEC DS automation. View details
    Preview abstract Slow concept drift is a ubiquitous, yet under-studied problem in practical machine learning systems. Although recent data is more indicative of future data in these settings, naively prioritizing these instances runs the risk of losing valuable information from the past. We propose an optimization-driven approach towards balancing instance importance over large training windows. First, we model instance relevance using a mixture of multiple timescales of decay, allowing us to capture rich temporal trends. Second, we learn an auxiliary \textit{scorer model} that recovers the appropriate mixture of timescales as a function of the instance itself. Finally, we propose a nested optimization objective for learning the scorer, by which it maximizes forward transfer for the learned model. Experiments on a large real-world dataset of 39M photos over a 9 year period show upto 15\% relative gains in accuracy compared to other robust learning baselines. We replicate our gains on two collections of real-world datasets for non-stationary learning, and extend our work to continual learning settings where, too, we beat SOTA methods by large margins. View details
    Hardware-Assisted Fault Isolation: Going Beyond the Limits of Software-Based Sandboxing
    Anjo Vahldiek-Oberwagner
    Tal Garfinkel
    Deian Stefan
    Michael LeMay
    Evan Johnson
    Mohammadkazem Taram
    Chris Fallin
    Ravi Sahita
    Joey Rudek
    Shravan Narayan
    Dean Tullsen
    IEEE Micro (2024)
    Preview abstract Hardware-assisted Fault Isolation (HFI) is a minimal extension to current processors that supports secure, flexible, and efficient in-process isolation. HFI addresses the limitations of software-based isolation (SFI) systems including: runtime overheads, limited scalability, vulnerability to Spectre attacks, and limited compatibility with existing code. HFI can be seamlessly integrated into exisiting SFI systems (e.g. WebAssembly), or directly sandbox unmodified native binaries. To ease adoption, HFI proposes incremental changes to existing high-performance processors. View details
    The Case for Validating Inputs in Software-Defined WANs
    Rishabh Iyer
    Isaac Keslassy
    Sylvia Ratnasamy
    The 23rd ACM Workshop on Hot Topics in Networks (HOTNETS ’24), ACM, Irvine, CA (2024) (to appear)
    Preview abstract We highlight a problem that the networking community has largely overlooked: ensuring that the inputs to network controllers in software- defined WANs are accurate. We we show that “incorrect” inputs are a common cause of major outages in practice and propose new directions to address these. View details
    Preview abstract In this paper we study users' opinions about the privacy of their mobile health apps. We look at what they write in app reviews in the 'Health & Fitness' category on the Google Play store. We identified 2832 apps in this category (based on 1K minimum installs). Using NLP/LLM analyses, we find that 76% of these apps have at least some privacy reviews. In total this yields over 164,000 reviews about privacy, from over 150 countries and in 25 languages. Our analyses identifies top themes and offers an approximation of how widespread these issues are around the world. We show that the top 4 themes - Data Sharing and Exposure, Permission Requests, Location Tracking and Data Collection - are issues of concern in over 70 countries. Our automatically generated thematic summaries reveal interesting aspects that deserve further research around user suspicions (unneeded data collection), user requests (more fine-grained control over data collection and data access), as well as user behavior (uninstalling apps). View details
    Preview abstract This is the seventh installment of the Developer Productivity for Humans column. This installment focuses on software quality: what it means, how developers see it, how we break it down into 4 types of quality, and the impact these have on each other. View details
    Preview abstract Large Language Models have been able to replicate their success from text generation to coding tasks. While a lot of work has made it clear that they have remarkable performance on tasks such as code completion and editing, it is still unclear as to why. We help bridge this gap by exploring to what degree do auto-regressive models understand the logical constructs of the underlying programs. We propose CAPP, a counterfactual testing framework to evaluate whether large code models understand programming concepts. With only black-box access to the model, we use CAPP to evaluate 10 popular large code models for 5 different programming concepts. Our findings suggest that current models lack understanding of concepts such as data flow and control flow. View details
    FieldSwap: Data Augmentation for Effective Form-Like Document Extraction
    Seth Ebner
    IEEE 40th International Conference on Data Engineering (ICDE) (2024), pp. 4722-4732
    Preview abstract Extracting structured data from visually rich documents like invoices, receipts, financial statements, and tax forms is key to automating many business workflows. However, building extraction models in this domain often demands a large collection of high-quality training examples. To address this challenge, we introduce FieldSwap, a novel data augmentation technique specifically designed for such extraction problems. FieldSwap generates synthetic training examples by replacing key phrases indicative of one field with those corresponding to another. Our experiments on five diverse datasets demonstrate that incorporating FieldSwap-augmented data into the training process can enhance model performance by 1-11 F1 points, particularly when dealing with limited training data (10--100 documents). Additionally, we propose algorithms for automatically inferring key phrases from the training data. Our findings indicate that FieldSwap is effective regardless of whether key phrases are manually provided by human experts or inferred automatically. View details