Shaun Kane
Research Areas
Authored Publications
Sort By
Using large language models to accelerate communication for eye gaze typing users with ALS
Subhashini Venugopalan
Katie Seaver
Xiang Xiao
Sri Jalasutram
Ajit Narayanan
Bob MacDonald
Emily Kornman
Daniel Vance
Blair Casey
Steve Gleason
(2024)
Preview abstract
Accelerating text input in augmentative and alternative communication (AAC) is a long-standing area of research with bearings on the quality of life in individuals with profound motor impairments. Recent advances in large language models (LLMs) pose opportunities for re-thinking strategies for enhanced text entry in AAC. In this paper, we present SpeakFaster, consisting of an LLM-powered user interface for text entry in a highly-abbreviated form, saving 57% more motor actions than traditional predictive keyboards in offline simulation. A pilot study on a mobile device with 19 non-AAC participants demonstrated motor savings in line with simulation and relatively small changes in typing speed. Lab and field testing on two eye-gaze AAC users with amyotrophic lateral sclerosis demonstrated text-entry rates 29–60% above baselines, due to significant saving of expensive keystrokes based on LLM predictions. These findings form a foundation for further exploration of LLM-assisted text entry in AAC and other user interfaces.
View details
From Provenance to Aberrations: Image Creator and Screen Reader User Perspectives on Alt Text for AI-Generated Images
Maitraye Das
Alexander J. Fiannaca
CHI Conference on Human Factors in Computing Systems (2024)
Preview abstract
AI-generated images are proliferating as a new visual medium. However, state-of-the-art image generation models do not output alternative (alt) text with
their images, rendering them largely inaccessible to screen reader users (SRUs). Moreover, less is known about what information would be most desirable
to SRUs in this new medium. To address this, we invited AI image creators and SRUs to evaluate alt text prepared from various sources and write their own
alt text for AI images. Our mixed-methods analysis makes three contributions. First, we highlight creators’ perspectives on alt text, as creators are well-positioned
to write descriptions of their images. Second, we illustrate SRUs’ alt text needs particular to the emerging medium of AI images. Finally, we discuss the
promises and pitfalls of utilizing text prompts written as input for AI models in alt text generation, and areas where broader digital accessibility guidelines
could expand to account for AI images.
View details
Preview abstract
Generative AI (GAI) is proliferating, and among its many applications are to support creative work (e.g., generating text, images, music) and to enhance accessibility (e.g., captions of images and audio). As GAI evolves, creatives must consider how (or how not) to incorporate these tools into their practices. In this paper, we present interviews at the intersection of these applications. We learned from 10 creatives with disabilities who intentionally use and do not use GAI in and around their creative work. Their mediums ranged from audio engineering to leatherwork, and they collectively experienced a variety of disabilities, from sensory to motor to invisible disabilities. We share cross-cutting themes of their access hacks, how creative practice and access work become entangled, and their perspectives on how GAI should and should not fit into their workflows. In turn, we offer qualities of accessible creativity with responsible AI that can inform future research.
View details
“They only care to show us the wheelchair”: disability representation in text-to-image AI models
Avery Mack
Rida Qadri
CHI Conference on Human-Computer Interaction (2024)
Preview abstract
This paper reports on disability representation in images output from text-to-image (T2I) generative AI systems. Through eight focus groups with 25 people
with disabilities, we found that models repeatedly presented reductive archetypes for different disabilities. Often these representations reflected broader
societal stereotypes and biases, which our participants were concerned to see reproduced through T2I. Our participants discussed further challenges with
using these models including the current reliance on prompt engineering to reach satisfactorily diverse results. Finally, they offered suggestions for
how to improve disability representation with solutions like showing multiple, heterogeneous images for a single prompt and including the prompt with images
generated. Our discussion reflects on tensions and tradeoffs we found among the diverse perspectives shared to inform future research on representation-oriented
generative AI system evaluation metrics and development processes.
View details
Preview abstract
Accessibility solutions often focus on the experiences of people with more severe disabilities, such as those who are unable to perform certain tasks unassisted. However, disability exists on a spectrum, and people with more moderate disabilities may not be included in research, or may not be considered disabled within research. In this study, we interviewed 12 adults with mild-to-moderate dexterity impairments about their experiences using smartphones and other mobile devices. Our participants did experience accessibility challenges but sometimes struggled to know where to find help for their problems, in part because of discomfort with traditional labels of disability and accessibility. We suggest that individuals with mild to moderate dexterity challenges may benefit from further consideration from the accessibility community and accessibility features that support their needs.
View details
"I wouldn’t say offensive but...": Disability-Centered Perspectives on Large Language Models
Vinitha Gadiraju
Alex Taylor
Robin Brewer
Proceedings of FAccT 2023 (2023) (to appear)
Preview abstract
Large language models (LLMs) trained on real-world data can inadvertently reflect harmful societal biases, particularly toward historically marginalized communities. While previous work has primarily focused on harms related to age and race, emerging research has shown that biases toward disabled communities exist. This study extends prior work exploring the existence of harms by identifying categories of LLM-perpetuated harms toward the disability community. We conducted 19 focus groups, during which 56 participants with disabilities probed a dialog model about disability and discussed and annotated its responses. Participants rarely characterized model outputs as blatantly offensive or toxic. Instead, participants used nuanced language to detail how the dialog model mirrored subtle yet harmful stereotypes they encountered in their lives and dominant media, e.g., inspiration porn and able-bodied saviors. Participants often implicated training data as a cause for these stereotypes and recommended training the model on diverse identities from disability-positive resources. Our discussion further explores representative data strategies to mitigate harm related to different communities through annotation co-design with ML researchers and developers.
View details
Practical Challenges for Investigating Abbreviation Strategies
Elisa Kreiss
CHI 2023 Workshop on Assistive Writing, ACM (2023) (to appear)
Preview abstract
Saying more while typing less is the ideal we strive towards when designing assistive writing technology that can minimize effort. Complementary to efforts on predictive completions is the idea to use a drastically abbreviated version of an intended message, which can then be reconstructed using Language Models. This paper highlights the challenges that arise from investigating what makes an abbreviation scheme promising for a potential application. We hope that this can provide a guide for designing studies which consequently allow for fundamental insights on efficient and goal driven abbreviation strategies.
View details
“The less I type, the better”: How AI Language Models can Enhance or Impede Communication for AAC Users
Stephanie Valencia
Richard Cave
Krystal Kallarackal
Katie Seaver
ACM Conference on Human Factors in Computing Systems (ACM CHI) 2023, ACM (2023) (to appear)
Preview abstract
Users of augmentative and alternative communication (AAC) devices sometimes find it difficult to communicate in real time with others due to the time it takes to compose messages. AI technologies such as large language models (LLMs) provide an opportunity to support AAC users by improving the quality and variety of text suggestions. However, these technologies may fundamentally change how users interact with AAC devices as users transition from typing their own phrases to prompting and selecting AI-generated phrases. We conducted a study in which 12 AAC users tested live suggestions from a language model across three usage scenarios: extending short replies, answering biographical questions, and requesting assistance. Our study participants believed that AI-generated phrases could save time, physical and cognitive effort when communicating, but felt it was important that these phrases reflect their own communication style and preferences. This work identifies opportunities and challenges for future AI-enhanced AAC devices.
View details
Designing Responsible AI: Adaptations of UX Practice to Meet Responsible AI Challenges
Qiaosi Wang
Michael Adam Madaio
Shivani Kapania
Lauren Wilcox
ACM Conference on Human Factors in Computing Systems (ACM CHI) 2023, ACM (2023)
Preview abstract
The shift towards Responsible AI (RAI) in the tech industry necessitates new practices and adaptations to roles. To understand practices at the intersection of user experience (UX) and RAI, we conducted an interview study with industrial UX practitioners and RAI subject matter experts, both of whom are actively involved in addressing RAI concerns, both early in and throughout the development of new AI-based prototypes, demos, and products. Many of the specific practices and their associated challenges have yet to be surfaced, and distilling them offers a critical view into how practitioners' roles are adapting to meet present-day RAI challenges. We present and discuss three emerging practices in which RAI is being enacted and reified in UX work. We conclude by arguing that the emerging practices, goals, and types of expertise that surfaced in our study point to an evolution in praxis that suggests important areas for further research in HCI.
View details
SpeakFaster Observer: Long-Term Instrumentation of Eye-Gaze Typing for Measuring AAC Communication
Richard Jonathan Noel Cave
Bob MacDonald
Jon Campbell
Blair Casey
Emily Kornman
Daniel Vance
Jay Beavers
CHI23 Case Studies of HCI in Practice (2023) (to appear)
Preview abstract
Accelerating communication for users with severe motor and speech impairments, in particular for eye-gaze Augmentative and Alternative Communication (AAC) device users, is a long-standing area of research. However, observation of such users' communication over extended durations has been limited. This case study presents the real-world experience of developing and field-testing a tool for observing and curating the gaze typing-based communication of a consented eye-gaze AAC user with amyotrophic lateral sclerosis (ALS) from the perspective of researchers at the intersection of HCI and artificial intelligence (AI). With the intent to observe and accelerate eye-gaze typed communication, we designed a tool and a protocol called the SpeakFaster Observer to measure everyday conversational text entry by the consenting gaze-typing user, as well as several consenting conversation partners of the AAC user. We detail the design of the Observer software and data curation protocol, along with considerations for privacy protection. The deployment of the data protocol from November 2021 to April 2022 yielded a rich dataset of gaze-based AAC text entry in everyday context, consisting of 130+ hours of gaze keypresses and 5.5k+ curated speech utterances from the AAC user and the conversation partners. We present the key statistics of the data, including the speed (8.1±3.9 words per minute) and keypress saving rate (-0.18±0.87) of gaze typing, patterns of of utterance repetition and reuse, as well as the temporal dynamics of conversation turn-taking in gaze-based communication. We share our findings and also open source our data collections tools for furthering research in this domain.
View details