Jump to Content
Meredith Ringel Morris

Meredith Ringel Morris

Meredith Ringel Morris is Director and Principal Scientist for Human-AI Interaction in Google DeepMind (formerly in Google Brain), conducting foundational research on Human-AI interaction and Human-Centered AI. Previously, she was Director of People + AI Research in Google Research's Responsible AI organization. She is also an Affiliate Professor at the University of Washington in The Paul G. Allen School of Computer Science & Engineering and in The Information School. Prior to joining Google Research, Dr. Morris was Research Area Manager for Interaction, Accessibility, and Mixed Reality at Microsoft Research, where she founded Microsoft’s Ability research group. Dr. Morris is an ACM Fellow and a member of the ACM SIGCHI Academy. Dr. Morris earned her Sc.B. in Computer Science from Brown University and her M.S. and Ph.D. in Computer Science from Stanford University.
Authored Publications
Google Publications
Other Publications
Sort By
  • Title
  • Title, descending
  • Year
  • Year, descending
    Preview abstract As AI systems quickly improve in both breadth and depth of performance, they lend themselves to creating increasingly powerful and realistic agents, including the possibility of agents modeled on specific people. We anticipate that within our lifetimes it may become common practice for people to create a custom AI agent to interact with loved ones and/or the broader world after death. We call these generative ghosts, since such agents will be capable of generating novel content rather than merely parroting content produced by their creator while living. In this paper, we first discuss the design space of potential implementations of generative ghosts. We then discuss the practical and ethical implications of generative ghosts, including potential positive and negative impacts on individuals and society. Based on these considerations, we lay out a research agenda for the AI and HCI research communities to empower people to create and interact with AI afterlives in a safe and beneficial manner. View details
    Preview abstract AI-generated images are proliferating as a new visual medium. However, state-of-the-art image generation models do not output alternative (alt) text with their images, rendering them largely inaccessible to screen reader users (SRUs). Moreover, less is known about what information would be most desirable to SRUs in this new medium. To address this, we invited AI image creators and SRUs to evaluate alt text prepared from various sources and write their own alt text for AI images. Our mixed-methods analysis makes three contributions. First, we highlight creators’ perspectives on alt text, as creators are well-positioned to write descriptions of their images. Second, we illustrate SRUs’ alt text needs particular to the emerging medium of AI images. Finally, we discuss the promises and pitfalls of utilizing text prompts written as input for AI models in alt text generation, and areas where broader digital accessibility guidelines could expand to account for AI images. View details
    Preview abstract We propose a framework for classifying the capabilities and behavior of Artificial General Intelligence (AGI) models and their precursors. This framework introduces levels of AGI performance, generality, and autonomy. It is our hope that this framework will be useful in an analogous way to the levels of autonomous driving, by providing a common language to compare models, assess risks, and measure progress along the path to AGI. To develop our framework, we analyze existing definitions of AGI, and distill six principles that a useful ontology for AGI should satisfy. These principles include focusing on capabilities rather than mechanisms; separately evaluating generality and performance; and defining stages along the path toward AGI, rather than focusing on the endpoint. With these principles in mind, we propose “Levels of AGI” based on depth (performance) and breadth (generality) of capabilities, and reflect on how current systems fit into this ontology. We discuss the challenging requirements for future benchmarks that quantify the behavior and capabilities of AGI models against these levels. Finally, we discuss how these levels of AGI interact with deployment considerations such as autonomy and risk, and emphasize the importance of carefully selecting Human-AI Interaction paradigms for responsible and safe deployment of highly capable AI systems. View details
    Preview abstract Generative AI models, including large language models and multimodal models that include text and other media, are on the cusp of transforming many aspects of modern life, including entertainment, education, civic life, the arts, and a range of professions. There is potential for Generative AI to have a substantive impact on the methods and pace of discovery for a range of scientific disciplines. We interviewed twenty scientists from a range of fields (including the physical, life, and social sciences) to gain insight into whether or how Generative AI technologies might add value to the practice of their respective disciplines, including not only ways in which AI might accelerate scientific discovery (i.e., research), but also other aspects of their profession, including the education of future scholars and the communication of scientific findings. In addition to identifying opportunities for Generative AI to augment scientists’ current practices, we also asked participants to reflect on concerns about AI. These findings can help guide the responsible development of models and interfaces for scientific education, inquiry, and communication. View details
    Preview abstract Presentation slides commonly use visual patterns for structural navigation, such as titles, dividers, and build slides. However, screen readers do not capture such intention, making it time-consuming and less accessible for blind and visually impaired (BVI) users to linearly consume slides with repeated content. We present Slide Gestalt, an automatic approach that identifies the hierarchical structure in a slide deck. Slide Gestalt computes the visual and textual correspondences between slides to generate hierarchical groupings. Readers can navigate the slide deck from the higher-level section overview to the lower-level description of a slide group or individual elements interactively with our UI. We derived side consumption and authoring practices from interviews with BVI readers and sighted creators and an analysis of 100 decks. We performed our pipeline with 50 real-world slide decks and a large dataset. Feedback from eight BVI participants showed that Slide Gestalt helped navigate a slide deck by anchoring content more efficiently, compared to using accessible slides. View details
    SpeakFaster Observer: Long-Term Instrumentation of Eye-Gaze Typing for Measuring AAC Communication
    Richard Jonathan Noel Cave
    Bob MacDonald
    Jon Campbell
    Blair Casey
    Emily Kornman
    Daniel Vance
    Jay Beavers
    CHI23 Case Studies of HCI in Practice (2023) (to appear)
    Preview abstract Accelerating communication for users with severe motor and speech impairments, in particular for eye-gaze Augmentative and Alternative Communication (AAC) device users, is a long-standing area of research. However, observation of such users' communication over extended durations has been limited. This case study presents the real-world experience of developing and field-testing a tool for observing and curating the gaze typing-based communication of a consented eye-gaze AAC user with amyotrophic lateral sclerosis (ALS) from the perspective of researchers at the intersection of HCI and artificial intelligence (AI). With the intent to observe and accelerate eye-gaze typed communication, we designed a tool and a protocol called the SpeakFaster Observer to measure everyday conversational text entry by the consenting gaze-typing user, as well as several consenting conversation partners of the AAC user. We detail the design of the Observer software and data curation protocol, along with considerations for privacy protection. The deployment of the data protocol from November 2021 to April 2022 yielded a rich dataset of gaze-based AAC text entry in everyday context, consisting of 130+ hours of gaze keypresses and 5.5k+ curated speech utterances from the AAC user and the conversation partners. We present the key statistics of the data, including the speed (8.1±3.9 words per minute) and keypress saving rate (-0.18±0.87) of gaze typing, patterns of of utterance repetition and reuse, as well as the temporal dynamics of conversation turn-taking in gaze-based communication. We share our findings and also open source our data collections tools for furthering research in this domain. View details
    Characterizing Image Accessibility on Wikipedia across Languages
    Elisa Kreiss
    Tiziano Piccardi
    Jesus Adolfo Hermosillo
    Michael S. Bernstein
    Christopher Potts
    Wiki Workshop 2023 (to appear)
    Preview abstract We make a first attempt to characterize image accessibility on Wikipedia across languages, present new experimental results that can inform efforts to assess description quality, and offer some strategies to improve Wikipedia's image accessibility. View details
    Towards Semantically-Aware UI Design Tools: Design, Implementation, and Evaluation of Semantic Grouping Guidelines
    Peitong Duan
    Bjoern Hartmann
    Karina Nguyen
    Marti Hearst
    ICML 2023 Workshop on Artificial Intelligence and Human-Computer Interaction (2023)
    Preview abstract A coherent semantic structure, where semantically-related elements are appropriately grouped, is critical for proper understanding of a UI. Ideally, UI design tools should help designers establish coherent semantic grouping. To work towards this, we contribute five semantic grouping guidelines that capture how human designers think about semantic grouping and are amenable to implementation in design tools. They were obtained from empirical observations on existing UIs, a literature review, and iterative refinement with UI experts’ feedback. We validated our guidelines through an expert review and heuristic evaluation; results indicate these guidelines capture valuable information about semantic structure. We demonstrate the guidelines’ use for building systems by implementing a set of computational metrics. These metrics detected many of the same severe issues that human design experts marked in a comparative study. Running our metrics on a larger UI dataset suggests many real UIs exhibit grouping violations. View details
    AI for Accessibility: An Agenda for the Global South
    Vaishnav Kameswaran
    Jerry Young
    Nithya Sambasivan
    Gaurav Aggarwal
    ASSETS 2023 A11yFutures Workshop , ACM (2023)
    Preview abstract AI technologies have the potential to improve the quality of life for marginalized populations, including people with disabilities. However, a majority of these AI solutions are designed for people in the Global North and so far, have marginalized the needs of people with disabilities in the Global South. Yet, the increased proliferation of AI across the world suggests that this trend will change. This prompts the question: What are key considerations for the design for AI solutions that center the needs of people with disabilities in the Global South: contexts often marked by poverty, limited resource availability, lack of accessible support structures and indifferent societal attitudes towards people with disabilities? In this position paper, we begin to answer this question. To do so, we draw upon a case study of designing a novel AI solution to support the indoor navigation practices of people with visual impairments. We provide guidance to HCI, AI, and Accessibility researchers and practitioners to aid in their quest to design more inclusive AI technologies. View details
    Generative Agents: Interactive Simulacra of Human Behavior
    Joon Sung Park
    Joseph C. O'Brien
    Percy Liang
    Michael Bernstein
    Proceedings of UIST 2023, ACM (2023)
    Preview abstract Believable proxies of human behavior can empower interactive applications ranging from immersive environments to rehearsal spaces for interpersonal communication to prototyping tools. In this paper, we introduce generative agents--computational software agents that simulate believable human behavior. Generative agents wake up, cook breakfast, and head to work; artists paint, while authors write; they form opinions, notice each other, and initiate conversations; they remember and reflect on days past as they plan the next day. To enable generative agents, we describe an architecture that extends a large language model to store a complete record of the agent's experiences using natural language, synthesize those memories over time into higher-level reflections, and retrieve them dynamically to plan behavior. We instantiate generative agents to populate an interactive sandbox environment inspired by The Sims, where end users can interact with a small town of twenty five agents using natural language. In an evaluation, these generative agents produce believable individual and emergent social behaviors: for example, starting with only a single user-specified notion that one agent wants to throw a Valentine's Day party, the agents autonomously spread invitations to the party over the next two days, make new acquaintances, ask each other out on dates to the party, and coordinate to show up for the party together at the right time. We demonstrate through ablation that the components of our agent architecture--observation, planning, and reflection--each contribute critically to the believability of agent behavior. By fusing large language models with computational, interactive agents, this work introduces architectural and interaction patterns for enabling believable simulations of human behavior. View details
    Preview abstract Saying more while typing less is the ideal we strive towards when designing assistive writing technology that can minimize effort. Complementary to efforts on predictive completions is the idea to use a drastically abbreviated version of an intended message, which can then be reconstructed using Language Models. This paper highlights the challenges that arise from investigating what makes an abbreviation scheme promising for a potential application. We hope that this can provide a guide for designing studies which consequently allow for fundamental insights on efficient and goal driven abbreviation strategies. View details
    Preview abstract As modern, pre-trained ML models have proliferated in recent years , many researchers and practitioners have made significant efforts to prevent AI systems from causing harm. This focus on safety is critical, but a singular focus on safety can come at the exclusion of considering other important stakeholder values and the interactions between those values in the AI systems we build. In this position paper, we propose that the AI community should incorporate ideas from the Value-Sensitive Design framework from the Human-Computer Interaction community to ensure the needs and values of all stakeholders are reflected in the systems we build. We share observations and reflections from our experiences working on AI-supported accessibility technologies and with members of various disability communities to illustrate the tensions that sometimes arise between safety and other values. View details
    Context-Aware Abbreviation Expansion Using Large Language Models
    Ajit Narayanan
    Annual Conference of the North American Chapter of the Association for Computational Linguistics, 2022 (2022) (to appear)
    Preview abstract Motivated by the need for accelerating text entry in augmentative and alternative communication (AAC) for people with severe motor impairments, we propose a paradigm in which phrases are abbreviated aggressively as primarily word-initial letters. Our approach is to expand the abbreviations into full-phrase options by leveraging conversation context with the power of pretrained large language models (LLMs). Through zero-shot, few-shot, and fine-tuning experiments on four public conversation datasets, we show that for replies to the initial turn of a dialog, an LLM with 64B parameters is able to exactly expand over 70% of phrases with abbreviation length up to 10, leading to an effective keystroke saving rate of up to about 77% on these exact expansions. Including a small amount of context in the form of a single conversation turn more than doubles abbreviation expansion accuracies compared to having no context, an effect that is more pronounced for longer phrases. Additionally, the robustness of models against typo noise can be enhanced through fine-tuning on noisy data. View details
    Context Matters for Image Description Evaluation: Challenges for Referenceless Metrics
    Elisa Kreiss
    Shayan Hooshmand
    Eric Zelikman
    Christopher Potts
    EMNLP 2022 (2022) (to appear)
    Preview abstract Few images on the Web receive alt-text descriptions that would make them accessible to blind and low vision (BLV) users. Image-based NLG systems have progressed to the point where they can begin to address this persistent societal problem, but these systems will not be fully successful unless we evaluate them on metrics that guide their development correctly. Here, we argue against current referenceless metrics -- those that don't rely on human-generated ground-truth descriptions -- on the grounds that they do not align with the needs of BLV users. The fundamental shortcoming of these metrics is that they cannot take context into account, whereas contextual information is highly valued by BLV users. To substantiate these claims, we present a study with BLV participants who rated descriptions along a variety of dimensions. An in-depth analysis reveals that the lack of context-awareness makes current referenceless metrics inadequate for advancing image accessibility, requiring a rethinking of referenceless evaluation metrics for image-based NLG systems. View details
    Social Simulacra: Creating Populated Prototypes for Social Computing Systems
    Joon Sung Park
    Lindsay Popowski
    Percy Liang
    Michael S. Bernstein
    Proceedings of UIST 2022, ACM (2022) (to appear)
    Preview abstract Prototyping techniques for social computing systems often recruit small groups to test a design, but many challenges that threaten the norms and moderation standards do not arise until a design achieves a larger scale. Can a designer understand how a social system might behave when later populated, and make adjustments before the system falls prey to such challenges? We introduce social simulacra, a technique enabling early prototyping of social computing systems by generating a breadth of possible social interactions that may emerge when the system is populated. Our implementation of social simulacra translates the designer’s description of a community’s goal, rules, and member personas into a set of posts, replies, and anti-social behaviors; shifts these behaviors appropriately in response to design changes; and enables exploration of "what if?" scenarios where community members or moderators intervene. We contribute techniques for prompting a large language model to generate such social interactions, drawing on the observation that large language models have consumed a wide variety of these behaviors on the public web. In evaluations, we show that participants were often unable to distinguish social simulacra from actual community behavior, and that social computing designers could use them to iterate on their designs. View details
    LaMPost: Evaluation of an AI-assisted Writing Email Editor Prototype for Adults with Dyslexia
    Steven Goodman
    Erin Buehler
    Patrick Clary
    Andy Coenen
    Aaron Michael Donsbach
    Tiffanie Horne
    Bob MacDonald
    Rain Breaw Michaels
    Ajit Narayanan
    Joel Christopher Riley
    Alex Santana
    Rachel Sweeney
    Phil Weaver
    Ann Yuan
    Proceedings of ASSETS 2022, ACM (2022) (to appear)
    Preview abstract Prior work has explored the writing challenges experienced by people with dyslexia, and the potential for new spelling, grammar, and word retrieval technologies to address these challenges. However, the capabilities for natural language generation demonstrated by the latest class of large language models (LLMs) highlight an opportunity to explore new forms of human-AI writing support tools. In this paper, we introduce LaMPost, a prototype email-writing interface that explores the potential for LLMs to power writing support tools that address the varied needs of people with dyslexia. LaMPost draws from our understanding of these needs and introduces novel AI-powered features for email-writing, including: outlining main ideas, generating a subject line, suggesting changes, rewriting a selection. We evaluated LaMPost with 19 adults with dyslexia, identifying many promising routes for further exploration (including the popularity of the “rewrite” and “subject line” features), but also finding that the current generation of LLMs may not surpass the accuracy and quality thresholds required to meet the needs of writers with dyslexia. Surprisingly, we found that participants’ awareness of the AI had no effect on their perception of the system, nor on their feelings of autonomy, expression, and self-efficacy when writing emails. Our findings yield further insight into the benefits and drawbacks of using LLMs as writing support for adults with dyslexia and provide a foundation to build upon in future research. View details
    The Design Space of Generative Models
    Jess Scon Holbrook
    Chinmay Kulkarni
    NeurIPS 2022 Human-Centered AI Workshop (2022) (to appear)
    Preview abstract Card et al.’s classic paper "The Design Space of Input Devices" established the value of design spaces as a tool for HCI analysis and invention. We posit that developing design spaces for emerging pre-trained, general AI models is necessary for supporting their integration into human-centered systems and practices. We explore what it means to develop an AI model design space by proposing two design spaces relating to pre-trained AI models: the first considers how HCI can impact pre-trained models (i.e., interfaces for models) and the second considers how pre-trained models can impact HCI (i.e., models as an HCI prototyping material). View details
    LaMDA: Language Models for Dialog Applications
    Aaron Daniel Cohen
    Alena Butryna
    Alicia Jin
    Apoorv Kulshreshtha
    Ben Zevenbergen
    Chung-ching Chang
    Cosmo Du
    Daniel De Freitas Adiwardana
    Dehao Chen
    Dmitry (Dima) Lepikhin
    Erin Hoffman-John
    Igor Krivokon
    James Qin
    Jamie Hall
    Joe Fenton
    Johnny Soraker
    Maarten Paul Bosma
    Marc Joseph Pickett
    Marcelo Amorim Menegali
    Marian Croak
    Maxim Krikun
    Noam Shazeer
    Rachel Bernstein
    Ravi Rajakumar
    Ray Kurzweil
    Romal Thoppilan
    Steven Zheng
    Taylor Bos
    Toju Duke
    Tulsee Doshi
    Vincent Y. Zhao
    Will Rusch
    Yuanzhong Xu
    arXiv (2022)
    Preview abstract We present LaMDA: Language Models for Dialog Applications. LaMDA is a family of Transformer-based neural language models specialized for dialog, which have up to 137B parameters and arepre-trained on 1.56T words of public dialog data and web text. While model scaling alone canimprove quality, it shows less improvements on safety and factual grounding. We demonstrate thatfine-tuning with annotated data and enabling the model to consult external knowledge sources canlead to significant improvements towards the two key challenges of safety and factual grounding.The first challenge, safety, involves ensuring that the model’s responses are consistent with a set ofhuman values, such as preventing harmful suggestions and unfair bias. We quantify safety using ametric based on an illustrative set of values, and we find that filtering candidate responses using aLaMDA classifier fine-tuned with a small amount of crowdworker-annotated data offers a promisingapproach to improving model safety. The second challenge, factual grounding, involves enabling themodel to consult external knowledge sources, such as an information retrieval system, a languagetranslator, and a calculator. We quantify factuality using a groundedness metric, and we find that ourapproach enables the model to generate responses grounded in known sources, rather than responsesthat merely sound plausible. Finally, we explore the use of LaMDA in the domains of education andcontent recommendations, and analyze their helpfulness and role consistency. View details
    No Results Found