
Kristen Olson
Kristen Olson is a Senior User Experience Researcher for AIUX in Google Research. Her research focuses on Generative AI user interface design and human data labeling task design.
Research Areas
Authored Publications
Sort By
Prompt-based Prototyping with Large Language Models
Edwin Toh
Ellen Jiang
Aaron Michael Donsbach
ACM CHI case study track (2022)
Preview abstract
Prototyping is notoriously difficult to do with machine learning (ML), but recent advances in large language models may lower the barriers to people prototyping with ML, through the use of natural language prompts. This case study reports on the real-world experiences of industry professionals (e.g. designers, program managers, front-end developers) prototyping new ML-powered feature ideas via prompt-based prototyping. Through interviews with eleven practitioners during a three-week sprint and a workshop, we find that prompt-based prototyping reduced barriers of access by substantially broadening who can prototype with ML, sped up the prototyping process, and grounded communication between collaborators. Yet, it also introduced new challenges, such as the need to reverse-engineer prompt designs, source example data, and debug and evaluate prompt effectiveness. Taken together, this case study provides important implications that lay the groundwork toward a new future of prototyping with ML.
View details
LaMDA: Language Models for Dialog Applications
James Qin
Noam Shazeer
Chung-ching Chang
Joe Fenton
Maarten Paul Bosma
Marc Joseph Pickett
Erin Hoffman-John
Kathy Meier-Hellstern
Vincent Y. Zhao
Marian Croak
Steven Zheng
Cosmo Du
Ravi Rajakumar
Taylor Bos
Tulsee Doshi
Jamie Hall
Ray Kurzweil
Will Rusch
Igor Krivokon
Marcelo Amorim Menegali
Alena Butryna
Johnny Soraker
Dehao Chen
Aaron Daniel Cohen
Ben Zevenbergen
Alicia Jin
Maxim Krikun
Toju Duke
Daniel De Freitas Adiwardana
Apoorv Kulshreshtha
Rachel Bernstein
Romal Thoppilan
Dmitry (Dima) Lepikhin
Yuanzhong Xu
arXiv (2022)
Preview abstract
We present LaMDA: Language Models for Dialog Applications. LaMDA is a family of Transformer-based neural language models specialized for dialog, which have up to 137B parameters and arepre-trained on 1.56T words of public dialog data and web text. While model scaling alone canimprove quality, it shows less improvements on safety and factual grounding. We demonstrate thatfine-tuning with annotated data and enabling the model to consult external knowledge sources canlead to significant improvements towards the two key challenges of safety and factual grounding.The first challenge, safety, involves ensuring that the model’s responses are consistent with a set ofhuman values, such as preventing harmful suggestions and unfair bias. We quantify safety using ametric based on an illustrative set of values, and we find that filtering candidate responses using aLaMDA classifier fine-tuned with a small amount of crowdworker-annotated data offers a promisingapproach to improving model safety. The second challenge, factual grounding, involves enabling themodel to consult external knowledge sources, such as an information retrieval system, a languagetranslator, and a calculator. We quantify factuality using a groundedness metric, and we find that ourapproach enables the model to generate responses grounded in known sources, rather than responsesthat merely sound plausible. Finally, we explore the use of LaMDA in the domains of education andcontent recommendations, and analyze their helpfulness and role consistency.
View details
Discovering the Syntax and Strategies of Natural Language Programming with Generative Language Models
Edwin Toh
Ellen Jiang
Aaron Michael Donsbach
CHI (2022)
Preview abstract
In this paper, we present a natural language code synthesis tool, GenLine, backed by a large generative language model and a set of task-specific prompts. To understand the user experience of natural language code synthesis with these types of models, we conducted a user study in which participants applied GenLine to two programming tasks. Our results indicate that while natural language code synthesis can sometimes provide a magical experience, participants still faced challenges. In particular, participants felt that they needed to learn the model’s "syntax,'' despite their input being natural language. Participants also faced challenges in debugging model input, and demonstrated a wide range of variability in the scope and specificity of their requests. From these findings, we discuss design implications for future natural language code synthesis tools built using generating language models.
View details