Minsuk Chang

Minsuk Chang

Minsuk Chang is a research scientist at Google Deepmind. He is interested in our and other agents’ (in)ability to acquire new skills/knowledge through interaction. He builds and studies the dynamics of learning processes, seeking to understand how agents effectively gather information, adapt to new situations, and expand their repertoire of behaviors.
Authored Publications
Sort By
  • Title
  • Title, descending
  • Year
  • Year, descending
    LLM Comparator: Visual Analytics for Side-by-Side Evaluation of Large Language Models
    Michael Xieyang Liu
    Krystal Kallarackal
    Extended Abstracts of the CHI Conference on Human Factors in Computing Systems (CHI EA '24), ACM (2024)
    Preview abstract Automatic side-by-side evaluation has emerged as a promising approach to evaluating the quality of responses from large language models (LLMs). However, analyzing the results from this evaluation approach raises scalability and interpretability challenges. In this paper, we present LLM Comparator, a novel visual analytics tool for interactively analyzing results from automatic side-by-side evaluation. The tool supports interactive workflows for users to understand when and why a model performs better or worse than a baseline model, and how the responses from two models are qualitatively different. We iteratively designed and developed the tool by closely working with researchers and engineers at Google. This paper details the user challenges we identified, the design and development of the tool, and an observational study with participants who regularly evaluate their models. View details
    The Prompt Artists
    Stefania Druga
    Alex Fiannaca
    Pedro Vergani
    Chinmay Kulkarni
    Creativity and Cognition 2023 (2023)
    Preview abstract In this paper, we present the results of a study examining the art practices, artwork, and motivations of prolific users of the latest generation of text-to-image models. Through interviews, observations, and a survey, we present a sampling of the artistic styles, and describe the developed community of practice. We find that: 1) the text prompt and resulting image collectively can be considered the art piece (prompts as art), and 2) prompt templates (prompts with “slots” for others to fill in with their own words) are developed to create generative art pieces. We also find that this community’s premium on unique outputs leads to artists seeking specialized vocabulary to produce distinctive art pieces (e.g., by going to architectural blogs), while others look for “glitches” in the model that can turn into artistic styles in their own right. From these findings, we outline specific implications for design. View details
    Bootstrap Your Own Skills: Learning to Solve New Tasks with Large Language Model Guidance
    Jesse Zhang
    Jiahui Zhang
    Karl Pertsch
    Ziyi Liu
    Xiang Ren
    Shao-Hua Sun
    Joseph Lim
    Conference on Robot Learning 2023 (2023)
    Preview abstract We propose BOSS, an approach that automatically learns to solve new long-horizon, complex, and meaningful tasks by autonomously growing a learned skill library. Prior work in reinforcement learning require expert supervision, in the form of demonstrations or rich reward functions, to learn long-horizon tasks. Instead, our approach BOSS (BOotStrapping your own Skills) learns to accomplish new tasks by performing “skill bootstrapping,” where an agent with a set of primitive skills interacts with the environment to practice new skills without receiving reward feedback for tasks outside of the initial skill set. This bootstrapping phase is guided by large language models (LLMs) that inform the agent of meaningful skills to chain together. Through this process, BOSS builds a wide range of complex and useful behaviors from a basic set of primitive skills. We demonstrate through experiments in realistic household environments that agents trained with our LLM-guided bootstrapping procedure outperform those trained with naive bootstrapping as well as prior unsupervised skill acquisition methods on zero-shot execution of unseen, long-horizon tasks in new environments View details
    CLARA: Classifying and Disambiguating User Commands for Reliable Interactive Robotic Agents
    Jeongeun Park
    Seungwon Lim
    Joonhyung Lee
    Sangbeom Park
    Sungjoon Choi
    Youngjae Yu
    IEEE Robotics and Automation Letters (2023) (to appear)
    Preview abstract In this paper, we focus on inferring whether the given user command is clear, ambiguous, or infeasible in the context of interactive robotic agents utilizing large language models (LLMs). To tackle this problem, we first present an uncertainty estimation method for LLMs to classify whether the command is certain (i.e., clear) or not (i.e., ambiguous or infeasible). Once the command is classified as uncertain, we further distinguish it between ambiguous or infeasible commands leveraging LLMs with situational aware few-shot prompting in a zero-shot manner. For ambiguous commands, we further disambiguate the command by interacting with users via question generation with LLMs. We believe that proper recognition of the given commands could lead to a decrease in malfunction and undesired actions of the robot, enhancing the reliability of interactive robot agents. To evaluate the proposed system, we present a dataset consisting pair of high-level commands, scene descriptions, and labels of command type (i.e., clear, ambiguous, or infeasible). We validate the proposed method on the collected dataset, pick-and-place tabletop simulation. Furthermore, we demonstrate the approach in a real-world human-robot interaction environment, i.e., handover scenarios. View details