Rosanne Liu

Rosanne Liu

Rosanne Liu is a research scientist at Google Brain, and co-founder and executive director of ML Collective, a non-profit organization providing research training for all. She was also a founding member of Uber AI. She obtained her PhD in Computer Science at Northwestern University, published research at NeurIPS, ICLR, ICML, Science and other top venues, and had her work featured by WIRED, MIT Tech Review and Fortune. She builds communities for underrepresented and unprivileged researchers, organizes symposiums, workshops, and a weekly reading group “Deep Learning: Classics and Trends” since 2018. She serves as the Diversity, Equity & Inclusion chair of ICLR 2022 and 2023.
Authored Publications
Sort By
  • Title
  • Title, descending
  • Year
  • Year, descending
    Extremely Simple Activation Shaping for Out-of-Distribution Detection
    Andrija Djurisic
    Arjun Ashok
    Nebojsa Bozanic
    NeurIPS 2022 (2023)
    Preview abstract The separation between training and deployment of machine learning models implies that not all scenarios encountered in deployment can be anticipated during training, and therefore relying solely on advancements in training has its limits. Out-of-distribution (OOD) detection is an important area that stress-tests a model's ability to handle unseen situations: Do models know when they don't know? Existing OOD detection methods either incur extra training steps, additional data or make nontrivial modifications to the trained network. In contrast, in this work, we propose an extremely simple, post-hoc, on-the-fly activation shaping method, ASH, where a large portion (e.g. 90%) of a sample's activation at a late layer is removed, and the rest (e.g. 10%) simplified or lightly adjusted. The shaping is applied at inference time, and does not require any statistics calculated from training data. Experiments show that such a simple treatment enhances in-distribution and out-of-distribution distinction so as to allow state-of-the-art OOD detection on ImageNet, and does not noticeably deteriorate the in-distribution accuracy. Video, animation and code can be found at: https://andrijazz.github.io/ash View details
    Character-Aware Models Improve Visual Text Rendering
    Chitwan Saharia
    William Chan
    Sharan Narang
    Irina Blok
    RJ Mical
    Mohammad Norouzi
    Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (2023)
    Preview abstract Current image generation models struggle to reliably produce well-formed visual text. In this paper, we investigate a key contributing factor: popular text-to-image models lack character-level input features, making it much harder to predict a word's visual makeup as a series of glyphs. To quantify this effect, we conduct a series of experiments comparing character-aware vs. character-blind text encoders. In the text-only domain, we find that character-aware models provide large gains on a novel spelling task (WikiSpell). Applying our learnings to the visual domain, we train a suite of image generation models, and show that character-aware variants outperform their character-blind counterparts across a range of novel text rendering tasks (our DrawText benchmark). Our models set a much higher state-of-the-art on visual spelling, with 30+ point accuracy gains over competitors on rare words, despite training on far fewer examples. View details
    Preview abstract Language models demonstrate both quantitative improvement and new qualitative capabilities with increasing scale. Despite their potentially transformative impact, these new capabilities are as yet poorly characterized. In order to direct future research, prepare for disruptive new model capabilities, and ameliorate socially harmful effects, it is vital that we understand the present and near-future capabilities and limitations of language models. To address this challenge, we introduce the Beyond the Imitation Game benchmark (BIG-bench). BIG-bench consists of 207 tasks, contributed by over 400 authors across 132 institutions. Task topics are diverse, drawing problems from linguistics, childhood development, math, common sense reasoning, biology, physics, social bias, software development, and beyond. BIG-bench focuses on capabilities that are believed to be beyond current language models. We evaluate the behavior of OpenAI's GPT models, Google-internal dense transformer architectures, and Switch-style sparse transformers on BIG-bench, across model sizes spanning millions to hundreds of billions of parameters. A team of human experts further performed all tasks, to provide a strong baseline. Findings include: model performance and calibration both improve with scale, but are poor in absolute terms (and when compared with human performance); model performance is remarkably similar across model classes; tasks that improve gradually and predictably commonly involve a large knowledge or memorization component, whereas tasks that exhibit ``breakthrough'' behavior at a critical scale often involve a significant reasoning or algorithmic component; social bias typically increases with scale in settings with ambiguous context, but this can be improved with prompting. View details