Jump to Content
Sunipa Dev

Sunipa Dev

Sunipa Dev is a Senior Research Scientist in the Responsible AI and Human Centered Technologies Org. Her research is centered around auditing AI systems such as large language models for fairness, interpretability, transparency, and inclusivity.
Authored Publications
Google Publications
Other Publications
Sort By
  • Title
  • Title, descending
  • Year
  • Year, descending
    Preview abstract Large language models (LLMs) trained on real-world data can inadvertently reflect harmful societal biases, particularly toward historically marginalized communities. While previous work has primarily focused on harms related to age and race, emerging research has shown that biases toward disabled communities exist. This study extends prior work exploring the existence of harms by identifying categories of LLM-perpetuated harms toward the disability community. We conducted 19 focus groups, during which 56 participants with disabilities probed a dialog model about disability and discussed and annotated its responses. Participants rarely characterized model outputs as blatantly offensive or toxic. Instead, participants used nuanced language to detail how the dialog model mirrored subtle yet harmful stereotypes they encountered in their lives and dominant media, e.g., inspiration porn and able-bodied saviors. Participants often implicated training data as a cause for these stereotypes and recommended training the model on diverse identities from disability-positive resources. Our discussion further explores representative data strategies to mitigate harm related to different communities through annotation co-design with ML researchers and developers. View details
    Preview abstract Measurements of fairness in NLP have been critiqued for lacking concrete definitions of biases or harms measured, and for perpetuating a singular, Western narrative of fairness globally. To combat some of these pivotal issues, methods for curating datasets and benchmarks that target specific harms are rapidly emerging. However, these methods still face the significant challenge of achieving coverage over global cultures and perspectives at scale. To address this, in this paper, we highlight the utility and importance of complementary approaches in these curation strategies, which leverage both community engagement as well as large generative models. We specifically target the harm of stereotyping and demonstrate a pathway to build a benchmark that covers stereotypes about diverse, and intersectional identities. View details
    The Tail Wagging the Dog: Dataset Construction Biases of Social Bias Benchmarks
    Nikil Selvam
    Daniel Khashabi
    Tushar Khot
    Kai-Wei Chang
    ACL (2023)
    Preview abstract How reliably can we trust the scores obtained from social bias benchmarks as faithful indicators of problematic social biases in a given model? In this work we study this question by contrasting social biases with \underline{non}-social biases that might not even be discernible to human eye. To do so, empirically we simulate various alternative constructions for a given benchmark based on innocuous modifications. (such as paraphrasing or random-sampling) that maintain the essence of their social bias. On two well-known social bias benchmarks (Winogender(Rudinger et al, 2019) and BiasNLI(Dev et al 2020)) we observe that the choice of these shallow modifications have surprising effect in the resulting degree of bias across various models. We hope these troubling observations motivates more robust measures of social biases. View details
    Preview abstract Gender bias in language technologies has been widely studied, but research has mostly been restricted to a binary paradigm of gender. It is important to also consider non-binary gender identities, as excluding them can cause further harm to an already marginalized group. One way in which English-speaking individuals linguistically encode their gender identity is through third-person personal pronoun declarations. This is often done using two or more pronoun forms, e.g., \textit{xe/xem}, or \textit{xe/xem/xyr}. In this paper, we comprehensively evaluate state-of-the-art language models for their ability to correctly use declared third-person personal pronouns. As far as we are aware, we are the first to do so. We evaluate language models in both zero-shot and few-shot settings. Models are still far from zero-shot gendering non-binary individuals accurately, and most also struggle with correctly using gender-neutral pronouns (singular \textit{they, them, their} etc.). This poor performance may be due to the lack of representation of non-binary pronouns in pre-training corpora, and some memorized associations between pronouns and names. We find an overall improvement in performance for non-binary pronouns when using in-context learning, demonstrating that language models with few-shot capabilities can adapt to using declared pronouns correctly. View details
    Preview abstract Along with the recent advances in large language modeling, there is growing concern that language technologies may reflect, propagate, and amplify various social stereotypes about groups of people. Publicly available stereotype benchmarks play a crucial role in detecting and mitigating this issue in language technologies to prevent both representational and allocational harms in downstream applications. However, existing stereotype benchmarks are limited in their size and coverage, largely restricted to stereotypes prevalent in the Western society. This is especially problematic as language technologies are gaining hold across the globe. To address this gap, we present SeeGULL, a broad-coverage stereotype dataset, expanding the coverage by utilizing the generative capabilities of large language models such as PaLM and GPT-3, and leveraging a globally diverse rater pool to validate prevalence of those stereotypes in society. SeeGULL is an order of magnitude larger in terms of size, and contains stereotypes for 179 identity groups spanning 6 continents, 8 different regions, 178 countries, 50 US states, and 31 Indian states and union territories. We also get fine-grained offensiveness scores for different stereotypes and demonstrate how stereotype perceptions for the same identity group differs across in-region vs out-region annotators. View details
    Re-contextualizing Fairness in NLP: The Case of India
    Shaily Bhatt
    In Proceedings of the 2nd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 12th International Joint Conference on Natural Language Processing (AACL-IJCNLP) (2022)
    Preview abstract Recent research has revealed undesirable biases in NLP data and models. However, these efforts focus of social disparities in West, and are not directly portable to other geo-cultural contexts. In this paper, we focus on NLP fair-ness in the context of India. We start with a brief account of the prominent axes of social disparities in India. We build resources for fairness evaluation in the Indian context and use them to demonstrate prediction biases along some of the axes. We then delve deeper into social stereotypes for Region and Religion, demonstrating its prevalence in corpora and models. Finally, we outline a holistic research agenda to re-contextualize NLP fairness research for the Indian context, ac-counting for Indian societal context, bridging technological gaps in NLP capabilities and re-sources, and adapting to Indian cultural values.While we focus on India, this framework can be generalized to other geo-cultural contexts. View details
    PaLM: Scaling Language Modeling with Pathways
    Sharan Narang
    Jacob Devlin
    Maarten Bosma
    Hyung Won Chung
    Sebastian Gehrmann
    Parker Schuh
    Sasha Tsvyashchenko
    Abhishek Rao
    Yi Tay
    Noam Shazeer
    Nan Du
    Reiner Pope
    James Bradbury
    Guy Gur-Ari
    Toju Duke
    Henryk Michalewski
    Xavier Garcia
    Liam Fedus
    David Luan
    Barret Zoph
    Ryan Sepassi
    David Dohan
    Shivani Agrawal
    Mark Omernick
    Marie Pellat
    Aitor Lewkowycz
    Erica Moreira
    Rewon Child
    Oleksandr Polozov
    Zongwei Zhou
    Michele Catasta
    Jason Wei
    arxiv:2204.02311 (2022)
    Preview abstract Large language models have been shown to achieve remarkable performance across a variety of natural language tasks using few-shot learning, which drastically reduces the number of task-specific training examples needed to adapt the model to a particular application. To further our understanding of the impact of scale on few-shot learning, we trained a 540-billion parameter, densely activated, Transformer language model, which we call Pathways Language Model PaLM. We trained PaLM on 6144 TPU v4 chips using Pathways, a new ML system which enables highly efficient training across multiple TPU Pods. We demonstrate continued benefits of scaling by achieving state-of-the-art few-shot learning results on hundreds of language understanding and generation benchmarks. On a number of these tasks, PaLM 540B achieves breakthrough performance, outperforming the finetuned state-of-the-art on a suite of multi-step reasoning tasks, and outperforming average human performance on the recently released BIG-bench benchmark. A significant number of BIG-bench tasks showed discontinuous improvements from model scale, meaning that performance steeply increased as we scaled to our largest model. PaLM also has strong capabilities in multilingual tasks and source code generation, which we demonstrate on a wide array of benchmarks. We additionally provide a comprehensive analysis on bias and toxicity, and study the extent of training data memorization with respect to model scale. Finally, we discuss the ethical considerations related to large language models and discuss potential mitigation strategies. View details
    On Measurements of Bias and Fairness in NLP
    Emily Sheng
    Jieyu Zhao
    Aubrie Amstutz
    Jiao Sun
    Yu Hou
    Mattie Sanseverino
    Jiin Kim
    Akihiro Nishi
    Nanyun Peng
    Kai-Wei Chang
    AACL (2022)
    Preview abstract Recent studies show that Natural Language Processing (NLP) models propagate societal biases about protected attributes such as gender, race, and nationality. While existing works propose bias evaluation and mitigation methods for various tasks, there remains a need to cohesively understand the biases and normative harms these measures capture and how different measures compare. To address this gap, this work presents a comprehensive survey of existing bias measures in NLP---both intrinsic measures of representations and extrinsic measures of downstream applications---and organizes them through associated NLP tasks, metrics, datasets, societal biases, and corresponding harms. This survey also organizes commonly used NLP fairness metrics into different categories to present advantages, disadvantages, and correlations with general fairness metrics common in machine learning. View details
    Preview abstract Recent research has revealed undesirable biases in NLP data and models. However, these efforts focus of social disparities in West, and are not directly portable to other geo-cultural contexts. In this position paper, we outline a holistic research agenda to re-contextualize NLP fairness research for the Indian context, accounting for Indian \textit{societal context}, bridging \textit{technological} gaps in capability \& resources, and adapting to Indian cultural \textit{values}. We also report high-level findings from an empirical study on various social stereotypes for Region and Religion axes in the Indian context, demonstrating its prevalence in corpora and models. View details
    Socially Aware Bias Measurements for Hindi Language Representations
    Vijit Malik
    Akihiro Nishi
    Nanyun Peng
    Kai-Wei Chang
    NAACL Main Conference (2022)
    Preview abstract Language representations are an efficient tool used across NLP, but they are strife with encoded societal biases. These biases are studied extensively, but with a primary focus on English language representations and biases common in the context of Western society. In this work, we investigate the biases present in Hindi language representations such as caste and religion-associated biases. We demonstrate how biases are unique to specific language representations based on the history and culture of the region they are widely spoken in, and also how the same societal bias (such as binary gender-associated biases) when investigated across languages is encoded by different words and text spans. With this work, we emphasize the necessity of social awareness along with linguistic and grammatical artifacts when modeling language representations, in order to understand the biases encoded. View details
    No Results Found